How does an SSD find data?

A solid state drive (SSD) is a data storage device that uses integrated circuit assemblies and flash memory to store data persistently. Unlike a traditional hard disk drive (HDD), which has spinning platters and moving read/write heads, an SSD has no moving mechanical parts and data is stored on flash memory chips.

This key difference gives SSDs advantages over HDDs in terms of performance, power consumption, physical durability, and operating noise. SSDs have much faster data read/write speeds, use less power, can withstand more physical shock, and run silently. However, HDDs have traditionally had a cost advantage in terms of dollars per gigabyte. Over time the price of SSDs has been decreasing while the storage capacity has increased.

So in summary, the key differences between SSDs and HDDs are:

  • SSDs use flash memory and have no moving parts, while HDDs store data on spinning platters and move read/write heads.
  • SSDs are faster, quieter, and more durable/shock-resistant than HDDs.
  • HDDs traditionally had a price advantage per gigabyte, but SSD prices have been dropping.

Flash Memory

At a fundamental level, SSDs use flash memory to store data. Flash memory is made up of memory cells, each of which stores data in the form of an electric charge. The most common type of flash memory used in SSDs is NAND flash memory.

NAND flash memory gets its name from the logic gate at the heart of its design: NAND. These logic gates are arranged in parallel to form a NAND string. Data is written to the flash memory cells by applying a voltage to the control gate to inject electrons into the floating gate, changing the threshold voltage. This voltage level represents either a 1 or 0 for binary data storage. Reading the cell determines if the threshold voltage is above the reference value, indicating a 0, or below it, indicating a 1 (Source).

NAND flash memory is organized into pages and blocks. A page represents the smallest unit that can be programmed, while a block, consisting of multiple pages, is the smallest unit that can be erased. This architecture allows data to be written at the page level while erasure can only be done at the block level, enabling greater efficiency.

Memory Controller

The SSD memory controller acts as the interface between the host system and the NAND flash memory components inside the SSD. It is considered one of the most important parts of the SSD (Sabrent). The controller manages all of the memory access requests to the flash memory chips and ensures accurate and efficient data reads and writes (Storage Review).

Some key responsibilities of the SSD controller include:

  • Translating requests from the host interface into instructions for the flash memory
  • Managing the NAND flash to ensure data integrity and proper wear leveling
  • Conducting error correction, encryption, compression/decompression, caching, and other data processing tasks
  • Monitoring and reporting on SSD performance and lifespan

The capabilities of the SSD controller have a major influence on the overall speed, lifespan, and features of the SSD. More advanced controllers enable faster data transfer speeds, improve endurance, and support extra features like hardware encryption. Many SSD manufacturers design their own proprietary controllers tailored for their specific NAND flash and SSDs (Sabrent).

File System

The file system is responsible for organizing data on the SSD and determining how it is stored, accessed and updated. Some common file systems used on SSDs include:

NTFS – This is the most common file system used on Windows operating systems. It supports large partition sizes and advanced data storage features like encryption and compression. However, it was designed for hard drives and lacks some SSD optimization features.

exFAT – A lightweight file system optimized for flash memory and external storage devices. It’s supported on most operating systems and lacks some advanced NTFS features, but works well on SSDs.

EXT4 – The most widely used Linux file system, EXT4 is considered reliable and efficient. It’s optimized for SSDs with fast fsync and delayed allocation features.

Btrfs – A newer Linux file system with built-in SSD optimization features like wear leveling and strong checksums. However, it’s not as mature or widely supported as EXT4.

F2FS – Designed specifically for NAND flash memory like SSDs, F2FS supports TRIM, wear leveling and other flash-friendly features. It’s gaining adoption, especially on Linux.

For most consumer uses, NTFS or exFAT provide a good balance of compatibility and features for SSDs on Windows, while EXT4 remains the standard for Linux SSD file storage.

Wear Leveling

Wear leveling is a technique used in SSDs to distribute writes evenly across all the blocks in the flash memory and prevent any one block from wearing out prematurely. This extends the lifespan of the SSD.

SSDs are made up of many NAND flash memory cells which can endure a limited number of erase cycles before wearing out. Wear leveling aims to distribute these erase cycles evenly so that the drive does not become unusable when a small number of cells fail. This is done by remapping logical block addresses to different physical locations in the flash memory over time. The SSD controller transparently handles these remappings in the background.

Overprovisioning refers to reserving a portion of the total flash capacity solely for wear leveling and other background tasks. Typically 7-28% extra capacity is provisioned for this. Having this spare area allows more flexible wear leveling and improves performance by reducing write amplification. Overprovisioning combined with wear leveling enables SSDs to continue functioning evenly for years.

“Overprovisioning along with wear leveling algorithms allows SSDs to extend endurance while maintaining peak performance throughout the life of the SSD.” (https://www.atpinc.com/blog/how-SSD-wear-leveling-works)

Garbage Collection

Garbage collection is an important process that helps SSDs maintain performance. When data is deleted or overwritten on an SSD, the blocks that data occupied don’t get erased right away. Instead, the space is marked as invalid. Garbage collection works in the background to find these invalid blocks and erase them so they can be reused (Source).

The garbage collection process involves taking the valid data from invalid blocks and moving it to new blocks so the invalid blocks can be erased. This helps ensure there are always free erased blocks available for new writes. Garbage collection also consolidates data, helping reduce write amplification. Without garbage collection, SSD performance would degrade over time as invalid data accumulates.

TRIM

TRIM is a command that allows an SSD to effectively erase data by notifying the SSD which blocks of data are no longer being used. Here’s how it works:

SSDs read and write data in blocks, usually 128KB in size. When a file is deleted, the file system tells the operating system that the blocks containing that file can be erased and reused. However, the SSD has no way of knowing that the blocks are now free.

This is where the TRIM command comes in. The operating system will periodically send the TRIM command to the SSD, notifying it of which blocks of data are no longer needed. The SSD will then erase those blocks and add them to a pool of free blocks that can be rewritten in the future. This helps maintain the performance of the SSD over time.

Enabling TRIM ensures that deleted files and data are fully erased at the hardware level, rather than simply removing file pointers. This prevents a gradual loss of performance as unused blocks of data fill up the SSD.https://www.donemax.com/wiki/ssd-trim-mac.html

Caching

SSD caching utilizes the SSD’s high speed and low latency to accelerate read/write operations to the primary HDD storage.
There are several common SSD caching techniques:

  • Read cache – Copies frequently accessed data from the HDD to the SSD. Subsequent reads are served from the SSD cache instead of the HDD, improving read performance.
  • Write cache – Writes are initially stored in the SSD cache and later flushed to the HDD in the background. This improves write responsiveness.
  • Read/write cache – Combines both read and write caching. This provides the full benefit of SSD caching for both reads and writes.

SSD caching is managed dynamically without user intervention. The cache contents evolve over time as usage patterns change. Advanced SSD controllers utilize algorithms to optimize caching effectiveness, such as frequency, recency and partition hotness.

Conclusion

In summary, SSDs locate data using several key processes and components. The flash memory stores data across interconnected chips, while the memory controller handles read and write requests and manages where data gets written. The SSD utilizes a file system to organize the raw flash memory into usable storage. Wear leveling helps distribute writes evenly to avoid wearing out frequently accessed cells. Garbage collection reclaims unused space by consolidating data. TRIM enables the operating system to notify the SSD of data that can be erased. Caching improves performance by keeping frequently accessed data available in faster DRAM. By leveraging these techniques, SSDs can quickly and reliably access data written across their flash memory.

Further Reading

Here are some additional resources to learn more about how SSDs work: