Why is read speed faster than write speed?

Read speed and write speed are important specifications to consider when selecting storage devices like hard disk drives (HDDs) and solid state drives (SSDs). Read speed measures how fast data can be read or retrieved from the storage device, while write speed determines how quickly new data can be written or saved to the device. Generally, read speed is faster than write speed for most storage media.

There are several key reasons why read operations tend to be faster than writes:

  • Caching and prefetching allow recently accessed data to be read quickly from faster memory rather than slower storage media
  • Seeking to a read location is faster than writing sequentially
  • Reading can be parallelized across multiple channels, heads, dies, etc.
  • Writes have additional overhead for things like error checking and wear leveling

We’ll explore these factors more in-depth throughout this article.

Physical Differences

There are physical differences between the speed of eye movements versus hand/finger movements that help explain why reading is typically faster than writing. The eyes are able to move very quickly and take in information rapidly through saccades, quick jumps between fixation points. Saccades during reading can reach speeds of up to 500 degrees per second (Jiang et al., 2020). In contrast, hand and finger movements during typing on a keyboard are constrained by physiological limits. Fingers can only move so quickly and need to move between keys on the keyboard. The maximum typing speed records are around 350 word per minute, significantly slower than reading speeds which can exceed 500 words per minute (Reddit, 2019).

Eye movements also require less physical effort than typing. The eyes can take in information passively by looking at text, while typing requires active finger and hand motions repeatedly striking keys. Reading is less fatiguing for the body overall. This allows reading to be sustained for longer periods than active typing or writing. The effort involved in typing can lead to faster onset of fatigue.

Caching and Prefetching

Computers can optimize read speeds by caching and prefetching data. Caching involves storing frequently accessed data in a location that is faster to access than the original storage location, typically in a computer’s RAM or solid state drives (SSDs) rather than slower mechanical hard disk drives (HDDs) (Why is caching used to increase read performance?, 2022). When data is requested, the system will first check the cache to see if it is already available there, which is much faster than having to retrieve it from the original storage location. This avoids the latency of mechanical components like drive heads moving into position, allowing for much faster read speeds.

In addition to caching recently accessed data, computers can also prefetch data by predicting what data is likely to be needed soon and fetching it ahead of time into the cache. This enables extremely fast read speeds when the predicted data is requested since it is already sitting readily available in the fast cache. Prefetching exploits locality of reference principles, anticipating future data needs based on patterns and algorithms (What is Caching and How it Works, 2022). Overall, intelligently leveraging caching and prefetching allows computers to optimize for fast read speeds by reducing mechanical motions and taking advantage of the speed of solid state memory.

Seeking vs Sequential

There is a significant difference in access speed between random access reading and sequential writing. Random access reading allows directly accessing any location in storage, while sequential writing must start from the beginning and write in order (Source). This makes random access much faster for reading, since the drive head can immediately move to the target location. Sequential writing is slower because each new piece of data must be written after the preceding one. The drive head must mechanically move across the sectors one by one as it writes (Source).

Overall, random access reading is faster than sequential writing because it does not require starting from the beginning and accessing data in order. The ability to directly seek to any location enables much faster read speeds.

Parallelism

One key difference between reading and writing is that reading can be parallelized across multiple disks or controllers, while writing typically cannot. As this Stack Overflow answer explains, reading from multiple disks at once improves performance because the relatively slow seek time can happen in parallel across disks: “A disk’s read speed is typically much faster than its seek speed, so if you do a lot of seeking, it will surely slow down. I predict that your parallel reads will go faster than your serial reads.”

In contrast, writes are usually sequential rather than parallel because writing to the same location from multiple disks risks data corruption. Writes need to be serialized and coordinated to avoid overlapping. As a result, reading can leverage parallelism for faster access while writes remain sequential in nature.

Write Overhead

Write operations often require substantial overhead beyond simply writing the data to storage. For example, writes may require error checking to confirm successful write completion, data integrity checks, logging for potential rollbacks or debugging, and index updates to support efficient reads (MongoDB — Overhead on write speed as indexes increase, https://medium.com/@rishabh011/mongodb-overhead-on-write-speed-as-indexes-increase-f28ac24d5e6b). Since writes modify the state of the data, they require extra processing before and after the physical write to storage. In contrast, reads are simply accessing existing data and do not incur as much ancillary processing.

Additionally, the OS and hardware may introduce write overhead. The OS needs to receive the write request, allocate resources, update metadata, and finally schedule the physical write. The storage hardware also processes write requests differently than reads, including caching, queueing, error correction, and destaging to physical media. All these factors introduce delays and variability beyond the raw write speed of the underlying storage media.

Write Amplification

Write amplification (WA) is an undesirable phenomenon associated with flash memory and solid-state drives (SSDs) where the actual amount of information physically written to the storage media is a multiple of the logical amount intended to be written (Source: https://en.wikipedia.org/wiki/Write_amplification). This happens due to the way SSDs handle writes at the hardware level.

Specifically, SSDs write data in pages and blocks. If a page contains some existing data that needs to be overwritten, the SSD can’t simply overwrite that data, it has to copy the other valid pages to a new block, erase the old block, modify the page that needs to be changed, and write the entire block back to the SSD. This amplification effect means even a small 512 byte write may result in 4KB or more actually being written to the physical media (Source: https://www.tuxera.com/blog/what-is-write-amplification-why-is-it-bad-what-causes-it/).

This write amplification has a major impact on performance for write-intensive workloads. Even though the SSD may have high theoretical write speeds, the actual observed speeds will be much lower due to the amplification. The effect is particularly pronounced for random write workloads. Write amplification also reduces the lifespan of SSDs since more writes are occurring at the hardware level.

Read Optimization

There are several techniques that can optimize database read speed. Some examples include:

Write Optimization

There are various techniques that can optimize write speed in databases and storage systems. Some examples include:

Write caching – Caching recently written data in memory can reduce the number of slow physical writes to disk (https://danielfoo.medium.com/11-database-optimization-techniques-97fdbed1b627).

Write combining – Combining multiple pending writes into a single larger physical write can improve performance by reducing total disk operations (https://www.britannica.com/science/optimization).

Write offloading – Offloading writes to specialized logging or queueing servers can free up the main database for reads and other queries (https://en.wikipedia.org/wiki/Mathematical_optimization).

Compression – Compressing written data before storing can reduce the total amount needing to be written. This is especially effective for text or numeric data.

Caching indexes – Storing indexes in memory can avoid expensive disk reads for every index update.

Batching – Grouping multiple writes together into transactions or batches amortizes fixed per-write overhead across larger operations.

Asynchronous I/O – Performing writes asynchronously allows the application to continue while writes happen in background.

Partitioning – Splitting data across multiple disks/servers allows writes to occur in parallel.

Conclusion

In summary, there are several key reasons why read speed tends to be faster than write speed:

Physical differences – Read heads are simpler than write heads. Also, reading data does not require the heads to actually modify the storage medium.

Caching and prefetching – Reads can leverage various caches and prefetched data to boost speeds.

Seeking vs sequential – Seek time for random reads is faster than the sequential write process.

Parallelism – Multiple read heads allow parallel reads which increases total read bandwidth.

Write overhead – Additional encoding, error checking, and sync processes slow writes compared to reads.

Write amplification – The mechanics of erasing old blocks means significantly more writes happen behind the scenes than user writes.

In general, modern storage systems employ a variety of optimizations on the read path that are difficult or impossible to mirror on the write path. So while both read and write performance continue to improve, reads will likely maintain a healthy speed advantage due to fundamental differences in the underlying physical processes.