Are SSDs good for archival?

Archival storage refers to the long-term storage of digital information for preservation and future reference. The purpose of archival storage is to retain important data, documents, images, recordings, and other files indefinitely and protect them from being altered or destroyed (https://www.collinsdictionary.com/us/dictionary/english/archival-storage).

Solid State Drives (SSDs) are a type of computer storage device that uses flash memory and integrated circuit assemblies to store data. Unlike traditional hard disk drives that have spinning platters, SSDs have no moving mechanical components and are less prone to physical damage. SSDs also provide faster access times, better durability, and lower power consumption compared to HDDs (https://www.pcmag.com/encyclopedia/term/solid-state-drive).

This article examines the suitability of SSDs for archival data storage needs and the advantages and disadvantages of using SSDs for long-term information retention.

Benefits of SSDs for Archival

SSDs offer some potential benefits for archival storage use cases:

Fast access speeds – SSDs provide much faster random read/write performance compared to traditional hard disk drives (HDDs). This enables quicker access to archived data.

Low latency – The all-electronic nature of SSDs allows them to access data with very low latency. This results in near instantaneous data retrieval.

High throughput – SSDs are capable of high sustained throughput thanks to parallelization techniques like multiple channels. This enables fast data transfers for archived data.

Compact physical size – SSDs come in a much smaller form factor than HDDs with similar storage capacity. This allows fitting more storage capacity in a smaller space for archives.

Overall, the fast performance and compact size of SSDs are advantageous for some archive use cases that require frequent and fast access. However, there are also downsides to using SSDs for long-term archival storage as we’ll explore next.

Drawbacks of SSDs for Archival

SSDs have some key drawbacks that make them less than ideal for long-term archival storage:

Limited write endurance – SSDs can only withstand a finite number of write/erase cycles before drive failure, unlike HDDs which can rewrite data indefinitely. Consumer SSDs typically allow anywhere from 300 to 3000 cycles. This makes SSDs better suited for read-heavy workloads.[1]

Data retention concerns – Stored data on SSDs can start to degrade over time if left unpowered, unlike on HDDs where data is magnetic. Higher-end SSDs may include capacitors to allow finishing writes if power is lost, but data retention decreases over years.[2]

Higher cost per GB – SSDs currently cost significantly more per gigabyte compared to HDDs. As archival requires large amounts of storage, HDDs tend to be much more cost effective for the huge storage capacities required.

Archival Use Cases for SSDs

While SSDs may not be well-suited for long-term archival storage, they can provide benefits in certain archival use cases where their strengths are optimized:

Staging Area for Backups

SSDs can serve as a fast staging area for data backups before moving to long-term archival storage like tape or cloud storage. Their high speed allows quick backups, and their lack of movable parts makes them more rugged for temporary data dumps. For example, the SanDisk Extreme Portable SSD provides read speeds up to 1050MB/s and write speeds up to 1000MB/s (Source).

Caching Layer

SSDs make an excellent caching layer to speed up access to slower long-term archival storage. Their high random read/write performance can cache hot data for low latency access compared to mechanical disks. For example, adding even a small SSD cache to a large archive server with HDD storage can dramatically improve performance.

Temporary Storage

For data that only needs to be archived for a short time (weeks to months), SSDs provide benefits like silent operation, resilience to vibration, and low power consumption. Their lack of movable parts also reduces chances of mechanical failure during temporary storage. SSDs are a good option for short-term archiving needs before data deletion or long-term archival.

Alternatives to SSDs for Archival

When it comes to archival storage, SSDs have some drawbacks compared to other options like HDDs, tape drives, and optical discs:

HDDs (hard disk drives) have historically been the go-to for long-term data storage. Compared to SSDs, HDDs are less expensive per gigabyte of storage and do not suffer performance degradation over time like SSDs do. HDDs are better suited for infrequently accessed “cold storage” of data. However, HDDs are mechanically more complex than SSDs and thus may be more prone to failure from shock or vibration. Overall, HDDs remain a very viable option for archival purposes, especially for large amounts of rarely accessed data [1].

Tape drives are still considered the gold standard for true archival storage. Tape is very inexpensive per gigabyte compared to HDDs and SSDs. Tape cartridges can be stored offline and have an expected lifetime of 30 years or more. The main downsides are slow access times and the cost of a tape drive. Tape is best suited for very large archives that do not need frequent access [2].

Optical discs like Blu-Ray can provide archival storage for up to decades. They are relatively inexpensive, portable, and immune to electromagnetic disturbances. However, capacities are generally low compared to HDDs and SSDs. Optical discs require physical access to insert/remove from a drive, so they are not as convenient as always-on HDD/SSD storage. Still, for small archives that only need infrequent access, optical discs remain a viable cold storage medium.

Overall, when choosing storage for archival purposes, SSDs have advantages like fast access, compact size, and resistance to shocks. But their limited endurance and higher $/GB make HDDs, tape, and optical discs potentially better choices for certain long-term cold storage use cases.

Maximizing SSD Endurance

There are several techniques that can help maximize the lifespan and endurance of SSDs:

Over-provisioning refers to leaving additional spare capacity on the SSD beyond what is exposed to the operating system. This allows the controller to better distribute writes across all the flash memory cells, preventing any single cell from wearing out prematurely. Most SSDs already come with some built-in over-provisioning from the manufacturer, but you can configure additional over-provisioning through disk utility software up to 20-30% of total capacity [1].

Enabling read caching can significantly reduce the number of writes to the SSD by caching data in RAM instead. The operating system will attempt to read from cache before writing data to disk. Just be aware that data in cache will be lost in the event of power failure or improper shutdown [2].

Wear leveling balances out the number of program/erase cycles across all the flash memory cells in the SSD. This prevents any one block from degrading substantially faster than the rest. Wear leveling is handled automatically by the SSD controller and firmware [3].

SSD Data Retention Methods

There are a few key ways to maximize data retention on SSDs:

Refreshing data – To prevent data loss from charge leakage, it is recommended to read and rewrite data to SSDs every 6-12 months if they are stored unpowered. This refreshes the electrical charge and helps retain the data for longer periods. Some SSD controllers have a built-in data refresh feature that can automate this process. However, manually copying data off and back onto the SSD periodically can also work [1].

Using enterprise SSDs – Enterprise SSDs designed for server and data center use often have higher endurance ratings and better data retention compared to consumer models, even when unpowered. For example, some enterprise SSDs guarantee 1 year data retention at 40°C versus 1 week for a typical consumer SSD [2].

ECC memory – Error correcting code (ECC) memory in SSD controllers can help detect and recover from data errors over time. The more advanced the ECC algorithm, the better it can handle retention issues from charge leakage. Enterprise SSDs tend to have more powerful ECC capabilities than consumer models [3].

Cost Analysis

One of the key differences between SSDs and HDDs is cost. SSDs generally have a higher cost per gigabyte compared to HDDs.

According to Amazon Web Services, data storage on an SSD typically costs $0.08 – $0.10 per gigabyte, while data storage on an HDD costs just $0.03 – $0.06 per gigabyte. So HDDs are around 2-3 times cheaper per gigabyte compared to SSDs.

Looking at a price history analysis on Reddit, since 2011 the cost per terabyte for HDDs has steadily decreased from around $30/TB to $13/TB in 2021. The cost decline has been gradual over time, without major drops when new technologies were introduced.

When looking at total cost of ownership, the higher upfront cost per gigabyte of SSDs needs to be weighed against other potential long-term savings from their benefits like faster performance, lower power usage, and higher reliability. While HDDs have a lower initial purchase price, over time SSDs can provide a better return on investment depending on the use case.

For archival use cases focused strictly on large amounts of inexpensive storage, HDDs tend to be more cost effective. But for applications where performance, reliability, and power efficiency matter, SSDs can justify their higher price through lower ongoing maintenance and operating costs.

Best Practices

When using SSDs for archival storage, it’s important to follow best practices to maximize data retention:

Use for temporary archival storage – While SSDs provide fast access, they are not suitable for long-term archival storage due to limited data retention. SSDs are best used for temporary archival storage where data will be accessed and refreshed periodically.

Refresh data periodically – To mitigate potential data loss over time, it’s critical to read and rewrite data on SSDs periodically, such as every 6-12 months. This refreshes the storage cells and avoids data decay.

Use enterprise SSDs – Enterprise SSDs designed for 24/7 operation have better endurance with higher write cycles before failure. Consumer SSDs have lower write endurance that wears out faster under archival workloads.

Additionally, enterprise SSDs often have capacitors that allow them to complete in-progress writes if power is lost. This avoids potential data corruption that can occur with consumer SSDs.

Conclusion

In summary, SSDs have both benefits and drawbacks for archival use cases. The benefits include faster read/write speeds, lower latency, smaller physical size, and lower power usage compared to HDDs. However, SSDs also have lower overall storage capacity per dollar, potential performance degradation over time, and concerns around long-term data retention.

The optimal use cases for SSD archiving tend to be for smaller datasets that need fast access speeds, such as databases, metadata, indexes, and caching. For larger archival datasets where capacity is key and access speeds are less important, HDDs remain the primary choice. To maximize endurance and data retention, it’s recommended to purchase enterprise-grade SSDs with high TBW ratings, implement wear leveling, and use the TRIM command and other best practices.

For most cost-effective archiving, a tiered storage approach combining SSDs and HDDs allows organizations to balance performance, capacity, and budget. SSDs can serve as a caching layer for frequently accessed “hot” data, while colder archives reside on larger HDDs. Testing retention yearly and migrating data to new drives periodically also helps ensure long-term accessibility.