Which RAID option is best for backup?

RAID (Redundant Array of Independent Disks) is a data storage technology that combines multiple disk drive components into a logical unit. There are several different RAID levels, each with its own set of pros, cons and use cases:

RAID 0 – Disk striping without parity or mirroring. Fast performance but no redundancy. If one drive fails, all data is lost.1

RAID 1 – Disk mirroring. Provides redundancy by duplicating all data on secondary disks, but doubles disk cost.1

RAID 5 – Block-level striping with distributed parity. Provides fault tolerance with minimal storage overhead. Can withstand one disk failure.1

RAID 6 – Block-level striping with double distributed parity. Provides fault tolerance for up to two disk failures.1

RAID 10 – Mirroring plus striping. Provides fast performance and able to withstand multiple drive failures. Requires minimum four disks.

When it comes to choosing the best RAID level for backups, there is an ongoing debate over the merits of RAID 1 vs RAID 5/6. This article will explore the pros and cons of each option.

RAID 0

RAID 0, also known as disk striping, spreads data evenly across multiple drives without parity or redundancy. This RAID level breaks up data into blocks and stripes the blocks across all the drives in the array (https://www.stellarinfo.co.in/blog/advantages-and-disadvantages-popular-raid-systems/). The main advantage of RAID 0 is that it provides faster performance compared to a single drive, since data can be read and written simultaneously from multiple disks. However, RAID 0 provides no redundancy – if one drive fails, all data will be lost. For this reason, RAID 0 is generally not recommended for backup purposes where data protection is critical.

While RAID 0 improves performance, it comes at the cost of increased risk. With no parity or mirroring, the failure of just a single drive will result in total data loss for the RAID set. This lack of fault tolerance makes RAID 0 unsuitable for mission critical or highly available applications (https://www.techtarget.com/searchdatabackup/tip/RAID-1-vs-RAID-0-Which-level-is-best-for-data-protection). The benefits of speed must be weighed carefully against the high risk of irrecoverable data loss.

RAID 1

RAID 1, also known as disk mirroring, involves duplicating data across two or more drives (referred to as a mirrored set). The rationale behind RAID 1 is to provide redundancy in case one drive fails.

All data is written to both drives simultaneously, providing fault tolerance. If one drive goes down, the system can instantly switch to the other drive without any interruption in service. This protects against data loss due to hardware failure.

The tradeoff with RAID 1 is cost. You need double the number of hard drives compared to a single drive, so there is a significant additional expense. However, for mission critical systems where downtime is unacceptable, the extra cost may be justified.

In summary, RAID 1 offers complete data redundancy and protection through drive mirroring, though at a higher hardware cost.

RAID 5

RAID 5 provides distributed parity and block-level striping (https://olkpeace.org/?url=http://qgrhbkl11mo53.%D0%BD%D1%83%D0%B6%D0%BD%D0%BE%D0%B5%D0%BC%D0%B5%D1%81%D1%82%D0%BE.%D1%80%D1%84/2-1). This means the parity information is distributed across multiple drives, providing redundancy without requiring a dedicated parity drive like in RAID 3 and RAID 4. With RAID 5, if one drive fails, the RAID system can rebuild the data on the failed drive using the parity information spread across the remaining drives.

Compared to RAID 1 which requires full duplication of all data, RAID 5 provides redundancy with lower cost since it only requires a single parity drive. This makes RAID 5 a popular option for cost-effective redundancy in situations that require high read performance and can accept lower write performance compared to RAID 0 or RAID 10.

RAID 6

RAID 6 provides double distributed parity, which means that data is striped across multiple disks like RAID 5, but it uses two parity stripes instead of one. This provides extra fault tolerance compared to RAID 5, as RAID 6 can withstand the failure of up to two disks without data loss (whereas RAID 5 can only handle one disk failure).

The tradeoff for this additional resilience is reduced write performance. Because RAID 6 has to calculate and write two parity stripes, rather than just one with RAID 5, there is more computational overhead involved. This can result in slower write speeds in RAID 6 arrays, especially as the number of disks increases.

However, the extra fault tolerance makes RAID 6 a popular choice for mission critical storage and applications where uptime and data integrity are paramount. The likelihood of two disks failing simultaneously is low, but RAID 6 provides an extra layer of protection compared to RAID 5. This makes it well-suited for backup storage, where avoiding data loss is a key priority.

Overall, RAID 6 offers excellent resilience against disk failures, at the cost of reduced write performance. This tradeoff makes it ideal for backup applications where data integrity is critical, and write speed is less of a concern.

RAID 10

RAID 10, also known as RAID 1+0, is a nested or hybrid RAID level that combines mirroring and striping. It provides redundancy through mirroring and performance gains through striping.

RAID 10 works by creating a mirrored set of two disks, then striping data across the mirrored sets in chunks. For example, with four disks, data would be mirrored across the first two disks. Then the second set of two mirrored disks would be striped with the first set. This provides the redundancy of RAID 1 with the performance of RAID 0.

The advantage of RAID 10 is very high read/write performance compared to a single disk, made possible through striping. It also offers good fault tolerance thanks to mirroring, as data remains intact if one disk in each mirrored set fails. However, it requires a minimum of four disks.

Overall, RAID 10 provides faster throughput and better fault tolerance compared to RAID 5 or RAID 6. The tradeoff is higher cost as it requires more disks. It’s ideal for applications that demand faster data transfer speeds for critical data.[1]

Cost Comparison

When evaluating the cost of different RAID levels, one important metric to consider is the dollar cost per terabyte (TB) of usable storage capacity. This factors in the number of disks required for each RAID level and the amount of capacity that is usable after accounting for parity and mirroring.

RAID 0 provides the lowest cost per TB since it uses all disks for data with no parity or mirroring. However, it offers no redundancy. RAID 1 is the most expensive per TB since the usable capacity is equivalent to a single disk while requiring two or more disks. RAID 5 provides a good balance of redundancy with a moderate cost per TB. RAID 6 has a higher cost due to the second distributed parity disk. RAID 10 is also quite expensive per TB due to the mirrored pairs.

For example, with four 2TB disks:

  • RAID 0: 4 TB usable, $150/TB
  • RAID 1: 2 TB usable, $300/TB
  • RAID 5: 6 TB usable, $200/TB
  • RAID 6: 4 TB usable, $225/TB
  • RAID 10: 4 TB usable, $300/TB

While RAID 0 provides the most capacity per dollar, the lack of fault tolerance eliminates it as an option for backup and critical data. For backup purposes, RAID 1, 5, 6, and 10 provide various tradeoffs between redundancy, usable capacity, and cost that should be evaluated.

Performance

The performance of RAID levels varies greatly depending on the configuration. Here is a comparison of read/write speeds for common RAID levels:

RAID 0 provides the fastest read and write speeds since data is striped across multiple disks. However, it offers no redundancy. Benchmarks show RAID 0 can achieve sequential read speeds over 1000MB/s with multiple SSDs. (https://www.dell.com/community/PowerEdge-HDD-SCSI-RAID/Proliant-Server-PERC-755N-8-x-NVME-SSDs-slow-Raid-performance/m-p/8401616)

RAID 1 provides excellent read speeds, almost as fast as RAID 0, but slower write speeds due to data being written twice. RAID 1 read performance can exceed 600MB/s with SSDs.

RAID 5 and 6 have slower read/write speeds than RAID 0/1 due to parity calculations. RAID 5 has faster speeds than RAID 6. Benchmarks show RAID 5 sequential reads around 500-700MB/s and writes around 400-600MB/s depending on drive types.

RAID 10 read speeds match or exceed RAID 0 speeds and provides better write speeds than RAID 5/6. With multiple SSDs, RAID 10 can achieve over 900MB/s for reads and writes.

In summary, RAID 0 provides the fastest performance but no redundancy. RAID 10 offers excellent performance while providing redundancy. RAID 1 offers great reads but slower writes. RAID 5/6 offer good performance with redundancy, but not as fast as RAID 10.

Ease of Recovery

Rebuilding a RAID array after a disk failure is crucial to restoring full redundancy and protection against data loss. The ease and speed of rebuilding depends on the RAID level.

RAID 0 has no redundancy, so a disk failure will result in complete data loss and no ability to rebuild the array. According to this article, RAID 1 rebuild time is the fastest since data only needs to be copied from the surviving disk. RAID 5 and RAID 6 have longer rebuild times since parity data needs to be recalculated and written. RAID 6 takes even longer than RAID 5 as it uses double distributed parity.

One source states RAID 5/6 rebuild times can range from 36-72 hours for 8-12TB drives, depending on the controller (1). Larger capacity drives and arrays will have longer rebuild times. RAID 10 combines mirrored sets in a striped set, so rebuild time depends on the size of the mirrored sets. Overall, RAID 1 has the fastest and easiest rebuild, while RAID 5/6 take much longer.

Conclusion

In summary, each RAID level has its own pros and cons for backup:

RAID 0 provides no redundancy or fault tolerance, making it a poor choice for backup. RAID 1 provides excellent redundancy through mirroring but has high storage overhead. RAID 5 provides good redundancy through striping with distributed parity but rebuild times can be very long with larger drives. RAID 6 is similar to RAID 5 but with double distributed parity for added redundancy. RAID 10 combines mirroring and striping for excellent performance and redundancy but also has high storage overhead.

For backup purposes where redundancy and fault tolerance are critical, I recommend RAID 10 or RAID 6. RAID 10 provides faster rebuild times compared to RAID 6, while RAID 6 can tolerate up to two disk failures compared to just one for RAID 10. Both provide excellent protection against data loss. The choice depends on your specific performance and storage capacity requirements. Overall, RAID 10 may be preferable for smaller backup systems while RAID 6 scales better for larger deployments. The high fault tolerance of these RAID levels makes them a smart choice for reliably backing up important data.