How many disks RAID 6 can lose without losing data?

RAID 6 is a type of RAID (Redundant Array of Independent Disks) that provides fault tolerance by using two parity data blocks distributed across a RAID array (Wikipedia, 2022). This allows RAID 6 to sustain two disk failures without losing data or access to the array. RAID 6 achieves fault tolerance by striping data and parity information across multiple drives, then distributing the parity blocks among the drives so they are not stored on the same drive (PCMag, 2022).

The key benefits of RAID 6 include high fault tolerance, the ability to rebuild a RAID array with up to two failed drives, and improved performance over RAID 5. However, RAID 6 also requires more disk overhead for parity compared to other RAID levels. Overall, RAID 6 provides a balance of performance, capacity, and strong protection against multiple disk failures in large storage arrays.

How RAID 6 Protects Against Disk Failures

RAID 6 offers protection against multiple disk failures by using dual parity. Parity is extra data calculated from the data being stored, used to reconstruct data if a drive fails. In contrast to RAID 5 which uses single parity, RAID 6 uses double parity, storing parity data across two drives (TechTarget).

This dual parity allows RAID 6 to continue operating with full data redundancy if up to two drives fail. If a drive fails, the array can recreate the data that was on that drive using the parity data. If a second drive fails before the first failed drive is replaced, the array can still operate using the remaining parity drive (IBM).

RAID 6’s dual parity provides an extra layer of protection compared to RAID 5. The likelihood of two drives failing simultaneously is much lower than a single drive failure. By withstanding up to two drive failures, RAID 6 provides excellent protection against data loss.

Dual Parity in RAID 6

RAID 6 provides fault tolerance using dual parity. This means there are two independent sets of parity data stored across the disks in the array, known as P parity and Q parity. P parity is calculated using XOR across the data strips like in RAID 5, while Q parity is calculated diagonally using Reed-Solomon codes (Source).

Having two sets of parity data allows RAID 6 to survive up to two disk failures without data loss. If one disk fails, the array can rebuild the lost data using P parity. If a second disk fails before rebuilding is complete, the array can still recover the data using Q parity. This provides an extra layer of redundancy compared to RAID 5.

When writing new data, both the P and Q parity are recalculated and written. When rebuilding after a single disk failure, only the P parity is recalculated. The Q parity is only recalculated after replacing the failed drive and restoring the array to a fully redundant state. This allows RAID 6 to provide fault tolerance while minimizing the write penalty compared to computing both P and Q on every write.

Rebuilding RAID 6 After Disk Failure

When a disk fails in a RAID 6 array, the array enters a degraded state and needs to be rebuilt to restore full redundancy. The rebuild process involves reconstructing the missing data from the failed disk using the parity information spread across the remaining disks.

Since RAID 6 uses dual parity, it can withstand up to two disk failures without data loss. If one disk fails, the RAID controller uses the P and Q parity data on the other disks to reconstruct the data that was on the failed drive. This rebuild process restores redundancy and protection against a second disk failure.

According to an article on Spiceworks, the RAID 6 rebuild process causes a lot of disk activity and stress (source). As a result, there is a higher chance of a second disk failure during rebuild compared to other RAID levels like RAID 10. Rebuilding a large RAID 6 array can take days and puts significant strain on the remaining disks.

Performance of RAID 6

RAID 6 provides good read performance as data can be read in parallel from multiple disks similar to RAID 5. However, write performance suffers compared to RAID 5 due to the dual parity calculation. Each write requires the parity data to be calculated and written across two disks instead of one like in RAID 5.

Overall, RAID 6 read speeds are comparable to RAID 5 while write speeds are significantly slower. According to benchmarks by Ars Technica, RAID 6 write performance gets progressively worse as the array size increases due to the increased parity calculation overhead. In their tests, RAID 6 write speed was approximately half of RAID 5 write speed in an 8-disk array.

For optimal performance, RAID 6 works best for read-intensive workloads that require the added redundancy. For write-heavy applications, RAID 10 provides better overall performance.

RAID 6 Array Sizes

RAID 6 requires a minimum of 4 drives to implement dual parity. With fewer than 4 drives, RAID 6 cannot protect against multiple disk failures. The minimum 4 drive configuration provides 2 drives of usable storage capacity (Source: https://serverfault.com/questions/122168/minimum-number-of-disks-to-implement-raid6).

There is no set upper limit on the maximum number of drives in a RAID 6 array. However, as the array size grows, the rebuild times also increase. Most experts recommend limiting RAID 6 arrays to 10-14 drives for rebuild time considerations (Source: https://recoverit.wondershare.com/windows-tips/what-is-raid-6.html). The largest arrays may contain 20+ drives but experience very long rebuild times.

In summary, RAID 6 requires a minimum of 4 drives and can theoretically support arrays with 20+ drives. However, for practical purposes, an ideal RAID 6 array size is between 4-14 drives depending on the required capacity vs. acceptable rebuild times.

Choosing Drives for RAID 6

When constructing a RAID 6 array, one of the most important considerations is choosing the right type of drives. The main options are hard disk drives (HDDs) or solid state drives (SSDs). There are pros and cons to each for use in RAID 6.

HDDs have much higher storage capacities, usually in the range of 2TB to 16TB per drive. This allows you to maximize storage space in the array. HDDs are also significantly less expensive per gigabyte compared to SSDs. However, HDDs have moving parts that are prone to failure over time. And they have slower access times that can impact performance.

SSDs have no moving parts and very fast access times. But SSD capacities top out at around 4TB currently. And SSDs carry a hefty price premium over HDDs. For large storage arrays where capacity is key, HDDs tend to be the more cost-effective choice.

When selecting HDDs for RAID 6, look for enterprise class drives designed for 24/7 operation, like Western Digital’s Red Pro series or Seagate’s IronWolf Pro. Consumer drives may fail faster under heavy workloads. Larger capacity HDDs in the 8TB to 16TB range maximize storage density.

For SSDs, choose drives with power-loss protection to prevent data loss during sudden power disruptions. Models with high endurance ratings, such as Samsung and Micron enterprise SSDs, handle heavy write workloads better.

In summary, HDDs are preferred for maximizing capacity in RAID 6 arrays while SSDs provide performance benefits. Carefully weigh the capacity, cost, and workload requirements when selecting drives.

RAID 6 vs Other RAID Levels

RAID 6 is often compared to other common RAID levels like RAID 5, RAID 10, and RAID 0 when deciding which to implement. Compared to RAID 5, RAID 6 offers an additional disk failure tolerance with its dual parity (IBM). Whereas RAID 5 can only handle a single disk failure without data loss, RAID 6 can survive up to two disk failures. This makes RAID 6 the better option for larger arrays or mission critical data where uptime is crucial.

RAID 10 combines mirroring and striping for both performance and redundancy. However, RAID 10 requires a minimum of 4 disks while RAID 6 can start with just 3 disks. Additionally, RAID 10 can only tolerate one disk failure per mirrored pair, while RAID 6 can handle two failures anywhere in the array (Pits Data Recovery). So RAID 6 provides more flexible redundancy, though RAID 10 may offer better performance.

Compared to RAID 0, RAID 6 provides fault tolerance and redundancy, while RAID 0 has no parity or mirroring. However, RAID 0 outperforms RAID 6 in read/write speeds. So RAID 6 is preferable when data protection is critical, though RAID 0 is better for pure performance.

RAID 6 Use Cases

RAID 6 is most applicable in use cases where large capacity and high fault tolerance are required. According to the IT Enterpriser article (https://itenterpriser.com/knowledge-base/what-is-raid-6/), RAID 6 should be used when data safety and availability are the top priorities, and you don’t want to sacrifice too much performance or capacity. Specifically, RAID 6 is a good choice for the following use cases:

Mission-critical applications where downtime is unacceptable – The dual parity provided by RAID 6 allows the array to withstand the failure of up to two drives without data loss. This makes RAID 6 well-suited for databases, email servers, virtualization hosts, and other business critical systems.

Large drive arrays – As drive sizes continue to increase, the risk of dual drive failures also rises. RAID 6 provides an extra level of protection compared to RAID 5 when using large capacity (8TB+) hard drives.

Media storage and editing – The balanced blend of performance, capacity, and fault tolerance offered by RAID 6 makes it a popular choice for media production workflows. Video editing projects with huge storage requirements can benefit from RAID 6.

Archival and backup storage – RAID 6 provides excellent protection for infrequently accessed data that needs to be stored and preserved over the long term. The dual parity allows archives to withstand multiple drive failures over time.

Remote sites – For storage devices located offsite or in unreliable environments, RAID 6 provides additional protection where drive replacement may be difficult. The array can withstand multiple failures until repairs can be made.

Conclusion

In summary, RAID 6 is a type of RAID configuration that uses dual parity to protect against the failure of up to two drives in the array. The dual parity provides an extra layer of redundancy compared to RAID 5, which can only handle a single drive failure.

Some key points about RAID 6:

  • RAID 6 can sustain up to two failed drives without losing data.
  • It requires a minimum of 4 drives to implement.
  • Performance is slower than RAID 0 or RAID 5 due to the parity calculations.
  • RAID 6 is best suited for large arrays where uptime and data protection are critical.
  • The redundant parity drives protect against data loss but also allow for rebuilding of the array if up to two drives fail.
  • RAID 6 provides excellent redundancy for mission critical data at the cost of usable capacity and performance.

In environments where downtime is unacceptable, RAID 6 offers the best combination of data protection and redundancy. The tradeoff is usable storage capacity and performance. But for critical data that cannot be lost, RAID 6 is often the right choice.