What is RAID 6 in computer architecture?

RAID 6 is a type of RAID (Redundant Array of Independent Disks) that provides fault tolerance by using two parity drives. This allows data to be recovered even if two drives fail. Some key things to know about RAID 6:

What is RAID?

RAID stands for Redundant Array of Independent Disks. It is a data storage technology that combines multiple physical drives into one logical unit. The main purposes of RAID are to provide redundancy, improve performance, and increase storage capacity beyond what a single drive can offer.

There are several levels or types of RAID that provide different methods of striping and mirroring data across drives. Each RAID level offers varying degrees of redundancy and physical drives needed. The most common RAID levels are:

  • RAID 0 – Data striping across drives for improved performance. No redundancy.
  • RAID 1 – Disk mirroring for 100% redundancy. Minimum 2 drives needed.
  • RAID 5 – Block-level striping with distributed parity. Minimum 3 drives needed.
  • RAID 6 – Block-level striping with double distributed parity. Minimum 4 drives needed.

The RAID level used depends on the needed blend of performance, capacity, and fault tolerance. RAID 6 offers high redundancy for critical data at the cost of reduced usable capacity.

What is RAID 6?

RAID 6 is a RAID configuration that uses block-level striping with double distributed parity. This means the data is broken down into blocks that are striped and written across the drives in the array. Then, two sets of parity data are calculated and written across the drives.

The dual parity provides fault tolerance up to two drive failures. If one or two drives fail, the missing data can be recreated from the remaining data and parity drives. This provides excellent redundancy for critical data at the cost of usable capacity.

At minimum, RAID 6 requires four physical drives to implement. However, it is common to use more drives to increase overall capacity. For example, a RAID 6 array with 8 total drives could tolerate up to two failed drives.

How does RAID 6 work?

RAID 6 works by using block-level striping with two independent distributed parity schemes known as P and Q. Here is a high-level overview of how it works:

  1. Data is broken down into blocks which are striped and written across the data drives in the array.
  2. Parity drive P calculates and stores parity information for each block stripe.
  3. Parity drive Q calculates and stores a separate set of parity data.
  4. The parity blocks P and Q are distributed across different drives rather than stored on a single dedicated drive.
  5. If up to two drives fail, the missing data and parity blocks can be calculated using P and Q to reconstruct the data.

This dual parity provides excellent fault tolerance and the distributed layout avoids bottlenecks from dedicated parity drives. However, the dual parity calculation reduces potential write performance compared to single parity RAID 5.

Advantages of RAID 6

Here are some of the key advantages of using RAID 6:

  • High fault tolerance – Can withstand failure of up to 2 drives without data loss.
  • Avoid rebuild issues – RAID 5 is more prone to failure during rebuilds. RAID 6 is more robust.
  • Larger capacity arrays – Can create large arrays with greater capacity using more drives.
  • Distributed parity – Avoids dedicated parity drive bottlenecks of RAID 4/5.
  • Auto-rebuild – Most RAID 6 implementations automatically re-create missing data if a drive fails.

For mission critical storage and larger drive arrays, RAID 6 provides excellent redundancy to protect against up to two drive failures. The distributed parity helps avoid rebuild issues that can occur with RAID 5 in larger arrays.

Disadvantages of RAID 6

There are also some potential downsides to using RAID 6 to consider:

  • Lower usable capacity – Double parity requires minimum 4 drives reducing usable space.
  • Slower writes – Dual parity calculation can reduce write performance.
  • Not available on older hardware – Requires RAID controller that supports RAID 6.
  • Higher rebuild times – More parity data can mean very long rebuild times with larger arrays.
  • Higher cost – Requires more physical drives than other RAID levels.

The trade-off for high fault tolerance is decreased usable capacity, slower writes, and a higher system cost. Rebuild times can also be extensive with very large arrays. Overall, RAID 6 works best for critical data that requires maximum redundancy.

When to use RAID 6

Here are some examples of use cases where implementing RAID 6 can be beneficial:

  • Database servers – Provides maximum redundancy for critical databases.
  • File servers – Protects against loss of essential files or documents.
  • Media servers – Guard against losing valuable media assets that are hard to replace.
  • Transactional systems – Prevents transaction loss for applications like banking.
  • Archival storage – Safeguards data that must be preserved and retained long-term.
  • High-capacity arrays – More robust than RAID 5 for large arrays with 8+ drives.

Any system that requires high uptime and availability is a good candidate for RAID 6. The dual parity provides excellent redundancy for mission critical data. RAID 6 is commonly used for databases, file servers, backups, and large storage arrays.

Alternatives to RAID 6

There are a few alternatives that can be considered instead of or in addition to RAID 6:

  • RAID 10 – Mirroring and striping for redundancy and performance. But lower capacity efficiency.
  • RAID 50/60 – Nested RAID levels combining striping and multiple parity schemes.
  • Backups – Regular backups provide an additional layer of protection against data loss.
  • Hot spares – Dedicated standby drives that can automatically rebuild failed drives.
  • Erasure coding – More advanced scheme using math to recreate data from subsets of drives.

Each option involves different trade-offs between things like cost, performance, capacity, and data protection. A blended approach using RAID 6 plus other technologies like snapshots and backups can provide comprehensive data protection.

RAID 6 in action

Here is an example RAID 6 setup with 8 total drives:

Drive Data/Parity
Drive 1 Data
Drive 2 Parity P
Drive 3 Data
Drive 4 Parity Q
Drive 5 Data
Drive 6 Data
Drive 7 Parity P
Drive 8 Parity Q

With this setup, the array could handle up to two drive failures and rebuild the missing data using the remaining data and parity drives. For example, if drives 3 and 6 failed, their data could be recreated using the P and Q parity data. RAID 6 provides excellent fault tolerance without some of the rebuild issues associated with RAID 5 in larger arrays.

Conclusion

RAID 6 is a high redundancy RAID level that uses double distributed parity to protect against up to two drive failures. It provides excellent data protection for mission critical systems and large drive arrays where faults are likely. The trade-off is reduced usable capacity and slower write speeds versus other RAID levels.

RAID 6 works best for use cases that demand maximum data redundancy like databases, file servers, and archival storage. It can help avoid lengthy downtimes and expensive data recovery in the event of multiple drive failures. When combined with regular backups and other data protection methods, RAID 6 offers a robust storage solution for business-critical data.