Which raid level is best cost effective option providing both performance and redundancy?

When configuring storage for a computer system, one of the most important considerations is which RAID level to use. RAID, which stands for Redundant Array of Independent Disks, allows multiple disk drives to be combined together to improve performance, capacity, and reliability. The most cost-effective RAID level provides a balance of performance, capacity, and redundancy while minimizing cost. In most cases, RAID 5 or RAID 6 offer the best combination of these factors for a cost-effective solution.

What is RAID?

RAID is a technology that combines multiple disk drive components into a logical unit. Data is distributed across the drives to provide redundancy and/or improve performance. The different RAID levels each have advantages and disadvantages in terms of performance, capacity, and fault tolerance. The most commonly used RAID levels are:

  • RAID 0: Data is striped across drives for performance, but there is no redundancy. RAID 0 provides the best performance, but no fault tolerance.
  • RAID 1: Disk mirroring is used to duplicate data across drives. Provides redundancy but capacity is limited to one disk worth of storage.
  • RAID 5: Data is striped across drives with parity distributed across the array. Provides redundancy with better storage capacity than RAID 1.
  • RAID 6: Similar to RAID 5 but with double distributed parity. Provides highest level of redundancy but lower storage capacity than RAID 5.
  • RAID 10: Combination of RAID 0 and RAID 1 by mirroring striped disk sets. Provides increased performance plus redundancy.

The RAID level used depends on the required level of fault tolerance and performance. Most of the time, RAID 5 or 6 provide the best balance for a cost-effective solution with reasonable performance and good redundancy.

Benefits of RAID

Implementing RAID provides several key benefits:

  • Increased storage capacity – Combining multiple drives adds their capacities together for more storage space.
  • Improved performance – RAID 0 and other striped levels provide better speed by distributing reads/writes.
  • Fault tolerance – Redundant RAID levels like 1, 5, and 6 provide protection against drive failures.
  • Increased reliability – Redundancy provides higher reliability and availability.

By combining multiple drives together into RAID configurations, systems can gain important advantages. The ideal RAID level provides the right blend of benefits for a particular environment’s needs.

Comparing RAID Levels

When choosing a RAID level, it is helpful to compare the technical differences between the most commonly used configurations:

RAID 0

  • Data is striped across drives for performance
  • No parity or redundancy
  • Highest performance of any RAID level
  • Total capacity equals sum of all drives
  • Not fault tolerant – any drive failure results in data loss
  • Useful when performance is most important

RAID 1

  • Disk mirroring provides 100% redundancy
  • Data is duplicated on secondary drive
  • Good performance for reads, slow writes
  • Capacity limited to single disk size
  • Very fault tolerant with good failure protection
  • Ideal for critical data that needs high availability

RAID 5

  • Block-level striping with distributed parity
  • Parity allows recovery from a single drive failure
  • Better performance than RAID 1 or 6
  • Capacity is (N-1) * smallest disk size
  • Decent failure protection and good performance
  • Well-balanced overall RAID level

RAID 6

  • Similar to RAID 5 with dual parity
  • Can withstand two drive failures
  • Read performance is good, writes are slow
  • Capacity is (N-2) * smallest disk size
  • Excellent fault tolerance and redundancy
  • Ideal for mission critical storage that needs high availability

This comparison shows some of the tradeoffs between performance, capacity, and redundancy for each RAID type. RAID 5 and 6 provide the best overall balance for most applications.

Factors to Consider

When selecting a RAID level, administrators need to consider:

Application Performance Requirements

The performance needs of the applications using the storage should guide the RAID selection. Streaming workloads like media servers prefer higher throughput while transactional systems need faster random I/O.

Capacity Requirements

The amount of usable storage after RAID calculations and parity overhead should be sufficient for requirements.

Redundancy Requirements

Mission critical systems may need dual parity or mirroring for enhanced redundancy while other uses like media storage can use single parity.

Cost

Higher levels of performance and redundancy require more drives, which increases cost. The balance depends on the storage budget.

Ease of Recovery

How easy is it to recover data if drives fail? RAID 6 rebuilds take longer than RAID 5 after a failure.

Analyzing these factors will guide the right RAID decision for a specific use case scenario.

RAID 5 Arrays

RAID 5 is one of the most popular and cost-effective RAID levels because it provides redundancy along with good performance and storage capacity. Here is an overview of RAID 5:

  • Data and parity blocks are striped across all drives
  • Parity allows recovery from a single disk failure
  • I/O performance is very good for most workloads
  • Write performance can suffer due to parity calculation overhead
  • Average usable capacity is (N-1)/N where N=number of disks
  • Cost effective protection for most applications
  • Relatively quick and easy rebuild after a drive failure

With support for redundancy and fast performance across multiple disks, RAID 5 delivers a versatile solution at a reasonable cost.

RAID 5 Performance

RAID 5 provides excellent overall performance for most applications. Read operations can be distributed in parallel across all the disks for fast read throughput. Writes are slower due to the parity calculation overhead which requires reading the existing data blocks, computing parity, and writing the new blocks plus parity.

RAID 5 performance characteristics:

  • Very fast Reads – all member disks read in parallel
  • Good sequential read and write throughput
  • Slower random writes due to parity update calculations
  • Good transactional performance with average read/write speeds

For general purpose use with combined reads and writes, RAID 5 offers excellent performance. The parity tradeoff is reasonable to gain the benefits of fault tolerance.

RAID 5 Reliability

A key benefit of RAID 5 is the added redundancy provided by the distributed parity mechanism. With the parity blocks distributed across the array, the RAID can withstand a single disk failure without data loss. When a disk fails, the missing data can be recalculated from the parity block and remaining data disks.

RAID 5 provides good protection against disk failures along with fast rebuild times. The recovery process is straightforward and completes quickly compared to rebuilding the entire disk contents. On average, rebuild times take 1-2 hours per terabyte of RAID size.

RAID 5 Capacity and Cost

For an array with N member disks of equal size, the total usable capacity in a RAID 5 configuration is:

RAID 5 Capacity = (N – 1) x Disk Size

This accounts for the space used for parity information. With today’s large high-capacity SATA drives, the parity overhead is relatively small given the gain in redundancy and performance.

In terms of cost, RAID 5 provides very good overall value per gigabyte while still delivering performance, capacity and redundancy. The technology has proven itself as a workhorse RAID level with widespread use across all types of systems and applications.

RAID 6 Arrays

RAID 6 is an advanced form of RAID that provides dual parity for added redundancy and protection:

  • Similar to RAID 5 with second set of parity blocks
  • Withstands up to two drive failures with no data loss
  • Slower write performance than RAID 5 due to dual parity
  • Average capacity is (N-2)/N where N is drive count
  • Excellent fault tolerance for mission critical data
  • Higher cost than RAID 5 and longer rebuilds

The dual parity provides an extra layer of protection but comes at the cost of decreased write performance and higher storage overhead. However, for certain critical applications, the added redundancy justifies the downsides.

RAID 6 Performance

The performance profile of RAID 6 is similar to RAID 5 but with slower write speeds. Specifically:

  • Excellent sustained read speeds
  • Very good sequential reads and writes
  • Slow random writes due to dual parity calculations
  • Often used with enterprise SSDs to offset write penalty

The RAID write penalty is more pronounced due to computing both P and Q parity values on each block change. To alleviate this issue, enterprise SSDs are often paired with RAID 6 due to their increased random write performance.

RAID 6 Reliability

With support for dual drive failures, RAID 6 offers the highest level of fault tolerance among standard RAID levels. The dual distributed parity provides an added layer of protection and resistance to multiple drive failures.

In addition, RAID 6 does not suffer from the infamous RAID 5 “write hole” vulnerability during degraded operation. The dual parity effectively eliminates this risk that exists in RAID 5 implementations.

For mission critical data or large drive arrays, RAID 6 is the premier choice to prevent data loss and downtime. The statistical chances of two drives failing simultaneously is extremely low in most environments.

RAID 6 Capacity and Cost

The usable capacity in a RAID 6 array is a function of the number of disks:

RAID 6 Capacity = (N – 2) x Disk Size

With the need to store two sets of parity data, total capacity is lower compared to RAID 5. However, with today’s massive drive sizes, this is usually not a major issue.

In terms of cost, RAID 6 has a higher expense for a given amount of storage vs RAID 5. The increased hardware requirements also add to the overall system cost. However, for certain use cases, the high redundancy and availability justify the added investment.

Comparison of RAID 5 and RAID 6

RAID 5 RAID 6
Redundancy Single parity stripe Double parity stripe
Drive Failures Tolerated 1 2
Read Performance Excellent Excellent
Write Performance Good Moderate
Capacity Efficiency N-1 N-2
Rebuild Time Fast Slow

In summary, RAID 5 offers faster rebuild times, better write performance, lower capacity overhead, and a lower cost than RAID 6. However, RAID 6 provides significantly better redundancy and protection against multiple drive failures. For mission critical data or large arrays, RAID 6 is preferred. In other scenarios where redundancy needs are lower, RAID 5 offers excellent performance and value.

Choosing Between RAID 5 and RAID 6

When deciding between RAID 5 and RAID 6, consider the following factors:

  • Redundancy requirements – How critical is a dual disk failure?
  • Performance needs – Will the dual parity writes of RAID 6 cause issues?
  • Number of drives – RAID 6 preferred for larger arrays
  • Drive types – RAID 6 better for slower HDDs, RAID 5 for SSDs
  • Budget – RAID 6 has higher hardware and capacity costs
  • Expected workload – OLTP favors RAID 5, media streaming RAID 6

In most general purpose applications, RAID 5 provides the right blend of redundancy, performance, and value for money. RAID 6 is preferred for arrays with 8+ large drives or mission critical transactional databases.

Conclusion

Determining the most cost-effective RAID level depends on the specific needs of an environment. In review, here are some final recommendations:

  • RAID 5 hits the sweet spot for overall value, performance, and redundancy for most uses
  • RAID 6 provides excellent fault tolerance for critical applications or large drive counts
  • RAID 10 improves performance radically but at much higher cost
  • RAID 0 maximizes speed and capacity with no redundancy
  • RAID 1 ensures high availability through mirroring

For a versatile, balanced solution, RAID 5 is an excellent choice for general purpose applications while RAID 6 offers superior redundancy for mission critical systems and large drive arrays. Carefully evaluate the tradeoffs and select the RAID level that best aligns with the requirements and constraints of your specific environment.