What are downsides of redundant disks?

Redundant disks, also known as RAID (Redundant Array of Independent Disks), are a way to provide fault tolerance and improve performance in storage systems by using multiple disks together. The basic idea is that if one disk fails, the data can still be accessed from the other disks in the array. This prevents data loss and downtime in the event of a disk failure.

Some common RAID levels and their pros and cons are:

RAID 0

RAID 0 provides improved performance by striping data across multiple disks. This allows reads and writes to be done in parallel. However, RAID 0 provides no redundancy – if any disk fails, all data will be lost.

RAID 1

RAID 1 provides redundancy by mirroring data across two disks. If one disk fails, data can still be accessed from the other mirror disk. The downside is that usable capacity is only equal to one disk, as the second disk is an exact copy.

RAID 5

RAID 5 stripes data and parity information across three or more disks. If any single disk fails, data can be recreated from the parity information. Compared to RAID 1, usable capacity is much higher. However, write performance suffers due to parity calculation.

RAID 6

RAID 6 is similar to RAID 5, but uses a second parity disk to allow for two disk failures without data loss. This further improves redundancy but reduces usable capacity compared to RAID 5.

While redundant disks provide important benefits, some downsides and considerations include:

Increased Cost

The most obvious downside of redundant disks is increased monetary cost. RAID configurations require purchasing extra drives, which can get expensive. For example, a 4 disk RAID 10 configuration would require buying 4 drives instead of just one. Enterprise quality SAS or SSD drives designed for RAID setups also have a higher upfront cost than consumer grade drives. The extra cost may be justified for mission critical data, but is overkill for less important data.

Complexity

Configuring, managing, and troubleshooting RAID arrays requires more knowledge and expertise compared to standalone disks. Choosing the right RAID level, properly configuring the array, monitoring for disk failures, replacing failed drives, and recovering from failures are complex tasks. Many small businesses lack dedicated storage administrators to handle this complexity.

Lower Usable Capacity

Depending on the RAID level, redundant disks provide less overall usable capacity compared to standalone disks. For example, in a 2 disk RAID 1 mirror, total usable capacity is equal to only a single disk. In RAID 5 with 3 disks, usable capacity is equal to 2 disks worth of space. The redundancy overhead has to be accounted for when planning storage capacity.

Slower Rebuild Times

When a disk in a RAID array fails, the data has to be rebuilt onto a replacement disk to restore redundancy. For large capacity disks, these rebuild times can be very long. During the rebuild window, there is no protection against a second disk failure. Rebuild times continue increasing as disk sizes grow larger.

Performance Overhead

RAID levels that provide redundancy have to do additional calculations to write parity information. This can negatively impact write performance compared to standalone disks. For example, RAID 5 write speed will be slower than a single disk because of the parity calculation overhead.

Single Point of Failure

Many RAID controllers and enclosures represent a single point of failure. If the RAID controller fails, access to the entire array is lost. Some RAID enclosures also have a single power supply or expander module that can take down the entire enclosure if faulty.

Difficult to Upgrade

Once RAID arrays are configured, upgrading capacity or changing the RAID level involves replacing all the disks. There is no easy way to mix and match new disks with an existing array. The only option is to backup data, destroy the array, and recreate it from scratch.

Fault Tolerance, Not Backup

It’s important to note that RAID provides fault tolerance against disk failures, but is not a backup solution. If data gets corrupted or deleted for any reason, it will also get corrupted or deleted on the redundant disks. Systematic backup to an external location is still required to protect against data loss.

Most Failures are Not Disk Related

According to many studies, disk failures account for only 10-20% of data failures. More common causes include human errors, software bugs, malware, power outages, hardware faults, natural disasters, and more. RAID arrays can do nothing against these non-disk failures.

Hidden Latent Disk Defects

Brand new disks that pass initial testing could still fail anytime during their lifetime from latent defects. If this happens, mirrors or parity information may be useless because bad sectors or unreadable tracks were undetected initially. This risk cannot be totally eliminated.

Increased Array Management

Managing large RAID arrays with 10s or 100s of disks introduces additional challenges. More hot spares may be needed to handle frequent rebuilds. Tracking disk inventory and replacing older disks gets more complex. Larger arrays are also impacted more severely by simultaneous disk failures.

Extra Hardware

Properly implementing RAID requires specialized RAID controllers and in some cases RAID enclosures. These add to the hardware cost and also introduce possible single points of failure if the cards/enclosures are poorly designed or implemented.

Difficult Data Recovery

Recovering data from failed RAID arrays can be difficult, tedious, and expensive. Advanced skills with RAID configuration and reconstruction are needed. In some cases, data recovery services may be the only option if onsite staff lacks expertise. This quickly adds up in costs.

Testing and Validation Overhead

Extensively testing a RAID array to verify it provides the expected redundancy and performance requires dedicated tools, time, and expertise. Cost reductions lead some organizations to cut corners on validation. But this leads to undetected configuration issues.

Install and Configure Properly

To benefit from redundant disks while avoiding drawbacks, RAID arrays must be carefully designed, implemented, and tested. Some best practices include:

– Choose the optimal RAID level based on performance vs redundancy needs. Don’t blindly default to RAID 5 or RAID 10.

– Use enterprise class disks designed for RAID instead of consumer grade drives.

– Buy disks from multiple suppliers to minimize bad batches.

– Implement hot spares to accelerate rebuilds.

– Use battery-backed cache to prevent data loss on power failure.

– Monitor arrays proactively for impending disk failures.

– Validate configurations, walk testing failover scenarios.

– Have spare parts and documented procedures to facilitate fast recovery.

– Backup RAID arrays regularly, its not a backup solution alone.

The Bottom Line

Redundant disks can deliver valuable protection against downtime from disk failures. But the additional complexity and cost tradeoffs should be evaluated carefully based on actual business needs. For less critical data, backups or cloud storage may provide sufficient data protection at lower complexity. Like any technology, the pros and cons must be weighed before jumping blindly on the RAID bandwagon.

Frequently Asked Questions

What are the downsides of RAID 5?

Some key downsides of RAID 5 include:

– Write performance overhead from parity calculations

– Rebuild times are very long for large capacity drives

– Increased risk of unrecoverable read errors during rebuilds

– Double disk failure leads to total data loss

– RAID 5 is not recommended for write intensive applications

What are the downsides of RAID 1?

Some key downsides of RAID 1 mirroring are:

– 50% storage efficiency as duplicates are maintained

– Increased cost since matching disks are needed for mirrors

– Rebuild time can still be lengthy for large drives

– Write performance may be slower on some controllers

Is RAID 6 better than RAID 5?

RAID 6 is generally considered better than RAID 5 today because:

– Dual parity allows two disk failures without data loss

– Rebuild times are faster as load is distributed over more disks

– Larger drive capacities have made RAID 5 rebuilds risky

– RAID 6 performs better for read-intensive applications

– RAID 6 provides excellent data protection for critical data

What are the disadvantages of RAID?

Some major disadvantages of RAID include:

– Increased complexity and cost compared to single disks

– Rebuild times can be very long leading to vulnerability

– Many RAID levels provide lower usable capacity

– Controller or enclosure failure can take entire RAID offline

– Does not protect against catastrophic failure, corruption, etc

– Difficult to upgrade and transition between RAID levels

When should you not use RAID?

RAID may not be suitable if:

– Budget is very tight and data integrity is not critical

– Small capacity drives are being used where rebuild times are quick

– Data protection via backups or cloud storage is deemed sufficient

– Application is read-intensive with redundancy not improving performance

– There is lack of expertise to properly configure and manage RAID

Conclusion

Redundant disks via RAID provide valuable protection against downtime from disk failures. But the complexity and expense tradeoffs need careful evaluation. RAID is complex to properly implement and manage. Costs also multiply quickly for large arrays. Alternatives like backups and cloud storage may meet data protection needs more cost efficiently. The pros and cons of different RAID levels should be analyzed in detail. RAID can deliver substantial benefits for mission critical applications. But it is not a silver bullet that should be blindly applied universally. Careful planning and design is necessary to maximize benefits while minimizing drawbacks.