What RAID array should I use?

Choosing the right RAID array for your storage needs is an important decision that requires careful consideration. The main factors to think about are performance, capacity, redundancy, and cost. This article will provide a comprehensive overview of the different RAID levels and help you determine which one is best for your specific use case.

What is RAID?

RAID stands for Redundant Array of Independent Disks. It is a data storage technology that combines multiple disk drive components into a logical unit. Data is distributed across the drives in one of several ways called RAID levels. Each RAID level provides a different balance of performance, capacity, and redundancy.

The main reasons to use RAID are to increase storage capacity beyond the limit of a single drive, improve performance for read and write operations, and provide protection against drive failures. RAID protects data by reconstructing missing or corrupt data from the remaining drives.

Key Advantages of RAID

  • Increased storage capacity – Combining drives together in an array allows for larger volumes beyond the capacity limit of any one disk.
  • Improved performance – Spreading data across multiple disks can optimize read and write speeds through parallelization.
  • Redundancy – Parity and mirroring provide fault tolerance if a drive fails.

RAID Levels

There are several standardized RAID levels, each with specific data distribution and redundancy characteristics. Here is an overview of the most commonly used RAID levels:

RAID 0

  • Data is striped across multiple drives without parity or mirroring.
  • Provides improved performance through parallelization.
  • No redundancy – One drive failure results in total data loss.
  • Best for non-critical data where speed is most important.

RAID 1

  • Drives are mirrored, with data duplicated on a second drive.
  • Good performance with simultaneous read operations.
  • Total capacity is equal to one drive.
  • Can recover from a single drive failure.
  • Ideal for critical data that needs high availability.

RAID 5

  • Data is striped across drives with distributed parity information.
  • Can survive a single drive failure without data loss.
  • Capacity is equal to number of drives minus one drive for parity.
  • Good balance of capacity, performance, and redundancy for most applications.

RAID 6

  • Similar to RAID 5 but with double distributed parity.
  • Can recover from two disk failures.
  • Write performance may be slower than RAID 5 due to parity calculations.
  • Recommended where avoiding downtime is crucial.

RAID 10

  • Combination of mirroring and striping for high performance and redundancy.
  • Can withstand multiple drive failures as long as no mirror loses all drives.
  • 50% storage efficiency due to mirroring.
  • Ideal for mission critical applications that require high capacity, speed, and availability.

Choosing the Right RAID Level

Selecting the optimal RAID level involves balancing performance, capacity, redundancy, and budgetary requirements. Here are some guidelines for choosing the right RAID level:

Situation Recommended RAID Level
Need maximum performance for non-critical data RAID 0
Critical data requiring high availability RAID 1 or RAID 10
General-purpose file and application servers RAID 5 or RAID 6
mission critical transactional databases RAID 10
Media streaming and video editing RAID 5 or RAID 6
Budget constraints limit number of drives RAID 1

You should also consider the number of drives you want to deploy and their individual capacities. More drives allow for larger arrays and greater flexibility in how you configure the RAID level.

RAID Configuration Scenarios

Let’s look at some common scenarios and recommended RAID configurations:

General File Server

For general-purpose file and print serving for a small office, you might configure:

  • 4 x 4TB HDD in RAID 5 = 12TB usable capacity
  • Good balance of decent capacity and performance with single drive fault tolerance

Database Server

For a mission critical database server, higher performance and redundancy are required. A sample setup might be:

  • 8 x 800GB SSD in RAID 10 = 4TB usable capacity
  • Very high performance read/writes with ability to survive multiple drive failures

Media Editing

For a video editing workstation with large storage needs, you could implement:

  • 6 x 8TB HDD in RAID 6 = 40TB usable capacity
  • Large capacity for holding raw video footage with good redundancy

Critical Virtual Host

For a virtualization host running many critical VMs, high performance and redundancy is crucial:

  • 4 x 200GB SSD + 4 x 1TB HDD in RAID 10 = 1.2TB usable capacity
  • Fast SSD performance for caching and HDD capacity for VMs
  • Can survive multiple drive failures

Software vs Hardware RAID

RAID can be implemented in software, hardware, or a combination of both. The choice depends on your performance and budget requirements:

  • Software RAID -RAID is handled by the operating system. More affordable but uses CPU resources.
  • Hardware RAID – Uses a dedicated RAID controller card. Faster but more expensive.
  • Hybrid – Combine hardware RAID with software RAID for flexibility.

For home and small offices, software RAID is common. Hardware RAID becomes important for mission critical servers and high performance workstations managing large volumes of data.

RAID Configuration Steps

Once you choose the optimal RAID level and hardware configuration, these are the general steps to set up and configure RAID:

  1. Install the physical hard drives and RAID controller (if using hardware RAID).
  2. Configure RAID settings in system BIOS if using hardware RAID.
  3. Create the RAID arrays through RAID management software.
  4. Install the operating system on the RAID volume.
  5. Test and verify the RAID configuration for errors.
  6. Add a hot spare drive for redundancy if desired.
  7. Monitor the health of the RAID arrays over time.

The RAID management software will allow you to configure the specific RAID level, stripe size, drive assignments, and other settings for your arrays. Take care when configuring RAID as all existing data on the drives will be lost.

Choosing RAID for Virtualized Environments

Virtualized servers and infrastructure require special consideration when choosing a RAID configuration:

  • Use RAID 10 for hypervisors and virtual machine storage when possible for performance.
  • Ensure virtual disk components are distributed across multiple physical RAID arrays.
  • Use SSDs for virtual machine cache and logs to boost performance.
  • Configure separate RAID arrays for hypervisor versus virtual machine files.

Monitoring RAID status and disk health is also critical in virtual environments to identify and replace failed drives promptly. Keep spare drives available for quick replacement to avoid virtual machine downtime.

Maintaining and Monitoring RAID Arrays

Like any storage system, RAID arrays require ongoing maintenance and monitoring. Here are some best practices:

  • Monitor disk health statistics and events in management tools.
  • Replace failed drives immediately to rebuild redundancy.
  • Keep spare drives on hand to swap into arrays.
  • Scrub arrays periodically to check data integrity.
  • Upgrade disk firmware and RAID controller software when needed.
  • Test redundancy by simulating drive failures.
  • Monitor system logs for any RAID errors or events.

By taking steps to proactively monitor and maintain RAID configurations, you can minimize the chances of any service disruptions. Consider a remote monitoring service to automatically check RAID health statistics if staffing resources are limited.

Conclusion

Choosing the optimal RAID setup requires weighing factors like application performance needs, required capacity, budget, and data redundancy requirements. Higher RAID levels provide more data protection but at the cost of usable storage capacity. The right balance depends on your specific use case.

In general, opt for RAID 1 or RAID 10 for mission critical data, RAID 5/6 for general purpose use, and RAID 0 when performance is the primary goal over redundancy. Take growth needs into account and use higher capacity drives if possible.

Carefully testing configurations, monitoring RAID health, and following redundancy best practices will help ensure your RAID investment provides continued protection and return on investment.