How do I know which RAID to use?

Choosing the right RAID configuration for your storage needs can be confusing. There are several different RAID levels to choose from, each with their own pros and cons. In this guide, we’ll walk you through the key factors to consider when deciding which RAID to use.

What is RAID?

RAID stands for Redundant Array of Independent Drives. It is a data storage technology that combines multiple disk drives into a logical unit. RAID provides increased storage capacity, reliability, and performance compared to single drives.

The main goals of RAID are:

  • Increase data reliability and fault tolerance
  • Improve performance for read and write operations
  • Provide larger storage capacities than single drives

RAID achieves this through techniques like disk striping, mirroring, and parity checking. The different RAID levels use these techniques in various combinations to meet different goals.

Key Factors for Choosing a RAID Level

Here are some key considerations when deciding which RAID level to implement:

1. Required capacity

The total storage capacity needed is a major factor. RAID 0 provides the full sum capacity of the drives in the array. But RAID 1 and other mirrored RAID levels only provide the capacity of a single drive. The more drives added, the larger the usable capacity.

2. Performance needs

RAID can provide performance improvements through parallelization. RAID 0 stripes data across all drives for fast reads and writes. But RAID 5/6 incur overhead for parity calculations, slowing write speeds. Consider whether raw performance or parity protection is more important.

3. Availability and redundancy

RAID aims to prevent data loss if a drive fails. But not all RAID levels provide redundancy. RAID 0 has no redundancy while RAID 1, 5, 6, 10 can survive single or double drive failures. Evaluate your uptime requirements when deciding on redundancy.

4. Rebuild times

The time to rebuild an array after a failed drive is replaced varies by RAID level. RAID 5/6 have longer rebuild times than RAID 1 or 10. If uptime is critical, consider RAID levels with faster rebuilds.

5. Cost effectiveness

The overall storage efficiency varies between RAID levels. Mirrored RAID levels effectively double the cost per gigabyte. RAID 5 provides efficient use of capacity with a single parity drive. Weigh the costs of redundancy versus usable capacity.

Overview of RAID Levels

Now that we’ve covered some key considerations, let’s take a look at the most common RAID levels and their differences:

RAID 0

  • Description: Disk striping without parity or mirroring.
  • Pros: High performance, full capacity utilization.
  • Cons: No fault tolerance, high risk of data loss.
  • Use cases: Temporary storage, non-critical data.

RAID 1

  • Description: Disk mirroring.
  • Pros: High performance reads, simple redundancy.
  • Cons: 50% storage efficiency, no performance gain on writes.
  • Use cases: Small servers, transactional databases.

RAID 5

  • Description: Striping with distributed parity.
  • Pros: Good performance, efficient use of capacity.
  • Cons: Slower writes, long rebuild times.
  • Use cases: File servers, web servers, media streaming.

RAID 6

  • Description: Striping with double distributed parity.
  • Pros: Protects against dual drive failures.
  • Cons: Added overhead reduces write performance.
  • Use cases: Mission critical storage, large arrays.

RAID 10

  • Description: Stripe of mirrors.
  • Pros: High performance, can survive multiple drive failures.
  • Cons: 50% storage efficiency.
  • Use cases: Database servers, transactional systems.

RAID Capacity Calculations

To determine the total usable capacity provided by different RAID configurations, you can use these basic formulas:

  • RAID 0 capacity = Sum of all drive capacities
  • RAID 1 capacity = Capacity of the smallest drive
  • RAID 5 capacity = (N – 1) * Smallest drive capacity
  • RAID 6 capacity = (N – 2) * Smallest drive capacity
  • RAID 10 capacity = (N/2) * Smallest drive capacity

Where N is the total number of drives in the array.

For example, a RAID 10 array with four 2TB drives would provide 4TB total capacity:

  • N = 4 drives
  • Smallest drive capacity = 2TB
  • RAID 10 capacity = (N/2) * 2TB = (4/2) * 2TB = 4TB

When to Use Each RAID Level

Now that we’ve covered the basics of how each RAID level works, here are some general guidelines on which ones are best suited for different use cases:

RAID 0

RAID 0 is ideal for applications where high performance is critical but redundancy is not. Examples include:

  • Video editing workstations
  • Scratch disks
  • Game recordings

RAID 1

RAID 1 provides simple redundancy for cases where uptime and reliability are important but the storage system is small. Good for:

  • Operating system drives
  • Small business servers
  • Transactional databases

RAID 5

RAID 5 offers a balance of performance, capacity, and redundancy. It works well for a wide range of use cases including:

  • Application and web servers
  • Medium sized databases
  • File and media storage servers
  • Backup systems

RAID 6

RAID 6 provides the highest level of redundancy for large, critical storage systems where downtime is unacceptable. Such as:

  • Large databases
  • Enterprise file servers
  • Cloud storage systems
  • Data warehousing

RAID 10

RAID 10 balances performance and redundancy for critical applications. It’s ideal for cases like:

  • High performance databases
  • I/O intensive web servers
  • Transactional systems
  • Virtualization

Software vs Hardware RAID

RAID can be implemented in software or hardware. Software RAID uses the operating system and CPU resources to manage the array. Hardware RAID uses a dedicated RAID controller card with its own processor and memory.

Software RAID offers these advantages:

  • Lower cost since no special controller is needed
  • Easier to manage and configure through software
  • Can be used with any drive interface like SATA or SAS

Hardware RAID provides these benefits:

  • Doesn’t consume host system resources
  • Provides caching to boost performance
  • More reliable with battery-backed cache protection
  • Specialized for high-end arrays (12Gbps SAS etc)

For most small servers, workstations and NAS devices, software RAID is preferred. Hardware RAID is best for mission critical storage that demands highest performance and reliability.

Choosing Drives for a RAID Array

When building a RAID array, the type, size, speed and interface of the drives impact performance and how much usable capacity is available. Here are some tips for drive selection:

  • Use enterprise class drives for RAID configurations rather than consumer drives. They have better reliability and performance.
  • Choose drives with capacities matched to your storage needs so capacity isn’t wasted.
  • Favor higher RPM drives like 10K or 15K for improved IOPS.
  • Use drives with SAS or NVMe interfaces for best performance.
  • SSDs provide huge speed benefits but have higher costs per gigabyte.
  • Mixing drive types or capacities in one array can get complicated.

Standardizing on matching high quality drives designed for RAID is the safest approach. Consumer drives can be used for certain RAID levels like RAID 0 if uptime isn’t critical.

Setting Up and Managing a RAID Array

Once you’ve selected the appropriate RAID level and drives, the array needs to be configured and initialized. This process varies between software and hardware RAID.

For software RAID, most operating systems provide built-in utilities for creating and managing the arrays. For example:

  • Windows has the Disk Management utility
  • Linux distributions include mdadm
  • macOS has Disk Utility

These tools allow specifying the RAID parameters like level, stripe size, and which disks to include. The OS then handles the underlying management of the array.

Hardware RAID requires configuring options on the RAID controller’s management interface. This is done through a BIOS menu, dedicated software, or command line tools depending on the controller. The controller then transparently handles the RAID operations.

No matter which control method is used, key tasks include:

  • Initializing new arrays
  • Monitoring drive health
  • Rebuilding failed drives
  • Expanding capacity
  • Changing RAID levels if needed

Maintenance and monitoring should be performed proactively to identify and replace failed drives before data loss occurs. Most RAID controllers and software provide alerts for drive failures and predictive warnings.

Caching and Battery Backup

RAID performance can be enhanced by using caching mechanisms. This is often provided on hardware RAID controllers which have dedicated cache memory and battery backup units (BBUs).

The benefits of RAID caching include:

  • Faster write speeds by temporarily storing data in cache before writing to disk.
  • Increased read performance by accessing frequently used data in cache.
  • Improved reliability by allowing data in cache to be written to disk in event of power loss.

BBUs provide power to continue caching and finish writing data from cache to drives when external power is disrupted. This prevents potential data loss or corruption.

Software RAID can also use system memory for caching. However, this lacks the data protection of battery-backed hardware cache if power fails.

Overall, RAID caching delivers a significant boost in array performance and redundancy. It’s a recommended feature for mission critical hardware RAID implementations.

Nested RAID Levels

In some large scale deployments, multiple RAID levels can be combined together in a nested configuration. For example:

  • RAID 10 arrays can be striped together in a RAID 0 setup.
  • RAID 0 arrays can be mirrored using RAID 1.
  • RAID 6 arrays can be nested in a RAID 10.

This allows blending performance, capacity, and redundancy attributes across very large arrays. However, it also adds significant complexity to management. Nested RAID is only recommended for storage experts managing enterprise infrastructures. For most use cases, a single RAID level is sufficient.

Conclusion

Configuring the optimal RAID setup requires understanding the core goals of performance, reliability, and capacity needs for your specific use case. Lower RAID levels provide better performance and capacity utilization while higher RAID levels emphasize fault tolerance and redundancy.

Software RAID provides a cost-effective solution that leverages the host system’s processing power. Hardware RAID delivers benefits through dedicated caching, processors, and battery backups. Carefully evaluating available RAID options and selecting quality enterprise-class drives enables building arrays tailored to the workload and performance characteristics required.

With a strong grasp of the core RAID concepts and levels, you can effectively match the right RAID solution to your storage needs that balances performance, capacity, and protection against drive failures. Taking the time to understand RAID technology prepares you to deploy storage infrastructures with the speed, reliability, and resilience demanded by modern applications.