Why is it called a RAID?

What is RAID?

RAID stands for Redundant Array of Independent Disks. It is a data storage technology that combines multiple disk drive components into a logical unit. Data is distributed across the drives in one of several ways called RAID levels, depending on what level of redundancy and performance is required.

The different RAID levels provide different combinations of increased data reliability and increased input/output performance. RAID technology allows data to be accessed simultaneously from multiple disks, improving performance. If one disk fails, the RAID system can rebuild the data from the remaining disks.

History of RAID

The term “RAID” was first defined in a 1987 paper titled “A Case for Redundant Arrays of Inexpensive Disks (RAID)” by researchers David Patterson, Garth Gibson, and Randy Katz at the University of California, Berkeley. This seminal paper outlined the fundamental RAID concepts that are still in use today.

Prior to Patterson et al.’s paper, disk arrays had been built by industry leaders like Tandem Computers to improve performance. However, these systems were proprietary, costly, and built on expensive, reliable disks. The RAID paper proposed using an array of inexpensive commodity disks to both improve performance and reliability. This novel approach helped popularize the concept of RAID.

The acronym itself was somewhat tongue-in-cheek. The researchers realized calling the technology “Redundant Arrays of Inexpensive Disks” sounded less impressive than “Redundant Arrays of Independent Disks.” So they chose the latter term, while still focusing on inexpensive commodity disks as the key innovation.

RAID Levels

There are several standard RAID levels, each with specific data distribution and redundancy characteristics:

RAID 0

RAID 0 provides striping, which spreads data evenly across multiple disks in the array, but without parity or mirroring. This improves performance but provides no fault tolerance. If any disk fails, the whole array fails.

RAID 1

RAID 1 provides mirroring by duplicating all data from one drive to a second drive. This means you effectively halve your overall storage capacity in return for fault tolerance. If one drive fails, the system can instantly switch to the second mirrored drive without any data loss.

RAID 5

RAID 5 stripes data and parity information across the disks. If any single disk fails, the missing data can be calculated from the remaining data and parity. RAID 5 requires at least three disks.

RAID 6

RAID 6 is similar to RAID 5, but can withstand the failure of two disks by using a second independent distributed parity scheme. Rebuilding the array is more complex than RAID 5.

RAID 10

RAID 10 provides both striping and mirroring by creating a striped set from mirrored subsets. It requires at least four disks. RAID 10 provides high performance and fault tolerance but at a high cost.

Why is it Called “RAID”?

Now that we’ve covered the basics of what RAID is, let’s discuss how it got its name.

As mentioned earlier, the acronym itself was originally a bit of a play on words. The researchers wanted to highlight inexpensive disks, but used the technically accurate term “Independent” to make it sound more respectable.

Beyond that word play, RAID is an appropriate name because it calls to mind the concept of a military raid – conducting an organized attack using coordinated teamwork to be robust and resilient.

A RAID system uses an array of disks in a carefully structured organization to deliver improved performance, capacity, and reliability. The redundancy provided by techniques like parity and mirroring give error and fault tolerance. And the parallelism of spreading data across multiple disks gives speed.

Just like a military raid relies on teamwork, planning, and redundancy to achieve goals a single soldier could not, a RAID uses carefully coordinated disks to provide benefits over a single disk.

So while the literal expansion of the RAID acronym downplays some of the technology’s advantages, the evocative term RAID is still an appropriate shorthand for a set of techniques and benefits related to strengths in numbers.

Key Benefits of RAID

Now that we’ve covered what RAID is and where the term comes from, let’s summarize some of the key benefits this technology delivers:

Increased Storage Capacity

Combining multiple disks in an array allows for much larger overall storage capacity than would be possible with a single disk. Capacity scales with each drive added.

Improved Performance

By splitting and distributing data across multiple disks, RAID can support faster read and write speeds. Parallel disk I/O improves access efficiency.

Redundancy and Fault Tolerance

Disk failure is a real risk for single drives. But RAID provides redundancy, so if a single disk fails, the system can recover the lost data from parity or mirrors. This fault tolerance is essential for reliable data storage.

Efficiency

RAID allows businesses to achieve benefits like speed, capacity, and reliability by using arrays of inexpensive commodity disks. This provides advantages over expensive single disks.

RAID Usage Scenarios

Let’s look at some of the typical scenarios where RAID delivers significant value:

Business-Critical Systems

Major enterprise systems that require high availability often leverage RAID data protection. The redundancy makes uninterrupted operation and fault tolerance possible.

Servers

Server systems frequently use RAID to achieve increased storage space, faster data access, and resilient operation. Both physical servers and virtualized servers can take advantage of RAID.

High Performance Computing

Research, science, and other technical computing fields require fast I/O for large datasets. Parallel RAID storage helps meet these performance demands.

Workstations

Graphic design, video editing, and other creative workstation systems benefit from RAID’s speed and capacity. Local data access performs better than standalone disks.

Backup

RAID alone is not a backup solution. But it provides an extra layer of protection against disk failure, which helps guard against data loss while backups are restored.

Software vs Hardware RAID

There are two main ways to implement RAID:

Software RAID

With software RAID, the core logic that manages the RAID set and computes parity/mirroring is handled by the operating system or a software driver. This allows RAID to be set up on commodity hardware. Software RAID provides flexibility but consumes CPU resources.

Hardware RAID

A hardware RAID solution handles the RAID logic in a dedicated controller. This adds cost but offloads the RAID processing overhead from the main CPU. Hardware RAID typically offers more sophisticated caching and other performance advantages.

RAID Controller Implementations

RAID functionality is made possible thanks to RAID controllers. These controller devices manage the distribution of data across the set of disks:

RAID Cards

Early RAID implementations were served by dedicated RAID controller cards installed in a PC or server. These intelligent I/O cards offload RAID logic from the main CPU.

Motherboard RAID

Many modern motherboards include built-in RAID support, often referred to as RAID-On-Chip (ROC). This allows single-board RAID without a discrete card.

Host Bus Adapters (HBAs)

For larger external RAID setups, Host Bus Adapters allowconnecting multiple external drives to a server. HBAs provide RAID processing power and I/O connections.

RAID Enclosures

Complete external RAID systems come packaged together in dedicated RAID enclosures, which contain both the drives and RAID processor. This simplifies major storage expansion.

Controller Type Pros Cons
RAID Cards Dedicated card optimizes RAID processing Added cost over motherboard RAID
Motherboard RAID Convenient and affordable onboard RAID Shared with other system resources
HBAs Facilitates large external arrays Can be expensive
RAID Enclosures Complete plug-and-play storage solution Less flexible than roll-your-own options

This table summarizes some of the key pros and cons for each type of RAID controller implementation. The right choice depends on budget, performance needs, and scale requirements.

Major RAID Vendors

Many technology vendors provide RAID products and solutions:

Dell

Dell EMC, one of the largest IT companies, offers a wide selection of servers and storage arrays featuring RAID support. Both hardware and software RAID options are available.

HP

Hewlett Packard Enterprise is another major IT vendor that produces servers and disk enclosures with support for hardware RAID cards and controllers.

Lenovo

Lenovo’s System x servers can be configured with different RAID cards from Lenovo and other vendors to enable RAID functionality.

Supermicro

Supermicro is a more niche provider that offers motherboards, servers, and RAID enclosure products that support both hardware and software RAID implementations.

QSAN

QSAN is one example of a dedicated storage vendor, focused specifically on RAID systems and enclosures for SMBs and enterprises.

Choosing a RAID Level

Selecting the right RAID level involves balancing performance vs redundancy requirements:

RAID 0 for pure performance

When speed is the top priority and redundancy is less important, RAID 0 striping delivers maximum throughput.

RAID 1 for crucial data

If protecting mission-critical data is the primary goal, RAID 1 mirroring provides the assurance of redundancy.

RAID 5 for balance

With modest redundancy plus solid performance gains, RAID 5 offers a versatile option for many workloads.

RAID 10 when budget allows

Top-tier speed and redundancy are possible with RAID 10, though the cost may only make sense for high-value use cases.

Understanding the strengths of each RAID level helps match the right solution to your specific needs.

Conclusion

While the literal expansion of the RAID acronym may seem deceptive, RAID has earned its name by providing coordinated redundancy and parallelism that improves on standalone disk solutions.

The core RAID concepts first outlined in 1987 are still relevant today for delivering high capacity, performance, reliability, and economy across a wide range of usage scenarios.

Careful planning and selection of optimal RAID levels allows IT professionals, businesses, and other organizations to reap these benefits in their data storage environments. RAID continues to justify its place as a foundational technology for modern computing.