What is RAID?
RAID stands for Redundant Array of Independent Disks. It is a data storage technology that combines multiple disk drive components into a logical unit. Data is distributed across the drives in one of several ways called “RAID levels”, depending on the required level of redundancy and performance.
The different RAID levels include:
- RAID 0: Stripes data across drives for performance, but offers no redundancy.
- RAID 1: Mirrors data across drives for redundancy.
- RAID 5: Stripes data across drives with distributed parity for redundancy.
- RAID 6: Stripes data across drives with dual distributed parity for high redundancy.
- RAID 10: Mirrors stripped data for both performance and redundancy.
The main goals of RAID are to provide increased data reliability through redundancy, improve performance, or both. However, RAID is not a backup solution and does not protect against file deletion or disk failure of multiple drives.
Benefits of RAID
RAID offers several key benefits that make it a popular data storage solution:
Redundancy – By spreading data across multiple disks, RAID provides fault tolerance if a drive fails. The failed drive can be replaced without losing data. For example, in RAID 1, data is mirrored between two disks. If one fails, the other still has a complete copy of the data.
Improved performance – RAID can increase speed by distributing data across multiple disks, allowing read/write operations to occur in parallel. For instance, RAID 0 stripes data across disks for faster access but provides no redundancy.
Increased storage capacity – In RAID levels like 0, 5, and 10, the total storage capacity equals the sum of all drives. A RAID setup can scale storage by adding more disks to the array.
Challenges of mixing drive sizes in RAID
There are some challenges that can arise when mixing drive sizes in a RAID array. One major challenge is the impact on performance. In a RAID array, the speed is limited by the slowest drive. So if you mix a large, but slower mechanical hard drive with a smaller yet faster SSD, the full performance benefit of that SSD will not be realized (Source). The larger hard drive will bottleneck the performance of the entire array.
Another issue with mixed drive sizes is unused storage space. For example, in a RAID 1 array with two 2TB drives and two 1TB drives in a single array, only 1TB of storage from the larger drives can be used. The remaining 1TB on the larger drives will go unused since the maximum space is limited by the smaller drive size (Source). So mixing drive sizes can result in wasted capacity.
RAID levels that support mixed drive sizes
There are several RAID levels that allow you to combine drives of different sizes in a single array:
RAID 0
RAID 0, also known as disk striping, spreads data evenly across all drives in the array. This allows combining drives of varying capacities since each drive contributes its full storage capacity to the array. However, RAID 0 provides no fault tolerance since data loss will occur if any drive fails (Choosing the Right RAID Configurations).
RAID 10
RAID 10 requires a minimum of four drives and mirrors pairs of drives while also striping data across the drive pairs. This provides fault tolerance through mirroring along with the ability to combine different drive sizes. However, maximum array capacity in RAID 10 is limited to the size of the smallest drive times the number of drive pairs (How to RAID HDDs with different sizes?).
RAID 50
RAID 50 stripes data across multiple RAID 5 drive groups, allowing each RAID 5 group to contain drives of different sizes. This provides fault tolerance through parity while also combining capacity across different drive sizes. However, the size of each RAID 5 group is limited by the size of the smallest drive in that group (Can I set up a RAID 5 with a bunch of drives of different sizes).
RAID 60
Similar to RAID 50, RAID 60 stripes data across RAID 6 groups, providing double distributed parity. This allows combining drives of varying sizes while protecting against up to two drive failures per RAID 6 group. Total capacity remains dependent on the size of the smallest drive in each group.
Best Practices for Drive Size Configuration
When configuring a RAID array with mixed drive sizes, it is generally recommended to group drives by capacity and align to the largest drive size. This allows the full capacity of larger drives to be utilized while smaller drives can still contribute to the overall array. For example, if you have two 4TB drives and two 2TB drives, the best practice is to create two RAID groups – one with both 4TB drives and one with both 2TB drives. The 4TB RAID group would have 4TB of usable space while the 2TB RAID group would have 2TB of usable space, for a total of 6TB in the array.
Striping the smaller drives in their own RAID group aligns the stripe size and avoids potential performance bottlenecks from mismatching stripe sizes. This also maximizes fault tolerance for the larger drives in their own group. Wasted storage space is minimized since each group is comprised of equally sized drives used to full capacity.
Overall, grouping drives by size allows you to optimize performance, capacity, and reliability when working with a mixed RAID setup. Aligning to the largest drive size ensures you utilize the full potential of your largest drives. Smaller drives can still contribute to the array when grouped appropriately.
As noted in [1], “The best practice with mixed drive sizes is to group drives of like capacity together in their own subarrays wherever possible.” This maximizes performance and available storage space.
[1] https://forums.freebsd.org/threads/mixed-hdds-brands-in-raid.75546/
Performance Impact
Mixing drive sizes in RAID can have varying effects on performance, particularly throughput and read/write speeds. In general, RAID arrays will perform at the speed of the slowest drive. For example, mixing 5400 RPM and 7200 RPM drives in the same RAID array will cause the array to run at 5400 RPM drive speeds. This is because the RAID controller has to sync operations across all the drives, so it is limited by the slowest component.
Similarly, mixing SSDs and HDDs will result in SSD performance throttling down to HDD levels. The SSDs have to wait for the mechanical HDDs during read/write operations, reducing the overall throughput.
When it comes to rebuilding failed drives in a RAID array, using mixed drive sizes can prolong the rebuilding process. The RAID controller has to rebuild/resync the replacement drive to match the smallest drive size. For example, replacing a failed 2TB drive with a 4TB drive in a mixed size array will result in only 2TB being rebuilds on the new drive.
In summary, mixing drive sizes will almost always result in lower performance compared to uniform sized drives. The RAID array will operate at the level of the lowest common denominator. Carefully consider whether the extra unused storage capacity is worth the tradeoff of reduced throughput.[1]
Effect on Fault Tolerance
The ability to recover data after a drive failure, known as fault tolerance, depends on both the RAID level and number of drives. According to Synology, SHR provides protection equivalent to RAID 1 when using just two disk drives. As more drives are added, the level of fault tolerance increases. However, mixing drive sizes can reduce fault tolerance in some scenarios.
For example, RAID 10 requires at least four drives and can tolerate failure of up to half the drives. But with mixed drive sizes, storage capacity and performance is limited to the smallest drive. If that small drive fails, only that capacity can be rebuilt onto the replacement drive. The larger remaining drives have unused storage that becomes inaccessible if a second drive fails before the smaller drive is replaced.
Reddit users on r/synology note that upgrading drive sizes in SHR2 can increase overall capacity while maintaining two drive fault tolerance. However, the array is still vulnerable during the transition period when old and new drives are mixed.
In summary, mixing drive sizes can reduce fault tolerance for some RAID levels and requires careful planning to avoid scenarios that leave data vulnerable.
Wasted storage space
When combining drives of different sizes in RAID, there will often be unused storage capacity on the larger drives. This is because in most RAID configurations, the total storage space is limited to the size of the smallest drive https://www.reddit.com/r/storage/comments/nto6t5/storage_spaces_wasted_capacity/. So if you have a 2TB drive and a 1TB drive in a RAID 1 array, the total capacity will be 1TB, with 1TB unused on the larger drive.
This wasted capacity is most pronounced in RAID levels like RAID 5 or RAID 6 which stripe data across all the drives. In these arrays, if you have four 2TB drives and one 1TB drive, the total capacity will be limited to 1TB per drive, wasting 7TB of potential storage space. This highlights the importance of matching drive sizes as closely as possible in high capacity RAID configurations.
That said, it may sometimes make sense to introduce a smaller “hot spare” drive to save costs while still gaining most of the advantage of the larger drives. But in general, mixing widely disparate drive sizes will lead to substantial wasted storage capacity that should be considered in the planning process.
When mixed size RAID makes sense
Using drives of different sizes in RAID can make sense in certain situations, especially for budget builds or incremental storage growth. As this LTT forum discussion points out, mixing drive sizes allows you to take advantage of old and new drives when expanding storage.
For example, if you have an existing 2 TB hard drive from an old system, you can add that to a RAID 5 array with new 4 TB drives. While the total capacity will be limited by the smallest drive, this allows you to utilize the old drive rather than wasting it. The DataHoarder subreddit recommends this approach for budget-conscious builds.
Using mismatched drives in this way provides a cost-effective way to expand RAID arrays over time. Rather than replacing all drives, you can simply add the new, larger capacity drive to the array. This incremental growth allows you to upgrade storage as budget permits.
Alternatives to mixed size RAID
There are several alternatives to using mixed size drives in a RAID array that provide more flexibility and avoid some of the downsides. Two popular options are JBOD and drive pooling software.
JBOD (Just a Bunch of Disks) simply connects drives together without any parity or striping. JBOD makes it easy to mix drives of any size, but does not provide any redundancy. Each drive acts as an individual volume.
Drive pooling software like unRAID and OpenMediaVault creates a single storage pool from drives of mixed sizes. The pool is presented as one large volume. Data is written across the drives and protected with parity, allowing a drive failure without losing data. Pooling gives more flexibility for upgrading drives than traditional RAID.
For enterprise environments, storage solutions like QNAP SHR, Synology SHR, and Storage Spaces allow creating pools from drives of any size while still providing RAID-like redundancy and performance.