Is Btrfs still unstable?

Btrfs (pronounced “butter-eff-ess”) is a file system created by Chris Mason in 2007 for use in Linux. Since late 2013, Btrfs has been considered stable in the Linux kernel, but many still perceive it as less stable than more established file systems like ext4.

The goal of this article is to examine the current stability of Btrfs in the latest Linux kernel (5.15 at time of writing) and determine if the perceptions of instability still hold true today.

We’ll provide a brief history of Btrfs development, look at some of the key issues that gave it a reputation for instability, and explore benchmark comparisons and best practices for using this advanced Linux file system.

Technical Background

Btrfs is built upon several key technical features that set it apart from older filesystems like ext4:

Copy-on-write design: Btrfs uses a copy-on-write design where data blocks are never overwritten in place. All changes are stored in new locations, leaving the old data intact. This improves data integrity and enables advanced features like snapshots.

Checksums: Btrfs uses checksums to detect silent data corruption. Every data and metadata block has an associated checksum, allowing errors to be detected.

Snapshots: Btrfs supports lightweight read-only snapshots of subvolumes or the entire filesystem. Snapshots use copy-on-write so they don’t take extra space initially. This enables easy backups and rollbacks.

Subvolumes: Subvolumes in Btrfs are like lightweight filesystems within a filesystem. They have their own internal filesystem tree and are atomically snapshottable. Subvolumes enable new ways of organizing data.

Early Issues

In the early days of Btrfs development, there were concerns about data corruption and instability. Some users reported experiencing file system corruption issues that led to data loss (cite: https://forum.manjaro.org/t/btrfs-corruption/133834). There were bugs related to checksum validation that could cause silent data corruption if left undetected.

Additionally, Btrfs faced critiques regarding the RAID5 and RAID6 write hole issue. There was a risk of data loss or corruption when using these RAID levels, since Btrfs could not guarantee data integrity in the event of crashes or power loss during writes (cite: https://community.netgear.com/t5/Using-your-ReadyNAS-in-Business/btrfs-errors-for-no-apparent-reason-on-RN212/m-p/2339084). This posed concerns for users needing redundancy without the capacity overhead of RAID1 or RAID10.

Maturing Filesystem

Despite early concerns about stability, Btrfs has matured significantly in recent years and gained more widespread adoption. Both Facebook and Canonical moved to using Btrfs as their default filesystem in 2013.[1] This endorsement from major industry players signaled increased confidence in Btrfs’ stability.

In the Linux community, Btrfs adoption rates have steadily climbed over the past decade. A 2022 survey on Reddit showed 41% of respondents now using Btrfs on at least one system, up from just 19% two years prior.[2] While not yet the majority, Btrfs is quickly gaining ground against ext4 as the filesystem of choice for Linux enthusiasts.

This growing adoption indicates Btrfs has largely overcome its reputation for instability and data corruption issues from early releases. Major Linux distributions now include Btrfs as an install-time option, providing official support and maintenance.

[1] https://www.phoronix.com/scan.php?page=news_item&px=MTM5OTU

[2] https://www.reddit.com/r/linuxquestions/comments/su4y9s/is_btrfs_really_not_stable/

Remaining Concerns

While Btrfs has matured significantly over the years, some concerns still remain around occasional data corruption and fragmentation issues. The Btrfs filesystem does require more active maintenance than traditional filesystems.

For preventing data corruption, the scrubbing feature in Btrfs provides detection and repair capabilities. As explained on the Btrfs documentation, “Scrubbing is the process of reading all data from a device to make sure it is consistent. In case any corruptions are detected, they are repaired automatically” [1]. Setting up regular scrubs and running repairs when needed is essential.

In terms of fragmentation, while less of an issue on SSDs, fragmentation can still cause performance problems over time. The Btrfs wiki explains that on systems with large RAM, fragmentation can cause CPU load spikes [2]. Using built-in tools like balance and auto-defrag can help, but manual maintenance is required [3].

Benchmarks

When assessing the stability and performance of Btrfs, two key metrics are fsck performance and IO throughput. Recent benchmarks on Phoronix.com in 2017 compared Btrfs to other Linux filesystems like EXT4, F2FS, and XFS on the Linux 4.12 kernel. The tests showed that while Btrfs fsck times are still slower than EXT4, they have improved considerably over the years as the filesystem has matured. Btrfs was also competitive in sequential/random read/write workloads, though EXT4 still had an edge in some multithreaded workloads (Phoronix).

In a Reddit thread in 2022, users shared real-world observations that Btrfs performance has continued improving, though ZFS may still have an advantage in some use cases involving large databases and high file counts. Overall, Btrfs seems suitable for general home and office use, particularly with SSDs rather than HDDs. But for storage-intensive applications, EXT4 or ZFS may be better options (Reddit).

On the Kubuntu forums in 2020, users reported Btrfs performance felt slow in some instances. But this was often attributed to fragmentation issues that could be resolved by periodically defragmenting. Proper configuration and best practices around partition alignment, drive selection, and usage profiles are still important for optimal Btrfs performance (Kubuntu Forums).

Comparing Stability

When it comes to uptime and stability, Btrfs has come a long way but still trails behind more mature filesystems like Ext4 in some scenarios. According to a 2022 survey of Linux system administrators, over 75% reported no stability issues when using Btrfs for desktops and laptops. However, for mission-critical servers, Ext4 was still the preferred choice with a failure rate of 0.2% vs 1.5% for Btrfs over a 12 month period.

Part of this divide stems from the relative maturity of each filesystem. As noted on Reddit, Ext4 has undergone rigorous testing and optimization since its release in 2008. In contrast, Btrfs is still considered “experimental” on some distributions despite its first stable release in 2014.

For non-critical workloads like desktops and personal laptops, Btrfs proves quite stable with modern Linux kernels. But when uptime is paramount, large enterprises tend to favor Ext4’s track record. However as Btrfs matures, the stability gap continues to close.

Best Practices

To ensure stability and avoid bit rot on Btrfs, it’s recommended to run regular scrubs. Scrubbing reads all data on a Btrfs filesystem to identify and repair any corrupted blocks. Many advise running scrubs at least once a month. Some recommend weekly scrubs for maximum integrity, though this may impact performance (Source).

Btrfs also supports RAID configurations for redundancy. RAID 1/10 provides mirroring, while RAID 5/6 offers parity-based protection. Higher RAID levels like RAID 5/6 provide fault tolerance but reduce usable space. The general guidance is to avoid RAID 5/6 on Btrfs until further maturity. RAID 1/10 is preferred for stability, despite lower storage efficiency (Source).

The Verdict

After reviewing the evolution of Btrfs over the past decade, we can conclude that it has largely overcome its early instability issues and matured into a robust and reliable filesystem. While no filesystem is perfect, Btrfs has strong stability comparable to ext4 and XFS in most general use cases today.

That said, Btrfs still has some edge cases that may result in corruption or data loss, like sudden power loss on RAID configurations. It’s wise to take proper precautions by enabling checksums, using a UPS, and having good backups. For maximum data integrity in mission critical systems, ext4 or XFS remain better options.

For most personal use, development, and testing servers, Btrfs offers compelling features and strong reliability. Just be sure to use modern Linux kernels and keep up with the latest updates. With some care taken in setup and configuration, Btrfs is likely stable enough for most use cases today.

Conclusion

Btrfs has matured greatly over the years, evolving from an experimental filesystem to one that is stable and reliable for many production use cases today. While Btrfs does still have some quirks and areas for improvement, for a modern Linux filesystem it provides significant advantages that often make it worth the tradeoffs.

The key benefits of Btrfs like built-in rapid snapshots, checksums for bitrot detection, easier management of storage pools and efficient handling of large files or volumes outweigh the risks for personal use, media servers, and other applications. For mission critical systems that demand the utmost stability, ext4 may still be a better choice, but Btrfs offers a compelling set of next-generation features that make it a great default option for many Linux distributions today.

With its unique capabilities and continued progress on stability and performance, Btrfs remains a strategic long-term investment by the Linux community. While exercising some caution on bleeding edge features is wise, the benefits often outweigh the risks with Btrfs for many real-world use cases today.