Is ZFS on Linux good?

ZFS on Linux has become an increasingly popular option for Linux users looking to implement advanced file system and volume management features. As an open source implementation of Sun’s ZFS file system, ZFS on Linux brings powerful features like data integrity checking, snapshotting, cloning, replication, and more to the Linux ecosystem.

What is ZFS?

ZFS was originally developed by Sun Microsystems for Solaris and OpenSolaris operating systems. The key features of ZFS include:

  • Pooled storage – ZFS combines disks into storage pools instead of using a one-to-one mapping between filesystems and disks.
  • Copy-on-write – ZFS uses copy-on-write transactions instead of in-place updates to optimize performance and data integrity.
  • Checksumming – ZFS verifies data integrity for all data and metadata with checksums.
  • Snapshots – ZFS supports lightweight, read-only snapshots for easy backup and recovery.
  • Clones – ZFS clones provide writable snapshots that can be used for testing or branching off work.
  • Continuous integrity checking – ZFS scrubs stored data to identify and repair errors.
  • RAID-Z – ZFS includes native software RAID with RAID-Z, RAID-Z2, and RAID-Z3.
  • Deduplication – ZFS can share identical data between snapshots, clones, and file systems to save space.

These features make ZFS a robust and flexible file system for servers, storage arrays, and other high-end storage needs. ZFS on Linux brings many of these capabilities to Linux environments.

ZFS on Linux history

The ZFS on Linux (ZoL) porting effort started in 2005 soon after Sun open sourced ZFS under the Common Development and Distribution License (CDDL). Early ZoL releases provided read-only access to ZFS pools created on Solaris/OpenSolaris systems. Over time, the port matured with new features like read-write support, SPL platform abstraction layer, native DKMS kernel modules, and more. Some key events in the history of ZFS on Linux include:

  • 2005 – Initial read-only port of ZFS on Linux released.
  • 2009 – Read-write support added in version 0.6.0.
  • 2013 – Native ZFS kernel modules using DKMS introduced.
  • 2016 – ZFS on Linux becomes an official Ubuntu 16.04 LTS installation option.
  • 2020 – OpenZFS 2.0 release unifies Linux, FreeBSD, and macOS implementations.

While originally independent forked efforts by the community, ZFS on Linux development is now managed by the OpenZFS project. OpenZFS provides a unified open source platform that aims for feature parity and compatibility across Linux, FreeBSD, macOS, and other operating systems.

Advantages of ZFS on Linux

ZFS on Linux inherits many of the strengths of ZFS while integrating tightly with Linux systems. Some of the notable advantages include:

  • Data integrity checking – ZFS provides end-to-end checksums on all data and metadata to detect silent data corruption issues.
  • File system snapshots – ZFS snapshots allow easily rolling back to previous versions or recovering deleted files when needed.
  • Clones – ZFS clones provide writable snapshots for testing patches or experimental changes.
  • Pooled storage – Grouping disks into pools abstracts physical layouts and supports scaling capacity.
  • RAID-Z – Native software RAID removes dependency on hardware RAID controllers.
  • Deduplication – Shared block deduplication saves disk space for redundant files or data.
  • Compression – Per-dataset compression squeezes more capacity from storage.
  • Scalability – ZFS is designed to scale to very large storage configurations.

For Linux servers and workstations, ZFS brings enterprise-grade storage management abilities. The combination of integrity checking, snapshots, clones, flexible volumes, RAID-Z, deduplication, and compression provides a robust storage framework for mission-critical data.

Disadvantages of ZFS on Linux

While ZFS on Linux offers many benefits, there are some downsides to consider as well:

  • Memory use – ZFS tends to consume more memory than traditional Linux file systems due to features like ARC caching.
  • Limited boot environments – Not all Linux distributions properly support booting from ZFS root file systems.
  • Learning curve – ZFS has many knobs to tweak so it can take time to master configuration best practices.
  • Compatibility – ZFS adheres strictly to POSIX semantics which can cause issues with some applications.
  • Licensing – CDDL licensing limits mixing of ZFS code into the Linux kernel source.
  • Software RAID – ZFS RAID-Z lacks some performance optimizations of dedicated hardware RAID controllers.

While the advantages typically outweigh the downsides for most use cases, these are factors to consider when evaluating ZFS on Linux.

ZFS on Linux use cases

Common use cases well suited for ZFS on Linux include:

  • File servers – Shared storage with integrity checking and easy snapshots.
  • Backup storage – Deduplication and compression reduce disk space for backups.
  • Database servers – Checksums and snapshots protect databases from corruption.
  • Media archives – Checksums, clones, and snapshots assist with managing media assets.
  • Docker and Kubernetes – Copy-on-write improves container storage performance.
  • Virtual machine storage – Snapshots simplify VM image management.

Any Linux server with critical or large storage needs can benefit from ZFS capabilities like snapshots, integrity checks, and pooled flexible volumes.

Getting started with ZFS on Linux

Trying out ZFS on Linux is relatively straightforward. Here are some tips for getting started:

  • Verify your Linux distribution has ZFS packages or repositories available. Ubuntu, Debian, Fedora, openSUSE, CentOS, and others support ZFS.
  • Install the ZFS userspace utilities (zfsutils-linux package on Ubuntu/Debian).
  • Install the ZFS Linux kernel modules (zfs-dkms or zfs-kmod packages).
  • Reboot or load the ZFS modules into the running kernel (modprobe zfs).
  • Create a ZFS storage pool using one or more disks (zpool create poolname vdev1 vdev2).
  • Create ZFS filesystems or volumes within the pool (zfs create poolname/videos).
  • Enable additional features like compression as needed (zfs set compression=lz4 poolname/videos).

There are also many helpful ZFS tutorials available online from sources like the OpenZFS docs, Linux ZFS wiki, and the FreeBSD ZFS guide.

ZFS vs traditional Linux filesystems

ZFS has some key advantages but also differs from traditional Linux file systems like ext4 in a few areas:

Feature ZFS ext4
Checksums End-to-end checksums Metadata-only checksums
Snapshots Read-only snapshots No built-in snapshots
Clones Writable cloned snapshots No native cloning
Compression Dataset compression No compression
Deduplication Variable block deduplication No deduplication
Scaling Large pooled storage Fixed volume sizes
Repair Self-healing via scrubbing fsck utility required
Memory use Higher RAM use Lower memory footprint

For advanced storage needs like integrity checking, deduplication, and scalability, ZFS has clear advantages. But for smaller storage sizes or memory-constrained systems, a conventional Linux file system may be sufficient.

ZFS licensing and Linux

The ZFS source code uses the Common Development and Distribution License (CDDL). This is an open source license, but deemed incompatible with the Linux kernel’s GPLv2 license according to the Free Software Foundation (FSF). As a result, the ZFS source code cannot be directly merged into the mainline Linux kernel.

However, CDDL-licensed code can be distributed as separate binary modules that link with Linux. This is the approach taken by the ZFS on Linux project which provides kernel modules that integrate with the Linux kernel through interfaces like the Virtual File System (VFS) layer.

While the FSF prefers GPL-compatible licenses, the Linux kernel developers determine licensing acceptable for inclusion in Linux. The core team has not accepted incompatible licenses like CDDL. This licensing issue prevents Linux from natively including advanced ZFS features, but ZFS on Linux provides a workaround via binary kernel modules.

Performance of ZFS on Linux vs other platforms

In general, ZFS on Linux offers performance competitive with or faster than other platforms:

  • File operations like copying, checking, and deletion are very fast on Linux with ZFS.
  • The ZFS intent log (ZIL) and data scrubbing tend to be faster on Linux than other OSes.
  • RAID-Z parity performance used to lag on Linux but is very fast now with recent improvements.
  • The advanced ZFS memory cache known as ARC runs well on Linux.
  • Snapshots and clones complete quickly on Linux and do not slow the file system.

Areas where Linux systems may see lower ZFS performance include:

  • Slower boot times since ZFS must import pools and load cache state.
  • No kernel dirty region caching due to the module divide, impacting some write speeds.
  • Small random read workloads may be slower than FreeBSD or Illumos.

Despite a few corner cases, ZFS on Linux generally meets or exceeds the performance on other platforms. The differences on modern systems are usually small.

Conclusion

ZFS on Linux provides enterprise-grade file system and volume management features not easily found elsewhere on Linux. Integrity checks, unlimited snapshots, clones, large storage pools, native software RAID, deduplication, compression, and more give ZFS powerful advantages for managing large and critical datasets.

The main downsides versus traditional Linux filesystems are increased memory use, compatibility concerns, and software RAID versus hardware RAID performance. However, for many use cases, the extensive features of ZFS outweigh the disadvantages.

With a long development history and an active open source community improving the OpenZFS shared codebase, ZFS on Linux is mature and gaining widespread adoption. For Linux users requiring advanced storage management capabilities, ZFS often provides the right solution.