The ext4 file system is the default and most widely used file system in Linux. It was designed as an improved replacement for the ext3 file system and offers several advantages for storage and data management. In this comprehensive guide, we will explore the key features of ext4 and how they benefit Linux users.
Improved Performance
One of the main goals of ext4 was to improve performance compared to ext3. Here are some of the performance enhancements:
- Faster file system checks – ext4 has improved algorithms for checking and repairing the file system. Checks typically run much faster than ext3, especially on large file systems.
- Delayed allocation – ext4 delays actually writing file data blocks until the data is flushed to disk. This allows efficient allocation strategies to minimize fragmentation.
- Faster file system growth – ext4 allocates larger bitmaps and inode tables when the file system is first created. This avoids fragmentation as the file system expands.
- Multiblock allocation – ext4 can allocate multiple blocks to a file at once, rather than one block at a time. This reduces overhead when writing large files.
- Improved timestamps – ext4 increases the resolution on file timestamps from 1 second to 1 microsecond. This provides more accuracy for timestamp-dependent applications.
Together, these improvements make ext4 noticeably faster at common file system operations compared to ext3.
Reliability Features
Ext4 builds on ext3’s existing reliability features and adds new mechanisms to prevent data loss and corruption:
- Journaling – ext4 retains ext3’s standard journaling feature, which records file system changes to an access journal. If the system crashes or loses power, the journal can be replayed to restore the file system to a consistent state.
- Metadata checksums – ext4 adds checksums to metadata structures like inodes and directories. This can detect corruption early and improve the chances of successful recovery.
- Improved journaling modes – ext4 offers multiple journaling modes (writeback, ordered, and journal). The writeback mode significantly improves performance for many workloads.
- Persistent preallocation – ext4 can preallocate on disk space for a file so the full size is guaranteed available. This avoids potential out-of-space errors for large files.
With these enhancements, ext4 provides rock-solid stability and prevents data loss in the face of system crashes, power failures, and other types of failures.
Larger File System and File Sizes
Ext4 enables significantly larger storage capacity compared to ext3 in two key areas:
- File system size – ext3 file systems were limited to 16 TB maximum. Ext4 raises this limit to 1 EB (exabyte), which is 1 million TB. Large storage arrays can easily use ext4 without worrying about size limits.
- File size – The maximum file size with ext3 was 2 TB. Ext4 raises this limit to 16 TB per file, enabling very large files.
- Number of sub-directories – ext4 allows subdirectories to reach unlimited depth, whereas ext2/3 had more shallow limits that could be exhausted on very large file systems.
Together, these expanded limits future-proof ext4 for use with ever-larger storage hardware.
Backwards Compatibility
Ext4 was designed to provide seamless backwards compatibility with existing ext2 and ext3 file systems. This provides a clear upgrade path with no conversion needed to gain ext4 advantages:
- An ext3 file system can be directly mounted as ext4 without any modification. This enables it to use new ext4 features not present in ext3.
- The ext4 driver in the Linux kernel provides forwards and backwards compatibility. It can mount ext2 and ext3 file systems just like the native drivers.
- The ext4 disk format is very similar to ext3 and utilizes the same data structures on disk. The differences are minor and do not prevent access from ext3 drivers.
This compatibility makes transitioning to ext4 essentially seamless. Admins can upgrade kernel support on the Linux distribution, and existing ext file systems start benefiting from ext4 features automatically.
Inline Data Support
Ext4 provides an option for “inline data” for small files of less than a few kilobytes. With inline data, the file contents are stored directly within the inode metadata structure instead of in separate data blocks. This provides several advantages:
- Faster access for small files since the data does not have to be separately fetched from data blocks
- Less wasted space since small files do not consume a full block
- Faster directory lookups given that the inode already contains the data
Inline data is particularly beneficial for small files like configuration files where speed of access is important.
Extent-Based Allocation
Compared to traditional block-based allocation, ext4 uses extents for most files. An extent represents a contiguous group of data blocks. This approach provides several improvements:
- Reduced file fragmentation – a file will consist of a fewer larger extents rather than many scattered blocks
- Faster large file access – large reads/writes can access one extent instead of many blocks
- Metadata efficiency – extents require less metadata to track vs individual blocks
Extent-based allocation improves both performance and disk space utilization for most workloads.
Delayed Allocation
As mentioned previously, ext4 uses delayed allocation to boost performance. With delayed allocation, ext4 delays actually assigning data blocks to a file and writing the content until necessary. This allows ext4 to collect larger writes and allocate blocks much more efficiently.
Without delayed allocation, every file write would immediately allocate and write to one or a few blocks. This can result in heavy file system fragmentation over time. By delaying until the data is flushed, ext4 can determine the full set of blocks a file needs and allocate them together contiguously.
Unlimited Subdirectory Depth
Ext2 and ext3 file systems were limited to a maximum subdirectory depth of 32,768 subdirectories per directory. Ext4 eliminates this limit and allows unlimited subdirectory depth. Very large and deep directory hierarchies are fully supported by ext4.
Faster fsck
The fsck utility in Linux is used to check and repair file system errors. Ext4 introduces several optimizations that dramatically improve fsck speed over ext3, especially on large volumes:
- Optimized algorithms – use of multithreading and improved techniques like recursion elimination speed up operations.
- Faster checksumming – checksumming for validation is delayed until the fsck pass is complete instead of during the pass.
- Improved journal replay – the journal can be replayed faster to restore consistency.
In benchmarks, ext4 fsck can be over 10x faster than ext3 fsck on the same hardware.
Swap Files
Swap files provide an alternative to dedicated swap partitions for systems to temporarily write inactive memory pages. With ext4, swap files can be created, resized, and deleted just like normal files while providing the same functionality as swap partitions.
Using swap files can provide more flexibility compared to partitions for systems utilizing swap. The size can be adjusted any time and swap can be turned on or off as needed.
Fast System Recovery
Ext4 uses several features to minimize system downtime and recovery time in the event of power loss or system crash:
- Journal checksumming – adds checksums to journal records to improve journal replay success
- Barrier support – barriers ensure file system changes are flushed to disk in a consistent order
- Improved fsync – synchronize individual files on demand to reduce large-scale sync times
Together these improvements can greatly reduce both the likelihood and duration of file system errors when restarting after a crash.
Flexibility for Specialized Workloads
For certain specialized workloads and use cases, ext4 offers flexibility to tune the file system behavior:
- Tail packing – pack small fragmented space together to reduce wasted space
- Lazy inode table and bitmap initialization – improve performance when benchmarking clean file systems
- Data=ordered journaling – improves performance in some database workloads
- Nodelalloc and nobarrier options – improve performance on high-end storage hardware
While ext4 aims to serve general-purpose use, it offers tuning options for high performance computing, databases, and other specialized environments. However, most distributions use the default options for maximum stability.
Large Volumes and Scalability
Support for large volumes and scalability for the future were key design goals for ext4. It includes several features to provide excellent performance and reliability at very large scales:
- 1 EB maximum file system size
- 16 TB maximum file size
- Unlimited subdirectories within a single directory
- Efficient extent-based allocation
- Support for large storage arrays, up to 1 exabyte
The greatly expanded limits and extent/delay allocation allow ext4 to readily scale to handle very large storage deployments and intensive workloads.
Snapshot Support
File system snapshots provide point-in-time copies of file system content that is useful for backups and other use cases. While ext3 had very limited snapshot capabilities, ext4 brings full snapshot support through features like:
- extents – snapshots rely heavily on extents for sharing blocks efficiently
- flex block group – for initializing snapshot-specific block groups on demand
- metadata checksums – critical for ensuring snapshot integrity
Together with tools like LVM or Btrfs, ext4 enables solid snapshot workflows in Linux environments.
Expanded Security Features
For improved security, ext4 expands on the standard Unix permissions and provides additional options like:
- Access Control Lists (ACLs) – finer-grained control over file permissions for multiple users/groups
- Extended attributes – associate custom metadata with files, e.g. for SELinux
- Encryption – can utilize dm-crypt volume encryption for the full file system
While ext3 had ACLs and extended attributes, ext4 further improves support through features like inline data and extended attributes in inode metadata.
Conclusion
Ext4 provides a modern, general-purpose file system for Linux with compelling advantages over ext3 in areas like reliability, performance, scalability, and features. It offers backward compatibility while equipping Linux for the storage and data management challenges of the future. The combination of improved engineering and new optimizations make ext4 a stable base for Linux file storage moving forward.