filesystemsext4ext3ext2

Is there an official specification for the ext2/ext3/ext4 filesystems?


I was wondering, for Bluetooth we have IEEE 802.15.1 standard, managed by Bluetooth Special Interest Group. For Wifi we have the IEEE 802.11 standards and also the Wifi Alliance. For NVMe SSDs on PCIe we have nvmexpress that regulate and publish its official documentation.

So, usually there is a regulatory agency who decides things and standardize how some things should be to allow interoperability between several different implementations.

But for the ext2/ext3/ext4 filesystems I didn't find any official standard besides the Linux Kernel code.

Are these filesystems basically dictated by the kernel community? And do they commit to not change them so it will still be compatible with other operating systems?

Or is there some official specification somewhere? And who provides it?

Thank you


Solution

  • Very few file systems are standardized via standards committees. In practice, the commitment not to break compatibility is due to the fact that we need to maintain backwards compatibility with older versions of Linux. The same is true for MacOS, Windows, etc. Users get cranky when a file system that was written using MacOS 10.1 can't be read by MacOS 10.3, etc.

    In the case of ext4, we have feature bitmasks in the superblocks. When we add a new feature, we define a new bit in one of three feature bitmasks: compat, r/o compat, incompat. If the kernel sees a bit which it does not know about in the r/o compat bitmask, it will now allow the file system to be mounted read/write, but it will allow it mount the file system read/only. If there the kernel sees a bit which it does not understand in the incompat bitmask, then it won't allow the file system to be mounted at all. And if there is a bit set in the compat file system that the kernel does not understand, the kernel knows that it is safe to mount the file system regardless. However, the file system consistency checker (e2fsck), and some of the other file system utilities (e.g., resize2fs) may require a more stringent compatibility check, and so they will not try to make changes to a file system that has some compat feature that it doesn't understand.

    In practice, when we add a new feature, we wait quite a while before mke2fs utility will be enable the feature by default. This allows more adventurous users to test the file system feature before we enabled by default for everyone. In practice, other operating systems only implement a very small subset of the ext4 features --- most commonly, the set of features that a non-Linux implementation of ext2/ext3/ext4 roughly corresponds to the file system features that are enabled via "mke2fs -t ext2 /dev/disk".

    These features haven't changed since they were first implemented almost a quarter of a century ago. And they won't change for the obvious reason that there are still lots of enterprises still using RHEL 5, which uses a kernel that was released over ten years ago, and we care an awful lot about backwards compatibility with ourselves as well as with other operating systems. So you can look at "The Design and Implementation of Ext2" (http://web.mit.edu/tytso/www/linux/ext2intro.html) paper, published in 1994, and as far as the basics are concerned they haven't changed.

    Of course, we are still adding new features --- for example, most recently we have added file-system level encryption (used in Android and soon, hopefully, Chrome OS), project quota, metadata checksums, etc. to ext4. Each of these new features are protected by feature flags, and all of these features are not enabled by default in the current version of mke2fs as distributed in the e2fsprogs source distributions. Some community distributions (such as Debian) may enable certain bleeding edge features, such as metadata checksums, just so they get more exposure and testing before it gets enabled for everyone, including the more conservative, corporate users of the enterprise Linux distributions.

    Naturally, other operating systems won't have support for these latest bleeding-edge features. But that's OK, because you can also create a file system using "mke2fs -t ext2" which will be much more basic, and that should be easily used for interoperability. In general people will use a file system with advanced features for native use, and a very basic file system with all of the advanced features turned off for interchange purposes. This is why many USB sticks use FAT --- since Linux, Windows, and MacOS can read FAT file systems without needing any special handling.

    Another possibility is that the latest version of e2fsprogs ships with a userspace file system implementation of ext4, called fuse2fs. For operating systems that support FUSE (which includes most BSD systems as well as MacOS), this can be a handy way of reading an ext4 file system. It won't be a high-performance read/write implementation, but someone who just wants to get data off of an ext4 file system image, fuse2fs works quite well.