linuxlinux-kernelfilesystemsmountvfs

How does `do_new_mount_fc()` mount real file systems like ext4?


In relatively old Linux kernel source codes, the do_new_mount() will call vfs_kern_mount(), which will finally do mount_fs(). And this function will call the real file system's function like below


struct dentry *
mount_fs(struct file_system_type *type, int flags, const char *name, void *data)
{
    struct dentry *root;
    struct super_block *sb;
......
    root = type->mount(type, flags, name, data);
......
    sb = root->d_sb;
......
}

But in relatively new Linux kernel source code, the do_new_mount() will call do_new_mount_fc() instead, and I can not find how this function call the real file system's mount function like above.

Can you tell me how it works now?


Solution

  • You don't find the "usual" function calls because of the recent transition to the new Filesystem Mount API through "filesystem context". You can find more info in the relevant (quite massive) patchwork.

    I'm not going to explain the whole thing as the kernel documentation that I linked above should already give a pretty good explanation (and I also am no Linux FS internals expert).

    The "filesystem context" is basically a structure containing useful information that is passed around and updated incrementally as needed. So what happens now is that the vfs_get_tree() function is responsible for creating the mountable root of the filesystem, and saving it in the fs_context structure that is then passed to do_new_mount_fc() and used to do the actual mount.

    (*) int vfs_get_tree(struct fs_context *fc);
    
         Get or create the mountable root and superblock, using the parameters in
         the filesystem context to select/configure the superblock.  This invokes
         the ->get_tree() method.
    

    So now in do_new_mount() you see that function being called right before do_new_mount_fc():

        fc = fs_context_for_mount(type, sb_flags);
        put_filesystem(type);
        if (IS_ERR(fc))
            return PTR_ERR(fc);
    
        if (subtype)
            err = vfs_parse_fs_string(fc, "subtype",
                          subtype, strlen(subtype));
        if (!err && name)
            err = vfs_parse_fs_string(fc, "source", name, strlen(name));
        if (!err)
            err = parse_monolithic_mount_data(fc, data);
        if (!err && !mount_capable(fc))
            err = -EPERM;
        if (!err)
            err = vfs_get_tree(fc); // <<<<<<<<<<<<<<<<<<<<<<< HERE
        if (!err)
            err = do_new_mount_fc(fc, path, mnt_flags);
    
        put_fs_context(fc);
        return err;
    }
    

    The vfs_get_tree() function calls fc->ops->get_tree() which is the method responsible for creating root (if it doesn't already exist) and assigning it to fc->root.

        error = fc->ops->get_tree(fc); // Here fc->root gets assigned.
        if (error < 0)
            return error;
    

    The transition to this new API is still not complete for all filesystems. For the filesystems that still use the old API (for example ext4), the fs_context_for_mount() function (called at the beginning in do_new_mount()) creates the filesystem context through alloc_fs_context(), which checks whether or not the filesystem supports the new API, and if not it uses a default legacy version of the filesystem context operations (in fact, you can also see the comment in this last link which says "TODO: Make all filesystems support this unconditionally").

    For get_tree(), the legacy version is legacy_get_tree(), which indeed does exacly what you would expect calling fc->fs_type->mount(...).

    /*
     * Get a mountable root with the legacy mount command.
     */
    static int legacy_get_tree(struct fs_context *fc)
    {
        struct legacy_fs_context *ctx = fc->fs_private;
        struct super_block *sb;
        struct dentry *root;
    
        root = fc->fs_type->mount(fc->fs_type, fc->sb_flags,
                          fc->source, ctx->legacy_data);
        if (IS_ERR(root))
            return PTR_ERR(root);
    
        sb = root->d_sb;
        BUG_ON(!sb);
    
        fc->root = root;
        return 0;
    }
    

    Sooner or later, all filesystems will be updated to use the new API with filesystem context, those legacy_* functions will be removed completely and we will see a init_fs_context method in the ext4 file_system_type.