clinuxlinux-kernelfilesystemsebpf

How can I safely build the full path of a struct dentry * in an eBPF (LSM) hook (e.g. inode_rename)?


I'm writing a Linux Security Module (LSM) eBPF program using the hook lsm.s/inode_rename, and I want to capture the full path of the renamed file or directory. I'm using CO-RE and Cilium's toolchain.

However, because the eBPF has strict memory access constraints, and attempts to walk dentry->d_parent repeatedly and copy names via bpf_probe_read_kernel() or bpf_probe_read_kernel_str() frequently result in verifier or runtime errors such as:

load program: invalid argument: value -2147483648 makes ringbuf_mem pointer be out of bounds

What I tried: I simplified the problem by appending only the file names in reverse path order into a preallocated character buffer, like below. (e.g., Suppose there is a path /home/knights/run.sh. Then, I would traverse the dentry nodes like run.sh -> knights -> home -> NULL, because dentry objects connect their parents via a pointer. Anyway, instead of reversing their order, for simplicity, I'm just trying to get the whole string appended like run.sh/knights/home/.)

static __always_inline int build_path(struct dentry *d, char *buf, int buf_len) {
    int off = 0;

#pragma unroll
    for (int i = 0; i < 16; i++) {
        if (!d) break;

        struct qstr q = BPF_CORE_READ(d, d_name);
        __u32 len = q.len;

        if (off + len + 2 > buf_len) // name + slash + NULL
            return -1;

        // Copy name into buffer
        bpf_probe_read_kernel(buf + off, len, q.name);
        off += len;

        // Append slash
        buf[off++] = '/';

        struct dentry *parent = BPF_CORE_READ(d, d_parent);
        if (parent == d) break;
        d = parent;
    }

    if (off >= buf_len)
        return -1;

    buf[off] = '\0';
    return off;
}

However, I still get crashes or verifier rejections when running this code under real-world usage (e.g., mv command in Ubuntu Linux). I could successfully get the filename only (e.g. run.sh if the whole path is /home/knight/run.sh). You can find the whole code of my current progress at: https://github.com/KnightChaser/lsm-bpf-capture-path-cilium/blob/feat/1/bpf/capture_inode_rename.bpf.c.

So, my question is: What’s the correct way to walk struct dentry * trees and build a (reversed or not) full path safely in eBPF, avoiding verifier errors or invalid memory access?


Solution

  • After repeated trials and errors, I could build the full path of a struct dentry * in an eBPF/LSM hook.

    Suppose a set of directory entry(dentry) objects is constructing the path /home/knight/paff.txt. In this case, the directory entry objects will be connected as below:

    paff.txt    (dentry object store the current string (e.g., "path.txt", "knight")
       ↓        (dentry has "struct dentry * d_parent" indicating the parent)
     knight
       ↓
      home ←---*
       |       |   (if the current dentry is root, then current == current->d_parent)
       *-------*   
    

    So, in the Linux kernel, when it has to construct the whole path string like /home/knight/paff.txt, it calls the dentry_path_raw() function, which is defined at fs/d_path.c. dentry_path_raw() calls __dentry_path(), which is defined as below.

    /*
     * Write full pathname from the root of the filesystem into the buffer.
     */
    static char *__dentry_path(const struct dentry *d, struct prepend_buffer *p)
    {
        const struct dentry *dentry;
        struct prepend_buffer b;
        int seq = 0;
    
        rcu_read_lock();
    restart:
        dentry = d;
        b = *p;
        read_seqbegin_or_lock(&rename_lock, &seq);
        while (!IS_ROOT(dentry)) {   // repeat traverse until the current dentry is root
            const struct dentry *parent = dentry->d_parent;  // store its parent
    
            prefetch(parent);
            if (!prepend_name(&b, &dentry->d_name))  // prepend current dentry's name to the current buffer
                break;
            dentry = parent;  // move to its parent
        }
        if (!(seq & 1))
            rcu_read_unlock();
        if (need_seqretry(&rename_lock, seq)) {
            seq = 1;
            goto restart;
        }
        done_seqretry(&rename_lock, seq);
        if (b.len == p->len)
            prepend_char(&b, '/');
        return extract_string(&b);
    }
    

    In short, the given function traverses the dentry chains from the lowest (e.g., paff.txt) to the highest (e.g., /home). And every time it discovers new dentry, it reads the current dentry's string that constructs the full path and append it to the current buffer like below.

    paff.txt                    (discovers a new dentry)
    /paff.txt                   (manually append '/')
    knight/paff.txt             (discovers a new dentry)
    /knight/paff.txt            (manually append '/')
    home/knight/paff.txt        (discovers a new dentry)
    /home/knight/paff.txt       (manually append '/') ----> finished
    

    So the algorithm of constructing the full path is not so difficult, it just need to move string pointers within a buffer. However, eBPF verifier which statistically verify the code safety stringently doesn't allow many conventionally used C practices due to potential string overflow or other errors, making the implementation itself very difficult. For example, attaching '/' in front of the current buffer, moving buffer text to the front after constructing the full path (e.g., /home/knight/paff.txt\0 -> /home/knight/paff.txt\0 ) -- everything was a problem and I couldn't tackle these things at least in my capability.

    Since my eBPF development environment was a combination of C(eBPF/LSM) + Go(Cilium package), I used the following approach.

    1. Discover a new dentry and read its name (e.g., paff.txt) using BPF CORE read function.

    2. Send its name to the user space (to Go program) immediately, because Golang provides easy and safe methods to manipulate the string without the harsh eBPF verifier.

    3. Until the current dentry chain doesn't have nothing new, send all dentry's names to the user space like ["paff.txt", "knight", "home"], then, construct the full path. Done.

    They may not 'desirable', because it sends the data from kernel to user spaces repeatedly for a single full path construction. However, considering its relative occurrence rarity than file open or file read, I think additional overhead caused by this approach is acceptable. Additionally, it is easy to implement and there is no need to wrestle with eBPF verifier, too.

    I want to share my struggle and progress to someone who may confront this issue in the future, you can find my full implementation in C and Go at the following links.

    https://github.com/KnightChaser/lsm-bpf-capture-path-cilium/blob/baa573ee0db80872e514bed4528dbd7b1d4069bd/bpf/capture_inode_rename.bpf.c (C)

    https://github.com/KnightChaser/lsm-bpf-capture-path-cilium/blob/baa573ee0db80872e514bed4528dbd7b1d4069bd/bpf/capture_inode_rename.go (Go)

    Thanks for reading!