ebpf

Why do I need to include vmlinux.h when working with eBPF co-re


Look at this eBPF code:

#include "vmlinux.h"
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>

char LICENSE[] SEC("license") = "GPL";

SEC("lsm/path_unlink")
int BPF_PROG(path_unlink, const struct path *dir, struct dentry *dentry) {

    struct qstr q = BPF_CORE_READ(dentry, d_name); // dentry->d_name

    // ...

    return 0;
}

I don't understand something: BPF_CORE_READ is using CO-RE. Co-re allows us to read kernel structures independently of kernel version. vmlinux.h has been generated on my computer. It contains kernel structures for MY kernel version. I think I should't include this file in my eBPF program when using co-re.

I have tried to remove vmlinux.h but I have compilation errors: struct dentry is unknown... What's the good practice when working with co-re ? What should I do to compile this file without vmlinux.h ?

Thanks


Solution

  • CO-RE leans on BTF type information to work. What happens when you use BPF_CORE_READ is that the compiler generates 'CO-RE Relocations'. These relocations take a few forms, but the simplest is offsets for struct fields. At load time, we need to be able to answer the question "What is the byte offset of field X in struct Y". Finding the field name once we have the correct type is simple, but finding the correct type when it might have changed shape is the challenge.

    So, to do this we take a type the user defines and say "find this". The actual algorithm does not actually care about the full type, just the name and fields needed for the relocation.

    The rough algorithm is:

    So, we essentially only care about the fields actually referenced in your code and the bits of the type considered when checking "compatibility". See the libbpf rules. So you do in fact not need the whole vmlinux, just the types you use, and your structures can contain just the fields you use. https://nakryiko.com/posts/bpf-core-reference-guide/#defining-own-co-re-relocatable-type-definitions

    The reason for using a vmlinux.h is mostly simplicity. You have the full types, and do not have to copy and redefine types every time you want to use a new type/field. Also, so users do not have to understand all of this underlying complexity when they first start.

    What's the good practice when working with co-re ?

    In my opinion, manually defining the smallest types needed. The reason is that sometimes the kernel types change so significantly that you need to manually provide multiple versions of the type and use conditional logic to essentially query which type is valid on the current kernel. See https://nakryiko.com/posts/bpf-core-reference-guide/#handling-incompatible-field-and-type-changes for details. Defining your types manually makes this process easier.

    The second reason is that when working with git, its nice to not have to track a vmlinux.h or ask users to re-generate locally.

    I have tried to remove vmlinux.h but I have compilation errors: struct dentry is unknown... [...] What should I do to compile this file without vmlinux.h ?

    You should only need to define the following

    struct qstr {
        union {
            struct {
                u32 hash; 
                u32 len;
            };
            u64 hash_len;
        };
        const unsigned char *name;
    };
    
    struct dentry {
      struct qstr d_name;
    } __attribute__((preserve_access_index));
    

    Note: In the actual definitions a macro is used to order hash and len based on endianness, but left that out for brevity.