thread-safetyebpfxdp-bpf

Why use BPF_F_LOCK if XDP map operations are thread safe


I found another post asking about whether eBPF/XDP map operations are thread safe or not. From the answer of this post and other sources I found, it seems that map operations are thread safe from both userspace and Kernel sides.

I understand that bpf_spin_locks are useful in the Kernel, since we can lock an entry of a map and modify several values while that lock is held. However, I am confused as to why we can set BPF_F_LOCK flag in userspace syscalls to hold the lock.

Since map operations are thread safe, why would we need to hold the lock in a bpf_map_update_elem syscall?

Also, since syscalls will return a copy of the data instead of a reference to it, there is no other way to change a value from userspace without using bpf_map_update_elem syscall, right?


Solution

  • Even if userspace is making a copy of the map value, it needs to hold the lock to ensure it's not copying the value in the middle of kernel writes.

    Example. Let's say you have a map counting packets and bytes. Because you want to ensure a coherent view of packets and bytes, in the kernel you hold a spin lock before incrementing both values:

    1: bpf_spin_lock(&value->lock);
    2: value->packets += 1;
    3: value->bytes += skb->len;
    4: bpf_spin_unlock(&value->lock);
    

    If your userspace process didn't hold the lock when copying the map value, it may end up with a value in which value->bytes corresponds to the situation at packet n and value->packets corresponds to the situation at packet n+1.

    Map Updates. The above assumes userspace is doing a map lookup, but the same problem exists for map updates.

    Let's say your userspace process wants to regularly reset counters. If your userspace process didn't hold the lock when resetting values, it may end up doing so between lines 2 and 3 of the above BPF program. Then, value->packet will be zero, but value->bytes won't. The userspace process thus needs to hold the lock for updates as well.