I'm struggling to find an answer to a question how big of a latency overhead is calling an eBPF program attached to kprobe and, more important, to uprobe (where overhead might take bigger relative size than kprobe because kprobe implies a kernel call which is already slow enough). Can you provide any information on that, and what it depends on, except for obvious things like CPU and kernel version?
I'm not aware of good benchmarks for this and it may be because there are so many parameters that will impact the results. It will depend on what is being intercept & measured, on what the BPF programs does, on kernel versions and hardware, on Spectre mitigation, etc. All I can tell you is that uprobe overhead will be much higher because it has to cross to kernel space to execute the BPF program.