compiler-constructionclanginlineebpf

clang bpf: attribute always_inline does not working


I wrote a BPF object file which included a section and a static inlined function, which defined as below:

static inline __attribute__((always_inline)) bpf_call_func(...);
__section("entry") bpf_func(...); // called bpf_call_func

It worked well and when I used llvm-objdump, it showed that bpf_call_func has already been inlined.

But when I defined another section in the same object file and called bpf_call_func

static inline __attribute__((always_inline)) bpf_call_func(...);
__section("entry") bpf_func(...); // called bpf_call_func
__section("entry2") bpf_func2(...); // called bpf_call_func

llvm-objdump showed bpf_call_func didn't inlined in neither bpf_func nor bpf_func2. It just defined in the .text section, and bpf_func and bpf_func2 used call instruction to call bpf_call_func.

The bpf_call_func is about 600 instructions. The bpf_func and bpf_func are about 250 instructions.

I viewed gcc manual, it says:

Note that certain usages in a function definition can make it unsuitable for inline substitution. Among these usages are: variadic functions, use of alloca, use of computed goto (see Labels as Values), use of nonlocal goto, use of nested functions, use of setjmp, use of __builtin_longjmp and use of __builtin_return or __builtin_apply_args. Using -Winline warns when a function marked inline could not be substituted, and gives the reason for the failure.

But I didn't know which condition matches my case.

I wonder why the bpf_call_func doesn't inline when two sections are calling it? Is it related to bpf_call_func's instruction number?


Solution

  • From what I can find there is not way to actually force clang to inline a function, this is the clang reference for always_inline:

    Inlining heuristics are disabled and inlining is always attempted regardless of optimization level.

    Does not guarantee that inline substitution actually occurs.

    This seems to be a clang thing since GCC states that it will always inline like the attribute suggests, or throw an error(for calls within a unit):

    always_inline

    Generally, functions are not inlined unless optimization is specified. For functions declared inline, this attribute inlines the function independent of any restrictions that otherwise apply to inlining. Failure to inline such a function is diagnosed as an error. Note that if such a function is called indirectly the compiler may or may not inline it depending on optimization level and a failure to inline an indirect call may or may not be diagnosed.

    GCC provides a -Winline flag so the compiler warns about functions that were not inlined, but clang ignores this:

    -Winline

    This diagnostic flag exists for GCC compatibility, and has no effect in Clang.

    So, it seems that clang treats the always_inline attribute as a hint and will happily not inline functions without error or warning. In your case it likely decided that your inline function is to large.

    And to be fair, unless you need to support kernels lower than 4.16 it doesn't matter that much, since eBPF supports functions calls nowadays.