Skip to content

Commit 0e9f684

Browse files
borkmannAlexei Starovoitov
authored and
Alexei Starovoitov
committed
bpf, libbpf: Add bpf_tail_call_static helper for bpf programs
Port of tail_call_static() helper function from Cilium's BPF code base [0] to libbpf, so others can easily consume it as well. We've been using this in production code for some time now. The main idea is that we guarantee that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the JITed BPF insns with direct jumps instead of having to fall back to using expensive retpolines. By using inline asm, we guarantee that the compiler won't merge the call from different paths with potentially different content of r2/r3. We're also using Cilium's __throw_build_bug() macro (here as: __bpf_unreachable()) in different places as a neat trick to trigger compilation errors when compiler does not remove code at compilation time. This works for the BPF back end as it does not implement the __builtin_trap(). [0] cilium/cilium@f5537c2 Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/1656a082e077552eb46642d513b4a6bde9a7dd01.1601477936.git.daniel@iogearbox.net
1 parent b4ab314 commit 0e9f684

File tree

1 file changed

+46
-0
lines changed

1 file changed

+46
-0
lines changed

tools/lib/bpf/bpf_helpers.h

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,52 @@
5353
})
5454
#endif
5555

56+
/*
57+
* Helper macro to throw a compilation error if __bpf_unreachable() gets
58+
* built into the resulting code. This works given BPF back end does not
59+
* implement __builtin_trap(). This is useful to assert that certain paths
60+
* of the program code are never used and hence eliminated by the compiler.
61+
*
62+
* For example, consider a switch statement that covers known cases used by
63+
* the program. __bpf_unreachable() can then reside in the default case. If
64+
* the program gets extended such that a case is not covered in the switch
65+
* statement, then it will throw a build error due to the default case not
66+
* being compiled out.
67+
*/
68+
#ifndef __bpf_unreachable
69+
# define __bpf_unreachable() __builtin_trap()
70+
#endif
71+
72+
/*
73+
* Helper function to perform a tail call with a constant/immediate map slot.
74+
*/
75+
static __always_inline void
76+
bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
77+
{
78+
if (!__builtin_constant_p(slot))
79+
__bpf_unreachable();
80+
81+
/*
82+
* Provide a hard guarantee that LLVM won't optimize setting r2 (map
83+
* pointer) and r3 (constant map index) from _different paths_ ending
84+
* up at the _same_ call insn as otherwise we won't be able to use the
85+
* jmpq/nopl retpoline-free patching by the x86-64 JIT in the kernel
86+
* given they mismatch. See also d2e4c1e6c294 ("bpf: Constant map key
87+
* tracking for prog array pokes") for details on verifier tracking.
88+
*
89+
* Note on clobber list: we need to stay in-line with BPF calling
90+
* convention, so even if we don't end up using r0, r4, r5, we need
91+
* to mark them as clobber so that LLVM doesn't end up using them
92+
* before / after the call.
93+
*/
94+
asm volatile("r1 = %[ctx]\n\t"
95+
"r2 = %[map]\n\t"
96+
"r3 = %[slot]\n\t"
97+
"call 12"
98+
:: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot)
99+
: "r0", "r1", "r2", "r3", "r4", "r5");
100+
}
101+
56102
/*
57103
* Helper structure used by eBPF C program
58104
* to describe BPF map attributes to libbpf loader

0 commit comments

Comments
 (0)