Skip to content

Conversation

@kernel-patches-daemon-bpf
Copy link

Pull request for series with
subject: bpf: Recognize special arithmetic shift in the verifier
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1023786

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 4f7bc83
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1023786
version: 2

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 6cc73f3
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1023786
version: 2

@kernel-patches-review-bot
Copy link

commit a5043b2c9e905e8b9fdb9b479b136f5ef5f8415d
Author: Alexei Starovoitov <[email protected]>

selftests/bpf: Add tests for s>>=31 and s>>=63

Add tests for special arithmetic shift right.

> diff --git a/tools/testing/selftests/bpf/progs/verifier_subreg.c b/tools/testing/selftests/bpf/progs/verifier_subreg.c
> index 8613ea160..0b572c067 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_subreg.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_subreg.c
> @@ -670,4 +670,47 @@ __naked void ldx_w_zero_extend_check(void)
>  	: __clobber_all);
>  }
>
> +SEC("socket")
> +__description("s>>=31")
> +__success __success_unpriv __retval(0)
> +__naked void arsh_31(void)
> +{
> +	/* Below is what LLVM generates in cilium's bpf_wiregard.o */
> +	asm volatile ("					\
> +	call %[bpf_get_prandom_u32];			\
> +	w2 = w0;					\
> +	w2 s>>= 31;					\
> +	w2 &= -134; /* w2 becomes 0 or -134 */		\
> +	if w2 s> -1 goto +2;				\
> +	if w2 != -136 goto +1;				\
                   ^^^^

Should this be -134 instead of -136?  The comment on the previous line
states that w2 becomes either 0 or -134 after the AND operation, so w2
can never equal -136.  This means the condition is always true and the
division by zero at "w0 /= 0" is never reachable.

In arsh_31(), after the arithmetic right shift by 31 bits, w2 will be
either 0 (if the sign bit was 0) or -1 (if the sign bit was 1). Then
"w2 &= -134" produces either 0 or -134, never -136.

> +	w0 /= 0;					\
> +	w0 = 0;						\
> +	exit;						\
> +"	:
> +	: __imm(bpf_get_prandom_u32)
> +	: __clobber_all);
> +}
> +
> +SEC("socket")
> +__description("s>>=63")
> +__success __success_unpriv __retval(0)
> +__naked void arsh_63(void)
> +{
> +	/* Copy of arsh_31 with s/w/r/ */
> +	asm volatile ("					\
> +	call %[bpf_get_prandom_u32];			\
> +	r2 = r0;					\
> +	r2 <<= 32;					\
> +	r2 s>>= 63;					\
> +	r2 &= -134;					\
> +	if r2 s> -1 goto +2;				\
> +	if r2 != -136 goto +1;				\
                   ^^^^

Same issue here - should this be -134?  After the shift and AND
operations, r2 can only be 0 or -134, making the -136 check always true
and the "r0 /= 0" unreachable.

In arsh_63(), after left-shifting by 32 bits and then arithmetic
right-shifting by 63 bits, r2 becomes either 0 or -1. Then "r2 &= -134"
produces either 0 or -134, never -136.

> +	r0 /= 0;					\
> +	r0 = 0;						\
> +	exit;						\
> +"	:
> +	: __imm(bpf_get_prandom_u32)
> +	: __clobber_all);
> +}
> +
>  char _license[] SEC("license") = "GPL";



AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: selftests/bpf: Add tests for s>>=31 and s>>=63
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19383126218

Alexei Starovoitov added 2 commits November 14, 2025 19:01
cilium bpf_wiregard.bpf.c when compiled with -O1 fails to load
with the following verifier log:

192: (79) r2 = *(u64 *)(r10 -304)     ; R2=pkt(r=40) R10=fp0 fp-304=pkt(r=40)
...
227: (85) call bpf_skb_store_bytes#9          ; R0=scalar()
228: (bc) w2 = w0                     ; R0=scalar() R2=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
229: (c4) w2 s>>= 31                  ; R2=scalar(smin=0,smax=umax=0xffffffff,smin32=-1,smax32=0,var_off=(0x0; 0xffffffff))
230: (54) w2 &= -134                  ; R2=scalar(smin=0,smax=umax=umax32=0xffffff7a,smax32=0x7fffff7a,var_off=(0x0; 0xffffff7a))
...
232: (66) if w2 s> 0xffffffff goto pc+125     ; R2=scalar(smin=umin=umin32=0x80000000,smax=umax=umax32=0xffffff7a,smax32=-134,var_off=(0x80000000; 0x7fffff7a))
...
238: (79) r4 = *(u64 *)(r10 -304)     ; R4=scalar() R10=fp0 fp-304=scalar()
239: (56) if w2 != 0xffffff78 goto pc+210     ; R2=0xffffff78 // -136
...
258: (71) r1 = *(u8 *)(r4 +0)
R4 invalid mem access 'scalar'

The error might confuse most bpf authors, since fp-304 slot had 'pkt'
pointer at insn 192 and became 'scalar' at 238. That happened because
bpf_skb_store_bytes() clears all packet pointers including those in
the stack. On the first glance it might look like a bug in the source
code, since ctx->data pointer should have been reloaded after the call
to bpf_skb_store_bytes().

The relevant part of cilium source code looks like this:

// bpf/lib/nodeport.h
int dsr_set_ipip6()
{
	if (ctx_adjust_hroom(...))
		return DROP_INVALID; // -134
	if (ctx_store_bytes(...))
		return DROP_WRITE_ERROR; // -141
	return 0;
}

bool dsr_fail_needs_reply(int code)
{
	if (code == DROP_FRAG_NEEDED) // -136
		return true;
	return false;
}

tail_nodeport_ipv6_dsr()
{
	ret = dsr_set_ipip6(...);
	if (!IS_ERR(ret)) {
		...
	} else {
		if (dsr_fail_needs_reply(ret))
			return dsr_reply_icmp6(...);
	}
}

The code doesn't have arithmetic shift by 31 and it reloads ctx->data
every time it needs to access it. So it's not a bug in the source code.

The reason is DAGCombiner::foldSelectCCToShiftAnd() LLVM transformation:

  // If this is a select where the false operand is zero and the compare is a
  // check of the sign bit, see if we can perform the "gzip trick":
  // select_cc setlt X, 0, A, 0 -> and (sra X, size(X)-1), A
  // select_cc setgt X, 0, A, 0 -> and (not (sra X, size(X)-1)), A

The conditional branch in dsr_set_ipip6() and its return values
are optimized into BPF_ARSH plus BPF_AND:

227: (85) call bpf_skb_store_bytes#9
228: (bc) w2 = w0
229: (c4) w2 s>>= 31   ; R2=scalar(smin=0,smax=umax=0xffffffff,smin32=-1,smax32=0,var_off=(0x0; 0xffffffff))
230: (54) w2 &= -134   ; R2=scalar(smin=0,smax=umax=umax32=0xffffff7a,smax32=0x7fffff7a,var_off=(0x0; 0xffffff7a))

after insn 230 the register w2 can only be 0 or -134,
but the verifier approximates it, since there is no way to
represent two scalars in bpf_reg_state.
After fallthough at insn 232 the w2 can only be -134,
hence the branch at insn
239: (56) if w2 != -136 goto pc+210
should be always taken, and trapping insn 258 should never execute.
LLVM generated correct code, but the verifier follows impossible
path and rejects valid program. To fix this issue recognize this
special LLVM optimization and fork the verifier state.
So after insn 229: (c4) w2 s>>= 31
the verifier has two states to explore:
one with w2 = 0 and another with w2 = 0xffffffff
which makes the verifier accept bpf_wiregard.c

Note there are 20+ such patterns in bpf_wiregard.o compiled
with -O1 and -O2, but they're rarely seen in other production
bpf programs, so push_stack() approach is not a concern.

Reported-by: Hao Sun <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Add tests for special arithmetic shift right.

Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 4722981
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1023786
version: 2

@kernel-patches-daemon-bpf
Copy link
Author

Forwarding comment 3535471021 via email
In-Reply-To: [email protected]
Patch: https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant