-
Notifications
You must be signed in to change notification settings - Fork 131
xsk: do not discard packet when NETDEV_TX_BUSY #72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Master branch: 642e450 Pull request is NOT updated. Failed to apply https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/, error message was: |
At least one diff in series https://patchwork.ozlabs.org/project/netdev/list/?series=202227 irrelevant now. Closing PR. |
The test case btrfs/238 reports the warning below: WARNING: CPU: 3 PID: 481 at fs/btrfs/super.c:2509 btrfs_show_devname+0x104/0x1e8 [btrfs] CPU: 2 PID: 1 Comm: systemd Tainted: G W O 5.14.0-rc1-custom #72 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 Call trace: btrfs_show_devname+0x108/0x1b4 [btrfs] show_mountinfo+0x234/0x2c4 m_show+0x28/0x34 seq_read_iter+0x12c/0x3c4 vfs_read+0x29c/0x2c8 ksys_read+0x80/0xec __arm64_sys_read+0x28/0x34 invoke_syscall+0x50/0xf8 do_el0_svc+0x88/0x138 el0_svc+0x2c/0x8c el0t_64_sync_handler+0x84/0xe4 el0t_64_sync+0x198/0x19c Reason: While btrfs_prepare_sprout() moves the fs_devices::devices into fs_devices::seed_list, the btrfs_show_devname() searches for the devices and found none, leading to the warning as in above. Fix: latest_dev is updated according to the changes to the device list. That means we could use the latest_dev->name to show the device name in /proc/self/mounts, the pointer will be always valid as it's assigned before the device is deleted from the list in remove or replace. The RCU protection is sufficient as the device structure is freed after synchronization. Reported-by: Su Yue <[email protected]> Tested-by: Su Yue <[email protected]> Signed-off-by: Anand Jain <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
This patch aims to partially prevent a potential infinite loop issue caused by combination of tailcal and freplace. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [kernel-patches#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty kernel-patches#72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog cannot be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch aims to partially prevent a potential infinite loop issue caused by combination of tailcal and freplace. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [kernel-patches#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty kernel-patches#72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [kernel-patches#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty kernel-patches#72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. BTW, do a minor style fix that converts 8-spaces to a tab. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [kernel-patches#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty kernel-patches#72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. BTW, fix a minor style issue by replacing 8-spaces with a tab. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents a potential infinite loop issue caused by combination of tailcal and freplace partially. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [kernel-patches#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty kernel-patches#72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. BTW, fix a minor style issue by replacing 8-spaces with a tab. Signed-off-by: Leon Hwang <[email protected]>
This patch partially prevents a potential infinite loop issue caused by combination of tailcal and freplace. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; bpf_tail_call_static(skb, &jmp_table, 0); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in entry_freplace is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic, like: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. BTW, fix a minor style issue by replacing 8-spaces with a tab. Signed-off-by: Leon Hwang <[email protected]>
This patch partially prevents a potential infinite loop issue caused by combination of tailcal and freplace. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; bpf_tail_call_static(skb, &jmp_table, 0); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in entry_freplace is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic, like: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch partially prevents this panic by preventing updating extended prog to prog_array map. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. Alongside next patch, the panic will be prevented completely. BTW, fix a minor style issue by replacing 8-spaces with a tab. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents an infinite loop issue caused by combination of tailcall and freplace. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; bpf_tail_call_static(skb, &jmp_table, 0); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in entry_freplace is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic, like: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [kernel-patches#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty kernel-patches#72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch prevents this panic by preventing updating extended prog to prog_array map and preventing extending a prog, which has been updated to prog_array map, with freplace prog. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. If a prog has been updated to prog_array map, it or its subprog can not be extended by freplace prog. BTW, fix a minor code style issue by replacing 8 spaces with a tab. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents an infinite loop issue caused by combination of tailcall and freplace. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; bpf_tail_call_static(skb, &jmp_table, 0); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in entry_freplace is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic, like: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch prevents this panic by preventing updating extended prog to prog_array map and preventing extending a prog, which has been updated to prog_array map, with freplace prog. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. If a prog has been updated to prog_array map, it or its subprog can not be extended by freplace prog. BTW, fix a minor code style issue by replacing 8 spaces with a tab. Signed-off-by: Leon Hwang <[email protected]>
This patch prevents an infinite loop issue caused by combination of tailcall and freplace. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; bpf_tail_call_static(skb, &jmp_table, 0); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in entry_freplace is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic, like: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [kernel-patches#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty kernel-patches#72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch prevents this panic by preventing updating extended prog to prog_array map and preventing extending a prog, which has been updated to prog_array map, with freplace prog. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. If a prog has been updated to prog_array map, it or its subprog can not be extended by freplace prog. BTW, fix a minor code style issue by replacing 8 spaces with a tab. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Leon Hwang <[email protected]>
This patch prevents an infinite loop issue caused by combination of tailcall and freplace. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; bpf_tail_call_static(skb, &jmp_table, 0); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in entry_freplace is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic, like: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch prevents this panic by preventing updating extended prog to prog_array map and preventing extending a prog, which has been updated to prog_array map, with freplace prog. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. If a prog has been updated to prog_array map, it or its subprog can not be extended by freplace prog. BTW, fix a minor code style issue by replacing 8 spaces with a tab. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Leon Hwang <[email protected]>
This patch prevents an infinite loop issue caused by combination of tailcall and freplace. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; bpf_tail_call_static(skb, &jmp_table, 0); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in entry_freplace is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic, like: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] </#DF> [ 15.310490] <TASK> [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] </TASK> [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch prevents this panic by preventing updating extended prog to prog_array map and preventing extending a prog, which has been updated to prog_array map, with freplace prog. If a prog or its subprog has been extended by freplace prog, the prog can not be updated to prog_array map. If a prog has been updated to prog_array map, it or its subprog can not be extended by freplace prog. BTW, fix a minor code style issue by replacing 8 spaces with a tab. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Leon Hwang <[email protected]>
Pull request for series with
subject: xsk: do not discard packet when NETDEV_TX_BUSY
version: 5
url: https://patchwork.ozlabs.org/project/netdev/list/?series=202227