-
Notifications
You must be signed in to change notification settings - Fork 148
s390/bpf: Fully order atomic "add", "and", "or" and "xor" #6969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Upstream branch: a9e7715 |
e62ff75 to
4e2f259
Compare
|
Upstream branch: 8e6d9ae |
d3d9844 to
3e7bc93
Compare
4e2f259 to
2139529
Compare
|
Upstream branch: e549b39 |
3e7bc93 to
881bc28
Compare
2139529 to
2ae8bbc
Compare
|
Upstream branch: 41b307a |
881bc28 to
67e1890
Compare
2ae8bbc to
459d932
Compare
|
Upstream branch: 329a672 |
67e1890 to
87908ef
Compare
459d932 to
ba083f5
Compare
|
Upstream branch: 75b0fbf |
87908ef to
5bae1c7
Compare
ba083f5 to
20150dc
Compare
|
Upstream branch: 93d1c2d |
5bae1c7 to
79c3a4c
Compare
20150dc to
606f72a
Compare
|
Upstream branch: 7e2c7a3 |
79c3a4c to
45a9af1
Compare
606f72a to
041b1be
Compare
|
Upstream branch: e612b5c |
45a9af1 to
768c9dd
Compare
041b1be to
63334d2
Compare
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_update_mem_type() that will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). In order to meet expected skb layout by new helper, pull the mac header before conversion from skb to xdp_buff. However, that is not enough as before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). In order to meet expected skb layout by new helper, pull the mac header before conversion from skb to xdp_buff. However, that is not enough as before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). In order to meet expected skb layout by new helper, pull the mac header before conversion from skb to xdp_buff. However, that is not enough as before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). In order to meet expected skb layout by new helper, pull the mac header before conversion from skb to xdp_buff. However, that is not enough as before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). In order to meet expected skb layout by new helper, pull the mac header before conversion from skb to xdp_buff. However, that is not enough as before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). In order to meet expected skb layout by new helper, pull the mac header before conversion from skb to xdp_buff. However, that is not enough as before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). In order to meet expected skb layout by new helper, pull the mac header before conversion from skb to xdp_buff. However, that is not enough as before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Veth calls skb_pp_cow_data() which makes the underlying memory to originate from system page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail(), following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). In order to meet expected skb layout by new helper, pull the mac header before conversion from skb to xdp_buff. However, that is not enough as before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
When skb's headroom is not sufficient for XDP purposes, skb_pp_cow_data() returns new skb with requested headroom space. This skb was provided by page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail() against a skb with frags, and mentioned helper consumed enough amount of bytes that in turn released the page, following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
When skb's headroom is not sufficient for XDP purposes, skb_pp_cow_data() returns new skb with requested headroom space. This skb was provided by page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail() against a skb with frags, and mentioned helper consumed enough amount of bytes that in turn released the page, following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
When skb's headroom is not sufficient for XDP purposes, skb_pp_cow_data() returns new skb with requested headroom space. This skb was provided by page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail() against a skb with frags, and mentioned helper consumed enough amount of bytes that in turn released the page, following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
When skb's headroom is not sufficient for XDP purposes, skb_pp_cow_data() returns new skb with requested headroom space. This skb was provided by page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail() against a skb with frags, and mentioned helper consumed enough amount of bytes that in turn released the page, following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
When skb's headroom is not sufficient for XDP purposes, skb_pp_cow_data() returns new skb with requested headroom space. This skb was provided by page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail() against a skb with frags, and mentioned helper consumed enough amount of bytes that in turn released the page, following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
When skb's headroom is not sufficient for XDP purposes, skb_pp_cow_data() returns new skb with requested headroom space. This skb was provided by page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail() against a skb with frags, and mentioned helper consumed enough amount of bytes that in turn released the page, following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]> Reviewed-by: Toke Høiland-Jørgensen <[email protected]>
When skb's headroom is not sufficient for XDP purposes, skb_pp_cow_data() returns new skb with requested headroom space. This skb was provided by page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail() against a skb with frags, and mentioned helper consumed enough amount of bytes that in turn released the page, following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]> Reviewed-by: Toke Høiland-Jørgensen <[email protected]>
When skb's headroom is not sufficient for XDP purposes, skb_pp_cow_data() returns new skb with requested headroom space. This skb was provided by page_pool. For CONFIG_DEBUG_VM=y and XDP program that uses bpf_xdp_adjust_tail() against a skb with frags, and mentioned helper consumed enough amount of bytes that in turn released the page, following splat was observed: [ 32.204881] BUG: Bad page state in process test_progs pfn:11c98b [ 32.207167] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11c98b [ 32.210084] flags: 0x1fffe0000000000(node=0|zone=1|lastcpupid=0x7fff) [ 32.212493] raw: 01fffe0000000000 dead000000000040 ff11000123c9b000 0000000000000000 [ 32.218056] raw: 0000000000000000 0000000000000001 00000000ffffffff 0000000000000000 [ 32.220900] page dumped because: page_pool leak [ 32.222636] Modules linked in: bpf_testmod(O) bpf_preload [ 32.224632] CPU: 6 UID: 0 PID: 3612 Comm: test_progs Tainted: G O 6.17.0-rc5-gfec474d29325 #6969 PREEMPT [ 32.224638] Tainted: [O]=OOT_MODULE [ 32.224639] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 32.224641] Call Trace: [ 32.224644] <IRQ> [ 32.224646] dump_stack_lvl+0x4b/0x70 [ 32.224653] bad_page.cold+0xbd/0xe0 [ 32.224657] __free_frozen_pages+0x838/0x10b0 [ 32.224660] ? skb_pp_cow_data+0x782/0xc30 [ 32.224665] bpf_xdp_shrink_data+0x221/0x530 [ 32.224668] ? skb_pp_cow_data+0x6d1/0xc30 [ 32.224671] bpf_xdp_adjust_tail+0x598/0x810 [ 32.224673] ? xsk_destruct_skb+0x321/0x800 [ 32.224678] bpf_prog_004ac6bb21de57a7_xsk_xdp_adjust_tail+0x52/0xd6 [ 32.224681] veth_xdp_rcv_skb+0x45d/0x15a0 [ 32.224684] ? get_stack_info_noinstr+0x16/0xe0 [ 32.224688] ? veth_set_channels+0x920/0x920 [ 32.224691] ? get_stack_info+0x2f/0x80 [ 32.224693] ? unwind_next_frame+0x3af/0x1df0 [ 32.224697] veth_xdp_rcv.constprop.0+0x38a/0xbe0 [ 32.224700] ? common_startup_64+0x13e/0x148 [ 32.224703] ? veth_xdp_rcv_one+0xcd0/0xcd0 [ 32.224706] ? stack_trace_save+0x84/0xa0 [ 32.224709] ? stack_depot_save_flags+0x28/0x820 [ 32.224713] ? __resched_curr.constprop.0+0x332/0x3b0 [ 32.224716] ? timerqueue_add+0x217/0x320 [ 32.224719] veth_poll+0x115/0x5e0 [ 32.224722] ? veth_xdp_rcv.constprop.0+0xbe0/0xbe0 [ 32.224726] ? update_load_avg+0x1cb/0x12d0 [ 32.224730] ? update_cfs_group+0x121/0x2c0 [ 32.224733] __napi_poll+0xa0/0x420 [ 32.224736] net_rx_action+0x901/0xe90 [ 32.224740] ? run_backlog_napi+0x50/0x50 [ 32.224743] ? clockevents_program_event+0x1cc/0x280 [ 32.224746] ? hrtimer_interrupt+0x31e/0x7c0 [ 32.224749] handle_softirqs+0x151/0x430 [ 32.224752] do_softirq+0x3f/0x60 [ 32.224755] </IRQ> It's because xdp_rxq with mem model set to MEM_TYPE_PAGE_SHARED was used when initializing xdp_buff. Fix this by using new helper xdp_convert_skb_to_buff() that, besides init/prepare xdp_buff, will check if page used for linear part of xdp_buff comes from page_pool. We assume that linear data and frags will have same memory provider as currently XDP API does not provide us a way to distinguish it (the mem model is registered for *whole* Rx queue and here we speak about single buffer granularity). Before releasing xdp_buff out of veth via XDP_{TX,REDIRECT}, mem type on xdp_rxq associated with xdp_buff is restored to its original model. We need to respect previous setting at least until buff is converted to frame, as frame carries the mem_type. Add a page_pool variant of veth_xdp_get() so that we avoid refcount underflow when draining page frag. Fixes: 0ebab78 ("net: veth: add page_pool for page recycling") Reviewed-by: Toke Høiland-Jørgensen <[email protected]> Reported-by: Alexei Starovoitov <[email protected]> Closes: https://lore.kernel.org/bpf/CAADnVQ+bBofJDfieyOYzSmSujSfJwDTQhiz3aJw7hE+4E2_iPA@mail.gmail.com/ Signed-off-by: Maciej Fijalkowski <[email protected]>
Pull request for series with
subject: s390/bpf: Fully order atomic "add", "and", "or" and "xor"
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=850815