Skip to content

bpf, sockmap: Fix concurrency issues between memory charge and uncharge #8911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

kernel-patches-daemon-bpf[bot]
Copy link

Pull request for series with
subject: bpf, sockmap: Fix concurrency issues between memory charge and uncharge
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 43745d1
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: b69d441
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 32c563d
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 0f2d39f
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: cf15cdc
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: cb4a119
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 5a8cb23
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: c8ce7db
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 7220eab
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: d0445d7
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: d0445d7
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 4dd372d
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 9325d53
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 9325d53
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 1cb0f56
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: b615ce5
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: b615ce5
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 25b6d5d
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=960768
version: 1

Triggering WARN_ON_ONCE(sk->sk_forward_alloc) by running the following
command, followed by pressing Ctrl-C after 2 seconds:
./bench sockmap -c 2 -p 1 -a --rx-verdict-ingress
'''
------------[ cut here ]------------
WARNING: CPU: 2 PID: 40 at net/ipv4/af_inet.c inet_sock_destruct

Call Trace:
<TASK>
__sk_destruct+0x46/0x222
sk_psock_destroy+0x22f/0x242
process_one_work+0x504/0x8a8
? process_one_work+0x39d/0x8a8
? __pfx_process_one_work+0x10/0x10
? worker_thread+0x44/0x2ae
? __list_add_valid_or_report+0x83/0xea
? srso_return_thunk+0x5/0x5f
? __list_add+0x45/0x52
process_scheduled_works+0x73/0x82
worker_thread+0x1ce/0x2ae
'''

Reason:
When we are in the backlog process, we allocate sk_msg and then perform
the charge process. Meanwhile, in the user process context, the recvmsg()
operation performs the uncharge process, leading to concurrency issues
between them.

The charge process (2 functions):
1. sk_rmem_schedule(size) -> sk_forward_alloc increases by PAGE_SIZE
                             multiples
2. sk_mem_charge(size)    -> sk_forward_alloc -= size

The uncharge process (sk_mem_uncharge()):
3. sk_forward_alloc += size
4. check if sk_forward_alloc > PAGE_SIZE
5. reclaim    -> sk_forward_alloc decreases, possibly becoming 0

Because the sk performing charge and uncharge is not locked
(mainly because the backlog process does not lock the socket), therefore,
steps 1 to 5 will execute concurrently as follows:

cpu0                                cpu1
1
                                    3
                                    4   --> sk_forward_alloc >= PAGE_SIZE
                                    5   --> reclaim sk_forward_alloc
2 --> sk_forward_alloc may
      become negative

Solution:
1. Add locking to the kfree_sk_msg() process, which is only called in the
   user process context.
2. Integrate the charge process into sk_psock_create_ingress_msg() in the
   backlog process and add locking.
3. Reuse the existing psock->ingress_lock.

Fixes: 799aa7f ("skmsg: Avoid lock_sock() in sk_psock_backlog()")
Signed-off-by: Jiayuan Chen <[email protected]>
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the series/960768=>bpf-next branch from cd7222c to 31d14c9 Compare May 20, 2025 23:28
@kernel-patches-daemon-bpf
Copy link
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=960768 irrelevant now for [Munch({'archived': False, 'project': 399, 'delegate': 121173})] search patterns

@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot deleted the series/960768=>bpf-next branch May 24, 2025 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant