Skip to content

Commit 54b8e76

Browse files
mrpreKernel Patches Daemon
authored andcommitted
bpf, sockmap: Fix psock incorrectly pointing to sk
We observed an issue from the latest selftest: sockmap_redir where sk_psock(psock->sk) != psock in the backlog. The root cause is the special behavior in sockmap_redir - it frequently performs map_update() and map_delete() on the same socket. During map_update(), we create a new psock and during map_delete(), we eventually free the psock via rcu_work in sk_psock_drop(). However, pending workqueues might still exist and not be processed yet. If users immediately perform another map_update(), a new psock will be allocated for the same sk, resulting in two psocks pointing to the same sk. When the pending workqueue is later triggered, it uses the old psock to access sk for I/O operations, which is incorrect. Timing Diagram: cpu0 cpu1 map_update(sk): sk->psock = psock1 psock1->sk = sk map_delete(sk): rcu_work_free(psock1) map_update(sk): sk->psock = psock2 psock2->sk = sk workqueue: wakeup with psock1, but the sk of psock1 doesn't belong to psock1 rcu_handler: clean psock1 free(psock1) Previously, we used reference counting to address the concurrency issue between backlog and sock_map_close(). This logic remains necessary as it prevents the sk from being freed while processing the backlog. But this patch prevents pending backlogs from using a psock after it has been freed. Note: We cannot call cancel_delayed_work_sync() in map_delete() since this might be invoked in BPF context by BPF helper, and the function may sleep. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Jiayuan Chen <[email protected]>
1 parent 40a30fa commit 54b8e76

File tree

2 files changed

+5
-1
lines changed

2 files changed

+5
-1
lines changed

include/linux/skmsg.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ struct sk_psock_progs {
6767
enum sk_psock_state_bits {
6868
SK_PSOCK_TX_ENABLED,
6969
SK_PSOCK_RX_STRP_ENABLED,
70+
SK_PSOCK_DROPPED,
7071
};
7172

7273
struct sk_psock_link {

net/core/skmsg.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -656,6 +656,9 @@ static void sk_psock_backlog(struct work_struct *work)
656656
bool ingress;
657657
int ret;
658658

659+
if (sk_psock_test_state(psock, SK_PSOCK_DROPPED))
660+
return;
661+
659662
/* Increment the psock refcnt to synchronize with close(fd) path in
660663
* sock_map_close(), ensuring we wait for backlog thread completion
661664
* before sk_socket freed. If refcnt increment fails, it indicates
@@ -867,7 +870,7 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock)
867870
write_unlock_bh(&sk->sk_callback_lock);
868871

869872
sk_psock_stop(psock);
870-
873+
sk_psock_set_state(psock, SK_PSOCK_DROPPED);
871874
INIT_RCU_WORK(&psock->rwork, sk_psock_destroy);
872875
queue_rcu_work(system_wq, &psock->rwork);
873876
}

0 commit comments

Comments
 (0)