[ciqlts8_6-rt] writeback: avoid use-after-free after removing device #151

thefossguy-ciq · 2025-03-05T08:34:38Z

Commit Message Requirements
Built against Vault/LTS Environment
kABI Check Passed, where Valid (Pre 9.4 RT does not have kABI stability)
Boot Test
Kernel SelfTest results
Additional Tests as determined relevant

Commit message

jira VULN-9260
cve CVE-2024-0562
commit-author Khazhismel Kumykov <[email protected]> commit f87904c075515f3e1d8f4a7115869d3b914674fd

When a disk is removed, bdi_unregister gets called to stop further writeback and wait for associated delayed work to complete.  However, wb_inode_writeback_end() may schedule bandwidth estimation dwork after this has completed, which can result in the timer attempting to access the just freed bdi_writeback.

Fix this by checking if the bdi_writeback is alive, similar to when scheduling writeback work.

Since this requires wb->work_lock, and wb_inode_writeback_end() may get called from interrupt, switch wb->work_lock to an irqsafe lock.

Link: https://lkml.kernel.org/r/[email protected] Fixes: 45a2966fd641 ("writeback: fix bandwidth estimate for spiky workload")
	Signed-off-by: Khazhismel Kumykov <[email protected]>
	Reviewed-by: Jan Kara <[email protected]>
	Cc: Michael Stapelberg <[email protected]>
	Cc: Wu Fengguang <[email protected]>
	Cc: Alexander Viro <[email protected]>
	Cc: <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit f87904c075515f3e1d8f4a7115869d3b914674fd)
	Signed-off-by: Pratham Patel <[email protected]>

Kernel build logs

  INSTALL sound/usb/line6/snd-usb-variax.ko
  INSTALL sound/usb/misc/snd-ua101.ko
  INSTALL sound/usb/snd-usb-audio.ko
  INSTALL sound/usb/snd-usbmidi-lib.ko
  INSTALL sound/usb/usx2y/snd-usb-us122l.ko
  INSTALL sound/usb/usx2y/snd-usb-usx2y.ko
  INSTALL sound/virtio/virtio_snd.ko
  INSTALL sound/x86/snd-hdmi-lpe-audio.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-_ppatel__ciqlts8_6-rt+
[TIMER]{MODULES}: 39s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-_ppatel__ciqlts8_6-rt+ arch/x86/boot/bzImage \
	System.map "/boot"
[TIMER]{INSTALL}: 11s
Checking kABI
Error: kABI check failed
Setting Default Kernel to /boot/vmlinuz-4.18.0-_ppatel__ciqlts8_6-rt+ and Index to 0
The default is /boot/loader/entries/74447cfd379141518e0d01ed1a411e3c-4.18.0-_ppatel__ciqlts8_6-rt+.conf with index 0 and kernel /boot/vmlinuz-4.18.0-_ppatel__ciqlts8_6-rt+
The default is /boot/loader/entries/74447cfd379141518e0d01ed1a411e3c-4.18.0-_ppatel__ciqlts8_6-rt+.conf with index 0 and kernel /boot/vmlinuz-4.18.0-_ppatel__ciqlts8_6-rt+
Generating grub configuration file ...
done
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 5s
[TIMER]{BUILD}: 1027s
[TIMER]{MODULES}: 39s
[TIMER]{INSTALL}: 11s
[TIMER]{TOTAL} 1086s
Rebooting in 10 seconds

kernel_build.log

Kselftests

$ grep '^ok ' kselftest-before.log | wc -l && grep '^ok ' kselftest-after.log | wc -l
214
215

$ grep '^not ok ' kselftest-before.log | wc -l && grep '^not ok ' kselftest-after.log | wc -l
52
51

$ gdiff <(grep '^ok ' kselftest-before.log) <(grep '^ok ' kselftest-after.log)
diff --git a/dev/fd/63 b/dev/fd/62
--- a/dev/fd/63
+++ b/dev/fd/62
@@ -81,6 +81,7 @@ ok 1 selftests: mqueue: mq_open_tests # SKIP
 ok 2 selftests: mqueue: mq_perf_tests # SKIP
 ok 2 selftests: net: reuseport_bpf_cpu
 ok 4 selftests: net: reuseport_dualstack
+ok 6 selftests: net: tls
 ok 7 selftests: net: run_netsocktests
 ok 8 selftests: net: run_afpackettests
 ok 10 selftests: net: netdevice.sh # SKIP

$ gdiff <(grep '^not ok ' kselftest-before.log) <(grep '^not ok ' kselftest-after.log)
diff --git a/dev/fd/63 b/dev/fd/62
--- a/dev/fd/63
+++ b/dev/fd/62
@@ -5,7 +5,6 @@ not ok 4 selftests: lib: scanf.sh
 not ok 2 selftests: memfd: run_fuse_test.sh # exit=1
 not ok 1 selftests: net: reuseport_bpf # exit=1
 not ok 3 selftests: net: reuseport_bpf_numa # exit=1
-not ok 6 selftests: net: tls # exit=1
 not ok 9 selftests: net: test_bpf.sh # exit=1
 not ok 14 selftests: net: fib-onlink-tests.sh # exit=1
 not ok 15 selftests: net: pmtu.sh # exit=1

kselftest-after.log
kselftest-before.log

jira VULN-9260 cve CVE-2024-0562 commit-author Khazhismel Kumykov <[email protected]> commit f87904c When a disk is removed, bdi_unregister gets called to stop further writeback and wait for associated delayed work to complete. However, wb_inode_writeback_end() may schedule bandwidth estimation dwork after this has completed, which can result in the timer attempting to access the just freed bdi_writeback. Fix this by checking if the bdi_writeback is alive, similar to when scheduling writeback work. Since this requires wb->work_lock, and wb_inode_writeback_end() may get called from interrupt, switch wb->work_lock to an irqsafe lock. Link: https://lkml.kernel.org/r/[email protected] Fixes: 45a2966 ("writeback: fix bandwidth estimate for spiky workload") Signed-off-by: Khazhismel Kumykov <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Michael Stapelberg <[email protected]> Cc: Wu Fengguang <[email protected]> Cc: Alexander Viro <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit f87904c) Signed-off-by: Pratham Patel <[email protected]>

bmastbergen

🥌

PlaidCat

…ce_finalize() commit 41cddf8 upstream. If migration succeeded, we called folio_migrate_flags()->mem_cgroup_migrate() to migrate the memcg from the old to the new folio. This will set memcg_data of the old folio to 0. Similarly, if migration failed, memcg_data of the dst folio is left unset. If we call folio_putback_lru() on such folios (memcg_data == 0), we will add the folio to be freed to the LRU, making memcg code unhappy. Running the hmm selftests: # ./hmm-tests ... # RUN hmm.hmm_device_private.migrate ... [ 102.078007][T14893] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff27d200 pfn:0x13cc00 [ 102.079974][T14893] anon flags: 0x17ff00000020018(uptodate|dirty|swapbacked|node=0|zone=2|lastcpupid=0x7ff) [ 102.082037][T14893] raw: 017ff00000020018 dead000000000100 dead000000000122 ffff8881353896c9 [ 102.083687][T14893] raw: 00000007ff27d200 0000000000000000 00000001ffffffff 0000000000000000 [ 102.085331][T14893] page dumped because: VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled()) [ 102.087230][T14893] ------------[ cut here ]------------ [ 102.088279][T14893] WARNING: CPU: 0 PID: 14893 at ./include/linux/memcontrol.h:726 folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.090478][T14893] Modules linked in: [ 102.091244][T14893] CPU: 0 UID: 0 PID: 14893 Comm: hmm-tests Not tainted 6.13.0-09623-g6c216bc522fd #151 [ 102.093089][T14893] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 [ 102.094848][T14893] RIP: 0010:folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.096104][T14893] Code: ... [ 102.099908][T14893] RSP: 0018:ffffc900236c37b0 EFLAGS: 00010293 [ 102.101152][T14893] RAX: 0000000000000000 RBX: ffffea0004f30000 RCX: ffffffff8183f426 [ 102.102684][T14893] RDX: ffff8881063cb880 RSI: ffffffff81b8117f RDI: ffff8881063cb880 [ 102.104227][T14893] RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 [ 102.105757][T14893] R10: 0000000000000001 R11: 0000000000000002 R12: ffffc900236c37d8 [ 102.107296][T14893] R13: ffff888277a2bcb0 R14: 000000000000001f R15: 0000000000000000 [ 102.108830][T14893] FS: 00007ff27dbdd740(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000 [ 102.110643][T14893] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 102.111924][T14893] CR2: 00007ff27d400000 CR3: 000000010866e000 CR4: 0000000000750ef0 [ 102.113478][T14893] PKRU: 55555554 [ 102.114172][T14893] Call Trace: [ 102.114805][T14893] <TASK> [ 102.115397][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.116547][T14893] ? __warn.cold+0x110/0x210 [ 102.117461][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.118667][T14893] ? report_bug+0x1b9/0x320 [ 102.119571][T14893] ? handle_bug+0x54/0x90 [ 102.120494][T14893] ? exc_invalid_op+0x17/0x50 [ 102.121433][T14893] ? asm_exc_invalid_op+0x1a/0x20 [ 102.122435][T14893] ? __wake_up_klogd.part.0+0x76/0xd0 [ 102.123506][T14893] ? dump_page+0x4f/0x60 [ 102.124352][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.125500][T14893] folio_batch_move_lru+0xd4/0x200 [ 102.126577][T14893] ? __pfx_lru_add+0x10/0x10 [ 102.127505][T14893] __folio_batch_add_and_move+0x391/0x720 [ 102.128633][T14893] ? __pfx_lru_add+0x10/0x10 [ 102.129550][T14893] folio_putback_lru+0x16/0x80 [ 102.130564][T14893] migrate_device_finalize+0x9b/0x530 [ 102.131640][T14893] dmirror_migrate_to_device.constprop.0+0x7c5/0xad0 [ 102.133047][T14893] dmirror_fops_unlocked_ioctl+0x89b/0xc80 Likely, nothing else goes wrong: putting the last folio reference will remove the folio from the LRU again. So besides memcg complaining, adding the folio to be freed to the LRU is just an unnecessary step. The new flow resembles what we have in migrate_folio_move(): add the dst to the lru, remove migration ptes, unlock and unref dst. Link: https://lkml.kernel.org/r/[email protected] Fixes: 8763cb4 ("mm/migrate: new memory migration helper for use with device memory") Signed-off-by: David Hildenbrand <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Alistair Popple <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

…ce_finalize() JIRA: https://issues.redhat.com/browse/RHEL-84184 JIRA: https://issues.redhat.com/browse/RHEL-83249 CVE: CVE-2025-21861 This patch is a backport of the following upstream commit: commit 41cddf8 Author: David Hildenbrand <[email protected]> Date: Mon Feb 10 17:13:17 2025 +0100 mm/migrate_device: don't add folio to be freed to LRU in migrate_device_finalize() If migration succeeded, we called folio_migrate_flags()->mem_cgroup_migrate() to migrate the memcg from the old to the new folio. This will set memcg_data of the old folio to 0. Similarly, if migration failed, memcg_data of the dst folio is left unset. If we call folio_putback_lru() on such folios (memcg_data == 0), we will add the folio to be freed to the LRU, making memcg code unhappy. Running the hmm selftests: # ./hmm-tests ... # RUN hmm.hmm_device_private.migrate ... [ 102.078007][T14893] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff27d200 pfn:0x13cc00 [ 102.079974][T14893] anon flags: 0x17ff00000020018(uptodate|dirty|swapbacked|node=0|zone=2|lastcpupid=0x7ff) [ 102.082037][T14893] raw: 017ff00000020018 dead000000000100 dead000000000122 ffff8881353896c9 [ 102.083687][T14893] raw: 00000007ff27d200 0000000000000000 00000001ffffffff 0000000000000000 [ 102.085331][T14893] page dumped because: VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled()) [ 102.087230][T14893] ------------[ cut here ]------------ [ 102.088279][T14893] WARNING: CPU: 0 PID: 14893 at ./include/linux/memcontrol.h:726 folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.090478][T14893] Modules linked in: [ 102.091244][T14893] CPU: 0 UID: 0 PID: 14893 Comm: hmm-tests Not tainted 6.13.0-09623-g6c216bc522fd #151 [ 102.093089][T14893] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 [ 102.094848][T14893] RIP: 0010:folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.096104][T14893] Code: ... [ 102.099908][T14893] RSP: 0018:ffffc900236c37b0 EFLAGS: 00010293 [ 102.101152][T14893] RAX: 0000000000000000 RBX: ffffea0004f30000 RCX: ffffffff8183f426 [ 102.102684][T14893] RDX: ffff8881063cb880 RSI: ffffffff81b8117f RDI: ffff8881063cb880 [ 102.104227][T14893] RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000 [ 102.105757][T14893] R10: 0000000000000001 R11: 0000000000000002 R12: ffffc900236c37d8 [ 102.107296][T14893] R13: ffff888277a2bcb0 R14: 000000000000001f R15: 0000000000000000 [ 102.108830][T14893] FS: 00007ff27dbdd740(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000 [ 102.110643][T14893] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 102.111924][T14893] CR2: 00007ff27d400000 CR3: 000000010866e000 CR4: 0000000000750ef0 [ 102.113478][T14893] PKRU: 55555554 [ 102.114172][T14893] Call Trace: [ 102.114805][T14893] <TASK> [ 102.115397][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.116547][T14893] ? __warn.cold+0x110/0x210 [ 102.117461][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.118667][T14893] ? report_bug+0x1b9/0x320 [ 102.119571][T14893] ? handle_bug+0x54/0x90 [ 102.120494][T14893] ? exc_invalid_op+0x17/0x50 [ 102.121433][T14893] ? asm_exc_invalid_op+0x1a/0x20 [ 102.122435][T14893] ? __wake_up_klogd.part.0+0x76/0xd0 [ 102.123506][T14893] ? dump_page+0x4f/0x60 [ 102.124352][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170 [ 102.125500][T14893] folio_batch_move_lru+0xd4/0x200 [ 102.126577][T14893] ? __pfx_lru_add+0x10/0x10 [ 102.127505][T14893] __folio_batch_add_and_move+0x391/0x720 [ 102.128633][T14893] ? __pfx_lru_add+0x10/0x10 [ 102.129550][T14893] folio_putback_lru+0x16/0x80 [ 102.130564][T14893] migrate_device_finalize+0x9b/0x530 [ 102.131640][T14893] dmirror_migrate_to_device.constprop.0+0x7c5/0xad0 [ 102.133047][T14893] dmirror_fops_unlocked_ioctl+0x89b/0xc80 Likely, nothing else goes wrong: putting the last folio reference will remove the folio from the LRU again. So besides memcg complaining, adding the folio to be freed to the LRU is just an unnecessary step. The new flow resembles what we have in migrate_folio_move(): add the dst to the lru, remove migration ptes, unlock and unref dst. Link: https://lkml.kernel.org/r/[email protected] Fixes: 8763cb4 ("mm/migrate: new memory migration helper for use with device memory") Signed-off-by: David Hildenbrand <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Alistair Popple <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Rafael Aquini <[email protected]>

jira LE-3201 Rebuild_History Non-Buildable kernel-rt-4.18.0-553.34.1.rt7.375.el8_10 commit-author Sean Christopherson <[email protected]> commit f1187ef Fix a goof where KVM tries to grab source vCPUs from the destination VM when doing intrahost migration. Grabbing the wrong vCPU not only hoses the guest, it also crashes the host due to the VMSA pointer being left NULL. BUG: unable to handle page fault for address: ffffe38687000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 39 PID: 17143 Comm: sev_migrate_tes Tainted: GO 6.5.0-smp--fff2e47e6c3b-next #151 Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.28.0 07/10/2023 RIP: 0010:__free_pages+0x15/0xd0 RSP: 0018:ffff923fcf6e3c78 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffe38687000000 RCX: 0000000000000100 RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffffe38687000000 RBP: ffff923fcf6e3c88 R08: ffff923fcafb0000 R09: 0000000000000000 R10: 0000000000000000 R11: ffffffff83619b90 R12: ffff923fa9540000 R13: 0000000000080007 R14: ffff923f6d35d000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff929d0d7c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffe38687000000 CR3: 0000005224c34005 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace: <TASK> sev_free_vcpu+0xcb/0x110 [kvm_amd] svm_vcpu_free+0x75/0xf0 [kvm_amd] kvm_arch_vcpu_destroy+0x36/0x140 [kvm] kvm_destroy_vcpus+0x67/0x100 [kvm] kvm_arch_destroy_vm+0x161/0x1d0 [kvm] kvm_put_kvm+0x276/0x560 [kvm] kvm_vm_release+0x25/0x30 [kvm] __fput+0x106/0x280 ____fput+0x12/0x20 task_work_run+0x86/0xb0 do_exit+0x2e3/0x9c0 do_group_exit+0xb1/0xc0 __x64_sys_exit_group+0x1b/0x20 do_syscall_64+0x41/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> CR2: ffffe38687000000 Fixes: 6defa24 ("KVM: SEV: Init target VMCBs in sev_migrate_from") Cc: [email protected] Cc: Peter Gonda <[email protected]> Reviewed-by: Peter Gonda <[email protected]> Reviewed-by: Pankaj Gupta <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> (cherry picked from commit f1187ef) Signed-off-by: Jonathan Maple <[email protected]>

bmastbergen requested review from bmastbergen, PlaidCat and kerneltoast March 5, 2025 14:08

bmastbergen approved these changes Mar 5, 2025

View reviewed changes

PlaidCat approved these changes Mar 5, 2025

View reviewed changes

thefossguy-ciq merged commit bcdcbf3 into ciqlts8_6-rt Mar 5, 2025
1 check passed

PlaidCat deleted the {ppatel}_ciqlts8_6-rt branch April 18, 2025 15:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ciqlts8_6-rt] writeback: avoid use-after-free after removing device #151

[ciqlts8_6-rt] writeback: avoid use-after-free after removing device #151

Uh oh!

thefossguy-ciq commented Mar 5, 2025

Uh oh!

bmastbergen left a comment

Uh oh!

PlaidCat left a comment

Uh oh!

Uh oh!

Uh oh!

[ciqlts8_6-rt] writeback: avoid use-after-free after removing device #151

[ciqlts8_6-rt] writeback: avoid use-after-free after removing device #151

Uh oh!

Conversation

thefossguy-ciq commented Mar 5, 2025

Commit message

Kernel build logs

Kselftests

Uh oh!

bmastbergen left a comment

Choose a reason for hiding this comment

Uh oh!

PlaidCat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!