Skip to content

[fips-9-compliant] net: sched: use RCU read-side critical section in taprio_dump() #419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

shreeya-patel98
Copy link

  • Commit Message Requirements
  • Built against Vault/LTS Environment
  • kABI Check Passed, where Valid (Pre 9.4 RT does not have kABI stability)
  • Boot Test
  • Kernel SelfTest results
  • Additional Tests as determined relevant

Commit message

jira VULN-40133
cve CVE-2024-50126
commit-author Dmitry Antipov <[email protected]> commit b22db8b8befe90b61c98626ca1a2fbb0505e9fe3
upstream-diff The following commits cause merge conflicts since they are
    not present in the fips-9-compliant/5.14.0-284.30.1 :-
    a54fc09 ("net/sched: taprio: allow user input of per-tc max SDU")
    9dd6ad6 ("net/sched: refactor mqprio qopt reconstruction to a library function")

Fix possible use-after-free in 'taprio_dump()' by adding RCU read-side critical section there. Never seen on x86 but found on a KASAN-enabled arm64 system when investigating https://syzkaller.appspot.com/bug?extid=b65e0af58423fc8a73aa:

[T15862] BUG: KASAN: slab-use-after-free in taprio_dump+0xa0c/0xbb0 [T15862] Read of size 4 at addr ffff0000d4bb88f8 by task repro/15862 [T15862]
[T15862] CPU: 0 UID: 0 PID: 15862 Comm: repro Not tainted 6.11.0-rc1-00293-gdefaf1a2113a-dirty #2 [T15862] Hardware name: QEMU QEMU Virtual Machine, BIOS edk2-20240524-5.fc40 05/24/2024 [T15862] Call trace:
[T15862]  dump_backtrace+0x20c/0x220
[T15862]  show_stack+0x2c/0x40
[T15862]  dump_stack_lvl+0xf8/0x174
[T15862]  print_report+0x170/0x4d8
[T15862]  kasan_report+0xb8/0x1d4
[T15862]  __asan_report_load4_noabort+0x20/0x2c
[T15862]  taprio_dump+0xa0c/0xbb0
[T15862]  tc_fill_qdisc+0x540/0x1020
[T15862]  qdisc_notify.isra.0+0x330/0x3a0
[T15862]  tc_modify_qdisc+0x7b8/0x1838
[T15862]  rtnetlink_rcv_msg+0x3c8/0xc20
[T15862]  netlink_rcv_skb+0x1f8/0x3d4
[T15862]  rtnetlink_rcv+0x28/0x40
[T15862]  netlink_unicast+0x51c/0x790
[T15862]  netlink_sendmsg+0x79c/0xc20
[T15862]  __sock_sendmsg+0xe0/0x1a0
[T15862]  ____sys_sendmsg+0x6c0/0x840
[T15862]  ___sys_sendmsg+0x1ac/0x1f0
[T15862]  __sys_sendmsg+0x110/0x1d0
[T15862]  __arm64_sys_sendmsg+0x74/0xb0
[T15862]  invoke_syscall+0x88/0x2e0
[T15862]  el0_svc_common.constprop.0+0xe4/0x2a0
[T15862]  do_el0_svc+0x44/0x60
[T15862]  el0_svc+0x50/0x184
[T15862]  el0t_64_sync_handler+0x120/0x12c
[T15862]  el0t_64_sync+0x190/0x194
[T15862]
[T15862] Allocated by task 15857:
[T15862]  kasan_save_stack+0x3c/0x70
[T15862]  kasan_save_track+0x20/0x3c
[T15862]  kasan_save_alloc_info+0x40/0x60
[T15862]  __kasan_kmalloc+0xd4/0xe0
[T15862]  __kmalloc_cache_noprof+0x194/0x334
[T15862]  taprio_change+0x45c/0x2fe0
[T15862]  tc_modify_qdisc+0x6a8/0x1838
[T15862]  rtnetlink_rcv_msg+0x3c8/0xc20
[T15862]  netlink_rcv_skb+0x1f8/0x3d4
[T15862]  rtnetlink_rcv+0x28/0x40
[T15862]  netlink_unicast+0x51c/0x790
[T15862]  netlink_sendmsg+0x79c/0xc20
[T15862]  __sock_sendmsg+0xe0/0x1a0
[T15862]  ____sys_sendmsg+0x6c0/0x840
[T15862]  ___sys_sendmsg+0x1ac/0x1f0
[T15862]  __sys_sendmsg+0x110/0x1d0
[T15862]  __arm64_sys_sendmsg+0x74/0xb0
[T15862]  invoke_syscall+0x88/0x2e0
[T15862]  el0_svc_common.constprop.0+0xe4/0x2a0
[T15862]  do_el0_svc+0x44/0x60
[T15862]  el0_svc+0x50/0x184
[T15862]  el0t_64_sync_handler+0x120/0x12c
[T15862]  el0t_64_sync+0x190/0x194
[T15862]
[T15862] Freed by task 6192:
[T15862]  kasan_save_stack+0x3c/0x70
[T15862]  kasan_save_track+0x20/0x3c
[T15862]  kasan_save_free_info+0x4c/0x80
[T15862]  poison_slab_object+0x110/0x160
[T15862]  __kasan_slab_free+0x3c/0x74
[T15862]  kfree+0x134/0x3c0
[T15862]  taprio_free_sched_cb+0x18c/0x220
[T15862]  rcu_core+0x920/0x1b7c
[T15862]  rcu_core_si+0x10/0x1c
[T15862]  handle_softirqs+0x2e8/0xd64
[T15862]  __do_softirq+0x14/0x20

Fixes: 18cdd2f0998a ("net/sched: taprio: taprio_dump and taprio_change are protected by rtnl_mutex")
	Acked-by: Vinicius Costa Gomes <[email protected]>
	Signed-off-by: Dmitry Antipov <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Paolo Abeni <[email protected]>

(cherry picked from commit b22db8b8befe90b61c98626ca1a2fbb0505e9fe3)
	Signed-off-by: Shreeya Patel <[email protected]>

Kernel build logs

/mnt/scratch/kernel-src-tree
Skipping mrproper (not requested)
[TIMER]{MRPROPER}: 0s
x86_64 architecture detected, copying config
'configs/kernel-x86_64-rhel.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291"
Making olddefconfig
#
# configuration written to .config
#
Starting Build
  SYNC    include/config/auto.conf.cmd
  UPD     include/config/kernel.release
  DESCEND objtool
  DESCEND bpf/resolve_btfids
  UPD     include/generated/utsrelease.h
  CALL    scripts/atomic/check-atomics.sh
warning: generated include/linux/atomic/atomic-instrumented.h has been modified.
  CALL    scripts/checksyscalls.sh
  CHK     include/generated/compile.h
  CC      init/version.o
  CC      arch/x86/crypto/aesni-intel_glue.o
  AR      init/built-in.a
  CC      kernel/sys.o
  CC      security/integrity/ima/ima_init.o
  CC      crypto/fips.o
  AR      arch/x86/crypto/built-in.a
  AR      security/integrity/ima/built-in.a
  AR      security/integrity/built-in.a
  AR      security/built-in.a
  AR      arch/x86/built-in.a
  CC      crypto/algapi.o
  CC [M]  net/sched/sch_taprio.o
  CC [M]  lib/crypto/des.o
  CC      drivers/base/firmware_loader/fallback_table.o
  CC      drivers/base/firmware_loader/main.o
  CC      crypto/dh.o
  CC [M]  fs/cifs/smbencrypt.o
  CC      kernel/module.o
  LD [M]  lib/crypto/libdes.o
  CC      crypto/rsa.o
  CC      net/ethtool/ioctl.o
  CC      drivers/base/firmware_loader/fallback.o
  <--snip-->
    STRIP   /lib/modules/5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+/kernel/sound/x86/snd-hdmi-lpe-audio.ko
  STRIP   /lib/modules/5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+/kernel/sound/xen/snd_xen_front.ko
  SIGN    /lib/modules/5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+/kernel/sound/virtio/virtio_snd.ko
  SIGN    /lib/modules/5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+/kernel/virt/lib/irqbypass.ko
  SIGN    /lib/modules/5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+/kernel/sound/x86/snd-hdmi-lpe-audio.ko
  SIGN    /lib/modules/5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+/kernel/sound/xen/snd_xen_front.ko
  DEPMOD  /lib/modules/5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+
[TIMER]{MODULES}: 10s
Making Install
sh ./arch/x86/boot/install.sh \
	5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+ arch/x86/boot/bzImage \
	System.map "/boot"
[TIMER]{INSTALL}: 23s
Checking kABI
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+ and Index to 1
The default is /boot/loader/entries/809410938d1447fc931cf787fb714082-5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+.conf with index 1 and kernel /boot/vmlinuz-5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+
The default is /boot/loader/entries/809410938d1447fc931cf787fb714082-5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+.conf with index 1 and kernel /boot/vmlinuz-5.14.0-_spatel__fips-9-compliant_5.14.0-284.30.1-e5f58b654291+
Generating grub configuration file ...
Adding boot menu entry for UEFI Firmware Settings ...
done
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 0s
[TIMER]{BUILD}: 212s
[TIMER]{MODULES}: 10s
[TIMER]{INSTALL}: 23s
[TIMER]{TOTAL} 249s
Rebooting in 10 seconds

kernel-build.log

Kselftests

shreeya@spatel-dev-bom:~/ciq$ grep '^ok ' kselftest-before.log | wc -l && grep '^ok ' kselftest-after.log | wc -l
329
329
shreeya@spatel-dev-bom:~/ciq$ grep '^not ok ' kselftest-before.log | wc -l && grep '^not ok ' kselftest-after.log | wc -l
79
79

kselftest-before.log
kselftest-after.log

jira VULN-40133
cve CVE-2024-50126
commit-author Dmitry Antipov <[email protected]>
commit b22db8b
upstream-diff The following commits cause merge conflicts since they are
    not present in the fips-9-compliant/5.14.0-284.30.1 :-
    a54fc09 ("net/sched: taprio: allow user input of per-tc max SDU")
    9dd6ad6 ("net/sched: refactor mqprio qopt reconstruction to a library function")

Fix possible use-after-free in 'taprio_dump()' by adding RCU
read-side critical section there. Never seen on x86 but
found on a KASAN-enabled arm64 system when investigating
https://syzkaller.appspot.com/bug?extid=b65e0af58423fc8a73aa:

[T15862] BUG: KASAN: slab-use-after-free in taprio_dump+0xa0c/0xbb0
[T15862] Read of size 4 at addr ffff0000d4bb88f8 by task repro/15862
[T15862]
[T15862] CPU: 0 UID: 0 PID: 15862 Comm: repro Not tainted 6.11.0-rc1-00293-gdefaf1a2113a-dirty #2
[T15862] Hardware name: QEMU QEMU Virtual Machine, BIOS edk2-20240524-5.fc40 05/24/2024
[T15862] Call trace:
[T15862]  dump_backtrace+0x20c/0x220
[T15862]  show_stack+0x2c/0x40
[T15862]  dump_stack_lvl+0xf8/0x174
[T15862]  print_report+0x170/0x4d8
[T15862]  kasan_report+0xb8/0x1d4
[T15862]  __asan_report_load4_noabort+0x20/0x2c
[T15862]  taprio_dump+0xa0c/0xbb0
[T15862]  tc_fill_qdisc+0x540/0x1020
[T15862]  qdisc_notify.isra.0+0x330/0x3a0
[T15862]  tc_modify_qdisc+0x7b8/0x1838
[T15862]  rtnetlink_rcv_msg+0x3c8/0xc20
[T15862]  netlink_rcv_skb+0x1f8/0x3d4
[T15862]  rtnetlink_rcv+0x28/0x40
[T15862]  netlink_unicast+0x51c/0x790
[T15862]  netlink_sendmsg+0x79c/0xc20
[T15862]  __sock_sendmsg+0xe0/0x1a0
[T15862]  ____sys_sendmsg+0x6c0/0x840
[T15862]  ___sys_sendmsg+0x1ac/0x1f0
[T15862]  __sys_sendmsg+0x110/0x1d0
[T15862]  __arm64_sys_sendmsg+0x74/0xb0
[T15862]  invoke_syscall+0x88/0x2e0
[T15862]  el0_svc_common.constprop.0+0xe4/0x2a0
[T15862]  do_el0_svc+0x44/0x60
[T15862]  el0_svc+0x50/0x184
[T15862]  el0t_64_sync_handler+0x120/0x12c
[T15862]  el0t_64_sync+0x190/0x194
[T15862]
[T15862] Allocated by task 15857:
[T15862]  kasan_save_stack+0x3c/0x70
[T15862]  kasan_save_track+0x20/0x3c
[T15862]  kasan_save_alloc_info+0x40/0x60
[T15862]  __kasan_kmalloc+0xd4/0xe0
[T15862]  __kmalloc_cache_noprof+0x194/0x334
[T15862]  taprio_change+0x45c/0x2fe0
[T15862]  tc_modify_qdisc+0x6a8/0x1838
[T15862]  rtnetlink_rcv_msg+0x3c8/0xc20
[T15862]  netlink_rcv_skb+0x1f8/0x3d4
[T15862]  rtnetlink_rcv+0x28/0x40
[T15862]  netlink_unicast+0x51c/0x790
[T15862]  netlink_sendmsg+0x79c/0xc20
[T15862]  __sock_sendmsg+0xe0/0x1a0
[T15862]  ____sys_sendmsg+0x6c0/0x840
[T15862]  ___sys_sendmsg+0x1ac/0x1f0
[T15862]  __sys_sendmsg+0x110/0x1d0
[T15862]  __arm64_sys_sendmsg+0x74/0xb0
[T15862]  invoke_syscall+0x88/0x2e0
[T15862]  el0_svc_common.constprop.0+0xe4/0x2a0
[T15862]  do_el0_svc+0x44/0x60
[T15862]  el0_svc+0x50/0x184
[T15862]  el0t_64_sync_handler+0x120/0x12c
[T15862]  el0t_64_sync+0x190/0x194
[T15862]
[T15862] Freed by task 6192:
[T15862]  kasan_save_stack+0x3c/0x70
[T15862]  kasan_save_track+0x20/0x3c
[T15862]  kasan_save_free_info+0x4c/0x80
[T15862]  poison_slab_object+0x110/0x160
[T15862]  __kasan_slab_free+0x3c/0x74
[T15862]  kfree+0x134/0x3c0
[T15862]  taprio_free_sched_cb+0x18c/0x220
[T15862]  rcu_core+0x920/0x1b7c
[T15862]  rcu_core_si+0x10/0x1c
[T15862]  handle_softirqs+0x2e8/0xd64
[T15862]  __do_softirq+0x14/0x20

Fixes: 18cdd2f ("net/sched: taprio: taprio_dump and taprio_change are protected by rtnl_mutex")
	Acked-by: Vinicius Costa Gomes <[email protected]>
	Signed-off-by: Dmitry Antipov <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Paolo Abeni <[email protected]>

(cherry picked from commit b22db8b)
	Signed-off-by: Shreeya Patel <[email protected]>
Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kernel is not impacted due to missing the Fixes: 18cdd2f0998a4967b1fff4c43ed9aef049e42c39 ("net/sched: taprio: taprio_dump and taprio_change are protected by rtnl_mutex") commit does not exist in this kernel:

[jmaple@devbox kernel-src-tree]$ git branch --show-current
fips-9-compliant/5.14.0-284.30.1

[jmaple@devbox kernel-src-tree]$ git branch --contains 18cdd2f0998a4967b1fff4c43ed9aef049e42c39 | grep  $(git branch --show-current)
[jmaple@devbox kernel-src-tree]$ git log --grep 18cdd2f0998a4967b1fff4c43ed9aef049e42c39
[jmaple@devbox kernel-src-tree]$

Reference this is the sha to the `v4.18` kernel so it should show up in the search since its in the history of kernel.org
[jmaple@devbox kernel-src-tree]$ git branch --contains 94710cac0ef4ee177a63b5227664b38c95bbf703 | grep $(git branch --show-current)
* fips-9-compliant/5.14.0-284.30.1

You can see however it is fixed in CentOS

[jmaple@devbox kernel-src-tree]$ git branch --show-current
rocky9_6

[jmaple@devbox kernel-src-tree]$ git log --grep 18cdd2f0998a4967b1fff4c43ed9aef049e42c39
commit 654c18a9eebaec96787eae06424bcc002d95dd5d
Author: Ivan Vecera <[email protected]>
Date:   Thu Apr 27 13:19:31 2023 +0200

    net/sched: taprio: taprio_dump and taprio_change are protected by rtnl_mutex

    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2172886

    commit 18cdd2f0998a4967b1fff4c43ed9aef049e42c39
    Author: Vladimir Oltean <[email protected]>
    Date:   Thu Sep 15 13:50:41 2022 +0300

        net/sched: taprio: taprio_dump and taprio_change are protected by rtnl_mutex

        Since the writer-side lock is taken here, we do not need to open an RCU
        read-side critical section, instead we can use rtnl_dereference() to
        tell lockdep we are serialized with concurrent writes.

        Signed-off-by: Vladimir Oltean <[email protected]>
        Signed-off-by: Jakub Kicinski <[email protected]>

    Signed-off-by: Ivan Vecera <[email protected]>

[jmaple@devbox kernel-src-tree]$ git describe 654c18a9eebaec96787eae06424bcc002d95dd5d
kernel-5.14.0-311.el9-51-g654c18a9eeba

These are the base CentOS streams branches for 9.2 and 9.4 respectfully

9.2 - kernel-5.14.0-284.30.1.el9_2
9.4 - kernel-5.14.0-427.42.1.el9_4

However didn't look at this initially when pulling up the original commit sha b22db8b from upstream to compere our the diff the thing that jumped out to me was this the - rcu_read_lock() at the top of taprio_dump
1/? inline comment
and
2/? inline comment

This made me start spelunking and then I also realized that these also changed their function calls. I completely missed this on the first glance at the review, as the rcu_lock* stood out.

 +      oper = rcu_dereference(q->oper_sched);
 +      admin = rcu_dereference(q->admin_sched);

->

+       oper = rtnl_dereference(q->oper_sched);
+       admin = rtnl_dereference(q->admin_sched);

This is all a result of 18cdd2f which is the thing this CVE is fixing.

The other commit called out in the upstream-diff section make 100% sense with how git cherry-pick was greedy, I was able to easily workout why the were obvious differences thank you for that.

I replicated the cherry-pick on a frecsh checkout and I can see how the other 2 diffs would cause these to get lost.

This is mostly a case of a short coming on our actionable CVE list internally. However I think I wanted to publicly demonstrate how we we try to evaluate commits.
There is also room for improvment on our tooling here:
https://github.com/ctrliq/kernel-src-tree-tools/blob/mainline/ciq-cherry-pick.py
to validate that the SHA we're cherry-picking actually has the FIXES shas in the branch we're targeting.

Feel free to close this with a comment of Not Affected
I appreciate you letting me post this for the public

Comment on lines -1893 to -1896
rcu_read_lock();
oper = rcu_dereference(q->oper_sched);
admin = rcu_dereference(q->admin_sched);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the first thing that jumped out the rcu_read_lock() this is not in the original sha

Comment on lines +1919 to +1920
rcu_read_lock();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2/?
However its added in here along with the label ptions_error_rcu: below and the fact that the done: label with rcu_unlock was already in the code, (I can't highlight this section because its out of scope. but its on lines 1938-1939

@shreeya-patel98
Copy link
Author

hmmm this was really interesting case to understand how we should evaluate whether we need to patch the kernel or not. Thank you for such a detailed explanation here.
So whenever I work on my next CVE, I will make sure to check for a Fixes tag as it could change a lot of things.

@shreeya-patel98
Copy link
Author

Closing this as 9.2 kernel is not affected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants