Skip to content

[rocky8_10] History rebuild for kernel-4.18.0-553.62.1.el8_10 #418

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jul 16, 2025

Conversation

PlaidCat
Copy link
Collaborator

General Process:

Checking Rebuild Commits for Potentially missing commits:

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 554396
Number of commits in rpm: 42
Number of commits matched with upstream: 32 (76.19%)
Number of commits in upstream but not in rpm: 554364
Number of commits NOT found in upstream: 10 (23.81%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.62.1.el8_10 for kernel-4.18.0-553.62.1.el8_10
Clean Cherry Picks: 17 (53.12%)
Empty Cherry Picks: 15 (46.88%)
_______________________________

__EMPTY COMMITS__________________________
8231a0e632405a03018034848d3c4620d7ba1dca s390: Add z17 elf platform
54da6a0924311c7cf5015533991e44fb8eb12773 locking: Introduce __cleanup() based infrastructure
85be6d842447067ce76047a14d4258c96fd33b7b cleanup: Make no_free_ptr() __must_check
e4ab322fbaaaf84b23d6cb0e3317a7f68baf36dc cleanup: Add conditional guard support
c80c4490c280a1678e47d34d2a335a58f1318615 cleanup: Standardize the header guard define's name
c6269149cbf7053272d918101981869438ff7c1e file: add take_fd() cleanup helper
d5934e76316e84eced836b6b2bafae1837d1cd58 cleanup: Add usage and style documentation
f730fd535fc51573f982fad629f2fc6b4a0cde2f cleanup: Remove address space of returned pointer
fcc22ac5baf06dd17193de44b60dbceea6461983 cleanup: Adjust scoped_guard() macros to avoid potential warning
36c2cf88808d47e926d11b98734f154fe4a9f50f cleanup: Add conditional guard helper
dc1771f718548f7d4b93991b174c6e7b5e1ba410 Revert "drivers: core: synchronize really_probe() and dev_uevent()"
04d3e5461c1f5cf8eec964ab64948ebed826e95e driver core: introduce device_set_driver() helper
18daa52418e7e4629ed1703b64777294209d2622 driver core: fix potential NULL pointer dereference in dev_uevent()
cd7eb8f83fcf258f71e293f7fc52a70be8ed0128 mm/slab: make __free(kfree) accept error pointers
2ccd42b959aaf490333dbd3b9b102eaf295c036a s390/virtio_ccw: Don't allocate/assign airqs for non-existing queues

__CHANGES NOT IN UPSTREAM________________
Adding prod certs and changed cert date to 20210620
Adding Rocky secure boot certs
Fixing vmlinuz removal
Fixing UEFI CA path
Porting to 8.10, debranding and Rocky branding
Fixing pesign_key_name values
Race between reading mdstat and stopping an md device
fs/dcache: Control # of dentries in list_lru_node
fs/dcache: Add sysctl parameter dentry-fs-klimit to control # of dentries in filesystem
mm/list_lru: Make list_lru_add() return # if items in affected list_lru_node

Build

[jmaple@devbox code]$ egrep -B 5 -A 5 "\[TIMER\]|^Starting Build" $(ls -t kbuild* | head -n1)
/mnt/code/kernel-src-tree
no .config file found, moving on
[TIMER]{MRPROPER}: 0s
x86_64 architecture detected, copying config
'configs/kernel-x86_64.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky8_10_rebuild-e252413ceae1"
Making olddefconfig
--
  HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf  --olddefconfig Kconfig
#
# configuration written to .config
#
Starting Build
scripts/kconfig/conf  --syncconfig Kconfig
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_64.h
--
  LD [M]  sound/usb/usx2y/snd-usb-usx2y.ko
  LD [M]  sound/virtio/virtio_snd.ko
  LD [M]  sound/x86/snd-hdmi-lpe-audio.ko
  LD [M]  sound/xen/snd_xen_front.ko
  LD [M]  virt/lib/irqbypass.ko
[TIMER]{BUILD}: 1574s
Making Modules
  INSTALL arch/x86/crypto/camellia-aesni-avx-x86_64.ko
  INSTALL arch/x86/crypto/blowfish-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx2.ko
  INSTALL arch/x86/crypto/camellia-x86_64.ko
--
  INSTALL sound/virtio/virtio_snd.ko
  INSTALL sound/x86/snd-hdmi-lpe-audio.ko
  INSTALL sound/xen/snd_xen_front.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-rocky8_10_rebuild-e252413ceae1+
[TIMER]{MODULES}: 14s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-rocky8_10_rebuild-e252413ceae1+ arch/x86/boot/bzImage \
        System.map "/boot"
[TIMER]{INSTALL}: 20s
Checking kABI
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-rocky8_10_rebuild-e252413ceae1+ and Index to 2
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 0s
[TIMER]{BUILD}: 1574s
[TIMER]{MODULES}: 14s
[TIMER]{INSTALL}: 20s
[TIMER]{TOTAL} 1613s
Rebooting in 10 seconds

KSelfTest

[jmaple@devbox code]$ ls -rt kselftest.* | tail -n2 | while read line; do echo $line; grep '^ok ' $line | wc -l ; done
kselftest.4.18.0-rocky8_10_rebuild-5c89e3d2c056+.log
206
kselftest.4.18.0-rocky8_10_rebuild-e252413ceae1+.log
206

PlaidCat added 30 commits July 16, 2025 04:01
jira LE-3587
cve CVE-2025-21991
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Florent Revest <[email protected]>
commit e3e8917

Currently, load_microcode_amd() iterates over all NUMA nodes, retrieves their
CPU masks and unconditionally accesses per-CPU data for the first CPU of each
mask.

According to Documentation/admin-guide/mm/numaperf.rst:

  "Some memory may share the same node as a CPU, and others are provided as
  memory only nodes."

Therefore, some node CPU masks may be empty and wouldn't have a "first CPU".

On a machine with far memory (and therefore CPU-less NUMA nodes):
- cpumask_of_node(nid) is 0
- cpumask_first(0) is CONFIG_NR_CPUS
- cpu_data(CONFIG_NR_CPUS) accesses the cpu_info per-CPU array at an
  index that is 1 out of bounds

This does not have any security implications since flashing microcode is
a privileged operation but I believe this has reliability implications by
potentially corrupting memory while flashing a microcode update.

When booting with CONFIG_UBSAN_BOUNDS=y on an AMD machine that flashes
a microcode update. I get the following splat:

  UBSAN: array-index-out-of-bounds in arch/x86/kernel/cpu/microcode/amd.c:X:Y
  index 512 is out of range for type 'unsigned long[512]'
  [...]
  Call Trace:
   dump_stack
   __ubsan_handle_out_of_bounds
   load_microcode_amd
   request_microcode_amd
   reload_store
   kernfs_fop_write_iter
   vfs_write
   ksys_write
   do_syscall_64
   entry_SYSCALL_64_after_hwframe

Change the loop to go over only NUMA nodes which have CPUs before determining
whether the first CPU on the respective node needs microcode update.

  [ bp: Massage commit message, fix typo. ]

Fixes: 7ff6edf ("x86/microcode/AMD: Fix mixed steppings support")
	Signed-off-by: Florent Revest <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit e3e8917)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
cve CVE-2025-22004
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Dan Carpenter <[email protected]>
commit f3009d0

The ->send() operation frees skb so save the length before calling
->send() to avoid a use after free.

Fixes: 1da177e ("Linux-2.6.12-rc2")
	Signed-off-by: Dan Carpenter <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Paolo Abeni <[email protected]>

(cherry picked from commit f3009d0)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
cve CVE-2025-23150
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Artem Sadovnikov <[email protected]>
commit 94824ac

Syzkaller detected a use-after-free issue in ext4_insert_dentry that was
caused by out-of-bounds access due to incorrect splitting in do_split.

BUG: KASAN: use-after-free in ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109
Write of size 251 at addr ffff888074572f14 by task syz-executor335/5847

CPU: 0 UID: 0 PID: 5847 Comm: syz-executor335 Not tainted 6.12.0-rc6-syzkaller-00318-ga9cda7c0ffed #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
 __asan_memcpy+0x40/0x70 mm/kasan/shadow.c:106
 ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109
 add_dirent_to_buf+0x3d9/0x750 fs/ext4/namei.c:2154
 make_indexed_dir+0xf98/0x1600 fs/ext4/namei.c:2351
 ext4_add_entry+0x222a/0x25d0 fs/ext4/namei.c:2455
 ext4_add_nondir+0x8d/0x290 fs/ext4/namei.c:2796
 ext4_symlink+0x920/0xb50 fs/ext4/namei.c:3431
 vfs_symlink+0x137/0x2e0 fs/namei.c:4615
 do_symlinkat+0x222/0x3a0 fs/namei.c:4641
 __do_sys_symlink fs/namei.c:4662 [inline]
 __se_sys_symlink fs/namei.c:4660 [inline]
 __x64_sys_symlink+0x7a/0x90 fs/namei.c:4660
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
 </TASK>

The following loop is located right above 'if' statement.

for (i = count-1; i >= 0; i--) {
	/* is more than half of this entry in 2nd half of the block? */
	if (size + map[i].size/2 > blocksize/2)
		break;
	size += map[i].size;
	move++;
}

'i' in this case could go down to -1, in which case sum of active entries
wouldn't exceed half the block size, but previous behaviour would also do
split in half if sum would exceed at the very last block, which in case of
having too many long name files in a single block could lead to
out-of-bounds access and following use-after-free.

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.

	Cc: [email protected]
Fixes: 5872331 ("ext4: fix potential negative array index in do_split()")
	Signed-off-by: Artem Sadovnikov <[email protected]>
	Reviewed-by: Jan Kara <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Theodore Ts'o <[email protected]>
(cherry picked from commit 94824ac)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
cve CVE-2025-37738
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Bhupesh <[email protected]>
commit c8e008b

Once inside 'ext4_xattr_inode_dec_ref_all' we should
ignore xattrs entries past the 'end' entry.

This fixes the following KASAN reported issue:

==================================================================
BUG: KASAN: slab-use-after-free in ext4_xattr_inode_dec_ref_all+0xb8c/0xe90
Read of size 4 at addr ffff888012c120c4 by task repro/2065

CPU: 1 UID: 0 PID: 2065 Comm: repro Not tainted 6.13.0-rc2+ #11
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x1fd/0x300
 ? tcp_gro_dev_warn+0x260/0x260
 ? _printk+0xc0/0x100
 ? read_lock_is_recursive+0x10/0x10
 ? irq_work_queue+0x72/0xf0
 ? __virt_addr_valid+0x17b/0x4b0
 print_address_description+0x78/0x390
 print_report+0x107/0x1f0
 ? __virt_addr_valid+0x17b/0x4b0
 ? __virt_addr_valid+0x3ff/0x4b0
 ? __phys_addr+0xb5/0x160
 ? ext4_xattr_inode_dec_ref_all+0xb8c/0xe90
 kasan_report+0xcc/0x100
 ? ext4_xattr_inode_dec_ref_all+0xb8c/0xe90
 ext4_xattr_inode_dec_ref_all+0xb8c/0xe90
 ? ext4_xattr_delete_inode+0xd30/0xd30
 ? __ext4_journal_ensure_credits+0x5f0/0x5f0
 ? __ext4_journal_ensure_credits+0x2b/0x5f0
 ? inode_update_timestamps+0x410/0x410
 ext4_xattr_delete_inode+0xb64/0xd30
 ? ext4_truncate+0xb70/0xdc0
 ? ext4_expand_extra_isize_ea+0x1d20/0x1d20
 ? __ext4_mark_inode_dirty+0x670/0x670
 ? ext4_journal_check_start+0x16f/0x240
 ? ext4_inode_is_fast_symlink+0x2f2/0x3a0
 ext4_evict_inode+0xc8c/0xff0
 ? ext4_inode_is_fast_symlink+0x3a0/0x3a0
 ? do_raw_spin_unlock+0x53/0x8a0
 ? ext4_inode_is_fast_symlink+0x3a0/0x3a0
 evict+0x4ac/0x950
 ? proc_nr_inodes+0x310/0x310
 ? trace_ext4_drop_inode+0xa2/0x220
 ? _raw_spin_unlock+0x1a/0x30
 ? iput+0x4cb/0x7e0
 do_unlinkat+0x495/0x7c0
 ? try_break_deleg+0x120/0x120
 ? 0xffffffff81000000
 ? __check_object_size+0x15a/0x210
 ? strncpy_from_user+0x13e/0x250
 ? getname_flags+0x1dc/0x530
 __x64_sys_unlinkat+0xc8/0xf0
 do_syscall_64+0x65/0x110
 entry_SYSCALL_64_after_hwframe+0x67/0x6f
RIP: 0033:0x434ffd
Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8
RSP: 002b:00007ffc50fa7b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000107
RAX: ffffffffffffffda RBX: 00007ffc50fa7e18 RCX: 0000000000434ffd
RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000005
RBP: 00007ffc50fa7be0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007ffc50fa7e08 R14: 00000000004bbf30 R15: 0000000000000001
 </TASK>

The buggy address belongs to the object at ffff888012c12000
 which belongs to the cache filp of size 360
The buggy address is located 196 bytes inside of
 freed 360-byte region [ffff888012c12000, ffff888012c12168)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12c12
head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x40(head|node=0|zone=0)
page_type: f5(slab)
raw: 0000000000000040 ffff888000ad7640 ffffea0000497a00 dead000000000004
raw: 0000000000000000 0000000000100010 00000001f5000000 0000000000000000
head: 0000000000000040 ffff888000ad7640 ffffea0000497a00 dead000000000004
head: 0000000000000000 0000000000100010 00000001f5000000 0000000000000000
head: 0000000000000001 ffffea00004b0481 ffffffffffffffff 0000000000000000
head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888012c11f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff888012c12000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff888012c12080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                           ^
 ffff888012c12100: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
 ffff888012c12180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

	Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=b244bda78289b00204ed
	Suggested-by: Thadeu Lima de Souza Cascardo <[email protected]>
	Signed-off-by: Bhupesh <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Theodore Ts'o <[email protected]>
(cherry picked from commit c8e008b)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Vasily Gorbik <[email protected]>
commit 8231a0e
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/8231a0e6.failed

Add detection for machine types 0x9175 and 0x9176 and set ELF platform
name to z17.

	Reviewed-by: Heiko Carstens <[email protected]>
	Signed-off-by: Vasily Gorbik <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 8231a0e)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/s390/kernel/processor.c
jira LE-3587
cve CVE-2022-49058
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Harshit Mogalapalli <[email protected]>
commit 64c4a37

Smatch printed a warning:
	arch/x86/crypto/poly1305_glue.c:198 poly1305_update_arch() error:
	__memcpy() 'dctx->buf' too small (16 vs u32max)

It's caused because Smatch marks 'link_len' as untrusted since it comes
from sscanf(). Add a check to ensure that 'link_len' is not larger than
the size of the 'link_str' buffer.

Fixes: c69c1b6 ("cifs: implement CIFSParseMFSymlink()")
	Signed-off-by: Harshit Mogalapalli <[email protected]>
	Reviewed-by: Ronnie Sahlberg <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 64c4a37)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
cve CVE-2024-57980
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Laurent Pinchart <[email protected]>
commit c6ef3a7

If the uvc_status_init() function fails to allocate the int_urb, it will
free the dev->status pointer but doesn't reset the pointer to NULL. This
results in the kfree() call in uvc_status_cleanup() trying to
double-free the memory. Fix it by resetting the dev->status pointer to
NULL after freeing it.

Fixes: a31a405 ("V4L/DVB:usbvideo:don't use part of buffer for USB transfer #4")
	Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laurent Pinchart <[email protected]>
Reviewed by: Ricardo Ribalda <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit c6ef3a7)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ricardo Ribalda <[email protected]>
commit 64627da

Avoid using the iterators after the list_for_each() constructs.
This patch should be a NOP, but makes cocci, happier:

drivers/media/usb/uvc/uvc_ctrl.c:1861:44-50: ERROR: invalid reference to the index variable of the iterator on line 1850
drivers/media/usb/uvc/uvc_ctrl.c:2195:17-23: ERROR: invalid reference to the index variable of the iterator on line 2179

	Reviewed-by: Sergey Senozhatsky <[email protected]>
	Reviewed-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Ricardo Ribalda <[email protected]>
	Signed-off-by: Hans Verkuil <[email protected]>
(cherry picked from commit 64627da)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ricardo Ribalda <[email protected]>
commit d9fecd0

Now we keep a reference to the active fh for any call to uvc_ctrl_set,
regardless if it is an actual set or if it is a just a try or if the
device refused the operation.

We should only keep the file handle if the device actually accepted
applying the operation.

	Cc: [email protected]
Fixes: e5225c8 ("media: uvcvideo: Send a control event when a Control Change interrupt arrives")
	Suggested-by: Hans de Goede <[email protected]>
	Reviewed-by: Hans de Goede <[email protected]>
	Reviewed-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Ricardo Ribalda <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit d9fecd0)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ricardo Ribalda <[email protected]>
commit 04d3398

ctrl->handle will only be different than NULL for controls that have
mappings. This is because that assignment is only done inside
uvc_ctrl_set() for mapped controls.

	Cc: [email protected]
Fixes: e5225c8 ("media: uvcvideo: Send a control event when a Control Change interrupt arrives")
	Reviewed-by: Laurent Pinchart <[email protected]>
	Reviewed-by: Hans de Goede <[email protected]>
	Signed-off-by: Ricardo Ribalda <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit 04d3398)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
cve CVE-2024-58002
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ricardo Ribalda <[email protected]>
commit 221cd51

When an async control is written, we copy a pointer to the file handle
that started the operation. That pointer will be used when the device is
done. Which could be anytime in the future.

If the user closes that file descriptor, its structure will be freed,
and there will be one dangling pointer per pending async control, that
the driver will try to use.

Clean all the dangling pointers during release().

To avoid adding a performance penalty in the most common case (no async
operation), a counter has been introduced with some logic to make sure
that it is properly handled.

	Cc: [email protected]
Fixes: e5225c8 ("media: uvcvideo: Send a control event when a Control Change interrupt arrives")
	Reviewed-by: Hans de Goede <[email protected]>
	Signed-off-by: Ricardo Ribalda <[email protected]>
	Reviewed-by: Laurent Pinchart <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit 221cd51)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ricardo Ribalda <[email protected]>
commit 02baaa0

Make it explicit that the function is always called with ctrl_mutex
being held.

	Suggested-by: Laurent Pinchart <[email protected]>
	Reviewed-by: Laurent Pinchart <[email protected]>
	Reviewed-by: Hans de Goede <[email protected]>
	Signed-off-by: Ricardo Ribalda <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit 02baaa0)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ricardo Ribalda <[email protected]>
commit d6b874f

Asynchronous controls trigger an event when they have completed their
operation.

This can make that the control cached value does not match the value in
the device.

Let's flush the cache to be on the safe side.

	Signed-off-by: Ricardo Ribalda <[email protected]>
	Reviewed-by: Laurent Pinchart <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit d6b874f)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ricardo Ribalda <[email protected]>
commit 87ce177

Now we return VB2_BUF_STATE_DONE for valid and invalid frames. Propagate
the correct value, so the user can know if the frame is valid or not via
struct v4l2_buffer->flags.

	Reported-by: Hans de Goede <[email protected]>
Closes: https://lore.kernel.org/linux-media/[email protected]
Fixes: 6998b6f ("[media] uvcvideo: Use videobuf2-vmalloc")
	Signed-off-by: Ricardo Ribalda <[email protected]>
	Reviewed-by: Laurent Pinchart <[email protected]>
	Reviewed-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit 87ce177)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ricardo Ribalda <[email protected]>
commit 52fbe17

The module param `nodrop` defines what to do with frames that contain an
error: drop them or sending them to userspace.

The default in the rest of the media subsystem is to return buffers with
an error to userspace with V4L2_BUF_FLAG_ERROR set in v4l2_buffer.flags.
In UVC we drop buffers with errors by default.

Change the default behaviour of uvcvideo to match the rest of the
drivers and maybe get rid of the module parameter in the future.

	Suggested-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Ricardo Ribalda <[email protected]>
	Reviewed-by: Laurent Pinchart <[email protected]>
	Reviewed-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit 52fbe17)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ricardo Ribalda <[email protected]>
commit 8869eb6

Right now the parameter value is read during video_registration and
cannot be changed afterwards, despite its permissions 0644, that makes
the user believe that the value can be written.

The parameter only affects the behaviour of uvc_queue_buffer_complete(),
with only one check per buffer.

We can read the value directly from uvc_queue_buffer_complete() and
therefore allowing changing it with sysfs on the fly.

	Signed-off-by: Ricardo Ribalda <[email protected]>
	Reviewed-by: Laurent Pinchart <[email protected]>
	Reviewed-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit 8869eb6)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ricardo Ribalda <[email protected]>
commit 40ed9e9

If the user sets the nodrop parameter, print a deprecation warning once.
Hopefully they will come to the mailing list if it is an ABI change.

Now that we have a callback, take this chance to parse the parameter as
a boolean. We still say to userspace that it is a uint to avoid ABI
changes.

	Signed-off-by: Ricardo Ribalda <[email protected]>
	Reviewed-by: Laurent Pinchart <[email protected]>
	Reviewed-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laurent Pinchart <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit 40ed9e9)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
cve CVE-2022-49788
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Alexander Potapenko <[email protected]>
commit e5b0d06

`struct vmci_event_qp` allocated by qp_notify_peer() contains padding,
which may carry uninitialized data to the userspace, as observed by
KMSAN:

  BUG: KMSAN: kernel-infoleak in instrument_copy_to_user ./include/linux/instrumented.h:121
   instrument_copy_to_user ./include/linux/instrumented.h:121
   _copy_to_user+0x5f/0xb0 lib/usercopy.c:33
   copy_to_user ./include/linux/uaccess.h:169
   vmci_host_do_receive_datagram drivers/misc/vmw_vmci/vmci_host.c:431
   vmci_host_unlocked_ioctl+0x33d/0x43d0 drivers/misc/vmw_vmci/vmci_host.c:925
   vfs_ioctl fs/ioctl.c:51
  ...

  Uninit was stored to memory at:
   kmemdup+0x74/0xb0 mm/util.c:131
   dg_dispatch_as_host drivers/misc/vmw_vmci/vmci_datagram.c:271
   vmci_datagram_dispatch+0x4f8/0xfc0 drivers/misc/vmw_vmci/vmci_datagram.c:339
   qp_notify_peer+0x19a/0x290 drivers/misc/vmw_vmci/vmci_queue_pair.c:1479
   qp_broker_attach drivers/misc/vmw_vmci/vmci_queue_pair.c:1662
   qp_broker_alloc+0x2977/0x2f30 drivers/misc/vmw_vmci/vmci_queue_pair.c:1750
   vmci_qp_broker_alloc+0x96/0xd0 drivers/misc/vmw_vmci/vmci_queue_pair.c:1940
   vmci_host_do_alloc_queuepair drivers/misc/vmw_vmci/vmci_host.c:488
   vmci_host_unlocked_ioctl+0x24fd/0x43d0 drivers/misc/vmw_vmci/vmci_host.c:927
  ...

  Local variable ev created at:
   qp_notify_peer+0x54/0x290 drivers/misc/vmw_vmci/vmci_queue_pair.c:1456
   qp_broker_attach drivers/misc/vmw_vmci/vmci_queue_pair.c:1662
   qp_broker_alloc+0x2977/0x2f30 drivers/misc/vmw_vmci/vmci_queue_pair.c:1750

  Bytes 28-31 of 48 are uninitialized
  Memory access of size 48 starts at ffff888035155e00
  Data copied to user address 0000000020000100

Use memset() to prevent the infoleaks.

Also speculatively fix qp_notify_peer_local(), which may suffer from the
same problem.

	Reported-by: [email protected]
	Cc: stable <[email protected]>
Fixes: 06164d2 ("VMCI: queue pairs implementation.")
	Signed-off-by: Alexander Potapenko <[email protected]>
	Reviewed-by: Vishnu Dasa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit e5b0d06)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Peter Zijlstra <[email protected]>
commit 54da6a0
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/54da6a09.failed

Use __attribute__((__cleanup__(func))) to build:

 - simple auto-release pointers using __free()

 - 'classes' with constructor and destructor semantics for
   scope-based resource management.

 - lock guards based on the above classes.

	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/20230612093537.614161713%40infradead.org
(cherry picked from commit 54da6a0)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/compiler-clang.h
#	include/linux/device.h
#	include/linux/file.h
#	include/linux/mutex.h
#	include/linux/preempt.h
#	include/linux/sched/task.h
#	include/linux/spinlock.h
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Peter Zijlstra <[email protected]>
commit 85be6d8
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/85be6d84.failed

recent discussion brought about the realization that it makes sense for
no_free_ptr() to have __must_check semantics in order to avoid leaking
the resource.

Additionally, add a few comments to clarify why/how things work.

All credit to Linus on how to combine __must_check and the
stmt-expression.

	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
	Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
(cherry picked from commit 85be6d8)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/cleanup.h
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Peter Zijlstra <[email protected]>
commit e4ab322
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/e4ab322f.failed

Adds:

 - DEFINE_GUARD_COND() / DEFINE_LOCK_GUARD_1_COND() to extend existing
   guards with conditional lock primitives, eg. mutex_trylock(),
   mutex_lock_interruptible().

   nb. both primitives allow NULL 'locks', which cause the lock to
       fail (obviously).

 - extends scoped_guard() to not take the body when the the
   conditional guard 'fails'. eg.

     scoped_guard (mutex_intr, &task->signal_cred_guard_mutex) {
	...
     }

   will only execute the body when the mutex is held.

 - provides scoped_cond_guard(name, fail, args...); which extends
   scoped_guard() to do fail when the lock-acquire fails.

	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/20231102110706.460851167%40infradead.org
(cherry picked from commit e4ab322)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/cleanup.h
#	include/linux/mutex.h
#	include/linux/rwsem.h
#	include/linux/spinlock.h
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Ingo Molnar <[email protected]>
commit c80c449
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/c80c4490.failed

At some point during early development, the <linux/cleanup.h> header
must have been named <linux/guard.h>, as evidenced by the header
guard name:

  #ifndef __LINUX_GUARDS_H
  #define __LINUX_GUARDS_H

It ended up being <linux/cleanup.h>, but the old guard name for
a file name that was never upstream never changed.

Do that now - and while at it, also use the canonical _LINUX prefix,
instead of the less common __LINUX prefix.

	Signed-off-by: Ingo Molnar <[email protected]>
	Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/171664113181.10875.8784434350512348496.tip-bot2@tip-bot2
(cherry picked from commit c80c449)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/cleanup.h
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Christian Brauner <[email protected]>
commit c626914
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/c6269149.failed

Add a helper that returns the file descriptor and ensures that the old
variable contains a negative value. This makes it easy to rely on
CLASS(get_unused_fd).

Link: https://lore.kernel.org/r/[email protected]
	Reviewed-by: Jeff Layton <[email protected]>
	Reviewed-by: Josef Bacik <[email protected]>
	Reviewed-by: Alexander Mikhalitsyn <[email protected]>
	Signed-off-by: Christian Brauner <[email protected]>
(cherry picked from commit c626914)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/cleanup.h
#	include/linux/file.h
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Dan Williams <[email protected]>
commit d5934e7
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/d5934e76.failed

When proposing that PCI grow some new cleanup helpers for pci_dev_put()
and pci_dev_{lock,unlock} [1], Bjorn had some fundamental questions
about expectations and best practices. Upon reviewing an updated
changelog with those details he recommended adding them to documentation
in the header file itself.

Add that documentation and link it into the rendering for
Documentation/core-api/.

	Signed-off-by: Dan Williams <[email protected]>
	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
	Reviewed-by: Jonathan Cameron <[email protected]>
	Reviewed-by: Kevin Tian <[email protected]>
Link: https://lore.kernel.org/r/171175585714.2192972.12661675876300167762.stgit@dwillia2-xfh.jf.intel.com
(cherry picked from commit d5934e7)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	Documentation/core-api/index.rst
#	include/linux/cleanup.h
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Uros Bizjak <[email protected]>
commit f730fd5
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/f730fd53.failed

Guard functions in local_lock.h are defined using DEFINE_GUARD() and
DEFINE_LOCK_GUARD_1() macros having lock type defined as pointer in
the percpu address space. The functions, defined by these macros
return value in generic address space, causing:

cleanup.h:157:18: error: return from pointer to non-enclosed address space

and

cleanup.h:214:18: error: return from pointer to non-enclosed address space

when strict percpu checks are enabled.

Add explicit casts to remove address space of the returned pointer.

Found by GCC's named address space checks.

Fixes: e4ab322 ("cleanup: Add conditional guard support")
	Signed-off-by: Uros Bizjak <[email protected]>
	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
(cherry picked from commit f730fd5)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/cleanup.h
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Przemek Kitszel <[email protected]>
commit fcc22ac
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/fcc22ac5.failed

Change scoped_guard() and scoped_cond_guard() macros to make reasoning
about them easier for static analysis tools (smatch, compiler
diagnostics), especially to enable them to tell if the given usage of
scoped_guard() is with a conditional lock class (interruptible-locks,
try-locks) or not (like simple mutex_lock()).

Add compile-time error if scoped_cond_guard() is used for non-conditional
lock class.

Beyond easier tooling and a little shrink reported by bloat-o-meter
this patch enables developer to write code like:

int foo(struct my_drv *adapter)
{
	scoped_guard(spinlock, &adapter->some_spinlock)
		return adapter->spinlock_protected_var;
}

Current scoped_guard() implementation does not support that,
due to compiler complaining:
error: control reaches end of non-void function [-Werror=return-type]

Technical stuff about the change:
scoped_guard() macro uses common idiom of using "for" statement to declare
a scoped variable. Unfortunately, current logic is too hard for compiler
diagnostics to be sure that there is exactly one loop step; fix that.

To make any loop so trivial that there is no above warning, it must not
depend on any non-const variable to tell if there are more steps. There is
no obvious solution for that in C, but one could use the compound
statement expression with "goto" jumping past the "loop", effectively
leaving only the subscope part of the loop semantics.

More impl details:
one more level of macro indirection is now needed to avoid duplicating
label names;
I didn't spot any other place that is using the
"for (...; goto label) if (0) label: break;" idiom, so it's not packed for
reuse beyond scoped_guard() family, what makes actual macros code cleaner.

There was also a need to introduce const true/false variable per lock
class, it is used to aid compiler diagnostics reasoning about "exactly
1 step" loops (note that converting that to function would undo the whole
benefit).

Big thanks to Andy Shevchenko for help on this patch, both internal and
public, ranging from whitespace/formatting, through commit message
clarifications, general improvements, ending with presenting alternative
approaches - all despite not even liking the idea.

Big thanks to Dmitry Torokhov for the idea of compile-time check for
scoped_cond_guard() (to use it only with conditional locsk), and general
improvements for the patch.

Big thanks to David Lechner for idea to cover also scoped_cond_guard().

	Signed-off-by: Przemek Kitszel <[email protected]>
	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
	Reviewed-by: Dmitry Torokhov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
(cherry picked from commit fcc22ac)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/cleanup.h
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author David Lechner <[email protected]>
commit 36c2cf8
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/36c2cf88.failed

Add a new if_not_guard() macro to cleanup.h for handling
conditional guards such as mutext_trylock().

This is more ergonomic than scoped_guard() for most use cases.
Instead of hiding the error handling statement in the macro args, it
works like a normal if statement and allow the error path to be indented
while the normal code flow path is not indented. And it avoid unwanted
side-effect from hidden for loop in scoped_guard().

	Signed-off-by: David Lechner <[email protected]>
Co-developed-by: Fabio M. De Francesco <[email protected]>
	Signed-off-by: Fabio M. De Francesco <[email protected]>
	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
	Reviewed-by: Dan Williams <[email protected]>
Link: https://lkml.kernel.org/r/20241001-cleanup-if_not_cond_guard-v1-1-7753810b0f7a@baylibre.com
(cherry picked from commit 36c2cf8)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/cleanup.h
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Dmitry Torokhov <[email protected]>
commit dc1771f
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/dc1771f7.failed

This reverts commit c0a4009.

Probing a device can take arbitrary long time. In the field we observed
that, for example, probing a bad micro-SD cards in an external USB card
reader (or maybe cards were good but cables were flaky) sometimes takes
longer than 2 minutes due to multiple retries at various levels of the
stack. We can not block uevent_show() method for that long because udev
is reading that attribute very often and that blocks udev and interferes
with booting of the system.

The change that introduced locking was concerned with dev_uevent()
racing with unbinding the driver. However we can handle it without
locking (which will be done in subsequent patch).

There was also claim that synchronization with probe() is needed to
properly load USB drivers, however this is a red herring: the change
adding the lock was introduced in May of last year and USB loading and
probing worked properly for many years before that.

Revert the harmful locking.

	Cc: [email protected]
	Signed-off-by: Dmitry Torokhov <[email protected]>
	Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit dc1771f)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/base/core.c
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Dmitry Torokhov <[email protected]>
commit 04d3e54
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/04d3e546.failed

In preparation to closing a race when reading driver pointer in
dev_uevent() code, instead of setting device->driver pointer directly
introduce device_set_driver() helper.

	Signed-off-by: Dmitry Torokhov <[email protected]>
	Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit 04d3e54)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/base/base.h
#	drivers/base/dd.c
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Dmitry Torokhov <[email protected]>
commit 18daa52
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/18daa524.failed

If userspace reads "uevent" device attribute at the same time as another
threads unbinds the device from its driver, change to dev->driver from a
valid pointer to NULL may result in crash. Fix this by using READ_ONCE()
when fetching the pointer, and take bus' drivers klist lock to make sure
driver instance will not disappear while we access it.

Use WRITE_ONCE() when setting the driver pointer to ensure there is no
tearing.

	Signed-off-by: Dmitry Torokhov <[email protected]>
	Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit 18daa52)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/base/base.h
#	drivers/base/core.c
PlaidCat added 3 commits July 16, 2025 04:01
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author Dan Carpenter <[email protected]>
commit cd7eb8f
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/cd7eb8f8.failed

Currently, if an automatically freed allocation is an error pointer that
will lead to a crash.  An example of this is in wm831x_gpio_dbg_show().

   171	char *label __free(kfree) = gpiochip_dup_line_label(chip, i);
   172	if (IS_ERR(label)) {
   173		dev_err(wm831x->dev, "Failed to duplicate label\n");
   174		continue;
   175  }

The auto clean up function should check for error pointers as well,
otherwise we're going to keep hitting issues like this.

Fixes: 54da6a0 ("locking: Introduce __cleanup() based infrastructure")
	Cc: <[email protected]>
	Signed-off-by: Dan Carpenter <[email protected]>
	Acked-by: David Rientjes <[email protected]>
	Signed-off-by: Vlastimil Babka <[email protected]>
(cherry picked from commit cd7eb8f)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/slab.h
jira LE-3587
Rebuild_History Non-Buildable kernel-4.18.0-553.62.1.el8_10
commit-author David Hildenbrand <[email protected]>
commit 2ccd42b
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/2ccd42b9.failed

If we finds a vq without a name in our input array in
virtio_ccw_find_vqs(), we treat it as "non-existing" and set the vq pointer
to NULL; we will not call virtio_ccw_setup_vq() to allocate/setup a vq.

Consequently, we create only a queue if it actually exists (name != NULL)
and assign an incremental queue index to each such existing queue.

However, in virtio_ccw_register_adapter_ind()->get_airq_indicator() we
will not ignore these "non-existing queues", but instead assign an airq
indicator to them.

Besides never releasing them in virtio_ccw_drop_indicators() (because
there is no virtqueue), the bigger issue seems to be that there will be a
disagreement between the device and the Linux guest about the airq
indicator to be used for notifying a queue, because the indicator bit
for adapter I/O interrupt is derived from the queue index.

The virtio spec states under "Setting Up Two-Stage Queue Indicators":

	... indicator contains the guest address of an area wherein the
	indicators for the devices are contained, starting at bit_nr, one
	bit per virtqueue of the device.

And further in "Notification via Adapter I/O Interrupts":

	For notifying the driver of virtqueue buffers, the device sets the
	bit in the guest-provided indicator area at the corresponding
	offset.

For example, QEMU uses in virtio_ccw_notify() the queue index (passed as
"vector") to select the relevant indicator bit. If a queue does not exist,
it does not have a corresponding indicator bit assigned, because it
effectively doesn't have a queue index.

Using a virtio-balloon-ccw device under QEMU with free-page-hinting
disabled ("free-page-hint=off") but free-page-reporting enabled
("free-page-reporting=on") will result in free page reporting
not working as expected: in the virtio_balloon driver, we'll be stuck
forever in virtballoon_free_page_report()->wait_event(), because the
waitqueue will not be woken up as the notification from the device is
lost: it would use the wrong indicator bit.

Free page reporting stops working and we get splats (when configured to
detect hung wqs) like:

 INFO: task kworker/1:3:463 blocked for more than 61 seconds.
       Not tainted 6.14.0 #4
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:kworker/1:3 [...]
 Workqueue: events page_reporting_process
 Call Trace:
  [<000002f404e6dfb2>] __schedule+0x402/0x1640
  [<000002f404e6f22e>] schedule+0x3e/0xe0
  [<000002f3846a88fa>] virtballoon_free_page_report+0xaa/0x110 [virtio_balloon]
  [<000002f40435c8a4>] page_reporting_process+0x2e4/0x740
  [<000002f403fd3ee2>] process_one_work+0x1c2/0x400
  [<000002f403fd4b96>] worker_thread+0x296/0x420
  [<000002f403fe10b4>] kthread+0x124/0x290
  [<000002f403f4e0dc>] __ret_from_fork+0x3c/0x60
  [<000002f404e77272>] ret_from_fork+0xa/0x38

There was recently a discussion [1] whether the "holes" should be
treated differently again, effectively assigning also non-existing
queues a queue index: that should also fix the issue, but requires other
workarounds to not break existing setups.

Let's fix it without affecting existing setups for now by properly ignoring
the non-existing queues, so the indicator bits will match the queue
indexes.

[1] https://lore.kernel.org/all/[email protected]/

Fixes: a229989 ("virtio: don't allocate vqs when names[i] = NULL")
	Reported-by: Chandra Merla <[email protected]>
	Cc: [email protected]
	Signed-off-by: David Hildenbrand <[email protected]>
	Tested-by: Thomas Huth <[email protected]>
	Reviewed-by: Thomas Huth <[email protected]>
	Reviewed-by: Cornelia Huck <[email protected]>
	Acked-by: Michael S. Tsirkin <[email protected]>
	Acked-by: Christian Borntraeger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 2ccd42b)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/s390/virtio/virtio_ccw.c
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 554396
Number of commits in rpm: 42
Number of commits matched with upstream: 32 (76.19%)
Number of commits in upstream but not in rpm: 554364
Number of commits NOT found in upstream: 10 (23.81%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.62.1.el8_10 for kernel-4.18.0-553.62.1.el8_10
Clean Cherry Picks: 17 (53.12%)
Empty Cherry Picks: 15 (46.88%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-4.18.0-553.62.1.el8_10/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
Copy link

@jdieter jdieter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PlaidCat PlaidCat merged commit e252413 into rocky8_10 Jul 16, 2025
2 checks passed
@PlaidCat PlaidCat deleted the rocky8_10_rebuild branch July 16, 2025 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants