-
Notifications
You must be signed in to change notification settings - Fork 10
[rocky8_10] History rebuild for kernel-4.18.0-553.82.1.el8_10 #674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
PlaidCat
wants to merge
146
commits into
rocky8_10
Choose a base branch
from
rocky8_10_rebuild
base: rocky8_10
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jira LE-4669 cve CVE-2023-53226 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Polaris Pi <[email protected]> commit 1195852 Make sure mwifiex_process_mgmt_packet, mwifiex_process_sta_rx_packet and mwifiex_process_uap_rx_packet, mwifiex_uap_queue_bridged_pkt and mwifiex_process_rx_packet not out-of-bounds access the skb->data buffer. Fixes: 2dbaf75 ("mwifiex: report received management frames to cfg80211") Signed-off-by: Polaris Pi <[email protected]> Reviewed-by: Matthew Wang <[email protected]> Reviewed-by: Brian Norris <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit 1195852) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 cve CVE-2023-53226 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Polaris Pi <[email protected]> commit 2785851 Add missed return in mwifiex_uap_queue_bridged_pkt() and mwifiex_process_rx_packet(). Fixes: 1195852 ("wifi: mwifiex: Fix OOB and integer underflow when rx packets") Signed-off-by: Polaris Pi <[email protected]> Reported-by: Dmitry Antipov <[email protected]> Acked-by: Brian Norris <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit 2785851) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 cve CVE-2023-53226 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Pin-yen Lin <[email protected]> commit aef7a03 Only skip the code path trying to access the rfc1042 headers when the buffer is too small, so the driver can still process packets without rfc1042 headers. Fixes: 1195852 ("wifi: mwifiex: Fix OOB and integer underflow when rx packets") Signed-off-by: Pin-yen Lin <[email protected]> Acked-by: Brian Norris <[email protected]> Reviewed-by: Matthew Wang <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] (cherry picked from commit aef7a03) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 cve CVE-2023-53257 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Johannes Berg <[email protected]> commit 19e4a47 Before checking the action code, check that it even exists in the frame. Reported-by: [email protected] Signed-off-by: Johannes Berg <[email protected]> (cherry picked from commit 19e4a47) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Zhang Yi <[email protected]> commit 3fcc2b8 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/3fcc2b88.failed Refactor and cleanup ext4_da_map_blocks(), reduce some unnecessary parameters and branches, no logic changes. Signed-off-by: Zhang Yi <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> (cherry picked from commit 3fcc2b8) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/ext4/inode.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Zhang Yi <[email protected]> commit acf795d Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/acf795dc.failed ext4_da_map_blocks() only hold i_data_sem in shared mode and i_rwsem when inserting delalloc extents, it could be raced by another querying path of ext4_map_blocks() without i_rwsem, .e.g buffered read path. Suppose we buffered read a file containing just a hole, and without any cached extents tree, then it is raced by another delayed buffered write to the same area or the near area belongs to the same hole, and the new delalloc extent could be overwritten to a hole extent. pread() pwrite() filemap_read_folio() ext4_mpage_readpages() ext4_map_blocks() down_read(i_data_sem) ext4_ext_determine_hole() //find hole ext4_ext_put_gap_in_cache() ext4_es_find_extent_range() //no delalloc extent ext4_da_map_blocks() down_read(i_data_sem) ext4_insert_delayed_block() //insert delalloc extent ext4_es_insert_extent() //overwrite delalloc extent to hole This race could lead to inconsistent delalloc extents tree and incorrect reserved space counter. Fix this by converting to hold i_data_sem in exclusive mode when adding a new delalloc extent in ext4_da_map_blocks(). Cc: [email protected] Signed-off-by: Zhang Yi <[email protected]> Suggested-by: Jan Kara <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> (cherry picked from commit acf795d) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/ext4/inode.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Zhang Yi <[email protected]> commit 8e4e5cd Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/8e4e5cdf.failed Factor out a new common helper ext4_map_query_blocks() from the ext4_da_map_blocks(), it query and return the extent map status on the inode's extent path, no logic changes. Signed-off-by: Zhang Yi <[email protected]> Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Ritesh Harjani (IBM) <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> (cherry picked from commit 8e4e5cd) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/ext4/inode.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Zhang Yi <[email protected]> commit 0ea6560 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/0ea6560a.failed ext4_da_map_blocks looks up for any extent entry in the extent status tree (w/o i_data_sem) and then the looks up for any ondisk extent mapping (with i_data_sem in read mode). If it finds a hole in the extent status tree or if it couldn't find any entry at all, it then takes the i_data_sem in write mode to add a da entry into the extent status tree. This can actually race with page mkwrite & fallocate path. Note that this is ok between 1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the folio lock 2. ext4 buffered write path v/s ext4 fallocate because of the inode lock. But this can race between ext4_page_mkwrite() & ext4 fallocate path ext4_page_mkwrite() ext4_fallocate() block_page_mkwrite() ext4_da_map_blocks() //find hole in extent status tree ext4_alloc_file_blocks() ext4_map_blocks() //allocate block and unwritten extent ext4_insert_delayed_block() ext4_da_reserve_space() //reserve one more block ext4_es_insert_delayed_block() //drop unwritten extent and add delayed extent by mistake Then, the delalloc extent is wrong until writeback and the extra reserved block can't be released any more and it triggers below warning: EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared! Fix the problem by looking up extent status tree again while the i_data_sem is held in write mode. If it still can't find any entry, then we insert a new da entry into the extent status tree. Cc: [email protected] Signed-off-by: Zhang Yi <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> (cherry picked from commit 0ea6560) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/ext4/inode.c
…ce() jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Zhang Yi <[email protected]> commit 53ce42a When removing space, we should use EXT4_EX_NOCACHE because we don't need to cache extents, and we should also use EXT4_EX_NOFAIL to prevent metadata inconsistencies that may arise from memory allocation failures. While ext4_ext_remove_space() already uses these two flags in most places, they are missing in ext4_ext_search_right() and read_extent_tree_block() calls. Unify the flags to ensure consistent behavior throughout the extent removal process. Signed-off-by: Zhang Yi <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> (cherry picked from commit 53ce42a) Signed-off-by: Jonathan Maple <[email protected]>
…teback jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Zhang Yi <[email protected]> commit 402e38e Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/402e38e6.failed Currently, in the I/O writeback path, ext4_map_blocks() may attempt to cache additional unrelated extents in the extent status tree without holding the inode's i_rwsem and the mapping's invalidate_lock. This can lead to stale extent status entries remaining in certain scenarios, potentially causing data corruption. For example, when performing a collapse range in ext4_collapse_range(), it clears the extent cache and dirty pages before removing blocks and shifting extents. It also holds the i_data_sem during these two operations. However, both ext4_ext_remove_space() and ext4_ext_shift_extents() may briefly release the i_data_sem if journal credits are insufficient (ext4_datasem_ensure_credits()). If another writeback process writes dirty pages from other regions during this interval, it may cache extents that are about to be modified. Unless ext4_collapse_range() explicitly clears the extent cache again, these cached entries can become stale and inconsistent with the actual extents. 0 a n b c m | | | | | | [www][wwwwww][wwwwwwww]...[wwwww][wwww]... | | N M Assume that block a is dirty. The collapse range operation is removing data from n to m and drops i_data_sem immediately after removing the extent from b to c. At the same time, a concurrent writeback begins to write back block a; it will reloads the extent from [n, b) into the extent status tree since it does not hold the i_rwsem or the invalidate_lock. After the collapse range operation, it left the stale extent [n, b), which points logical block n to N, but the actual physical block of n should be M. Similarly, both ext4_insert_range() and ext4_truncate() have the same problem. ext4_punch_hole() survived since it re-add a hole extent entry after removing space since commit 9f11182 ("ext4: add a hole extent entry in cache after punch"). In most cases, during dirty page writeback, the block mapping information is likely to be found in the extent cache, making it less necessary to search for physical extents. Consequently, loading unrelated extent caches during writeback appears to be ineffective. Therefore, fix this by adds EXT4_EX_NOCACHE in the writeback path to prevent caching of unrelated extents, eliminating this potential source of corruption. Signed-off-by: Zhang Yi <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> (cherry picked from commit 402e38e) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/ext4/fast_commit.c # fs/ext4/inode.c
jira LE-4669 cve CVE-2025-39864 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Dmitry Antipov <[email protected]> commit 26e8444 Following bss_free() quirk introduced in commit 776b358 ("cfg80211: track hidden SSID networks properly"), adjust cfg80211_update_known_bss() to free the last beacon frame elements only if they're not shared via the corresponding 'hidden_beacon_bss' pointer. Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=30754ca335e6fb7e3092 Fixes: 3ab8227 ("cfg80211: refactor cfg80211_bss_update") Signed-off-by: Dmitry Antipov <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Johannes Berg <[email protected]> (cherry picked from commit 26e8444) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 703a4af Move gfs2_log_pointers_init to recovery.c: there is no need for inlining this function. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 703a4af) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Bob Peterson <[email protected]> commit 9d9b160 We must not call gfs2_consist (which does a file system withdraw) from the freeze glock's freeze_go_xmote_bh function because the withdraw will try to use the freeze glock, thus causing a glock recursion error. This patch changes freeze_go_xmote_bh to call function gfs2_assert_withdraw_delayed instead of gfs2_consist to avoid recursion. Signed-off-by: Bob Peterson <[email protected]> (cherry picked from commit 9d9b160) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 8a43d21 Move the initialization of sdp->sd_log_sequence and sdp->sd_log_flush_head inside gfs2_log_pointers_init(). Use gfs2_replay_incr_blk(). Before this change, the log head lookup code in freeze_go_xmote_bh() didn't update sdp->sd_log_flush_head. This is now fixed, but the code in freeze_go_xmote_bh() appears to be pretty useless in the first place: on a frozen filesystem, the log head will not change. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 8a43d21) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 2ebb94a In function clean_journal(), update @Head to point at the log header that indicates successful recovery: this is where logging needs to resume. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 2ebb94a) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit b66f723 In gfs2_make_fs_rw(), make sure to call gfs2_consist() to report an inconsistency and mark the filesystem as withdrawn when gfs2_find_jhead() fails. At the end of gfs2_make_fs_rw(), when we discover that the filesystem has been withdrawn, make sure we report an error. This also replaces the gfs2_withdrawn() check after gfs2_find_jhead(). Reported-by: Tetsuo Handa <[email protected]> Cc: [email protected] Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit b66f723) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit 93bd5ed Currently at mount time, the recovery code looks up the current log head and, if necessary, replays the log and writes a recovery header to indicate that the log is clean. It does that for each log that may need recovery. We also know that our own log will always be checked as part of that process. Then, the mount code looks up the log head of our own log again. The double log head lookup can be costly, but more importantly, it is unnecessary because we can trivially compute the position of the log head after recovery; all we need to do for that is bump the position and lh_sequence by one when writing a recovery header. With that in mind, move the call to gfs2_log_pointers_init() into gfs2_recover_func() and get rid of the double lookup in gfs2_make_fs_rw(). Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 93bd5ed) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Bob Peterson <[email protected]> commit f5456b5 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/f5456b5d.failed Before this patch, the system ail lists were cleaned up if the logd process withdrew, but on other withdraws, they were not cleaned up. This included the cleaning up of the revokes as well. This patch reorganizes things a bit so that all withdraws (not just logd) clean up the ail lists, including any pending revokes. Signed-off-by: Bob Peterson <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit f5456b5) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/log.h
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Andreas Gruenbacher <[email protected]> commit e320050 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/e320050e.failed We are no longer calling gfs2_find_jhead() on the same log twice, so there is no more reason for keeping the log contents cached across those calls. In addition, log head lookup and log header writing didn't go through the same address space and so the caching wasn't even fully working, anyway. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit e320050) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/gfs2/lops.h
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Vasily Averin <[email protected]> commit 961c613 veth netdevice defines own rx queues and allocates array containing up to 4095 ~750-bytes-long 'struct veth_rq' elements. Such allocation is quite huge and should be accounted to memcg. Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]> (cherry picked from commit 961c613) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Jakub Kicinski <[email protected]> commit 1ce7d30 struct veth_rq is pretty large, 832B total without debug options enabled. Since commit under Fixes we try to pre-allocate enough queues for every possible CPU. Miao Wang reports that this may lead to order-5 allocations which will fail in production. Let the allocation fallback to vmalloc() and try harder. These are the same flags we pass to netdev queue allocation. Reported-and-tested-by: Miao Wang <[email protected]> Fixes: 9d3684c ("veth: create by default nr_possible_cpus queues") Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]> (cherry picked from commit 1ce7d30) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Steve French <[email protected]> commit 65af8f0 Applications that create and extend and write to a file do not expect to see 0 allocation size. When file is extended, set its allocation size to a plausible value until we have a chance to query the server for it. When the file is cached this will prevent showing an impossible number of allocated blocks (like 0). This fixes e.g. xfstests 614 which does 1) create a file and set its size to 64K 2) mmap write 64K to the file 3) stat -c %b for the file (to query the number of allocated blocks) It was failing because we returned 0 blocks. Even though we would return the correct cached file size, we returned an impossible allocation size. Signed-off-by: Steve French <[email protected]> CC: <[email protected]> Reviewed-by: Aurelien Aptel <[email protected]> (cherry picked from commit 65af8f0) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 45a4546 For AES256 encryption (GCM and CCM), we need to adjust the size of a few fields to 32 bytes instead of 16 to accommodate the larger keys. Also, the L value supplied to the key generator needs to be changed from to 256 when these algorithms are used. Keeping the ioctl struct for dumping keys of the same size for now. Will send out a different patch for that one. Signed-off-by: Shyam Prasad N <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]> CC: <[email protected]> # v5.10+ Signed-off-by: Steve French <[email protected]> (cherry picked from commit 45a4546) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit d1a931c We needed a way to identify the channels under the smb session which are in reconnect, so that the traffic to other channels can continue. So I replaced the bool need_reconnect with a bitmask identifying all the channels that need reconnection (named chans_need_reconnect). When a channel needs reconnection, the bit corresponding to the index of the server in ses->chans is used to set this bitmask. Checking if no channels or all the channels need reconnect then becomes very easy. Also wrote some helper macros for checking and setting the bits. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit d1a931c) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit f486ef8 We use the concept of "binding" when one of the secondary channel is in the process of connecting/reconnecting to the server. Till this binding process completes, and the channel is bound to an existing session, we redirect traffic from other established channels on the binding channel, effectively blocking all traffic till individual channels get reconnected. With my last set of commits, we can get rid of this binding serialization. We now have a bitmap of connection states for each channel. We will use this bitmap instead for tracking channel status. Having a bitmap also now enables us to keep the session alive, as long as even a single channel underneath is alive. Unfortunately, this also meant that we need to supply the tcp connection info for the channel during all negotiate and session setup functions. These changes have resulted in a slightly bigger code churn. However, I expect perf and robustness improvements in the mchan scenario after this change. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit f486ef8) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 66eb0c6 Use ses->chans_need_reconnect bitmask to print the connection status of each channel under an SMB session. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 66eb0c6) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 2e0fa29 chan_count keeps track of the total number of channels. Since at least the primary channel will always be connected, this value can never go below 1. Warn if that happens. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 2e0fa29) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 183eea2 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/183eea2e.failed With the new per-channel bitmask for reconnect, we have an option to reconnect the tcp session associated with the channel without reconnecting the smb session. i.e. if there are still channels to operate on, we can continue to use the smb session and tcon. However, there are cases where it makes sense to reconnect the smb session even when there are active channels underneath. For example for SMB session expiry. With this patch, we'll have an option to do either, and use the correct option for specific cases. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 183eea2) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/connect.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 080dc5e Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/080dc5e5.failed While checking/updating status for tcp ses, smb ses or tcon, we take GlobalMid_Lock. This doesn't make any sense. Replaced it with cifs_tcp_ses_lock. Ideally, we should take a spin lock per struct. But since tcp ses, smb ses and tcon objects won't add up to a lot, I think there should not be too much contention. Also, in few other places, these are checked without locking. Added locking for these. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 080dc5e) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/connect.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Enzo Matsumiya <[email protected]> commit 1913e11 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/1913e111.failed Mount will hang if using SMB1 and DFS. This is because every call to get_next_mid() will, unconditionally, mark tcpStatus to CifsNeedReconnect before even establishing the initial connect, because "reconnect" variable was not initialized. Initializing "reconnect" to false fix this issue. Fixes: 220c5bc25d87 ("cifs: take cifs_tcp_ses_lock for status checks") Signed-off-by: Enzo Matsumiya <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 1913e11) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/smb1ops.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Winston Wen <[email protected]> commit 99f2807 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/99f28070.failed Don't collect exiting session in smb2_reconnect_server(), because it will be released soon. Note that the exiting session will stay in server->smb_ses_list until it complete the cifs_free_ipc() and logoff() and then delete itself from the list. Signed-off-by: Winston Wen <[email protected]> Reviewed-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 99f2807) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/smb2pdu.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Winston Wen <[email protected]> commit 66be5c4 Chech the session state and skip it if it's exiting. Signed-off-by: Winston Wen <[email protected]> Reviewed-by: Shyam Prasad N <[email protected]> Cc: [email protected] Signed-off-by: Steve French <[email protected]> (cherry picked from commit 66be5c4) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit ac615db We do not log the session id in crypt_setup when a matching session is not found. Printing the session id helps debugging here. This change does just that. This change also changes this log to FYI, since it is normal to see then during a reconnect. Doing the same for a similar log in case of signed connections. The plan is to have a tracepoint for this event, so that we will be able to see this event if need be. That will be done as another change. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit ac615db) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Winston Wen <[email protected]> commit ff7d80a Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/ff7d80a9.failed We switch session state to SES_EXITING without cifs_tcp_ses_lock now, it may lead to potential use-after-free issue. Consider the following execution processes: Thread 1: __cifs_put_smb_ses() spin_lock(&cifs_tcp_ses_lock) if (--ses->ses_count > 0) spin_unlock(&cifs_tcp_ses_lock) return spin_unlock(&cifs_tcp_ses_lock) ---> **GAP** spin_lock(&ses->ses_lock) if (ses->ses_status == SES_GOOD) ses->ses_status = SES_EXITING spin_unlock(&ses->ses_lock) Thread 2: cifs_find_smb_ses() spin_lock(&cifs_tcp_ses_lock) list_for_each_entry(ses, ...) spin_lock(&ses->ses_lock) if (ses->ses_status == SES_EXITING) spin_unlock(&ses->ses_lock) continue ... spin_unlock(&ses->ses_lock) if (ret) cifs_smb_ses_inc_refcount(ret) spin_unlock(&cifs_tcp_ses_lock) If thread 1 is preempted in the gap and thread 2 start executing, thread 2 will get the session, and soon thread 1 will switch the session state to SES_EXITING and start releasing it, even though thread 1 had increased the session's refcount and still uses it. So switch session state under cifs_tcp_ses_lock to eliminate this gap. Signed-off-by: Winston Wen <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit ff7d80a) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/connect.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 05844bd We store the last updated time for interface list while parsing the interfaces. This change is to just print that info in DebugData. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 05844bd) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 cve CVE-2023-52752 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Paulo Alcantara <[email protected]> commit d328c09 Skip SMB sessions that are being teared down (e.g. @ses->ses_status == SES_EXITING) in cifs_debug_data_proc_show() to avoid use-after-free in @SES. This fixes the following GPF when reading from /proc/fs/cifs/DebugData while mounting and umounting [ 816.251274] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d81: 0000 [#1] PREEMPT SMP NOPTI ... [ 816.260138] Call Trace: [ 816.260329] <TASK> [ 816.260499] ? die_addr+0x36/0x90 [ 816.260762] ? exc_general_protection+0x1b3/0x410 [ 816.261126] ? asm_exc_general_protection+0x26/0x30 [ 816.261502] ? cifs_debug_tcon+0xbd/0x240 [cifs] [ 816.261878] ? cifs_debug_tcon+0xab/0x240 [cifs] [ 816.262249] cifs_debug_data_proc_show+0x516/0xdb0 [cifs] [ 816.262689] ? seq_read_iter+0x379/0x470 [ 816.262995] seq_read_iter+0x118/0x470 [ 816.263291] proc_reg_read_iter+0x53/0x90 [ 816.263596] ? srso_alias_return_thunk+0x5/0x7f [ 816.263945] vfs_read+0x201/0x350 [ 816.264211] ksys_read+0x75/0x100 [ 816.264472] do_syscall_64+0x3f/0x90 [ 816.264750] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 816.265135] RIP: 0033:0x7fd5e669d381 Cc: [email protected] Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit d328c09) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit c3326a6 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/c3326a61.failed We introduced a helper function to be used by non-cifsd threads to mark the connection for reconnect. For multichannel, when only a particular channel needs to be reconnected, this had a bug. This change fixes that by marking that particular channel for reconnect. Fixes: dca6581 ("cifs: use a different reconnect helper for non-cifsd threads") Cc: [email protected] Signed-off-by: Shyam Prasad N <[email protected]> Reviewed-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit c3326a6) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/connect.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 6e5e64c If the mount command has specified multichannel as a mount option, but multichannel is found to be unsupported by the server at the time of mount, we set chan_max to 1. Which means that the user needs to remount the share if the server starts supporting multichannel. This change removes this reset. What it means is that if the user specified multichannel or max_channels during mount, and at this time, multichannel is not supported, but the server starts supporting it at a later point, the client will be capable of scaling out the number of channels. Cc: [email protected] Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 6e5e64c) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit d9a6d78 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/d9a6d780.failed During a session reconnect, it is possible that the server moved to another physical server (happens in case of Azure files). So at this time, force a query of server interfaces again (in case of multichannel session), such that the secondary channels connect to the right IP addresses (possibly updated now). Cc: [email protected] Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit d9a6d78) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/connect.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 0c51cc6 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/0c51cc6f.failed So far, SMB multichannel could only scale up, but not scale down the number of channels. In this series of patch, we now allow the client to deal with the case of multichannel disabled on the server when the share is mounted. With that change, we now need the ability to scale down the channels. This change allows the client to deal with cases of missing channels more gracefully. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 0c51cc6) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/connect.c # fs/cifs/sess.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit a6d8fb5 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/a6d8fb54.failed Today, if the server interfaces RSS capable, we simply choose the fastest interface to setup a channel. This is not a scalable approach, and does not make a lot of attempt to distribute the connections. This change does a weighted distribution of channels across all the available server interfaces, where the weight is a function of the advertised interface speed. Also make sure that we don't mix rdma and non-rdma for channels. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit a6d8fb5) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/sess.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit fa1d050 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/fa1d0508.failed The refcounting of server interfaces should account for the primary channel too. Although this is not strictly necessary, doing so will account for the primary channel in DebugData. Cc: [email protected] Reviewed-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit fa1d050) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/sess.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 7257bcf Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/7257bcf3.failed cifs_chan_is_iface_active checks the channels of a session to see if the associated iface is active. This should always happen with chan_lock held. However, these two callers of this function were missing this locking. This change makes sure the function calls are protected with proper locking. Fixes: b54034a ("cifs: during reconnect, update interface if necessary") Fixes: fa1d050 ("cifs: account for primary channel in the interface list") Cc: [email protected] Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 7257bcf) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/connect.c # fs/cifs/smb2ops.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 09eeb07 parse_server_interfaces should be in complete charge of maintaining the iface_list linked list. Today, iface entries are removed from the list only when the last refcount is dropped. i.e. in release_iface. However, this can result in undercounting of refcount if the server stops advertising interfaces (which Azure SMB server does). This change puts parse_server_interfaces in full charge of maintaining the iface_list. So if an empty list is returned by the server, the entries in the list will immediately be removed. This way, a following call to the same function will not find entries in the list. Fixes: aa45dad ("cifs: change iface_list from array to sorted linked list") Cc: [email protected] Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 09eeb07) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 78e727e Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/78e727e5.failed iface_last_update was an unused field when it was introduced. Later, when we had periodic update of server interface list, this field was used regularly to decide when to update next. However, with the new logic of updating the interfaces, it becomes crucial that this field be updated whenever parse_server_interfaces runs successfully. This change updates this field when either the server does not support query of interfaces; so that we do not query the interfaces repeatedly. It also updates the field when the function reaches the end. Fixes: aa45dad ("cifs: change iface_list from array to sorted linked list") Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 78e727e) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/smb2ops.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 6aac002 After the interface selection policy change to do a weighted round robin, each iface maintains a weight_fulfilled. When the weight_fulfilled reaches the total weight for the iface, we know that the weights can be reset and ifaces can be allocated from scratch again. During channel allocation failures on a particular channel, weight_fulfilled is not incremented. If a few interfaces are inactive, we could end up in a situation where the active interfaces are all allocated for the total_weight, and inactive ones are all that remain. This can cause a situation where no more channels can be allocated further. This change fixes it by increasing weight_fulfilled, even when channel allocation failure happens. This could mean that if there are temporary failures in channel allocation, the iface weights may not strictly be adhered to. But that's still okay. Fixes: a6d8fb5 ("cifs: distribute channels across interfaces based on speed") Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 6aac002) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 cve CVE-2024-35870 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Paulo Alcantara <[email protected]> commit 24a9799 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/24a9799a.failed The UAF bug is due to smb2_reconnect_server() accessing a session that is already being teared down by another thread that is executing __cifs_put_smb_ses(). This can happen when (a) the client has connection to the server but no session or (b) another thread ends up setting @ses->ses_status again to something different than SES_EXITING. To fix this, we need to make sure to unconditionally set @ses->ses_status to SES_EXITING and prevent any other threads from setting a new status while we're still tearing it down. The following can be reproduced by adding some delay to right after the ipc is freed in __cifs_put_smb_ses() - which will give smb2_reconnect_server() worker a chance to run and then accessing @ses->ipc: kinit ... mount.cifs //srv/share /mnt/1 -o sec=krb5,nohandlecache,echo_interval=10 [disconnect srv] ls /mnt/1 &>/dev/null sleep 30 kdestroy [reconnect srv] sleep 10 umount /mnt/1 ... CIFS: VFS: Verify user has a krb5 ticket and keyutils is installed CIFS: VFS: \\srv Send error in SessSetup = -126 CIFS: VFS: Verify user has a krb5 ticket and keyutils is installed CIFS: VFS: \\srv Send error in SessSetup = -126 general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 50 Comm: kworker/3:1 Not tainted 6.9.0-rc2 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39 04/01/2014 Workqueue: cifsiod smb2_reconnect_server [cifs] RIP: 0010:__list_del_entry_valid_or_report+0x33/0xf0 Code: 4f 08 48 85 d2 74 42 48 85 c9 74 59 48 b8 00 01 00 00 00 00 ad de 48 39 c2 74 61 48 b8 22 01 00 00 00 00 74 69 <48> 8b 01 48 39 f8 75 7b 48 8b 72 08 48 39 c6 0f 85 88 00 00 00 b8 RSP: 0018:ffffc900001bfd70 EFLAGS: 00010a83 RAX: dead000000000122 RBX: ffff88810da53838 RCX: 6b6b6b6b6b6b6b6b RDX: 6b6b6b6b6b6b6b6b RSI: ffffffffc02f6878 RDI: ffff88810da53800 RBP: ffff88810da53800 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88810c064000 R13: 0000000000000001 R14: ffff88810c064000 R15: ffff8881039cc000 FS: 0000000000000000(0000) GS:ffff888157c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe3728b1000 CR3: 000000010caa4000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? die_addr+0x36/0x90 ? exc_general_protection+0x1c1/0x3f0 ? asm_exc_general_protection+0x26/0x30 ? __list_del_entry_valid_or_report+0x33/0xf0 __cifs_put_smb_ses+0x1ae/0x500 [cifs] smb2_reconnect_server+0x4ed/0x710 [cifs] process_one_work+0x205/0x6b0 worker_thread+0x191/0x360 ? __pfx_worker_thread+0x10/0x10 kthread+0xe2/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Cc: [email protected] Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 24a9799) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/connect.c
jira LE-4669 cve CVE-2024-53179 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Paulo Alcantara <[email protected]> commit 343d7fe Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/343d7fe6.failed Customers have reported use-after-free in @ses->auth_key.response with SMB2.1 + sign mounts which occurs due to following race: task A task B cifs_mount() dfs_mount_share() get_session() cifs_mount_get_session() cifs_send_recv() cifs_get_smb_ses() compound_send_recv() cifs_setup_session() smb2_setup_request() kfree_sensitive() smb2_calc_signature() crypto_shash_setkey() *UAF* Fix this by ensuring that we have a valid @ses->auth_key.response by checking whether @ses->ses_status is SES_GOOD or SES_EXITING with @ses->ses_lock held. After commit 24a9799 ("smb: client: fix UAF in smb2_reconnect_server()"), we made sure to call ->logoff() only when @SES was known to be good (e.g. valid ->auth_key.response), so it's safe to access signing key when @ses->ses_status == SES_EXITING. Cc: [email protected] Reported-by: Jay Shin <[email protected]> Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 343d7fe) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/smb2transport.c
jira LE-4669 cve CVE-2025-21725 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Paulo Alcantara <[email protected]> commit be7a6a7 It isn't guaranteed that NETWORK_INTERFACE_INFO::LinkSpeed will always be set by the server, so the client must handle any values and then prevent oopses like below from happening: Oops: divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 0 UID: 0 PID: 1323 Comm: cat Not tainted 6.13.0-rc7 #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41 04/01/2014 RIP: 0010:cifs_debug_data_proc_show+0xa45/0x1460 [cifs] Code: 00 00 48 89 df e8 3b cd 1b c1 41 f6 44 24 2c 04 0f 84 50 01 00 00 48 89 ef e8 e7 d0 1b c1 49 8b 44 24 18 31 d2 49 8d 7c 24 28 <48> f7 74 24 18 48 89 c3 e8 6e cf 1b c1 41 8b 6c 24 28 49 8d 7c 24 RSP: 0018:ffffc90001817be0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88811230022c RCX: ffffffffc041bd99 RDX: 0000000000000000 RSI: 0000000000000567 RDI: ffff888112300228 RBP: ffff888112300218 R08: fffff52000302f5f R09: ffffed1022fa58ac R10: ffff888117d2c566 R11: 00000000fffffffe R12: ffff888112300200 R13: 000000012a15343f R14: 0000000000000001 R15: ffff888113f2db58 FS: 00007fe27119e740(0000) GS:ffff888148600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe2633c5000 CR3: 0000000124da0000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? die+0x2e/0x50 ? do_trap+0x159/0x1b0 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs] ? do_error_trap+0x90/0x130 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs] ? exc_divide_error+0x39/0x50 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs] ? asm_exc_divide_error+0x1a/0x20 ? cifs_debug_data_proc_show+0xa39/0x1460 [cifs] ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs] ? seq_read_iter+0x42e/0x790 seq_read_iter+0x19a/0x790 proc_reg_read_iter+0xbe/0x110 ? __pfx_proc_reg_read_iter+0x10/0x10 vfs_read+0x469/0x570 ? do_user_addr_fault+0x398/0x760 ? __pfx_vfs_read+0x10/0x10 ? find_held_lock+0x8a/0xa0 ? __pfx_lock_release+0x10/0x10 ksys_read+0xd3/0x170 ? __pfx_ksys_read+0x10/0x10 ? __rcu_read_unlock+0x50/0x270 ? mark_held_locks+0x1a/0x90 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fe271288911 Code: 00 48 8b 15 01 25 10 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8 20 ad 01 00 f3 0f 1e fa 80 3d b5 a7 10 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec RSP: 002b:00007ffe87c079d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000040000 RCX: 00007fe271288911 RDX: 0000000000040000 RSI: 00007fe2633c6000 RDI: 0000000000000003 RBP: 00007ffe87c07a00 R08: 0000000000000000 R09: 00007fe2713e6380 R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000040000 R13: 00007fe2633c6000 R14: 0000000000000003 R15: 0000000000000000 </TASK> Fix this by setting cifs_server_iface::speed to a sane value (1Gbps) by default when link speed is unset. Cc: Shyam Prasad N <[email protected]> Cc: Tom Talpey <[email protected]> Fixes: a6d8fb5 ("cifs: distribute channels across interfaces based on speed") Reported-by: Frank Sorenson <[email protected]> Reported-by: Jay Shin <[email protected]> Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit be7a6a7) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit c184689 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/c1846893.failed When the server interface info changes (more common in clustered servers like Azure Files), the per-channel iface gets updated. However, this did not update the corresponding dstaddr. As a result these channels will still connect (or try connecting) to older addresses. Fixes: b54034a ("cifs: during reconnect, update interface if necessary") Cc: <[email protected]> Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit c184689) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/sess.c
jira LE-4669 cve CVE-2025-38244 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Paulo Alcantara <[email protected]> commit 711741f Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/711741f9.failed Fix cifs_signal_cifsd_for_reconnect() to take the correct lock order and prevent the following deadlock from happening ====================================================== WARNING: possible circular locking dependency detected 6.16.0-rc3-build2+ #1301 Tainted: G S W ------------------------------------------------------ cifsd/6055 is trying to acquire lock: ffff88810ad56038 (&tcp_ses->srv_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x134/0x200 but task is already holding lock: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&ret_buf->chan_lock){+.+.}-{3:3}: validate_chain+0x1cf/0x270 __lock_acquire+0x60e/0x780 lock_acquire.part.0+0xb4/0x1f0 _raw_spin_lock+0x2f/0x40 cifs_setup_session+0x81/0x4b0 cifs_get_smb_ses+0x771/0x900 cifs_mount_get_session+0x7e/0x170 cifs_mount+0x92/0x2d0 cifs_smb3_do_mount+0x161/0x460 smb3_get_tree+0x55/0x90 vfs_get_tree+0x46/0x180 do_new_mount+0x1b0/0x2e0 path_mount+0x6ee/0x740 do_mount+0x98/0xe0 __do_sys_mount+0x148/0x180 do_syscall_64+0xa4/0x260 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #1 (&ret_buf->ses_lock){+.+.}-{3:3}: validate_chain+0x1cf/0x270 __lock_acquire+0x60e/0x780 lock_acquire.part.0+0xb4/0x1f0 _raw_spin_lock+0x2f/0x40 cifs_match_super+0x101/0x320 sget+0xab/0x270 cifs_smb3_do_mount+0x1e0/0x460 smb3_get_tree+0x55/0x90 vfs_get_tree+0x46/0x180 do_new_mount+0x1b0/0x2e0 path_mount+0x6ee/0x740 do_mount+0x98/0xe0 __do_sys_mount+0x148/0x180 do_syscall_64+0xa4/0x260 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #0 (&tcp_ses->srv_lock){+.+.}-{3:3}: check_noncircular+0x95/0xc0 check_prev_add+0x115/0x2f0 validate_chain+0x1cf/0x270 __lock_acquire+0x60e/0x780 lock_acquire.part.0+0xb4/0x1f0 _raw_spin_lock+0x2f/0x40 cifs_signal_cifsd_for_reconnect+0x134/0x200 __cifs_reconnect+0x8f/0x500 cifs_handle_standard+0x112/0x280 cifs_demultiplex_thread+0x64d/0xbc0 kthread+0x2f7/0x310 ret_from_fork+0x2a/0x230 ret_from_fork_asm+0x1a/0x30 other info that might help us debug this: Chain exists of: &tcp_ses->srv_lock --> &ret_buf->ses_lock --> &ret_buf->chan_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ret_buf->chan_lock); lock(&ret_buf->ses_lock); lock(&ret_buf->chan_lock); lock(&tcp_ses->srv_lock); *** DEADLOCK *** 3 locks held by cifsd/6055: #0: ffffffff857de398 (&cifs_tcp_ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x7b/0x200 #1: ffff888119c64060 (&ret_buf->ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x9c/0x200 #2: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200 Cc: [email protected] Reported-by: David Howells <[email protected]> Fixes: d7d7a66 ("cifs: avoid use of global locks for high contention data") Reviewed-by: David Howells <[email protected]> Tested-by: David Howells <[email protected]> Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 711741f) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/cifsglob.h # fs/cifs/connect.c
jira LE-4669 cve CVE-2024-35999 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Steve French <[email protected]> commit 8094a60 Coverity spotted a place where we should have been holding the channel lock when accessing the ses channel index. Addresses-Coverity: 1582039 ("Data race condition (MISSING_LOCK)") Cc: [email protected] Reviewed-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 8094a60) Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 66d590b Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/66d590b8.failed Our current approach to select a channel for sending requests is this: 1. iterate all channels to find the min and max queue depth 2. if min and max are not the same, pick the channel with min depth 3. if min and max are same, round robin, as all channels are equally loaded The problem with this approach is that there's a lag between selecting a channel and sending the request (that increases the queue depth on the channel). While these numbers will eventually catch up, there could be a skew in the channel usage, depending on the application's I/O parallelism and the server's speed of handling requests. With sufficient parallelism, this lag can artificially increase the queue depth, thereby impacting the performance negatively. This change will change the step 1 above to start the iteration from the last selected channel. This is to reduce the skew in channel usage even in the presence of this lag. Fixes: ea90708 ("cifs: use the least loaded channel for sending requests") Cc: <[email protected]> Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 66d590b) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/transport.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 9d5eff7 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/9d5eff78.failed We now do a weighted selection of server interfaces when allocating new channels. The weights are decided based on the speed advertised. The fulfilled weight for an interface is a counter that is used to track the interface selection. It should be reset back to zero once all interfaces fulfilling their weight. In cifs_chan_update_iface, this reset logic was missing. As a result when the server interface list changes, the client may not be able to find a new candidate for other channels after all interfaces have been fulfilled. Fixes: a6d8fb5 ("cifs: distribute channels across interfaces based on speed") Cc: <[email protected]> Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 9d5eff7) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/cifs/sess.c
jira LE-4669 Rebuild_History Non-Buildable kernel-4.18.0-553.82.1.el8_10 commit-author Shyam Prasad N <[email protected]> commit 29954d5 My last change in this area introduced a change which accounted for primary channel in the interface ref count. However, it did not reduce this ref count on deallocation of the primary channel. i.e. during umount. Fixing this leak here, by dropping this ref count for primary channel while freeing up the session. Fixes: fa1d050 ("cifs: account for primary channel in the interface list") Cc: [email protected] Reported-by: Paulo Alcantara <[email protected]> Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 29954d5) Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50% Number of commits in upstream range v4.18~1..kernel-mainline: 567757 Number of commits in rpm: 155 Number of commits matched with upstream: 145 (93.55%) Number of commits in upstream but not in rpm: 567612 Number of commits NOT found in upstream: 10 (6.45%) Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.82.1.el8_10 for kernel-4.18.0-553.82.1.el8_10 Clean Cherry Picks: 74 (51.03%) Empty Cherry Picks: 71 (48.97%) _______________________________ Full Details Located here: ciq/ciq_backports/kernel-4.18.0-553.82.1.el8_10/rebuild.details.txt Includes: * git commit header above * Empty Commits with upstream SHA * RPM ChangeLog Entries that could not be matched Individual Empty Commit failures contained in the same containing directory. The git message for empty commits will have the path for the failed commit. File names are the first 8 characters of the upstream SHA
thefossguy-ciq
approved these changes
Nov 7, 2025
thefossguy-ciq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚤
jdieter
approved these changes
Nov 7, 2025
jdieter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
General Process:
src.rpm4.18.0-553git cherry-pickrpmbuild -bpfrom corresponding src.rpm.Checking Rebuild Commits for Potentially missing commits:
kernel-4.18.0-553.82.1.el8_10
BUILD
KSelfTest