Skip to content

[rocky9_6] History Rebuild for kernel-5.14.0-570.26.1.el9_6 #416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 37 commits into from
Jul 16, 2025

Conversation

PlaidCat
Copy link
Collaborator

This is the attempt at a re-builder built on Cron and some internal tools, but the same process is as follows as previous rebuilds

  • Download all unprocessed src.rpm
  • for each src,pm
    • Find all commits in changelog up to last known tag ... in this case 5.14.0-570
    • Re-play commits in reverse order (oldest in change log to newest) with git cherry-pick
    • After replay replace ENTIRE code in branch with rpmbuild -bp from corresponding src.rpm.
    • Tag Rebuild branch
  • Use New Local Build with prodman and test (note test results will be different than usual)

Checking Rebuild Commits

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..kernel-mainline: 309912
Number of commits in rpm: 41
Number of commits matched with upstream: 39 (95.12%)
Number of commits in upstream but not in rpm: 309873
Number of commits NOT found in upstream: 2 (4.88%)

Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.26.1.el9_6 for kernel-5.14.0-570.26.1.el9_6
Clean Cherry Picks: 14 (35.90%)
Empty Cherry Picks: 22 (56.41%)
_______________________________

__EMPTY COMMITS__________________________
6857be5fecaebd9773ff27b6d29b6fff3b1abbce mm: introduce ARCH_SUPPORTS_HUGE_PFNMAP and special bits to pmd/pud
ef713ec3a566d3e5e011c5d6201eb661ebf94c1f mm: drop is_huge_zero_pud()
10d83d7781a8a6ff02bafd172c1ab183b27f8d5a mm/pagewalk: check pfnmap for folio_walk_start()
cb10c28ac82c9b7a5e9b3b1dc7157036c20c36dd mm: remove follow_pfn
6da8e9634bb7e3fdad9ae0e4db873a05036c4343 mm: new follow_pfnmap API
b1b46751671be5a426982f037a47ae05f37ff80b mm: fix follow_pfnmap API lockdep assert
5b34b76cb0cd8a21dee5c7677eae98480b0d05cc mm: move follow_phys to arch/x86/mm/pat/memtype.c
29ae7d96d166fa08c7232daf8a314ef5ba1efd20 mm: pass VMA instead of MM to follow_pte()
5731aacd54a883dd2c1a5e8c85e1fe78fc728dc7 KVM: use follow_pfnmap API
bd8c2d18bf5cccd8842d00b17d6f222beb98b1b3 s390/pci_mmio: use follow_pfnmap API
cbea8536d933d546ceb1005bf9c04f9d01da8092 mm/x86/pat: use the new follow_pfnmap API
a77f9489f1d7873a56e1d6640cc0c4865f64176b vfio: use the new follow_pfnmap API
b17269a51cc7f046a6f2cf9a6c314a0de885e5a5 mm/access_process_vm: use the new follow_pfnmap API
c5541ba378e3d36ea88bf5839d5b23e33e7d1627 mm: follow_pte() improvements
b0a1c0d0edcd75a0f8ec5fd19dbd64b8d097f534 mm: remove follow_pte()
75182022a0439788415b2dd1db3086e07aa506f7 mm/x86: support large pfn mappings
3e509c9b03f9abc7804c80bed266a6cc4286a5a8 mm/arm64: support large pfn mappings
f9e54c3a2f5b79ecc57c7bc7d0d3521e461a2101 vfio/pci: implement huge_fault support
09dfc8a5f2ce897005a94bf66cca4f91e4e03700 vfio/pci: Fallback huge faults for unaligned pfn
62fb8adc43afad5fa1c9cadc6f3a8e9fb72af194 mm: Provide address mask in struct follow_pfnmap_args
0fd06844de5d063cb384384e06a11ec7141a35d5 vfio/type1: Use mapping page mask for pfnmaps
c1d9dac0db168198b6f63f460665256dedad9b6e vfio/pci: Align huge faults to order

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 9, debranding and Rocky branding'
Ensure aarch64 kernel is not compressed

Build

[jmaple@devbox code]$ egrep -B 5 -A 5 "\[TIMER\]|^Starting Build" $(ls -t kbuild* | head -n1)
/mnt/code/kernel-src-tree
  CLEAN   scripts/basic
  CLEAN   scripts/kconfig
  CLEAN   include/config include/generated .config .config.old .version
[TIMER]{MRPROPER}: 6s
x86_64 architecture detected, copying config
'configs/kernel-x86_64-rhel.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky9_6_rebuild-8cc6f289778f"
Making olddefconfig
--
  HOSTCC  scripts/kconfig/util.o
  HOSTLD  scripts/kconfig/conf
#
# configuration written to .config
#
Starting Build
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_64.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
--
  LD [M]  sound/xen/snd_xen_front.ko
  BTF [M] sound/usb/snd-usb-audio.ko
  BTF [M] sound/virtio/virtio_snd.ko
  BTF [M] sound/x86/snd-hdmi-lpe-audio.ko
  BTF [M] sound/xen/snd_xen_front.ko
[TIMER]{BUILD}: 1616s
Making Modules
  INSTALL /lib/modules/5.14.0-rocky9_6_rebuild-8cc6f289778f/kernel/arch/x86/crypto/blake2s-x86_64.ko
  INSTALL /lib/modules/5.14.0-rocky9_6_rebuild-8cc6f289778f/kernel/arch/x86/crypto/blowfish-x86_64.ko
  INSTALL /lib/modules/5.14.0-rocky9_6_rebuild-8cc6f289778f/kernel/arch/x86/crypto/camellia-aesni-avx-x86_64.ko
  INSTALL /lib/modules/5.14.0-rocky9_6_rebuild-8cc6f289778f/kernel/arch/x86/crypto/camellia-aesni-avx2.ko
--
  SIGN    /lib/modules/5.14.0-rocky9_6_rebuild-8cc6f289778f/kernel/sound/virtio/virtio_snd.ko
  STRIP   /lib/modules/5.14.0-rocky9_6_rebuild-8cc6f289778f/kernel/sound/xen/snd_xen_front.ko
  SIGN    /lib/modules/5.14.0-rocky9_6_rebuild-8cc6f289778f/kernel/sound/x86/snd-hdmi-lpe-audio.ko
  SIGN    /lib/modules/5.14.0-rocky9_6_rebuild-8cc6f289778f/kernel/sound/xen/snd_xen_front.ko
  DEPMOD  /lib/modules/5.14.0-rocky9_6_rebuild-8cc6f289778f
[TIMER]{MODULES}: 12s
Making Install
sh ./arch/x86/boot/install.sh 5.14.0-rocky9_6_rebuild-8cc6f289778f \
        arch/x86/boot/bzImage System.map "/boot"
[TIMER]{INSTALL}: 22s
Checking kABI
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-5.14.0-rocky9_6_rebuild-8cc6f289778f and Index to 0
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 6s
[TIMER]{BUILD}: 1616s
[TIMER]{MODULES}: 12s
[TIMER]{INSTALL}: 22s
[TIMER]{TOTAL} 1662s
Rebooting in 10 seconds

KSelfTests

[jmaple@devbox code]$ ls -rt kselftest.* | tail -n2 | while read line; do echo $line; grep '^ok ' $line | wc -l ; done
kselftest.5.14.0-rocky9_6_rebuild-cad0cbcb03be.log
312
kselftest.5.14.0-rocky9_6_rebuild-8cc6f289778f.log
317

PlaidCat added 30 commits July 15, 2025 00:01
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit 6857be5
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/6857be5f.failed

Patch series "mm: Support huge pfnmaps", v2.

Overview
========

This series implements huge pfnmaps support for mm in general.  Huge
pfnmap allows e.g.  VM_PFNMAP vmas to map in either PMD or PUD levels,
similar to what we do with dax / thp / hugetlb so far to benefit from TLB
hits.  Now we extend that idea to PFN mappings, e.g.  PCI MMIO bars where
it can grow as large as 8GB or even bigger.

Currently, only x86_64 (1G+2M) and arm64 (2M) are supported.  The last
patch (from Alex Williamson) will be the first user of huge pfnmap, so as
to enable vfio-pci driver to fault in huge pfn mappings.

Implementation
==============

In reality, it's relatively simple to add such support comparing to many
other types of mappings, because of PFNMAP's specialties when there's no
vmemmap backing it, so that most of the kernel routines on huge mappings
should simply already fail for them, like GUPs or old-school follow_page()
(which is recently rewritten to be folio_walk* APIs by David).

One trick here is that we're still unmature on PUDs in generic paths here
and there, as DAX is so far the only user.  This patchset will add the 2nd
user of it.  Hugetlb can be a 3rd user if the hugetlb unification work can
go on smoothly, but to be discussed later.

The other trick is how to allow gup-fast working for such huge mappings
even if there's no direct sign of knowing whether it's a normal page or
MMIO mapping.  This series chose to keep the pte_special solution, so that
it reuses similar idea on setting a special bit to pfnmap PMDs/PUDs so
that gup-fast will be able to identify them and fail properly.

Along the way, we'll also notice that the major pgtable pfn walker, aka,
follow_pte(), will need to retire soon due to the fact that it only works
with ptes.  A new set of simple API is introduced (follow_pfnmap* API) to
be able to do whatever follow_pte() can already do, plus that it can also
process huge pfnmaps now.  Half of this series is about that and
converting all existing pfnmap walkers to use the new API properly.
Hopefully the new API also looks better to avoid exposing e.g.  pgtable
lock details into the callers, so that it can be used in an even more
straightforward way.

Here, three more options will be introduced and involved in huge pfnmap:

  - ARCH_SUPPORTS_HUGE_PFNMAP

    Arch developers will need to select this option when huge pfnmap is
    supported in arch's Kconfig.  After this patchset applied, both x86_64
    and arm64 will start to enable it by default.

  - ARCH_SUPPORTS_PMD_PFNMAP / ARCH_SUPPORTS_PUD_PFNMAP

    These options are for driver developers to identify whether current
    arch / config supports huge pfnmaps, making decision on whether it can
    use the huge pfnmap APIs to inject them.  One can refer to the last
    vfio-pci patch from Alex on the use of them properly in a device
    driver.

So after the whole set applied, and if one would enable some dynamic debug
lines in vfio-pci core files, we should observe things like:

  vfio-pci 0000:00:06.0: vfio_pci_mmap_huge_fault(,order = 9) BAR 0 page offset 0x0: 0x100
  vfio-pci 0000:00:06.0: vfio_pci_mmap_huge_fault(,order = 9) BAR 0 page offset 0x200: 0x100
  vfio-pci 0000:00:06.0: vfio_pci_mmap_huge_fault(,order = 9) BAR 0 page offset 0x400: 0x100

In this specific case, it says that vfio-pci faults in PMDs properly for a
few BAR0 offsets.

Patch Layout
============

Patch 1:         Introduce the new options mentioned above for huge PFNMAPs
Patch 2:         A tiny cleanup
Patch 3-8:       Preparation patches for huge pfnmap (include introduce
                 special bit for pmd/pud)
Patch 9-16:      Introduce follow_pfnmap*() API, use it everywhere, and
                 then drop follow_pte() API
Patch 17:        Add huge pfnmap support for x86_64
Patch 18:        Add huge pfnmap support for arm64
Patch 19:        Add vfio-pci support for all kinds of huge pfnmaps (Alex)

TODO
====

More architectures / More page sizes
------------------------------------

Currently only x86_64 (2M+1G) and arm64 (2M) are supported.  There seems
to have plan to support arm64 1G later on top of this series [2].

Any arch will need to first support THP / THP_1G, then provide a special
bit in pmds/puds to support huge pfnmaps.

remap_pfn_range() support
-------------------------

Currently, remap_pfn_range() still only maps PTEs.  With the new option,
remap_pfn_range() can logically start to inject either PMDs or PUDs when
the alignment requirements match on the VAs.

When the support is there, it should be able to silently benefit all
drivers that is using remap_pfn_range() in its mmap() handler on better
TLB hit rate and overall faster MMIO accesses similar to processor on
hugepages.

More driver support
-------------------

VFIO is so far the only consumer for the huge pfnmaps after this series
applied.  Besides above remap_pfn_range() generic optimization, device
driver can also try to optimize its mmap() on a better VA alignment for
either PMD/PUD sizes.  This may, iiuc, normally require userspace changes,
as the driver doesn't normally decide the VA to map a bar.  But I don't
think I know all the drivers to know the full picture.

Credits all go to Alex on help testing the GPU/NIC use cases above.

[0] https://lore.kernel.org/r/[email protected]
[1] https://lore.kernel.org/r/[email protected]
[2] https://lore.kernel.org/r/[email protected]

This patch (of 19):

This patch introduces the option to introduce special pte bit into
pmd/puds.  Archs can start to define pmd_special / pud_special when
supported by selecting the new option.  Per-arch support will be added
later.

Before that, create fallbacks for these helpers so that they are always
available.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 6857be5)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	mm/Kconfig
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit ef713ec
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/ef713ec3.failed

It constantly returns false since 2017.  One assertion is added in 2019 but
it should never have triggered, IOW it means what is checked should be
asserted instead.

If it didn't exist for 7 years maybe it's good idea to remove it and only
add it when it comes.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Reviewed-by: Jason Gunthorpe <[email protected]>
	Acked-by: David Hildenbrand <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit ef713ec)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/huge_mm.h
#	mm/huge_memory.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit 3c8e44c

We need these special bits to be around on pfnmaps.  Mark properly for
!devmap case, reflecting that there's no page struct backing the entry.

Link: https://lkml.kernel.org/r/[email protected]
	Reviewed-by: Jason Gunthorpe <[email protected]>
	Signed-off-by: Peter Xu <[email protected]>
	Acked-by: David Hildenbrand <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 3c8e44c)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit ae3c99e

Since gup-fast doesn't have the vma reference, teach it to detect such huge
pfnmaps by checking the special bit for pmd/pud too, just like ptes.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Acked-by: David Hildenbrand <[email protected]>
	Reviewed-by: Jason Gunthorpe <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit ae3c99e)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit 10d83d7
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/10d83d77.failed

Teach folio_walk_start() to recognize special pmd/pud mappings, and fail
them properly as it means there's no folio backing them.

[[email protected]: remove some stale comments, per David]
  Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 10d83d7)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	mm/pagewalk.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit bc02afb

Teach the fork code to properly copy pfnmaps for pmd/pud levels.  Pud is
much easier, the write bit needs to be persisted though for writable and
shared pud mappings like PFNMAP ones, otherwise a follow up write in
either parent or child process will trigger a write fault.

Do the same for pmd level.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit bc02afb)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author David Hildenbrand <[email protected]>
commit 47fa301

We should only check for pmd_special() after we made sure that we have a
present PMD.  For example, if we have a migration PMD, pmd_special() might
indicate that we have a special PMD although we really don't.

This fixes confusing migration entries as PFN mappings, and not doing what
we are supposed to do in the "is_swap_pmd()" case further down in the
function -- including messing up COW, page table handling and accounting.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: bc02afb ("mm/fork: accept huge pfnmap entries")
	Signed-off-by: David Hildenbrand <[email protected]>
	Reported-by: [email protected]
Closes: https://lore.kernel.org/lkml/[email protected]/
	Reviewed-by: Peter Xu <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 47fa301)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit 0515e02

There're:

  - 8 archs (arc, arm64, include, mips, powerpc, s390, sh, x86) that
  support pte_pgprot().

  - 2 archs (x86, sparc) that support pmd_pgprot().

  - 1 arch (x86) that support pud_pgprot().

Always define them to be used in generic code, and then we don't need to
fiddle with "#ifdef"s when doing so.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Reviewed-by: Jason Gunthorpe <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 0515e02)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Christoph Hellwig <[email protected]>
commit cb10c28
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/cb10c28a.failed

Remove follow_pfn now that the last user is gone.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Christoph Hellwig <[email protected]>
	Reviewed-by: David Hildenbrand <[email protected]>
	Cc: Andy Lutomirski <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: Fei Li <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Peter Zijlstra <[email protected]>
	Cc: Nathan Chancellor <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit cb10c28)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	mm/nommu.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit 6da8e96
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/6da8e963.failed

Introduce a pair of APIs to follow pfn mappings to get entry information.
It's very similar to what follow_pte() does before, but different in that
it recognizes huge pfn mappings.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 6da8e96)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	mm/memory.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Linus Torvalds <[email protected]>
commit b1b4675
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/b1b46751.failed

The lockdep asserts for the new follow_pfnmap() API "knows" that a
pfnmap always has a vma->vm_file, since that's the only way to create
such a mapping.

And that's actually true for all the normal cases.  But not for the mmap
failure case, where the incomplete mapping is torn down and we have
cleared vma->vm_file because the failure occured before the file was
linked to the vma.

So this codepath does actually need to check for vm_file being NULL.

	Reported-by: Jann Horn <[email protected]>
Fixes: 6da8e96 ("mm: new follow_pfnmap API")
	Cc: Peter Xu <[email protected]>
	Cc: Andrew Morton <[email protected]>
	Signed-off-by: Linus Torvalds <[email protected]>
(cherry picked from commit b1b4675)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	mm/memory.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Christoph Hellwig <[email protected]>
commit 5b34b76
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/5b34b76c.failed

follow_phys is only used by two callers in arch/x86/mm/pat/memtype.c.
Move it there and hardcode the two arguments that get the same values
passed by both callers.

[[email protected]: conflict resolutions]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Christoph Hellwig <[email protected]>
	Signed-off-by: David Hildenbrand <[email protected]>
	Reviewed-by: David Hildenbrand <[email protected]>
	Cc: Andy Lutomirski <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: Fei Li <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Peter Zijlstra <[email protected]>
	Cc: Nathan Chancellor <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 5b34b76)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/mm.h
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author David Hildenbrand <[email protected]>
commit 29ae7d9
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/29ae7d96.failed

... and centralize the VM_IO/VM_PFNMAP sanity check in there. We'll
now also perform these sanity checks for direct follow_pte()
invocations.

For generic_access_phys(), we might now check multiple times: nothing to
worry about, really.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: David Hildenbrand <[email protected]>
	Acked-by: Sean Christopherson <[email protected]>	[KVM]
	Cc: Alex Williamson <[email protected]>
	Cc: Christoph Hellwig <[email protected]>
	Cc: Fei Li <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Yonghua Huang <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 29ae7d9)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/x86/mm/pat/memtype.c
#	drivers/virt/acrn/mm.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit 5731aac
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/5731aacd.failed

Use the new pfnmap API to allow huge MMIO mappings for VMs.  The rest work
is done perfectly on the other side (host_pfn_mapping_level()).

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 5731aac)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	virt/kvm/kvm_main.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit bd8c2d1
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/bd8c2d18.failed

Use the new API that can understand huge pfn mappings.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit bd8c2d1)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/s390/pci/pci_mmio.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit cbea853
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/cbea8536.failed

Use the new API that can understand huge pfn mappings.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit cbea853)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/x86/mm/pat/memtype.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit a77f948
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/a77f9489.failed

Use the new API that can understand huge pfn mappings.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit a77f948)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/vfio/vfio_iommu_type1.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit b17269a
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/b17269a5.failed

Use the new API that can understand huge pfn mappings.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit b17269a)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	mm/memory.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author David Hildenbrand <[email protected]>
commit c5541ba
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/c5541ba3.failed

follow_pte() is now our main function to lookup PTEs in VM_PFNMAP/VM_IO
VMAs.  Let's perform some more sanity checks to make this exported
function harder to abuse.

Further, extend the doc a bit, it still focuses on the KVM use case with
MMU notifiers.  Drop the KVM+follow_pfn() comment, follow_pfn() is no
more, and we have other users nowadays.

Also extend the doc regarding refcounted pages and the interaction with
MMU notifiers.

KVM is one example that uses MMU notifiers and can deal with refcounted
pages properly.  VFIO is one example that doesn't use MMU notifiers, and
to prevent use-after-free, rejects refcounted pages: pfn_valid(pfn) &&
!PageReserved(pfn_to_page(pfn)).  Protection changes are less of a concern
for users like VFIO: the behavior is similar to longterm-pinning a page,
and getting the PTE protection changed afterwards.

The primary concern with refcounted pages is use-after-free, which callers
should be aware of.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: David Hildenbrand <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Christoph Hellwig <[email protected]>
	Cc: Fei Li <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Yonghua Huang <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit c5541ba)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	mm/memory.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit b0a1c0d
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/b0a1c0d0.failed

follow_pte() users have been converted to follow_pfnmap*().  Remove the
API.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit b0a1c0d)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/mm.h
#	mm/memory.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit 7518202
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/75182022.failed

Helpers to install and detect special pmd/pud entries.  In short, bit 9 on
x86 is not used for pmd/pud, so we can directly define them the same as
the pte level.  One note is that it's also used in _PAGE_BIT_CPA_TEST but
that is only used in the debug test, and shouldn't conflict in this case.

One note is that pxx_set|clear_flags() for pmd/pud will need to be moved
upper so that they can be referenced by the new special bit helpers.
There's no change in the code that was moved.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 7518202)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/x86/Kconfig
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Peter Xu <[email protected]>
commit 3e509c9
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/3e509c9b.failed

Support huge pfnmaps by using bit 56 (PTE_SPECIAL) for "special" on
pmds/puds.  Provide the pmd/pud helpers to set/get special bit.

There's one more thing missing for arm64 which is the pxx_pgprot() for
pmd/pud.  Add them too, which is mostly the same as the pte version by
dropping the pfn field.  These helpers are essential to be used in the new
follow_pfnmap*() API to report valid pgprot_t results.

Note that arm64 doesn't yet support huge PUD yet, but it's still
straightforward to provide the pud helpers that we need altogether.  Only
PMD helpers will make an immediate benefit until arm64 will support huge
PUDs first in general (e.g.  in THPs).

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Alex Williamson <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 3e509c9)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/arm64/Kconfig
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Alex Williamson <[email protected]>
commit f9e54c3
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/f9e54c3a.failed

With the addition of pfnmap support in vmf_insert_pfn_{pmd,pud}() we can
take advantage of PMD and PUD faults to PCI BAR mmaps and create more
efficient mappings.  PCI BARs are always a power of two and will typically
get at least PMD alignment without userspace even trying.  Userspace
alignment for PUD mappings is also not too difficult.

Consolidate faults through a single handler with a new wrapper for
standard single page faults.  The pre-faulting behavior of commit
d71a989 ("vfio/pci: Insert full vma on mmap'd MMIO fault") is removed
in this refactoring since huge_fault will cover the bulk of the faults and
results in more efficient page table usage.  We also want to avoid that
pre-faulted single page mappings preempt huge page mappings.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Alex Williamson <[email protected]>
	Signed-off-by: Peter Xu <[email protected]>
	Cc: Alexander Gordeev <[email protected]>
	Cc: Aneesh Kumar K.V <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Christian Borntraeger <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Gavin Shan <[email protected]>
	Cc: Gerald Schaefer <[email protected]>
	Cc: Heiko Carstens <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: Matthew Wilcox <[email protected]>
	Cc: Niklas Schnelle <[email protected]>
	Cc: Paolo Bonzini <[email protected]>
	Cc: Ryan Roberts <[email protected]>
	Cc: Sean Christopherson <[email protected]>
	Cc: Sven Schnelle <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Vasily Gorbik <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Zi Yan <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit f9e54c3)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/vfio/pci/vfio_pci_core.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Alex Williamson <[email protected]>
commit 09dfc8a
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/09dfc8a5.failed

The PFN must also be aligned to the fault order to insert a huge
pfnmap.  Test the alignment and fallback when unaligned.

Fixes: f9e54c3 ("vfio/pci: implement huge_fault support")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219619
	Reported-by: Athul Krishna <[email protected]>
	Reported-by: Precific <[email protected]>
	Reviewed-by: Peter Xu <[email protected]>
	Tested-by: Precific <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Cc: [email protected]
	Signed-off-by: Alex Williamson <[email protected]>
(cherry picked from commit 09dfc8a)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/vfio/pci/vfio_pci_core.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Alex Williamson <[email protected]>
commit afe84f3

pin_user_pages_remote() can currently return zero for invalid args
or zero nr_pages, neither of which should ever happen.  However
vaddr_get_pfns() indicates it should only ever return a positive
value or -errno and there's a theoretical case where this can slip
through and be unhandled by callers.  Therefore convert zero to
-EFAULT.

	Reviewed-by: Peter Xu <[email protected]>
	Reviewed-by: Mitchell Augustin <[email protected]>
	Tested-by: Mitchell Augustin <[email protected]>
	Reviewed-by: Jason Gunthorpe <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Alex Williamson <[email protected]>
(cherry picked from commit afe84f3)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Alex Williamson <[email protected]>
commit 7a701e9

This is a step towards passing the structure to vaddr_get_pfns()
directly in order to provide greater distinction between page backed
pfns and pfnmaps.

	Reviewed-by: Peter Xu <[email protected]>
	Reviewed-by: Mitchell Augustin <[email protected]>
	Tested-by: Mitchell Augustin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Alex Williamson <[email protected]>
(cherry picked from commit 7a701e9)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Alex Williamson <[email protected]>
commit eb996ee

Passing the vfio_batch to vaddr_get_pfns() allows for greater
distinction between page backed pfns and pfnmaps.  In the case of page
backed pfns, vfio_batch.size is set to a positive value matching the
number of pages filled in vfio_batch.pages.  For a pfnmap,
vfio_batch.size remains zero as vfio_batch.pages are not used.  In both
cases the return value continues to indicate the number of pfns and the
provided pfn arg is set to the initial pfn value.

This allows us to shortcut the pfnmap case, which is detected by the
zero vfio_batch.size.  pfnmaps do not contribute to locked memory
accounting, therefore we can update counters and continue directly,
which also enables a future where vaddr_get_pfns() can return a value
greater than one for consecutive pfnmaps.

NB. Now that we're not guessing whether the initial pfn is page backed
or pfnmap, we no longer need to special case the put_pfn() and batch
size reset.  It's safe for vfio_batch_unpin() to handle this case.

	Reviewed-by: Peter Xu <[email protected]>
	Reviewed-by: Mitchell Augustin <[email protected]>
	Tested-by: Mitchell Augustin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Alex Williamson <[email protected]>
(cherry picked from commit eb996ee)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Alex Williamson <[email protected]>
commit 0635559

Page count should more consistently be an unsigned long when passed as
an argument while functions returning a number of pages should use a
signed long to allow for -errno.

vaddr_get_pfns() can therefore be upgraded to return long, though in
practice it's currently limited by the batch capacity.  In fact, the
batch indexes are noted to never hold negative values, so while it
doesn't make sense to bloat the structure with unsigned longs in this
case, it does make sense to specify these as unsigned.

No change in behavior expected.

	Reviewed-by: Peter Xu <[email protected]>
	Reviewed-by: Mitchell Augustin <[email protected]>
	Tested-by: Mitchell Augustin <[email protected]>
	Reviewed-by: Jason Gunthorpe <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Alex Williamson <[email protected]>
(cherry picked from commit 0635559)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Alex Williamson <[email protected]>
commit 62fb8ad
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/62fb8adc.failed

follow_pfnmap_start() walks the page table for a given address and
fills out the struct follow_pfnmap_args in pfnmap_args_setup().
The address mask of the page table level is already provided to this
latter function for calculating the pfn.  This address mask can also
be useful for the caller to determine the extent of the contiguous
mapping.

For example, vfio-pci now supports huge_fault for pfnmaps and is able
to insert pud and pmd mappings.  When we DMA map these pfnmaps, ex.
PCI MMIO BARs, we iterate follow_pfnmap_start() to get each pfn to test
for a contiguous pfn range.  Providing the mapping address mask allows
us to skip the extent of the mapping level.  Assuming a 1GB pud level
and 4KB page size, iterations are reduced by a factor of 256K.  In wall
clock time, mapping a 32GB PCI BAR is reduced from ~1s to <1ms.

	Cc: Andrew Morton <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: [email protected]
	Reviewed-by: Peter Xu <[email protected]>
	Reviewed-by: Mitchell Augustin <[email protected]>
	Tested-by: Mitchell Augustin <[email protected]>
	Reviewed-by: Jason Gunthorpe <[email protected]>
	Acked-by: David Hildenbrand <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Alex Williamson <[email protected]>
(cherry picked from commit 62fb8ad)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/mm.h
#	mm/memory.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Alex Williamson <[email protected]>
commit 0fd0684
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/0fd06844.failed

vfio-pci supports huge_fault for PCI MMIO BARs and will insert pud and
pmd mappings for well aligned mappings.  follow_pfnmap_start() walks the
page table and therefore knows the page mask of the level where the
address is found and returns this through follow_pfnmap_args.addr_mask.
Subsequent pfns from this address until the end of the mapping page are
necessarily consecutive.  Use this information to retrieve a range of
pfnmap pfns in a single pass.

With optimal mappings and alignment on systems with 1GB pud and 4KB
page size, this reduces iterations for DMA mapping PCI BARs by a
factor of 256K.  In real world testing, the overhead of iterating
pfns for a VM DMA mapping a 32GB PCI BAR is reduced from ~1s to
sub-millisecond overhead.

	Reviewed-by: Peter Xu <[email protected]>
	Reviewed-by: Mitchell Augustin <[email protected]>
	Tested-by: Mitchell Augustin <[email protected]>
	Reviewed-by: Jason Gunthorpe <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Alex Williamson <[email protected]>
(cherry picked from commit 0fd0684)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/vfio/vfio_iommu_type1.c
PlaidCat added 7 commits July 15, 2025 00:01
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Alex Williamson <[email protected]>
commit c1d9dac
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/c1d9dac0.failed

The vfio-pci huge_fault handler doesn't make any attempt to insert a
mapping containing the faulting address, it only inserts mappings if the
faulting address and resulting pfn are aligned.  This works in a lot of
cases, particularly in conjunction with QEMU where DMA mappings linearly
fault the mmap.  However, there are configurations where we don't get
that linear faulting and pages are faulted on-demand.

The scenario reported in the bug below is such a case, where the physical
address width of the CPU is greater than that of the IOMMU, resulting in a
VM where guest firmware has mapped device MMIO beyond the address width of
the IOMMU.  In this configuration, the MMIO is faulted on demand and
tracing indicates that occasionally the faults generate a VM_FAULT_OOM.
Given the use case, this results in a "error: kvm run failed Bad address",
killing the VM.

The host is not under memory pressure in this test, therefore it's
suspected that VM_FAULT_OOM is actually the result of a NULL return from
__pte_offset_map_lock() in the get_locked_pte() path from insert_pfn().
This suggests a potential race inserting a pte concurrent to a pmd, and
maybe indicates some deficiency in the mm layer properly handling such a
case.

Nevertheless, Peter noted the inconsistency of vfio-pci's huge_fault
handler where our mapping granularity depends on the alignment of the
faulting address relative to the order rather than aligning the faulting
address to the order to more consistently insert huge mappings.  This
change not only uses the page tables more consistently and efficiently, but
as any fault to an aligned page results in the same mapping, the race
condition suspected in the VM_FAULT_OOM is avoided.

	Reported-by: Adolfo <[email protected]>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220057
Fixes: 09dfc8a ("vfio/pci: Fallback huge faults for unaligned pfn")
	Cc: [email protected]
	Tested-by: Adolfo <[email protected]>
Co-developed-by: Peter Xu <[email protected]>
	Signed-off-by: Peter Xu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Alex Williamson <[email protected]>
(cherry picked from commit c1d9dac)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/vfio/pci/vfio_pci_core.c
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Piotr Jaroszynski <[email protected]>
commit f7edb07

Update the __flush_tlb_range_op macro not to modify its parameters as
these are unexepcted semantics. In practice, this fixes the call to
mmu_notifier_arch_invalidate_secondary_tlbs() in
__flush_tlb_range_nosync() to use the correct range instead of an empty
range with start=end. The empty range was (un)lucky as it results in
taking the invalidate-all path that doesn't cause correctness issues,
but can certainly result in suboptimal perf.

This has been broken since commit 6bbd42e ("mmu_notifiers: call
invalidate_range() when invalidating TLBs") when the call to the
notifiers was added to __flush_tlb_range(). It predates the addition of
the __flush_tlb_range_op() macro from commit 3608390 ("arm64: tlb:
Refactor the core flush algorithm of __flush_tlb_range") that made the
bug hard to spot.

Fixes: 6bbd42e ("mmu_notifiers: call invalidate_range() when invalidating TLBs")

	Signed-off-by: Piotr Jaroszynski <[email protected]>
	Cc: Catalin Marinas <[email protected]>
	Cc: Will Deacon <[email protected]>
	Cc: Robin Murphy <[email protected]>
	Cc: Alistair Popple <[email protected]>
	Cc: Raghavendra Rao Ananta <[email protected]>
	Cc: SeongJae Park <[email protected]>
	Cc: Jason Gunthorpe <[email protected]>
	Cc: John Hubbard <[email protected]>
	Cc: Nicolin Chen <[email protected]>
	Cc: [email protected]
	Cc: [email protected]
	Cc: [email protected]
	Cc: [email protected]
	Cc: [email protected]
	Reviewed-by: Catalin Marinas <[email protected]>
	Reviewed-by: Alistair Popple <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Will Deacon <[email protected]>
(cherry picked from commit f7edb07)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Chunjie Zhu <[email protected]>
commit 262b73e

The following Python script results in unexpected behaviour when run on
a CIFS filesystem against a Windows Server:

    # Create file
    fd = os.open('test', os.O_WRONLY|os.O_CREAT)
    os.write(fd, b'foo')
    os.close(fd)

    # Open and close the file to leave a pending deferred close
    fd = os.open('test', os.O_RDONLY|os.O_DIRECT)
    os.close(fd)

    # Try to open the file via a hard link
    os.link('test', 'new')
    newfd = os.open('new', os.O_RDONLY|os.O_DIRECT)

The final open returns EINVAL due to the server returning
STATUS_INVALID_PARAMETER. The root cause of this is that the client
caches lease keys per inode, but the spec requires them to be related to
the filename which causes problems when hard links are involved:

From MS-SMB2 section 3.3.5.9.11:

"The server MUST attempt to locate a Lease by performing a lookup in the
LeaseTable.LeaseList using the LeaseKey in the
SMB2_CREATE_REQUEST_LEASE_V2 as the lookup key. If a lease is found,
Lease.FileDeleteOnClose is FALSE, and Lease.Filename does not match the
file name for the incoming request, the request MUST be failed with
STATUS_INVALID_PARAMETER"

On client side, we first check the context of file open, if it hits above
conditions, we first close all opening files which are belong to the same
inode, then we do open the hard link file.

	Cc: [email protected]
	Signed-off-by: Chunjie Zhu <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 262b73e)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Paulo Alcantara <[email protected]>
commit b64af6b

Customer reported that one of their applications started failing to
open files with STATUS_INSUFFICIENT_RESOURCES due to NetApp server
hitting the maximum number of opens to same file that it would allow
for a single client connection.

It turned out the client was failing to reuse open handles with
deferred closes because matching ->f_flags directly without masking
off O_CREAT|O_EXCL|O_TRUNC bits first broke the comparision and then
client ended up with thousands of deferred closes to same file.  Those
bits are already satisfied on the original open, so no need to check
them against existing open handles.

Reproducer:

 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 #include <fcntl.h>
 #include <pthread.h>

 #define NR_THREADS      4
 #define NR_ITERATIONS   2500
 #define TEST_FILE       "/mnt/1/test/dir/foo"

 static char buf[64];

 static void *worker(void *arg)
 {
         int i, j;
         int fd;

         for (i = 0; i < NR_ITERATIONS; i++) {
                 fd = open(TEST_FILE, O_WRONLY|O_CREAT|O_APPEND, 0666);
                 for (j = 0; j < 16; j++)
                         write(fd, buf, sizeof(buf));
                 close(fd);
         }
 }

 int main(int argc, char *argv[])
 {
         pthread_t t[NR_THREADS];
         int fd;
         int i;

         fd = open(TEST_FILE, O_WRONLY|O_CREAT|O_TRUNC, 0666);
         close(fd);
         memset(buf, 'a', sizeof(buf));
         for (i = 0; i < NR_THREADS; i++)
                 pthread_create(&t[i], NULL, worker, NULL);
         for (i = 0; i < NR_THREADS; i++)
                 pthread_join(t[i], NULL);
         return 0;
 }

Before patch:

$ mount.cifs //srv/share /mnt/1 -o ...
$ mkdir -p /mnt/1/test/dir
$ gcc repro.c && ./a.out
...
number of opens: 1391

After patch:

$ mount.cifs //srv/share /mnt/1 -o ...
$ mkdir -p /mnt/1/test/dir
$ gcc repro.c && ./a.out
...
number of opens: 1

	Cc: [email protected]
	Cc: David Howells <[email protected]>
	Cc: Jay Shin <[email protected]>
	Cc: Pierguido Lambri <[email protected]>
Fixes: b8ea3b1 ("smb: enable reuse of deferred file handles for write operations")
	Acked-by: Shyam Prasad N <[email protected]>
	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit b64af6b)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Srinivas Pandruvada <[email protected]>
commit ac4e04d

When turbo mode is unavailable on a Skylake-X system, executing the
command:

 # echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo

results in an unchecked MSR access error:

 WRMSR to 0x199 (attempted to write 0x0000000100001300).

This issue was reproduced on an OEM (Original Equipment Manufacturer)
system and is not a common problem across all Skylake-X systems.

This error occurs because the MSR 0x199 Turbo Engage Bit (bit 32) is set
when turbo mode is disabled. The issue arises when intel_pstate fails to
detect that turbo mode is disabled. Here intel_pstate relies on
MSR_IA32_MISC_ENABLE bit 38 to determine the status of turbo mode.
However, on this system, bit 38 is not set even when turbo mode is
disabled.

According to the Intel Software Developer's Manual (SDM), the BIOS sets
this bit during platform initialization to enable or disable
opportunistic processor performance operations. Logically, this bit
should be set in such cases. However, the SDM also specifies that "OS
and applications must use CPUID leaf 06H to detect processors with
opportunistic processor performance operations enabled."

Therefore, in addition to checking MSR_IA32_MISC_ENABLE bit 38, verify
that CPUID.06H:EAX[1] is 0 to accurately determine if turbo mode is
disabled.

Fixes: 4521e1a ("cpufreq: intel_pstate: Reflect current no_turbo state correctly")
	Signed-off-by: Srinivas Pandruvada <[email protected]>
	Cc: All applicable <[email protected]>
	Signed-off-by: Rafael J. Wysocki <[email protected]>
(cherry picked from commit ac4e04d)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3557
cve CVE-2025-21991
Rebuild_History Non-Buildable kernel-5.14.0-570.26.1.el9_6
commit-author Florent Revest <[email protected]>
commit e3e8917

Currently, load_microcode_amd() iterates over all NUMA nodes, retrieves their
CPU masks and unconditionally accesses per-CPU data for the first CPU of each
mask.

According to Documentation/admin-guide/mm/numaperf.rst:

  "Some memory may share the same node as a CPU, and others are provided as
  memory only nodes."

Therefore, some node CPU masks may be empty and wouldn't have a "first CPU".

On a machine with far memory (and therefore CPU-less NUMA nodes):
- cpumask_of_node(nid) is 0
- cpumask_first(0) is CONFIG_NR_CPUS
- cpu_data(CONFIG_NR_CPUS) accesses the cpu_info per-CPU array at an
  index that is 1 out of bounds

This does not have any security implications since flashing microcode is
a privileged operation but I believe this has reliability implications by
potentially corrupting memory while flashing a microcode update.

When booting with CONFIG_UBSAN_BOUNDS=y on an AMD machine that flashes
a microcode update. I get the following splat:

  UBSAN: array-index-out-of-bounds in arch/x86/kernel/cpu/microcode/amd.c:X:Y
  index 512 is out of range for type 'unsigned long[512]'
  [...]
  Call Trace:
   dump_stack
   __ubsan_handle_out_of_bounds
   load_microcode_amd
   request_microcode_amd
   reload_store
   kernfs_fop_write_iter
   vfs_write
   ksys_write
   do_syscall_64
   entry_SYSCALL_64_after_hwframe

Change the loop to go over only NUMA nodes which have CPUs before determining
whether the first CPU on the respective node needs microcode update.

  [ bp: Massage commit message, fix typo. ]

Fixes: 7ff6edf ("x86/microcode/AMD: Fix mixed steppings support")
	Signed-off-by: Florent Revest <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit e3e8917)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..kernel-mainline: 309912
Number of commits in rpm: 41
Number of commits matched with upstream: 39 (95.12%)
Number of commits in upstream but not in rpm: 309873
Number of commits NOT found in upstream: 2 (4.88%)

Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.26.1.el9_6 for kernel-5.14.0-570.26.1.el9_6
Clean Cherry Picks: 14 (35.90%)
Empty Cherry Picks: 22 (56.41%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-5.14.0-570.26.1.el9_6/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
Copy link

@thefossguy-ciq thefossguy-ciq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prime's voice: FIVE PASSING TESTS A PR
🚤

Copy link

@jdieter jdieter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PlaidCat PlaidCat merged commit 8cc6f28 into rocky9_6 Jul 16, 2025
4 checks passed
@PlaidCat PlaidCat deleted the rocky9_6_rebuild branch July 16, 2025 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants