Skip to content

Conversation

@stefpopa
Copy link
Contributor

This patch adds support for AD5676R (spi) and AD5696R (i2c)
family of DAC devices.

Signed-off-by: Stefan Popa [email protected]

Copy link

@lclausen-adi lclausen-adi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

These changes look good. But the patch does a few things all at once and I'd prefer to split them into separate patches. This will make it easier for upstream reviewers to follow the changes.

The patch does a few codestyle cleanups, these should be in its own patch.

The patch changes how the internal vref is handled, this should be in its own patch.

Now for adding support for the new devices. Adding support for a new SPI device is straight forward, so I'd put this in its own patch.

These first few patches would all be trivial and easy to review. Now the bulk of the patch is adding I2C support, so that should be in its own patch. This allows reviewers to focus on what is actually going on.

When you add I2C support the best approach is to make three modules. One baseline module that has all the chip-logic and then one module for SPI and one for I2C. A good example of this is ad5592r-base.c/ad5592r.c/ad5593r.c (Just done call the base file -base, just call it ad5686.c and the other files ad5686-spi.c and ad5686-i2c.c).

I know that there are a lot of examples in the IIO subsystem that have both SPI and I2C support in the same file and I know I'm responsible for a lot of those drivers, but it turned out that there are some problems with this approach. For example if I2C support for the kernel is built as a module and SPI support is built-in. The other issue is what do when registering one driver fails? Do you still register the other, or do you abort an return an error?

Also try to make the commit message a bit more verbose, try to mention what kind of DAC this is, where to find the datasheet and so on.

val <<= shift;
msg = cpu_to_be32((AD5686_CMD(cmd) | AD5686_ADDR(addr) | val) << 8);

ret = i2c_master_send(i2c, (char *)&msg, 3);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to use the transfer buffer instead of a on-stack variable, just like the SPI read write functions.

on-stack buffers are not always safe for I2C transfers in case for example the I2C driver tries to do DMA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if doing the change to a buffer, i2c_master_send(...., ...., 3); to ``i2c_master_send(...., ...., len);and initlen = sizeof(buf)` ; that way you can also check if `ret == len`



static int ad5686_probe(struct spi_device *spi)
static int ad5686_probe(struct device *dev, enum ad5686_supported_device_ids chip_type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd rename this to ad5686_probe_common() to make sure it's not assumed to be a probe function on it's own

st->chip_info = &ad5686_chip_info_tbl[chip_type];

if (voltage_uv)
/* If available, internal reference is on by default */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change can be it's own commit [ changing to the internal reference by default ]

indio_dev->channels = st->chip_info->channel;
indio_dev->num_channels = AD5686_DAC_CHANNELS;

ret = ad5686_spi_write(st, AD5686_CMD_INTERNAL_REFER_SETUP, 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curios: this change removes this write ; is this fine to just remove it ? or does this need to be replaced with a st->write() equivalent ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The on-chip reference is on at power-up by default. The driver uses the internal reference if it is supported. In this case it is not compulsory to do a write.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack ; thanks for clarification

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, what happens if somebody wants to use an external reference? Previously this was possible with this driver, now it no longer is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I will change this.

static const struct i2c_device_id ad5686_i2c_id[] = {
{"ad5671r", ID_AD5671R},
{"ad5675r", ID_AD5675R},
{"ad5694", ID_AD5694},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: align the IDs column

char data[3];
int ret;

ret = i2c_master_recv(i2c, data, 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reads 2 bytes, but data is 3 bytes ;
if 2 bytes are needed i'd reduce char data[2]

either way, i'd use sizeof(data) here


ret = i2c_master_send(i2c, (char *)&msg, 3);

return (ret < 0 ? ret : (ret != 3) ? -EIO : 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is usually not preferred when doing validation, as it's a bit hard to follow in some points of the day [after the 2nd half of the day usually]
i'd split this into a simpler form ; something like:

if (ret < 0)
   return ret;

return (ret != len) ? -EIO : 0;

{"ad5684r", ID_AD5684R},
{"ad5685r", ID_AD5685R},
{"ad5686", ID_AD5686},
{"ad5686r", ID_AD5686R},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: align the IDs column

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not do that, it means you have to re-indent the whole thing when you add a new ID with a different length. That causes a lot of noise in the changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine from me to not change :)

…pport

The AD5684R/AD5685R/AD5686R are a family of 4 channel DACs with 12-bit, 14-bit
and 16-bit precision respectively. The devices come either with a built-in
reference or no built-in reference.

The AD5672R/AD5676/AD5676R are similar, except that they have 8 channels
instead of 4.

Datasheets:
http://www.analog.com/media/en/technical-documentation/data-sheets/AD5672R_5676R.pdf
http://www.analog.com/media/en/technical-documentation/data-sheets/AD5686R_5685R_5684R.pdf

Signed-off-by: Stefan Popa <[email protected]>
@stefpopa stefpopa force-pushed the rpi-4.9.y-ad5686-support branch from 2d7db57 to 61ef8a9 Compare January 23, 2018 10:47
@stefpopa
Copy link
Contributor Author

stefpopa commented Jan 23, 2018

Respun based on review comments

@@ -0,0 +1,94 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally there should be a commit that deals with just the separation of the spi part into it's own file ;
then in a subsequent commit, the new chips can be added

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about it, but the spi part separation makes sense only in the context of adding support for a family of devices which use the i2c interface.

},
};


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: this is whitespace change ; can go in first commit :)

Copy link

@lclausen-adi lclausen-adi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks very good, well done. Just a few small things inline.

module will be called ad5686.

config AD5686_I2C
tristate "Analog Devices AD5686 and similar multi-channel DACs"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have two Kconfig entries with the same label. I'd add (SPI) and (I2C) at the end. Maybe also change the part number for the I2C entry, since the AD5686 is SPI only.

ret = i2c_master_recv(i2c, &st->data[0].d8[1], 2);
if (ret < 0)
return ret;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check whether it works when you combine the i2c_master_send and i2c_master_recv into one i2c_transfer transaction with a read and write part? Splitting this into two separate transactions has the disadvantage that the I2C bus is released in between and another I2C driver might schedule a transaction on the same bus in between the two. This can cause all kind of difficult to diagnose issues since this would be a very rare race condition

return ret;

return (((int)st->data[0].d8[1] << 8) | st->data[0].d8[2]);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add a __be16 d16 field to the union and then use be16_to_cpu() here. And also pass the pointer to the d16 field as the I2C read buffer.


ret = ad5686_spi_write(st, AD5686_CMD_POWERDOWN_DAC, 0,
st->pwr_down_mask & st->pwr_down_mode, 0);
ret = st->write(st, AD5686_CMD_POWERDOWN_DAC, 0,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd wrap the st->write and st->read access in helper functions ad5686_write() and ad5686_read(). This provides a clean separation between implementation and function.

Copy link

@lclausen-adi lclausen-adi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me tell you what upstream will say to this:
"I can't take patches without any changelog text, sorry :("

I know this change is trivial, but you can always write one or two sentences about what has been done.

@stefpopa stefpopa force-pushed the rpi-4.9.y-ad5686-support branch from 61ef8a9 to 40a6a38 Compare January 23, 2018 15:56
@stefpopa
Copy link
Contributor Author

changelog v2 -> v3:

  • Changed the part number for the I2C entry from AD5686_I2C to AD5696_I2C and modified the label.
  • used one i2c_transfer transaction and added a _be16 d16 field to the union which is passed as a pointer to the I2C read buffer
  • Wrapped the st->write and st->read access in helper functions ad5686_write() and ad5686_read()

@lclausen-adi
Copy link

Looks good. But I think the last two patches got squashed together?

@commodo
Copy link
Contributor

commodo commented Jan 23, 2018

LGTM
LGTM

…696R support

The AD5694/AD5694R/AD5695R/AD5696/AD5696R are a family of 4 channel DACs with 12-bit,
14-bit and 16-bit precision respectively. The devices have either no built-in reference,
or built-in 2.5V reference.

The AD5671R/AD5675R are similar, except that they have 8 instead of 4 channels.

These devices are similar to AD5672R/AD5676/AD5676R/AD5684/AD5684R/AD5684/AD5685R/AD5686/AD5686R,
except that they use i2c instead of spi. In order to add support in the previously existing
driver, three modules were created. One baseline module that has all the chip-logic and then one
module for SPI and one for I2C.

Datasheets:
	http://www.analog.com/media/en/technical-documentation/data-sheets/AD5671R_5675R.pdf
	http://www.analog.com/media/en/technical-documentation/data-sheets/AD5696R_5695R_5694R.pdf

Signed-off-by: Stefan Popa <[email protected]>
@stefpopa stefpopa force-pushed the rpi-4.9.y-ad5686-support branch from 40a6a38 to 7001d74 Compare January 25, 2018 08:41
@stefpopa stefpopa merged commit 7bfe644 into rpi-4.9.y Jan 25, 2018
@stefpopa stefpopa deleted the rpi-4.9.y-ad5686-support branch January 25, 2018 09:07
commodo pushed a commit that referenced this pull request Aug 7, 2018
…ol-handlers'

Vladimir Zapolskiy says:

====================
ravb/sh_eth: fix sleep in atomic by reusing shared ethtool handlers

For ages trivial changes to RAVB and SuperH ethernet links by means of
standard 'ethtool' trigger a 'sleeping function called from invalid
context' bug, to visualize it on r8a7795 ULCB:

  % ethtool -r eth0
  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
  in_atomic(): 1, irqs_disabled(): 128, pid: 554, name: ethtool
  INFO: lockdep is turned off.
  irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>]           (null)
  hardirqs last disabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918
  softirqs last  enabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918
  softirqs last disabled at (0): [<0000000000000000>]           (null)
  CPU: 5 PID: 554 Comm: ethtool Not tainted 4.17.0-rc4-arm64-renesas+ #33
  Hardware name: Renesas H3ULCB board based on r8a7795 ES2.0+ (DT)
  Call trace:
   dump_backtrace+0x0/0x198
   show_stack+0x24/0x30
   dump_stack+0xb8/0xf4
   ___might_sleep+0x1c8/0x1f8
   __might_sleep+0x58/0x90
   __mutex_lock+0x50/0x890
   mutex_lock_nested+0x3c/0x50
   phy_start_aneg_priv+0x38/0x180
   phy_start_aneg+0x24/0x30
   ravb_nway_reset+0x3c/0x68
   dev_ethtool+0x3dc/0x2338
   dev_ioctl+0x19c/0x490
   sock_do_ioctl+0xe0/0x238
   sock_ioctl+0x254/0x460
   do_vfs_ioctl+0xb0/0x918
   ksys_ioctl+0x50/0x80
   sys_ioctl+0x34/0x48
   __sys_trace_return+0x0/0x4

The root cause is that an attempt to modify ECMR and GECMR registers
only when RX/TX function is disabled was too overcomplicated in its
original implementation, also processing of an optional Link Change
interrupt added even more complexity, as a result the implementation
was error prone.

The new locking scheme is confirmed to be correct by dumping driver
specific and generic PHY framework function calls with aid of ftrace
while running more or less advanced tests.

Please note that sh_eth patches from the series were built-tested only.

On purpose I do not add Fixes tags, the reused PHY handlers were added
way later than the fixed problems were firstly found in the drivers.

Changes from v1 to v2:
* the original patches are split to bugfixes and enhancements only,
  both v1 and v2 series are absolutely equal in total, thus I omit
  description of changes in individual patches,
* the latter implies that there should be no strict need for retesting,
  but because formally two series are different, I have to drop the tags
  given by Geert and Andrew, please send your tags again.
====================

Signed-off-by: David S. Miller <[email protected]>
commodo pushed a commit that referenced this pull request Aug 24, 2018
In the (unlikely) event fixup_permanent_addr() returns a failure,
addrconf_permanent_addr() calls ipv6_del_addr() without the
mandatory call to in6_ifa_hold(), leading to a refcount error,
spotted by syzkaller :

WARNING: CPU: 1 PID: 3142 at lib/refcount.c:227 refcount_dec+0x4c/0x50
lib/refcount.c:227
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 3142 Comm: ip Not tainted 4.14.0-rc4-next-20171009+ #33
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 panic+0x1e4/0x41c kernel/panic.c:181
 __warn+0x1c4/0x1e0 kernel/panic.c:544
 report_bug+0x211/0x2d0 lib/bug.c:183
 fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
 do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
 do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
 invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:refcount_dec+0x4c/0x50 lib/refcount.c:227
RSP: 0018:ffff8801ca49e680 EFLAGS: 00010286
RAX: 000000000000002c RBX: ffff8801d07cfcdc RCX: 0000000000000000
RDX: 000000000000002c RSI: 1ffff10039493c90 RDI: ffffed0039493cc4
RBP: ffff8801ca49e688 R08: ffff8801ca49dd70 R09: 0000000000000000
R10: ffff8801ca49df58 R11: 0000000000000000 R12: 1ffff10039493cd9
R13: ffff8801ca49e6e8 R14: ffff8801ca49e7e8 R15: ffff8801d07cfcdc
 __in6_ifa_put include/net/addrconf.h:369 [inline]
 ipv6_del_addr+0x42b/0xb60 net/ipv6/addrconf.c:1208
 addrconf_permanent_addr net/ipv6/addrconf.c:3327 [inline]
 addrconf_notify+0x1c66/0x2190 net/ipv6/addrconf.c:3393
 notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
 __raw_notifier_call_chain kernel/notifier.c:394 [inline]
 raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
 call_netdevice_notifiers_info+0x32/0x60 net/core/dev.c:1697
 call_netdevice_notifiers net/core/dev.c:1715 [inline]
 __dev_notify_flags+0x15d/0x430 net/core/dev.c:6843
 dev_change_flags+0xf5/0x140 net/core/dev.c:6879
 do_setlink+0xa1b/0x38e0 net/core/rtnetlink.c:2113
 rtnl_newlink+0xf0d/0x1a40 net/core/rtnetlink.c:2661
 rtnetlink_rcv_msg+0x733/0x1090 net/core/rtnetlink.c:4301
 netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2408
 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4313
 netlink_unicast_kernel net/netlink/af_netlink.c:1273 [inline]
 netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1299
 netlink_sendmsg+0xa4a/0xe70 net/netlink/af_netlink.c:1862
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 ___sys_sendmsg+0x75b/0x8a0 net/socket.c:2049
 __sys_sendmsg+0xe5/0x210 net/socket.c:2083
 SYSC_sendmsg net/socket.c:2094 [inline]
 SyS_sendmsg+0x2d/0x50 net/socket.c:2090
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x7fa9174d3320
RSP: 002b:00007ffe302ae9e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ffe302b2ae0 RCX: 00007fa9174d3320
RDX: 0000000000000000 RSI: 00007ffe302aea20 RDI: 0000000000000016
RBP: 0000000000000082 R08: 0000000000000000 R09: 000000000000000f
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe302b32a0
R13: 0000000000000000 R14: 00007ffe302b2ab8 R15: 00007ffe302b32b8

Fixes: f1705ec ("net: ipv6: Make address flushing on ifdown optional")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: David Ahern <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
commodo pushed a commit that referenced this pull request Sep 10, 2020
When multiple adapters are present in the system, pci hot-removing second
adapter leads to the following warning as both the adapters registered
thermal zone device with same thermal zone name/type.
Therefore, use unique thermal zone name during thermal zone device
initialization. Also mark thermal zone dev NULL once unregistered.

[  414.370143] ------------[ cut here ]------------
[  414.370944] sysfs group 'power' not found for kobject 'hwmon0'
[  414.371747] WARNING: CPU: 9 PID: 2661 at fs/sysfs/group.c:281
 sysfs_remove_group+0x76/0x80
[  414.382550] CPU: 9 PID: 2661 Comm: bash Not tainted 5.8.0-rc6+ #33
[  414.383593] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0a 06/23/2016
[  414.384669] RIP: 0010:sysfs_remove_group+0x76/0x80
[  414.385738] Code: 48 89 df 5b 5d 41 5c e9 d8 b5 ff ff 48 89 df e8 60 b0 ff ff
 eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 90 ae 13 bb e8 6a 27 d0 ff <0f> 0b 5b 5d
 41 5c c3 0f 1f 00 0f 1f 44 00 00 48 85 f6 74 31 41 54
[  414.388404] RSP: 0018:ffffa22bc080fcb0 EFLAGS: 00010286
[  414.389638] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  414.390829] RDX: 0000000000000001 RSI: ffff8ee2de3e9510 RDI: ffff8ee2de3e9510
[  414.392064] RBP: ffffffffbaef2ee0 R08: 0000000000000000 R09: 0000000000000000
[  414.393224] R10: 0000000000000000 R11: 000000002b30006c R12: ffff8ee260720008
[  414.394388] R13: ffff8ee25e0a40e8 R14: ffffa22bc080ff08 R15: ffff8ee2c3be5020
[  414.395661] FS:  00007fd2a7171740(0000) GS:ffff8ee2de200000(0000)
 knlGS:0000000000000000
[  414.396825] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  414.398011] CR2: 00007f178ffe5020 CR3: 000000084c5cc003 CR4: 00000000003606e0
[  414.399172] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  414.400352] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  414.401473] Call Trace:
[  414.402685]  device_del+0x89/0x400
[  414.403819]  device_unregister+0x16/0x60
[  414.405024]  hwmon_device_unregister+0x44/0xa0
[  414.406112]  thermal_remove_hwmon_sysfs+0x196/0x200
[  414.407256]  thermal_zone_device_unregister+0x1b5/0x1f0
[  414.408415]  cxgb4_thermal_remove+0x3c/0x4f [cxgb4]
[  414.409668]  remove_one+0x212/0x290 [cxgb4]
[  414.410875]  pci_device_remove+0x36/0xb0
[  414.412004]  device_release_driver_internal+0xe2/0x1c0
[  414.413276]  pci_stop_bus_device+0x64/0x90
[  414.414433]  pci_stop_and_remove_bus_device_locked+0x16/0x30
[  414.415609]  remove_store+0x75/0x90
[  414.416790]  kernfs_fop_write+0x114/0x1b0
[  414.417930]  vfs_write+0xcf/0x210
[  414.419059]  ksys_write+0xa7/0xe0
[  414.420120]  do_syscall_64+0x4c/0xa0
[  414.421278]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  414.422335] RIP: 0033:0x7fd2a686afd0
[  414.423396] Code: Bad RIP value.
[  414.424549] RSP: 002b:00007fffc1446148 EFLAGS: 00000246 ORIG_RAX:
 0000000000000001
[  414.425638] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd2a686afd0
[  414.426830] RDX: 0000000000000002 RSI: 00007fd2a7196000 RDI: 0000000000000001
[  414.427927] RBP: 00007fd2a7196000 R08: 000000000000000a R09: 00007fd2a7171740
[  414.428923] R10: 00007fd2a7171740 R11: 0000000000000246 R12: 00007fd2a6b43400
[  414.430082] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000
[  414.431027] irq event stamp: 76300
[  414.435678] ---[ end trace 13865acb4d5ab00f ]---

Fixes: b187191 ("cxgb4: Add thermal zone support")
Signed-off-by: Potnuri Bharat Teja <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
nunojsa pushed a commit that referenced this pull request Jan 18, 2024
commit 23c2d49 upstream.

The kmemleak_*_phys() apis do not check the address for lowmem's min
boundary, while the caller may pass an address below lowmem, which will
trigger an oops:

  # echo scan > /sys/kernel/debug/kmemleak
  Unable to handle kernel paging request at virtual address ff5fffffffe00000
  Oops [#1]
  Modules linked in:
  CPU: 2 PID: 134 Comm: bash Not tainted 5.18.0-rc1-next-20220407 #33
  Hardware name: riscv-virtio,qemu (DT)
  epc : scan_block+0x74/0x15c
   ra : scan_block+0x72/0x15c
  epc : ffffffff801e5806 ra : ffffffff801e5804 sp : ff200000104abc30
   gp : ffffffff815cd4e8 tp : ff60000004cfa340 t0 : 0000000000000200
   t1 : 00aaaaaac23954cc t2 : 00000000000003ff s0 : ff200000104abc90
   s1 : ffffffff81b0ff28 a0 : 0000000000000000 a1 : ff5fffffffe01000
   a2 : ffffffff81b0ff28 a3 : 0000000000000002 a4 : 0000000000000001
   a5 : 0000000000000000 a6 : ff200000104abd7c a7 : 0000000000000005
   s2 : ff5fffffffe00ff9 s3 : ffffffff815cd998 s4 : ffffffff815d0e90
   s5 : ffffffff81b0ff28 s6 : 0000000000000020 s7 : ffffffff815d0eb0
   s8 : ffffffffffffffff s9 : ff5fffffffe00000 s10: ff5fffffffe01000
   s11: 0000000000000022 t3 : 00ffffffaa17db4c t4 : 000000000000000f
   t5 : 0000000000000001 t6 : 0000000000000000
  status: 0000000000000100 badaddr: ff5fffffffe00000 cause: 000000000000000d
    scan_gray_list+0x12e/0x1a6
    kmemleak_scan+0x2aa/0x57e
    kmemleak_write+0x32a/0x40c
    full_proxy_write+0x56/0x82
    vfs_write+0xa6/0x2a6
    ksys_write+0x6c/0xe2
    sys_write+0x22/0x2a
    ret_from_syscall+0x0/0x2

The callers may not quite know the actual address they pass(e.g. from
devicetree).  So the kmemleak_*_phys() apis should guarantee the address
they finally use is in lowmem range, so check the address for lowmem's
min boundary.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Patrick Wang <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
nunojsa pushed a commit that referenced this pull request Mar 10, 2025
Some of the platforms may connect the INT pin via inversion logic
effectively make the triggering to be active-low.
Remove explicit trigger flag to respect the settings from firmware.

Without this change even idling chip produces spurious interrupts
and kernel disables the line in the result:

  irq 33: nobody cared (try booting with the "irqpoll" option)
  CPU: 0 UID: 0 PID: 125 Comm: irq/33-i2c-INT3 Not tainted 6.12.0-00236-g8b874ed11dae #64
  Hardware name: Intel Corp. QUARK/Galileo, BIOS 0x01000900 01/01/2014
  ...
  handlers:
  [<86e86bea>] irq_default_primary_handler threaded [<d153e44a>] cy8c95x0_irq_handler [pinctrl_cy8c95x0]
  Disabling IRQ #33

Fixes: e6cbbe4 ("pinctrl: Add Cypress cy8c95x0 support")
Signed-off-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
github-actions bot pushed a commit that referenced this pull request Jun 26, 2025
Without the change `perf `hangs up on charaster devices. On my system
it's enough to run system-wide sampler for a few seconds to get the
hangup:

    $ perf record -a -g --call-graph=dwarf
    $ perf report
    # hung

`strace` shows that hangup happens on reading on a character device
`/dev/dri/renderD128`

    $ strace -y -f -p 2780484
    strace: Process 2780484 attached
    pread64(101</dev/dri/renderD128>, strace: Process 2780484 detached

It's call trace descends into `elfutils`:

    $ gdb -p 2780484
    (gdb) bt
    #0  0x00007f5e508f04b7 in __libc_pread64 (fd=101, buf=0x7fff9df7edb0, count=0, offset=0)
        at ../sysdeps/unix/sysv/linux/pread64.c:25
    #1  0x00007f5e52b79515 in read_file () from /<<NIX>>/elfutils-0.192/lib/libelf.so.1
    #2  0x00007f5e52b25666 in libdw_open_elf () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #3  0x00007f5e52b25907 in __libdw_open_file () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #4  0x00007f5e52b120a9 in dwfl_report_elf@@ELFUTILS_0.156 ()
       from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #5  0x000000000068bf20 in __report_module (al=al@entry=0x7fff9df80010, ip=ip@entry=139803237033216, ui=ui@entry=0x5369b5e0)
        at util/dso.h:537
    #6  0x000000000068c3d1 in report_module (ip=139803237033216, ui=0x5369b5e0) at util/unwind-libdw.c:114
    #7  frame_callback (state=0x535aef10, arg=0x5369b5e0) at util/unwind-libdw.c:242
    #8  0x00007f5e52b261d3 in dwfl_thread_getframes () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #9  0x00007f5e52b25bdb in get_one_thread_cb () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #10 0x00007f5e52b25faa in dwfl_getthreads () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #11 0x00007f5e52b26514 in dwfl_getthread_frames () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #12 0x000000000068c6ce in unwind__get_entries (cb=cb@entry=0x5d4620 <unwind_entry>, arg=arg@entry=0x10cd5fa0,
        thread=thread@entry=0x1076a290, data=data@entry=0x7fff9df80540, max_stack=max_stack@entry=127,
        best_effort=best_effort@entry=false) at util/thread.h:152
    #13 0x00000000005dae95 in thread__resolve_callchain_unwind (evsel=0x106006d0, thread=0x1076a290, cursor=0x10cd5fa0,
        sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2939
    #14 thread__resolve_callchain_unwind (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, sample=0x7fff9df80540,
        max_stack=127, symbols=true) at util/machine.c:2920
    #15 __thread__resolve_callchain (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, evsel@entry=0x7fff9df80440,
        sample=0x7fff9df80540, parent=parent@entry=0x7fff9df804a0, root_al=root_al@entry=0x7fff9df80440, max_stack=127, symbols=true)
        at util/machine.c:2970
    #16 0x00000000005d0cb2 in thread__resolve_callchain (thread=<optimized out>, cursor=<optimized out>, evsel=0x7fff9df80440,
        sample=<optimized out>, parent=0x7fff9df804a0, root_al=0x7fff9df80440, max_stack=127) at util/machine.h:198
    #17 sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fff9df804a0,
        evsel=evsel@entry=0x106006d0, al=al@entry=0x7fff9df80440, max_stack=max_stack@entry=127) at util/callchain.c:1127
    #18 0x0000000000617e08 in hist_entry_iter__add (iter=iter@entry=0x7fff9df80480, al=al@entry=0x7fff9df80440, max_stack_depth=127,
        arg=arg@entry=0x7fff9df81ae0) at util/hist.c:1255
    #19 0x000000000045d2d0 in process_sample_event (tool=0x7fff9df81ae0, event=<optimized out>, sample=0x7fff9df80540,
        evsel=0x106006d0, machine=<optimized out>) at builtin-report.c:334
    #20 0x00000000005e3bb1 in perf_session__deliver_event (session=0x105ff2c0, event=0x7f5c7d735ca0, tool=0x7fff9df81ae0,
        file_offset=2914716832, file_path=0x105ffbf0 "perf.data") at util/session.c:1367
    #21 0x00000000005e8d93 in do_flush (oe=0x105ffa50, show_progress=false) at util/ordered-events.c:245
    #22 __ordered_events__flush (oe=0x105ffa50, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324
    #23 0x00000000005e1f64 in perf_session__process_user_event (session=0x105ff2c0, event=0x7f5c7d752b18, file_offset=2914835224,
        file_path=0x105ffbf0 "perf.data") at util/session.c:1419
    #24 0x00000000005e47c7 in reader__read_event (rd=rd@entry=0x7fff9df81260, session=session@entry=0x105ff2c0,
    --Type <RET> for more, q to quit, c to continue without paging--
    quit
        prog=prog@entry=0x7fff9df81220) at util/session.c:2132
    #25 0x00000000005e4b37 in reader__process_events (rd=0x7fff9df81260, session=0x105ff2c0, prog=0x7fff9df81220)
        at util/session.c:2181
    #26 __perf_session__process_events (session=0x105ff2c0) at util/session.c:2226
    #27 perf_session__process_events (session=session@entry=0x105ff2c0) at util/session.c:2390
    #28 0x0000000000460add in __cmd_report (rep=0x7fff9df81ae0) at builtin-report.c:1076
    #29 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1827
    #30 0x00000000004c5a40 in run_builtin (p=p@entry=0xd8f7f8 <commands+312>, argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0)
        at perf.c:351
    #31 0x00000000004c5d63 in handle_internal_command (argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:404
    #32 0x0000000000442de3 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:448
    #33 main (argc=<optimized out>, argv=0x7fff9df844b0) at perf.c:556

The hangup happens because nothing in` perf` or `elfutils` checks if a
mapped file is easily readable.

The change conservatively skips all non-regular files.

Signed-off-by: Sergei Trofimovich <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
github-actions bot pushed a commit that referenced this pull request Jul 22, 2025
Callers of wdev_chandef() must hold the wiphy mutex.

But the worker cfg80211_propagate_cac_done_wk() never takes the lock.
Which triggers the warning below with the mesh_peer_connected_dfs
test from hostapd and not (yet) released mac80211 code changes:

WARNING: CPU: 0 PID: 495 at net/wireless/chan.c:1552 wdev_chandef+0x60/0x165
Modules linked in:
CPU: 0 UID: 0 PID: 495 Comm: kworker/u4:2 Not tainted 6.14.0-rc5-wt-g03960e6f9d47 #33 13c287eeabfe1efea01c0bcc863723ab082e17cf
Workqueue: cfg80211 cfg80211_propagate_cac_done_wk
Stack:
 00000000 00000001 ffffff00 6093267c
 00000000 6002ec30 6d577c50 60037608
 00000000 67e8d108 6063717b 00000000
Call Trace:
 [<6002ec30>] ? _printk+0x0/0x98
 [<6003c2b3>] show_stack+0x10e/0x11a
 [<6002ec30>] ? _printk+0x0/0x98
 [<60037608>] dump_stack_lvl+0x71/0xb8
 [<6063717b>] ? wdev_chandef+0x60/0x165
 [<6003766d>] dump_stack+0x1e/0x20
 [<6005d1b7>] __warn+0x101/0x20f
 [<6005d3a8>] warn_slowpath_fmt+0xe3/0x15d
 [<600b0c5c>] ? mark_lock.part.0+0x0/0x4ec
 [<60751191>] ? __this_cpu_preempt_check+0x0/0x16
 [<600b11a2>] ? mark_held_locks+0x5a/0x6e
 [<6005d2c5>] ? warn_slowpath_fmt+0x0/0x15d
 [<60052e53>] ? unblock_signals+0x3a/0xe7
 [<60052f2d>] ? um_set_signals+0x2d/0x43
 [<60751191>] ? __this_cpu_preempt_check+0x0/0x16
 [<607508b2>] ? lock_is_held_type+0x207/0x21f
 [<6063717b>] wdev_chandef+0x60/0x165
 [<605f89b4>] regulatory_propagate_dfs_state+0x247/0x43f
 [<60052f00>] ? um_set_signals+0x0/0x43
 [<605e6bfd>] cfg80211_propagate_cac_done_wk+0x3a/0x4a
 [<6007e460>] process_scheduled_works+0x3bc/0x60e
 [<6007d0ec>] ? move_linked_works+0x4d/0x81
 [<6007d120>] ? assign_work+0x0/0xaa
 [<6007f81f>] worker_thread+0x220/0x2dc
 [<600786ef>] ? set_pf_worker+0x0/0x57
 [<60087c96>] ? to_kthread+0x0/0x43
 [<6008ab3c>] kthread+0x2d3/0x2e2
 [<6007f5ff>] ? worker_thread+0x0/0x2dc
 [<6006c05b>] ? calculate_sigpending+0x0/0x56
 [<6003b37d>] new_thread_handler+0x4a/0x64
irq event stamp: 614611
hardirqs last  enabled at (614621): [<00000000600bc96b>] __up_console_sem+0x82/0xaf
hardirqs last disabled at (614630): [<00000000600bc92c>] __up_console_sem+0x43/0xaf
softirqs last  enabled at (614268): [<00000000606c55c6>] __ieee80211_wake_queue+0x933/0x985
softirqs last disabled at (614266): [<00000000606c52d6>] __ieee80211_wake_queue+0x643/0x985

Fixes: 26ec17a ("cfg80211: Fix radar event during another phy CAC")
Signed-off-by: Alexander Wetzel <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
nunojsa pushed a commit that referenced this pull request Aug 5, 2025
[ Upstream commit eedf3e3 ]

ACPICA commit 1c28da2242783579d59767617121035dafba18c3

This was originally done in NetBSD:
NetBSD/src@b69d1ac
and is the correct alternative to the smattering of `memcpy`s I
previously contributed to this repository.

This also sidesteps the newly strict checks added in UBSAN:
llvm/llvm-project@7926744

Before this change we see the following UBSAN stack trace in Fuchsia:

  #0    0x000021afcfdeca5e in acpi_rs_get_address_common(struct acpi_resource*, union aml_resource*) ../../third_party/acpica/source/components/resources/rsaddr.c:329 <platform-bus-x86.so>+0x6aca5e
  #1.2  0x000021982bc4af3c in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x41f3c
  #1.1  0x000021982bc4af3c in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x41f3c
  #1    0x000021982bc4af3c in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:395 <libclang_rt.asan.so>+0x41f3c
  #2    0x000021982bc4bb6f in handletype_mismatch_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:137 <libclang_rt.asan.so>+0x42b6f
  #3    0x000021982bc4b723 in __ubsan_handle_type_mismatch_v1 compiler-rt/lib/ubsan/ubsan_handlers.cpp:142 <libclang_rt.asan.so>+0x42723
  #4    0x000021afcfdeca5e in acpi_rs_get_address_common(struct acpi_resource*, union aml_resource*) ../../third_party/acpica/source/components/resources/rsaddr.c:329 <platform-bus-x86.so>+0x6aca5e
  #5    0x000021afcfdf2089 in acpi_rs_convert_aml_to_resource(struct acpi_resource*, union aml_resource*, struct acpi_rsconvert_info*) ../../third_party/acpica/source/components/resources/rsmisc.c:355 <platform-bus-x86.so>+0x6b2089
  #6    0x000021afcfded169 in acpi_rs_convert_aml_to_resources(u8*, u32, u32, u8, void**) ../../third_party/acpica/source/components/resources/rslist.c:137 <platform-bus-x86.so>+0x6ad169
  #7    0x000021afcfe2d24a in acpi_ut_walk_aml_resources(struct acpi_walk_state*, u8*, acpi_size, acpi_walk_aml_callback, void**) ../../third_party/acpica/source/components/utilities/utresrc.c:237 <platform-bus-x86.so>+0x6ed24a
  #8    0x000021afcfde66b7 in acpi_rs_create_resource_list(union acpi_operand_object*, struct acpi_buffer*) ../../third_party/acpica/source/components/resources/rscreate.c:199 <platform-bus-x86.so>+0x6a66b7
  #9    0x000021afcfdf6979 in acpi_rs_get_method_data(acpi_handle, const char*, struct acpi_buffer*) ../../third_party/acpica/source/components/resources/rsutils.c:770 <platform-bus-x86.so>+0x6b6979
  #10   0x000021afcfdf708f in acpi_walk_resources(acpi_handle, char*, acpi_walk_resource_callback, void*) ../../third_party/acpica/source/components/resources/rsxface.c:731 <platform-bus-x86.so>+0x6b708f
  #11   0x000021afcfa95dcf in acpi::acpi_impl::walk_resources(acpi::acpi_impl*, acpi_handle, const char*, acpi::Acpi::resources_callable) ../../src/devices/board/lib/acpi/acpi-impl.cc:41 <platform-bus-x86.so>+0x355dcf
  #12   0x000021afcfaa8278 in acpi::device_builder::gather_resources(acpi::device_builder*, acpi::Acpi*, fidl::any_arena&, acpi::Manager*, acpi::device_builder::gather_resources_callback) ../../src/devices/board/lib/acpi/device-builder.cc:84 <platform-bus-x86.so>+0x368278
  #13   0x000021afcfbddb87 in acpi::Manager::configure_discovered_devices(acpi::Manager*) ../../src/devices/board/lib/acpi/manager.cc:75 <platform-bus-x86.so>+0x49db87
  #14   0x000021afcf99091d in publish_acpi_devices(acpi::Manager*, zx_device_t*, zx_device_t*) ../../src/devices/board/drivers/x86/acpi-nswalk.cc:95 <platform-bus-x86.so>+0x25091d
  #15   0x000021afcf9c1d4e in x86::X86::do_init(x86::X86*) ../../src/devices/board/drivers/x86/x86.cc:60 <platform-bus-x86.so>+0x281d4e
  #16   0x000021afcf9e33ad in λ(x86::X86::ddk_init::(anon class)*) ../../src/devices/board/drivers/x86/x86.cc:77 <platform-bus-x86.so>+0x2a33ad
  #17   0x000021afcf9e313e in fit::internal::target<(lambda at../../src/devices/board/drivers/x86/x86.cc:76:19), false, false, std::__2::allocator<std::byte>, void>::invoke(void*) ../../sdk/lib/fit/include/lib/fit/internal/function.h:183 <platform-bus-x86.so>+0x2a313e
  #18   0x000021afcfbab4c7 in fit::internal::function_base<16UL, false, void(), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<16UL, false, void (), std::__2::allocator<std::byte> >*) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <platform-bus-x86.so>+0x46b4c7
  #19   0x000021afcfbab342 in fit::function_impl<16UL, false, void(), std::__2::allocator<std::byte>>::operator()(const fit::function_impl<16UL, false, void (), std::__2::allocator<std::byte> >*) ../../sdk/lib/fit/include/lib/fit/function.h:315 <platform-bus-x86.so>+0x46b342
  #20   0x000021afcfcd98c3 in async::internal::retained_task::Handler(async_dispatcher_t*, async_task_t*, zx_status_t) ../../sdk/lib/async/task.cc:24 <platform-bus-x86.so>+0x5998c3
  #21   0x00002290f9924616 in λ(const driver_runtime::Dispatcher::post_task::(anon class)*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, zx_status_t) ../../src/devices/bin/driver_runtime/dispatcher.cc:789 <libdriver_runtime.so>+0x10a616
  #22   0x00002290f9924323 in fit::internal::target<(lambda at../../src/devices/bin/driver_runtime/dispatcher.cc:788:7), true, false, std::__2::allocator<std::byte>, void, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int>::invoke(void*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/internal/function.h:128 <libdriver_runtime.so>+0x10a323
  #23   0x00002290f9904b76 in fit::internal::function_base<24UL, true, void(std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<24UL, true, void (std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <libdriver_runtime.so>+0xeab76
  #24   0x00002290f9904831 in fit::callback_impl<24UL, true, void(std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int), std::__2::allocator<std::byte>>::operator()(fit::callback_impl<24UL, true, void (std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/function.h:471 <libdriver_runtime.so>+0xea831
  #25   0x00002290f98d5adc in driver_runtime::callback_request::Call(driver_runtime::callback_request*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, zx_status_t) ../../src/devices/bin/driver_runtime/callback_request.h:74 <libdriver_runtime.so>+0xbbadc
  #26   0x00002290f98e1e58 in driver_runtime::Dispatcher::dispatch_callback(driver_runtime::Dispatcher*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >) ../../src/devices/bin/driver_runtime/dispatcher.cc:1248 <libdriver_runtime.so>+0xc7e58
  #27   0x00002290f98e4159 in driver_runtime::Dispatcher::dispatch_callbacks(driver_runtime::Dispatcher*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.cc:1308 <libdriver_runtime.so>+0xca159
  #28   0x00002290f9918414 in λ(const driver_runtime::Dispatcher::create_with_adder::(anon class)*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.cc:353 <libdriver_runtime.so>+0xfe414
  #29   0x00002290f991812d in fit::internal::target<(lambda at../../src/devices/bin/driver_runtime/dispatcher.cc:351:7), true, false, std::__2::allocator<std::byte>, void, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>>::invoke(void*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/internal/function.h:128 <libdriver_runtime.so>+0xfe12d
  #30   0x00002290f9906fc7 in fit::internal::function_base<8UL, true, void(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<8UL, true, void (std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <libdriver_runtime.so>+0xecfc7
  #31   0x00002290f9906c66 in fit::function_impl<8UL, true, void(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte>>::operator()(const fit::function_impl<8UL, true, void (std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/function.h:315 <libdriver_runtime.so>+0xecc66
  #32   0x00002290f98e73d9 in driver_runtime::Dispatcher::event_waiter::invoke_callback(driver_runtime::Dispatcher::event_waiter*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.h:543 <libdriver_runtime.so>+0xcd3d9
  #33   0x00002290f98e700d in driver_runtime::Dispatcher::event_waiter::handle_event(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, async_dispatcher_t*, async::wait_base*, zx_status_t, zx_packet_signal_t const*) ../../src/devices/bin/driver_runtime/dispatcher.cc:1442 <libdriver_runtime.so>+0xcd00d
  #34   0x00002290f9918983 in async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>::handle_event(async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>*, async_dispatcher_t*, async::wait_base*, zx_status_t, zx_packet_signal_t const*) ../../src/devices/bin/driver_runtime/async_loop_owned_event_handler.h:59 <libdriver_runtime.so>+0xfe983
  #35   0x00002290f9918b9e in async::wait_method<async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>, &async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>::handle_event>::call_handler(async_dispatcher_t*, async_wait_t*, zx_status_t, zx_packet_signal_t const*) ../../sdk/lib/async/include/lib/async/cpp/wait.h:201 <libdriver_runtime.so>+0xfeb9e
  #36   0x00002290f99bf509 in async_loop_dispatch_wait(async_loop_t*, async_wait_t*, zx_status_t, zx_packet_signal_t const*) ../../sdk/lib/async-loop/loop.c:394 <libdriver_runtime.so>+0x1a5509
  #37   0x00002290f99b9958 in async_loop_run_once(async_loop_t*, zx_time_t) ../../sdk/lib/async-loop/loop.c:343 <libdriver_runtime.so>+0x19f958
  #38   0x00002290f99b9247 in async_loop_run(async_loop_t*, zx_time_t, _Bool) ../../sdk/lib/async-loop/loop.c:301 <libdriver_runtime.so>+0x19f247
  #39   0x00002290f99ba962 in async_loop_run_thread(void*) ../../sdk/lib/async-loop/loop.c:860 <libdriver_runtime.so>+0x1a0962
  #40   0x000041afd176ef30 in start_c11(void*) ../../zircon/third_party/ulib/musl/pthread/pthread_create.c:63 <libc.so>+0x84f30
  #41   0x000041afd18a448d in thread_trampoline(uintptr_t, uintptr_t) ../../zircon/system/ulib/runtime/thread.cc:100 <libc.so>+0x1ba48d

Link: acpica/acpica@1c28da22
Signed-off-by: Rafael J. Wysocki <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Tamir Duberstein <[email protected]>
[ rjw: Pick up the tag from Tamir ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 5, 2025
Patch series "mm: remove nth_page()", v2.

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is
because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when it
might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".  If we
only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page() within
CMA allocations (again, could span memory sections), and in one corner
case (kfence) when processing memblock allocations (again, could span
memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.


This patch (of 37):

In an ideal world, we wouldn't have to deal with SPARSEMEM without
SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is
considered too costly and consequently not supported.

However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP,
let's forbid the user to disable VMEMMAP: just like we already do for
arm64, s390 and x86.

So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without
SPARSEMEM_VMEMMAP.

This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone
for loongarch, powerpc, riscv and sparc.  All architectures only enable
SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big
downside to using the VMEMMAP (quite the contrary).

This is a preparation for not supporting

(1) folio sizes that exceed a single memory section

(2) CMA allocations of non-contiguous page ranges

in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit
possible impact as much as possible (e.g., gigantic hugetlb page
allocations suddenly fails).

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Zi Yan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: SeongJae Park <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Alex Dubov <[email protected]>
Cc: Alex Willamson <[email protected]>
Cc: Bart van Assche <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Brendan Jackman <[email protected]>
Cc: Brett Creeley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Lameter (Ampere) <[email protected]>
Cc: Damien Le Maol <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Doug Gilbert <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Inki Dae <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Lahtinen <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Lars Persson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Maxim Levitky <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Robin Murohy <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Shameerali Kolothum Thodi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yishai Hadas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 6, 2025
Patch series "mm: remove nth_page()", v2.

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is
because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when it
might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".  If we
only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page() within
CMA allocations (again, could span memory sections), and in one corner
case (kfence) when processing memblock allocations (again, could span
memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.


This patch (of 37):

In an ideal world, we wouldn't have to deal with SPARSEMEM without
SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is
considered too costly and consequently not supported.

However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP,
let's forbid the user to disable VMEMMAP: just like we already do for
arm64, s390 and x86.

So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without
SPARSEMEM_VMEMMAP.

This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone
for loongarch, powerpc, riscv and sparc.  All architectures only enable
SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big
downside to using the VMEMMAP (quite the contrary).

This is a preparation for not supporting

(1) folio sizes that exceed a single memory section

(2) CMA allocations of non-contiguous page ranges

in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit
possible impact as much as possible (e.g., gigantic hugetlb page
allocations suddenly fails).

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Zi Yan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: SeongJae Park <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Alex Dubov <[email protected]>
Cc: Alex Willamson <[email protected]>
Cc: Bart van Assche <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Brendan Jackman <[email protected]>
Cc: Brett Creeley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Lameter (Ampere) <[email protected]>
Cc: Damien Le Maol <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Doug Gilbert <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Inki Dae <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Lahtinen <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Lars Persson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Maxim Levitky <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Robin Murohy <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Shameerali Kolothum Thodi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yishai Hadas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 9, 2025
Patch series "mm: remove nth_page()", v2.

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is
because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when it
might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".  If we
only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page() within
CMA allocations (again, could span memory sections), and in one corner
case (kfence) when processing memblock allocations (again, could span
memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.


This patch (of 37):

In an ideal world, we wouldn't have to deal with SPARSEMEM without
SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is
considered too costly and consequently not supported.

However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP,
let's forbid the user to disable VMEMMAP: just like we already do for
arm64, s390 and x86.

So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without
SPARSEMEM_VMEMMAP.

This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone
for loongarch, powerpc, riscv and sparc.  All architectures only enable
SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big
downside to using the VMEMMAP (quite the contrary).

This is a preparation for not supporting

(1) folio sizes that exceed a single memory section

(2) CMA allocations of non-contiguous page ranges

in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit
possible impact as much as possible (e.g., gigantic hugetlb page
allocations suddenly fails).

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Zi Yan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: SeongJae Park <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Alex Dubov <[email protected]>
Cc: Alex Willamson <[email protected]>
Cc: Bart van Assche <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Brendan Jackman <[email protected]>
Cc: Brett Creeley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Lameter (Ampere) <[email protected]>
Cc: Damien Le Maol <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Doug Gilbert <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Inki Dae <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Lahtinen <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Lars Persson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Maxim Levitky <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Robin Murohy <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Shameerali Kolothum Thodi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yishai Hadas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 11, 2025
Patch series "mm: remove nth_page()", v2.

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is
because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when it
might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".  If we
only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page() within
CMA allocations (again, could span memory sections), and in one corner
case (kfence) when processing memblock allocations (again, could span
memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.


This patch (of 37):

In an ideal world, we wouldn't have to deal with SPARSEMEM without
SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is
considered too costly and consequently not supported.

However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP,
let's forbid the user to disable VMEMMAP: just like we already do for
arm64, s390 and x86.

So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without
SPARSEMEM_VMEMMAP.

This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone
for loongarch, powerpc, riscv and sparc.  All architectures only enable
SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big
downside to using the VMEMMAP (quite the contrary).

This is a preparation for not supporting

(1) folio sizes that exceed a single memory section

(2) CMA allocations of non-contiguous page ranges

in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit
possible impact as much as possible (e.g., gigantic hugetlb page
allocations suddenly fails).

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Zi Yan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: SeongJae Park <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Alex Dubov <[email protected]>
Cc: Alex Willamson <[email protected]>
Cc: Bart van Assche <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Brendan Jackman <[email protected]>
Cc: Brett Creeley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Lameter (Ampere) <[email protected]>
Cc: Damien Le Maol <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Doug Gilbert <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Inki Dae <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Lahtinen <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Lars Persson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Maxim Levitky <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Robin Murohy <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Shameerali Kolothum Thodi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yishai Hadas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 12, 2025
Patch series "mm: remove nth_page()", v2.

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is
because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when it
might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".  If we
only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page() within
CMA allocations (again, could span memory sections), and in one corner
case (kfence) when processing memblock allocations (again, could span
memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.


This patch (of 37):

In an ideal world, we wouldn't have to deal with SPARSEMEM without
SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is
considered too costly and consequently not supported.

However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP,
let's forbid the user to disable VMEMMAP: just like we already do for
arm64, s390 and x86.

So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without
SPARSEMEM_VMEMMAP.

This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone
for loongarch, powerpc, riscv and sparc.  All architectures only enable
SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big
downside to using the VMEMMAP (quite the contrary).

This is a preparation for not supporting

(1) folio sizes that exceed a single memory section

(2) CMA allocations of non-contiguous page ranges

in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit
possible impact as much as possible (e.g., gigantic hugetlb page
allocations suddenly fails).

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Zi Yan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: SeongJae Park <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Alex Dubov <[email protected]>
Cc: Alex Willamson <[email protected]>
Cc: Bart van Assche <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Brendan Jackman <[email protected]>
Cc: Brett Creeley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Lameter (Ampere) <[email protected]>
Cc: Damien Le Maol <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Doug Gilbert <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Inki Dae <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Lahtinen <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Lars Persson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Maxim Levitky <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Robin Murohy <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Shameerali Kolothum Thodi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yishai Hadas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 13, 2025
Patch series "mm: remove nth_page()", v2.

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is
because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when it
might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".  If we
only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page() within
CMA allocations (again, could span memory sections), and in one corner
case (kfence) when processing memblock allocations (again, could span
memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.


This patch (of 37):

In an ideal world, we wouldn't have to deal with SPARSEMEM without
SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is
considered too costly and consequently not supported.

However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP,
let's forbid the user to disable VMEMMAP: just like we already do for
arm64, s390 and x86.

So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without
SPARSEMEM_VMEMMAP.

This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone
for loongarch, powerpc, riscv and sparc.  All architectures only enable
SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big
downside to using the VMEMMAP (quite the contrary).

This is a preparation for not supporting

(1) folio sizes that exceed a single memory section

(2) CMA allocations of non-contiguous page ranges

in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit
possible impact as much as possible (e.g., gigantic hugetlb page
allocations suddenly fails).

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Zi Yan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: SeongJae Park <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Alex Dubov <[email protected]>
Cc: Alex Willamson <[email protected]>
Cc: Bart van Assche <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Brendan Jackman <[email protected]>
Cc: Brett Creeley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Lameter (Ampere) <[email protected]>
Cc: Damien Le Maol <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Doug Gilbert <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Inki Dae <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Lahtinen <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Lars Persson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Maxim Levitky <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Robin Murohy <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Shameerali Kolothum Thodi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yishai Hadas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 16, 2025
Patch series "mm: remove nth_page()", v2.

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is
because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when it
might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".  If we
only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page() within
CMA allocations (again, could span memory sections), and in one corner
case (kfence) when processing memblock allocations (again, could span
memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.


This patch (of 37):

In an ideal world, we wouldn't have to deal with SPARSEMEM without
SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is
considered too costly and consequently not supported.

However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP,
let's forbid the user to disable VMEMMAP: just like we already do for
arm64, s390 and x86.

So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without
SPARSEMEM_VMEMMAP.

This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone
for loongarch, powerpc, riscv and sparc.  All architectures only enable
SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big
downside to using the VMEMMAP (quite the contrary).

This is a preparation for not supporting

(1) folio sizes that exceed a single memory section

(2) CMA allocations of non-contiguous page ranges

in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit
possible impact as much as possible (e.g., gigantic hugetlb page
allocations suddenly fails).

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Zi Yan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: SeongJae Park <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Alex Dubov <[email protected]>
Cc: Alex Willamson <[email protected]>
Cc: Bart van Assche <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Brendan Jackman <[email protected]>
Cc: Brett Creeley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Lameter (Ampere) <[email protected]>
Cc: Damien Le Maol <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Doug Gilbert <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Inki Dae <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Lahtinen <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Lars Persson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Maxim Levitky <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Robin Murohy <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Shameerali Kolothum Thodi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yishai Hadas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 17, 2025
Patch series "mm: remove nth_page()", v2.

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is
because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when it
might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".  If we
only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page() within
CMA allocations (again, could span memory sections), and in one corner
case (kfence) when processing memblock allocations (again, could span
memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.


This patch (of 37):

In an ideal world, we wouldn't have to deal with SPARSEMEM without
SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is
considered too costly and consequently not supported.

However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP,
let's forbid the user to disable VMEMMAP: just like we already do for
arm64, s390 and x86.

So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without
SPARSEMEM_VMEMMAP.

This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone
for loongarch, powerpc, riscv and sparc.  All architectures only enable
SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big
downside to using the VMEMMAP (quite the contrary).

This is a preparation for not supporting

(1) folio sizes that exceed a single memory section

(2) CMA allocations of non-contiguous page ranges

in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit
possible impact as much as possible (e.g., gigantic hugetlb page
allocations suddenly fails).

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Zi Yan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: SeongJae Park <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Alex Dubov <[email protected]>
Cc: Alex Willamson <[email protected]>
Cc: Bart van Assche <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Brendan Jackman <[email protected]>
Cc: Brett Creeley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Lameter (Ampere) <[email protected]>
Cc: Damien Le Maol <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Doug Gilbert <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Inki Dae <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Lahtinen <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Lars Persson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Maxim Levitky <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Robin Murohy <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Shameerali Kolothum Thodi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yishai Hadas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 18, 2025
Patch series "mm: remove nth_page()", v2.

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is
because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when it
might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".  If we
only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page() within
CMA allocations (again, could span memory sections), and in one corner
case (kfence) when processing memblock allocations (again, could span
memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.


This patch (of 37):

In an ideal world, we wouldn't have to deal with SPARSEMEM without
SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is
considered too costly and consequently not supported.

However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP,
let's forbid the user to disable VMEMMAP: just like we already do for
arm64, s390 and x86.

So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without
SPARSEMEM_VMEMMAP.

This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone
for loongarch, powerpc, riscv and sparc.  All architectures only enable
SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big
downside to using the VMEMMAP (quite the contrary).

This is a preparation for not supporting

(1) folio sizes that exceed a single memory section

(2) CMA allocations of non-contiguous page ranges

in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit
possible impact as much as possible (e.g., gigantic hugetlb page
allocations suddenly fails).

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Zi Yan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: SeongJae Park <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Alex Dubov <[email protected]>
Cc: Alex Willamson <[email protected]>
Cc: Bart van Assche <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Brendan Jackman <[email protected]>
Cc: Brett Creeley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Lameter (Ampere) <[email protected]>
Cc: Damien Le Maol <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Doug Gilbert <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Inki Dae <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Lahtinen <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Lars Persson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Maxim Levitky <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Robin Murohy <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Shameerali Kolothum Thodi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yishai Hadas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
dlech pushed a commit that referenced this pull request Sep 23, 2025
Patch series "mm: remove nth_page()", v2.

As discussed recently with Linus, nth_page() is just nasty and we would
like to remove it.

To recap, the reason we currently need nth_page() within a folio is
because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the
memmap is allocated per memory section.

While buddy allocations cannot cross memory section boundaries, hugetlb
and dax folios can.

So crossing a memory section means that "page++" could do the wrong thing.
Instead, nth_page() on these problematic configs always goes from
page->pfn, to the go from (++pfn)->page, which is rather nasty.

Likely, many people have no idea when nth_page() is required and when it
might be dropped.

We refer to such problematic PFN ranges and "non-contiguous pages".  If we
only deal with "contiguous pages", there is not need for nth_page().

Besides that "obvious" folio case, we might end up using nth_page() within
CMA allocations (again, could span memory sections), and in one corner
case (kfence) when processing memblock allocations (again, could span
memory sections).

So let's handle all that, add sanity checks, and remove nth_page().

Patch #1 -> #5   : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups
Patch #6 -> #13  : disallow folios to have non-contiguous pages
Patch #14 -> #20 : remove nth_page() usage within folios
Patch #22        : disallow CMA allocations of non-contiguous pages
Patch #23 -> #33 : sanity+check + remove nth_page() usage within SG entry
Patch #34        : sanity-check + remove nth_page() usage in
                   unpin_user_page_range_dirty_lock()
Patch #35        : remove nth_page() in kfence
Patch #36        : adjust stale comment regarding nth_page
Patch #37        : mm: remove nth_page()

A lot of this is inspired from the discussion at [1] between Linus, Jason
and me, so cudos to them.


This patch (of 37):

In an ideal world, we wouldn't have to deal with SPARSEMEM without
SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is
considered too costly and consequently not supported.

However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP,
let's forbid the user to disable VMEMMAP: just like we already do for
arm64, s390 and x86.

So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without
SPARSEMEM_VMEMMAP.

This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone
for loongarch, powerpc, riscv and sparc.  All architectures only enable
SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big
downside to using the VMEMMAP (quite the contrary).

This is a preparation for not supporting

(1) folio sizes that exceed a single memory section

(2) CMA allocations of non-contiguous page ranges

in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit
possible impact as much as possible (e.g., gigantic hugetlb page
allocations suddenly fails).

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Zi Yan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: SeongJae Park <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Cc: Alex Dubov <[email protected]>
Cc: Alex Willamson <[email protected]>
Cc: Bart van Assche <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Brendan Jackman <[email protected]>
Cc: Brett Creeley <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Lameter (Ampere) <[email protected]>
Cc: Damien Le Maol <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Doug Gilbert <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Inki Dae <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Lahtinen <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Lars Persson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Maxim Levitky <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Robin Murohy <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Shameerali Kolothum Thodi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Ulf Hansson <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yishai Hadas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
github-actions bot pushed a commit that referenced this pull request Oct 10, 2025
The hfsplus_strcasecmp() logic can trigger the issue:

[  117.317703][ T9855] ==================================================================
[  117.318353][ T9855] BUG: KASAN: slab-out-of-bounds in hfsplus_strcasecmp+0x1bc/0x490
[  117.318991][ T9855] Read of size 2 at addr ffff88802160f40c by task repro/9855
[  117.319577][ T9855]
[  117.319773][ T9855] CPU: 0 UID: 0 PID: 9855 Comm: repro Not tainted 6.17.0-rc6 #33 PREEMPT(full)
[  117.319780][ T9855] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  117.319783][ T9855] Call Trace:
[  117.319785][ T9855]  <TASK>
[  117.319788][ T9855]  dump_stack_lvl+0x1c1/0x2a0
[  117.319795][ T9855]  ? __virt_addr_valid+0x1c8/0x5c0
[  117.319803][ T9855]  ? __pfx_dump_stack_lvl+0x10/0x10
[  117.319808][ T9855]  ? rcu_is_watching+0x15/0xb0
[  117.319816][ T9855]  ? lock_release+0x4b/0x3e0
[  117.319821][ T9855]  ? __kasan_check_byte+0x12/0x40
[  117.319828][ T9855]  ? __virt_addr_valid+0x1c8/0x5c0
[  117.319835][ T9855]  ? __virt_addr_valid+0x4a5/0x5c0
[  117.319842][ T9855]  print_report+0x17e/0x7e0
[  117.319848][ T9855]  ? __virt_addr_valid+0x1c8/0x5c0
[  117.319855][ T9855]  ? __virt_addr_valid+0x4a5/0x5c0
[  117.319862][ T9855]  ? __phys_addr+0xd3/0x180
[  117.319869][ T9855]  ? hfsplus_strcasecmp+0x1bc/0x490
[  117.319876][ T9855]  kasan_report+0x147/0x180
[  117.319882][ T9855]  ? hfsplus_strcasecmp+0x1bc/0x490
[  117.319891][ T9855]  hfsplus_strcasecmp+0x1bc/0x490
[  117.319900][ T9855]  ? __pfx_hfsplus_cat_case_cmp_key+0x10/0x10
[  117.319906][ T9855]  hfs_find_rec_by_key+0xa9/0x1e0
[  117.319913][ T9855]  __hfsplus_brec_find+0x18e/0x470
[  117.319920][ T9855]  ? __pfx_hfsplus_bnode_find+0x10/0x10
[  117.319926][ T9855]  ? __pfx_hfs_find_rec_by_key+0x10/0x10
[  117.319933][ T9855]  ? __pfx___hfsplus_brec_find+0x10/0x10
[  117.319942][ T9855]  hfsplus_brec_find+0x28f/0x510
[  117.319949][ T9855]  ? __pfx_hfs_find_rec_by_key+0x10/0x10
[  117.319956][ T9855]  ? __pfx_hfsplus_brec_find+0x10/0x10
[  117.319963][ T9855]  ? __kmalloc_noprof+0x2a9/0x510
[  117.319969][ T9855]  ? hfsplus_find_init+0x8c/0x1d0
[  117.319976][ T9855]  hfsplus_brec_read+0x2b/0x120
[  117.319983][ T9855]  hfsplus_lookup+0x2aa/0x890
[  117.319990][ T9855]  ? __pfx_hfsplus_lookup+0x10/0x10
[  117.320003][ T9855]  ? d_alloc_parallel+0x2f0/0x15e0
[  117.320008][ T9855]  ? __lock_acquire+0xaec/0xd80
[  117.320013][ T9855]  ? __pfx_d_alloc_parallel+0x10/0x10
[  117.320019][ T9855]  ? __raw_spin_lock_init+0x45/0x100
[  117.320026][ T9855]  ? __init_waitqueue_head+0xa9/0x150
[  117.320034][ T9855]  __lookup_slow+0x297/0x3d0
[  117.320039][ T9855]  ? __pfx___lookup_slow+0x10/0x10
[  117.320045][ T9855]  ? down_read+0x1ad/0x2e0
[  117.320055][ T9855]  lookup_slow+0x53/0x70
[  117.320065][ T9855]  walk_component+0x2f0/0x430
[  117.320073][ T9855]  path_lookupat+0x169/0x440
[  117.320081][ T9855]  filename_lookup+0x212/0x590
[  117.320089][ T9855]  ? __pfx_filename_lookup+0x10/0x10
[  117.320098][ T9855]  ? strncpy_from_user+0x150/0x290
[  117.320105][ T9855]  ? getname_flags+0x1e5/0x540
[  117.320112][ T9855]  user_path_at+0x3a/0x60
[  117.320117][ T9855]  __x64_sys_umount+0xee/0x160
[  117.320123][ T9855]  ? __pfx___x64_sys_umount+0x10/0x10
[  117.320129][ T9855]  ? do_syscall_64+0xb7/0x3a0
[  117.320135][ T9855]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  117.320141][ T9855]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  117.320145][ T9855]  do_syscall_64+0xf3/0x3a0
[  117.320150][ T9855]  ? exc_page_fault+0x9f/0xf0
[  117.320154][ T9855]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  117.320158][ T9855] RIP: 0033:0x7f7dd7908b07
[  117.320163][ T9855] Code: 23 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 08
[  117.320167][ T9855] RSP: 002b:00007ffd5ebd9698 EFLAGS: 00000202 ORIG_RAX: 00000000000000a6
[  117.320172][ T9855] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7dd7908b07
[  117.320176][ T9855] RDX: 0000000000000009 RSI: 0000000000000009 RDI: 00007ffd5ebd9740
[  117.320179][ T9855] RBP: 00007ffd5ebda780 R08: 0000000000000005 R09: 00007ffd5ebd9530
[  117.320181][ T9855] R10: 00007f7dd799bfc0 R11: 0000000000000202 R12: 000055e2008b32d0
[  117.320184][ T9855] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  117.320189][ T9855]  </TASK>
[  117.320190][ T9855]
[  117.351311][ T9855] Allocated by task 9855:
[  117.351683][ T9855]  kasan_save_track+0x3e/0x80
[  117.352093][ T9855]  __kasan_kmalloc+0x8d/0xa0
[  117.352490][ T9855]  __kmalloc_noprof+0x288/0x510
[  117.352914][ T9855]  hfsplus_find_init+0x8c/0x1d0
[  117.353342][ T9855]  hfsplus_lookup+0x19c/0x890
[  117.353747][ T9855]  __lookup_slow+0x297/0x3d0
[  117.354148][ T9855]  lookup_slow+0x53/0x70
[  117.354514][ T9855]  walk_component+0x2f0/0x430
[  117.354921][ T9855]  path_lookupat+0x169/0x440
[  117.355325][ T9855]  filename_lookup+0x212/0x590
[  117.355740][ T9855]  user_path_at+0x3a/0x60
[  117.356115][ T9855]  __x64_sys_umount+0xee/0x160
[  117.356529][ T9855]  do_syscall_64+0xf3/0x3a0
[  117.356920][ T9855]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  117.357429][ T9855]
[  117.357636][ T9855] The buggy address belongs to the object at ffff88802160f000
[  117.357636][ T9855]  which belongs to the cache kmalloc-2k of size 2048
[  117.358827][ T9855] The buggy address is located 0 bytes to the right of
[  117.358827][ T9855]  allocated 1036-byte region [ffff88802160f000, ffff88802160f40c)
[  117.360061][ T9855]
[  117.360266][ T9855] The buggy address belongs to the physical page:
[  117.360813][ T9855] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x21608
[  117.361562][ T9855] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[  117.362285][ T9855] flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
[  117.362929][ T9855] page_type: f5(slab)
[  117.363282][ T9855] raw: 00fff00000000040 ffff88801a842f00 ffffea0000932000 dead000000000002
[  117.364015][ T9855] raw: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000
[  117.364750][ T9855] head: 00fff00000000040 ffff88801a842f00 ffffea0000932000 dead000000000002
[  117.365491][ T9855] head: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000
[  117.366232][ T9855] head: 00fff00000000003 ffffea0000858201 00000000ffffffff 00000000ffffffff
[  117.366968][ T9855] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
[  117.367711][ T9855] page dumped because: kasan: bad access detected
[  117.368259][ T9855] page_owner tracks the page as allocated
[  117.368745][ T9855] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN1
[  117.370541][ T9855]  post_alloc_hook+0x240/0x2a0
[  117.370954][ T9855]  get_page_from_freelist+0x2101/0x21e0
[  117.371435][ T9855]  __alloc_frozen_pages_noprof+0x274/0x380
[  117.371935][ T9855]  alloc_pages_mpol+0x241/0x4b0
[  117.372360][ T9855]  allocate_slab+0x8d/0x380
[  117.372752][ T9855]  ___slab_alloc+0xbe3/0x1400
[  117.373159][ T9855]  __kmalloc_cache_noprof+0x296/0x3d0
[  117.373621][ T9855]  nexthop_net_init+0x75/0x100
[  117.374038][ T9855]  ops_init+0x35c/0x5c0
[  117.374400][ T9855]  setup_net+0x10c/0x320
[  117.374768][ T9855]  copy_net_ns+0x31b/0x4d0
[  117.375156][ T9855]  create_new_namespaces+0x3f3/0x720
[  117.375613][ T9855]  unshare_nsproxy_namespaces+0x11c/0x170
[  117.376094][ T9855]  ksys_unshare+0x4ca/0x8d0
[  117.376477][ T9855]  __x64_sys_unshare+0x38/0x50
[  117.376879][ T9855]  do_syscall_64+0xf3/0x3a0
[  117.377265][ T9855] page last free pid 9110 tgid 9110 stack trace:
[  117.377795][ T9855]  __free_frozen_pages+0xbeb/0xd50
[  117.378229][ T9855]  __put_partials+0x152/0x1a0
[  117.378625][ T9855]  put_cpu_partial+0x17c/0x250
[  117.379026][ T9855]  __slab_free+0x2d4/0x3c0
[  117.379404][ T9855]  qlist_free_all+0x97/0x140
[  117.379790][ T9855]  kasan_quarantine_reduce+0x148/0x160
[  117.380250][ T9855]  __kasan_slab_alloc+0x22/0x80
[  117.380662][ T9855]  __kmalloc_noprof+0x232/0x510
[  117.381074][ T9855]  tomoyo_supervisor+0xc0a/0x1360
[  117.381498][ T9855]  tomoyo_env_perm+0x149/0x1e0
[  117.381903][ T9855]  tomoyo_find_next_domain+0x15ad/0x1b90
[  117.382378][ T9855]  tomoyo_bprm_check_security+0x11c/0x180
[  117.382859][ T9855]  security_bprm_check+0x89/0x280
[  117.383289][ T9855]  bprm_execve+0x8f1/0x14a0
[  117.383673][ T9855]  do_execveat_common+0x528/0x6b0
[  117.384103][ T9855]  __x64_sys_execve+0x94/0xb0
[  117.384500][ T9855]
[  117.384706][ T9855] Memory state around the buggy address:
[  117.385179][ T9855]  ffff88802160f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  117.385854][ T9855]  ffff88802160f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  117.386534][ T9855] >ffff88802160f400: 00 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  117.387204][ T9855]                       ^
[  117.387566][ T9855]  ffff88802160f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  117.388243][ T9855]  ffff88802160f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  117.388918][ T9855] ==================================================================

The issue takes place if the length field of struct hfsplus_unistr
is bigger than HFSPLUS_MAX_STRLEN. The patch simply checks
the length of comparing strings. And if the strings' length
is bigger than HFSPLUS_MAX_STRLEN, then it is corrected
to this value.

v2
The string length correction has been added for hfsplus_strcmp().

Reported-by: Jiaming Zhang <[email protected]>
Signed-off-by: Viacheslav Dubeyko <[email protected]>
cc: John Paul Adrian Glaubitz <[email protected]>
cc: Yangtao Li <[email protected]>
cc: [email protected]
cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Viacheslav Dubeyko <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants