Skip to content

Commit 27af8e2

Browse files
Andrea Righipavelmachek
Andrea Righi
authored andcommitted
leds: trigger: fix potential deadlock with libata
We have the following potential deadlock condition: ======================================================== WARNING: possible irq lock inversion dependency detected 5.10.0-rc2+ #25 Not tainted -------------------------------------------------------- swapper/3/0 just changed the state of lock: ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200 but this lock took another, HARDIRQ-READ-unsafe lock in the past: (&trig->leddev_list_lock){.+.?}-{2:2} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&trig->leddev_list_lock); local_irq_disable(); lock(&host->lock); lock(&trig->leddev_list_lock); <Interrupt> lock(&host->lock); *** DEADLOCK *** no locks held by swapper/3/0. the shortest dependencies between 2nd lock and 1st lock: -> (&trig->leddev_list_lock){.+.?}-{2:2} ops: 46 { HARDIRQ-ON-R at: lock_acquire+0x15f/0x420 _raw_read_lock+0x42/0x90 led_trigger_event+0x2b/0x70 rfkill_global_led_trigger_worker+0x94/0xb0 process_one_work+0x240/0x560 worker_thread+0x58/0x3d0 kthread+0x151/0x170 ret_from_fork+0x1f/0x30 IN-SOFTIRQ-R at: lock_acquire+0x15f/0x420 _raw_read_lock+0x42/0x90 led_trigger_event+0x2b/0x70 kbd_bh+0x9e/0xc0 tasklet_action_common.constprop.0+0xe9/0x100 tasklet_action+0x22/0x30 __do_softirq+0xcc/0x46d run_ksoftirqd+0x3f/0x70 smpboot_thread_fn+0x116/0x1f0 kthread+0x151/0x170 ret_from_fork+0x1f/0x30 SOFTIRQ-ON-R at: lock_acquire+0x15f/0x420 _raw_read_lock+0x42/0x90 led_trigger_event+0x2b/0x70 rfkill_global_led_trigger_worker+0x94/0xb0 process_one_work+0x240/0x560 worker_thread+0x58/0x3d0 kthread+0x151/0x170 ret_from_fork+0x1f/0x30 INITIAL READ USE at: lock_acquire+0x15f/0x420 _raw_read_lock+0x42/0x90 led_trigger_event+0x2b/0x70 rfkill_global_led_trigger_worker+0x94/0xb0 process_one_work+0x240/0x560 worker_thread+0x58/0x3d0 kthread+0x151/0x170 ret_from_fork+0x1f/0x30 } ... key at: [<ffffffff83da4c00>] __key.0+0x0/0x10 ... acquired at: _raw_read_lock+0x42/0x90 led_trigger_blink_oneshot+0x3b/0x90 ledtrig_disk_activity+0x3c/0xa0 ata_qc_complete+0x26/0x450 ata_do_link_abort+0xa3/0xe0 ata_port_freeze+0x2e/0x40 ata_hsm_qc_complete+0x94/0xa0 ata_sff_hsm_move+0x177/0x7a0 ata_sff_pio_task+0xc7/0x1b0 process_one_work+0x240/0x560 worker_thread+0x58/0x3d0 kthread+0x151/0x170 ret_from_fork+0x1f/0x30 -> (&host->lock){-...}-{2:2} ops: 69 { IN-HARDIRQ-W at: lock_acquire+0x15f/0x420 _raw_spin_lock_irqsave+0x52/0xa0 ata_bmdma_interrupt+0x27/0x200 __handle_irq_event_percpu+0xd5/0x2b0 handle_irq_event+0x57/0xb0 handle_edge_irq+0x8c/0x230 asm_call_irq_on_stack+0xf/0x20 common_interrupt+0x100/0x1c0 asm_common_interrupt+0x1e/0x40 native_safe_halt+0xe/0x10 arch_cpu_idle+0x15/0x20 default_idle_call+0x59/0x1c0 do_idle+0x22c/0x2c0 cpu_startup_entry+0x20/0x30 start_secondary+0x11d/0x150 secondary_startup_64_no_verify+0xa6/0xab INITIAL USE at: lock_acquire+0x15f/0x420 _raw_spin_lock_irqsave+0x52/0xa0 ata_dev_init+0x54/0xe0 ata_link_init+0x8b/0xd0 ata_port_alloc+0x1f1/0x210 ata_host_alloc+0xf1/0x130 ata_host_alloc_pinfo+0x14/0xb0 ata_pci_sff_prepare_host+0x41/0xa0 ata_pci_bmdma_prepare_host+0x14/0x30 piix_init_one+0x21f/0x600 local_pci_probe+0x48/0x80 pci_device_probe+0x105/0x1c0 really_probe+0x221/0x490 driver_probe_device+0xe9/0x160 device_driver_attach+0xb2/0xc0 __driver_attach+0x91/0x150 bus_for_each_dev+0x81/0xc0 driver_attach+0x1e/0x20 bus_add_driver+0x138/0x1f0 driver_register+0x91/0xf0 __pci_register_driver+0x73/0x80 piix_init+0x1e/0x2e do_one_initcall+0x5f/0x2d0 kernel_init_freeable+0x26f/0x2cf kernel_init+0xe/0x113 ret_from_fork+0x1f/0x30 } ... key at: [<ffffffff83d9fdc0>] __key.6+0x0/0x10 ... acquired at: __lock_acquire+0x9da/0x2370 lock_acquire+0x15f/0x420 _raw_spin_lock_irqsave+0x52/0xa0 ata_bmdma_interrupt+0x27/0x200 __handle_irq_event_percpu+0xd5/0x2b0 handle_irq_event+0x57/0xb0 handle_edge_irq+0x8c/0x230 asm_call_irq_on_stack+0xf/0x20 common_interrupt+0x100/0x1c0 asm_common_interrupt+0x1e/0x40 native_safe_halt+0xe/0x10 arch_cpu_idle+0x15/0x20 default_idle_call+0x59/0x1c0 do_idle+0x22c/0x2c0 cpu_startup_entry+0x20/0x30 start_secondary+0x11d/0x150 secondary_startup_64_no_verify+0xa6/0xab This lockdep splat is reported after: commit e918188 ("locking: More accurate annotations for read_lock()") To clarify: - read-locks are recursive only in interrupt context (when in_interrupt() returns true) - after acquiring host->lock in CPU1, another cpu (i.e. CPU2) may call write_lock(&trig->leddev_list_lock) that would be blocked by CPU0 that holds trig->leddev_list_lock in read-mode - when CPU1 (ata_ac_complete()) tries to read-lock trig->leddev_list_lock, it would be blocked by the write-lock waiter on CPU2 (because we are not in interrupt context, so the read-lock is not recursive) - at this point if an interrupt happens on CPU0 and ata_bmdma_interrupt() is executed it will try to acquire host->lock, that is held by CPU1, that is currently blocked by CPU2, so: * CPU0 blocked by CPU1 * CPU1 blocked by CPU2 * CPU2 blocked by CPU0 *** DEADLOCK *** The deadlock scenario is better represented by the following schema (thanks to Boqun Feng <[email protected]> for the schema and the detailed explanation of the deadlock condition): CPU 0: CPU 1: CPU 2: ----- ----- ----- led_trigger_event(): read_lock(&trig->leddev_list_lock); <workqueue> ata_hsm_qc_complete(): spin_lock_irqsave(&host->lock); write_lock(&trig->leddev_list_lock); ata_port_freeze(): ata_do_link_abort(): ata_qc_complete(): ledtrig_disk_activity(): led_trigger_blink_oneshot(): read_lock(&trig->leddev_list_lock); // ^ not in in_interrupt() context, so could get blocked by CPU 2 <interrupt> ata_bmdma_interrupt(): spin_lock_irqsave(&host->lock); Fix by using read_lock_irqsave/irqrestore() in led_trigger_event(), so that no interrupt can happen in between, preventing the deadlock condition. Apply the same change to led_trigger_blink_setup() as well, since the same deadlock scenario can also happen in power_supply_update_bat_leds() -> led_trigger_blink() -> led_trigger_blink_setup() (workqueue context), and potentially prevent other similar usages. Link: https://lore.kernel.org/lkml/20201101092614.GB3989@xps-13-7390/ Fixes: eb25cb9 ("leds: convert IDE trigger to common disk trigger") Signed-off-by: Andrea Righi <[email protected]> Signed-off-by: Pavel Machek <[email protected]>
1 parent 47854d2 commit 27af8e2

File tree

1 file changed

+6
-4
lines changed

1 file changed

+6
-4
lines changed

drivers/leds/led-triggers.c

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
378378
enum led_brightness brightness)
379379
{
380380
struct led_classdev *led_cdev;
381+
unsigned long flags;
381382

382383
if (!trig)
383384
return;
384385

385-
read_lock(&trig->leddev_list_lock);
386+
read_lock_irqsave(&trig->leddev_list_lock, flags);
386387
list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
387388
led_set_brightness(led_cdev, brightness);
388-
read_unlock(&trig->leddev_list_lock);
389+
read_unlock_irqrestore(&trig->leddev_list_lock, flags);
389390
}
390391
EXPORT_SYMBOL_GPL(led_trigger_event);
391392

@@ -396,19 +397,20 @@ static void led_trigger_blink_setup(struct led_trigger *trig,
396397
int invert)
397398
{
398399
struct led_classdev *led_cdev;
400+
unsigned long flags;
399401

400402
if (!trig)
401403
return;
402404

403-
read_lock(&trig->leddev_list_lock);
405+
read_lock_irqsave(&trig->leddev_list_lock, flags);
404406
list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list) {
405407
if (oneshot)
406408
led_blink_set_oneshot(led_cdev, delay_on, delay_off,
407409
invert);
408410
else
409411
led_blink_set(led_cdev, delay_on, delay_off);
410412
}
411-
read_unlock(&trig->leddev_list_lock);
413+
read_unlock_irqrestore(&trig->leddev_list_lock, flags);
412414
}
413415

414416
void led_trigger_blink(struct led_trigger *trig,

0 commit comments

Comments
 (0)