-
Notifications
You must be signed in to change notification settings - Fork 140
Suspend/Resume and S0ix implementation #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suspend/Resume and S0ix implementation #57
Conversation
|
@plbossart @lgirdwood This is the new series for suspend/resume implementation. Could you please help review? |
|
@plbossart @lgirdwood Update for 07/25: I have tested this PR with a 2 pipeline topology on the up squared board containing a playback pipeline and a dmic capture pipeline and it worked without any issues. |
c6264aa to
b7f25d3
Compare
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only got some minor formatting and indentation nitpicks that can be fixed later incrementally.
sound/soc/sof/topology.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline can be removed here and maybe other places.
sound/soc/sof/topology.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dito, newline between set and check ret.
sound/soc/sof/topology.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
sound/soc/sof/topology.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed a lot of these where params are spread over three lines where they should fit int two. It makes it easier for me to read if they are on two (and closer to the conditional check).
plbossart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix compilation issues, thanks!
sound/soc/sof/topology.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the second patch does not compile, please make sure each patch can be compiled to avoid breaking git bisect
/data/pbossart/ktest/sof-dev/sound/soc/sof/topology.c: In function ‘snd_sof_free_topology’:
/data/pbossart/ktest/sof-dev/sound/soc/sof/topology.c:2130:48: error: ‘struct snd_sof_dev’ has no member named ‘route_list’; did you mean ‘pcm_list’?
sound/soc/sof/topology.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/data/pbossart/ktest/sof-dev/sound/soc/sof/topology.c: In function ‘sof_widget_ready’:
/data/pbossart/ktest/sof-dev/sound/soc/sof/topology.c:1421:5: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
if (ret < 0 || reply.rhdr.error < 0) {
sound/soc/sof/pm.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/data/pbossart/ktest/sof-dev/sound/soc/sof/pm.c: In function ‘sof_suspend_streams’:
/data/pbossart/ktest/sof-dev/sound/soc/sof/pm.c:177:9: error: ‘struct snd_sof_pcm’ has no member named ‘restore_stream’
spcm->restore_stream[dir] = 1;
^~
/data/pbossart/ktest/sof-dev/sound/soc/sof/pm.c:194:9: error: ‘struct snd_sof_pcm’ has no member named ‘restore_stream’
spcm->restore_stream[dir] = 1;
^~
|
@plbossart let me look into the compilation issue. I might have to re-order the commits. |
b7f25d3 to
30e05b1
Compare
|
@plbossart @lgirdwood I have addressed both of your comments in the latest update |
Add suspend/resume callbacks for APL. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
…ions Create an route_list in snd_sof_dev that will be used to store the dapm routes while parsing topology. This list will be used for restoring the pipeline connections during suspend/resume. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Save the ipc comp data so that the pipeline components can be restored during PM resume. This patch also handles deleting the DAPM routes before the pipeline widgets are unloaded and the associated data is freed. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Move the code to send ipc for initializing trace into a separate function that can also be called during suspend/resume. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
This will be called during resume to send ipc for pipeline completion. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Decrement device usage counter for both the sof device and the pci/acpi/spi device. Without this change the runtime_usage count never reaches 0 and the device will not suspend. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
set the kcontrol cmd which will be used to send the correct ipc command to restore volume control value during resume. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Allow runtime_pm to be enabled after probe is finished for pci/acpi/spi devices. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Modify the signature for load_firmware to add a flag indicating if this is the first time the DSP is being booted up. Based on the platform specific implementation, this flag can be used to decide whether the firmware can be booted from memory in self-retention or to be downloaded again. Note that this patch does not change the implementation of any platform specific load_firmware() method. It merely adds a flag that can be used to modify the implementation later if needed. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
This patch adds the changes required to save the pcm hw_params that will be used to restart streams during system resume. It also implements the flow for pcm resume trigger and handles the stop/pause_release triggersafter resume, There are 3 possible situations when the system resumes from sleep: 1. If the stream was running at suspend, the hw_params is restored and the stream started from the last know host dma position. 2. If the stream was paused at suspend and the user undoes pause after resume, the SNDRV_PCM_TRIGGER_RESUME does not get invoked for such streams. So these streams need to marked for hw_params to be restored at resume and started from the paused host dma position. 3. If the stream was paused at suspend and the user stops playback after resume, the trigger callback method should return without any further action because the stream has not been set up after resume anyway. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
This is the initial implementation for PM and runtime PM callbacks in the SOF driver. The suspend callback includes: suspend all pcm's stream that are running, send CTX_SAVE ipc, drop all ipc's, release trace dma and then power off the DSP. And the resume callback performs the following steps: load FW, run FW, re-initialize trace, restore pipeline, restore the kcontrol values and finally send the ctx restore ipc to the dsp. The streams that are suspended are resumed by the ALSA resume trigger. If the streams are paused during system suspend, they are marked explicitly so they can be restored during PAUSE_RELEASE. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Update the dai config with the pdm config information after parsing the pdm tokens. This will be needed to restore the dai config after suspend/resume. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Suspend/resume callbacks are currently set only for APL. This patch will prevent errors on other platforms for which the callbacks are not implemented as yet. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Previously, dai config was updated only for one DAI with the same name as the link name. But in the case of duplex pcm's, there will be two DAI's with the same name and both of them need to be updated. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Skip adding virtual routes to the route list. This will prevent sending the routes to the FW during system resume. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
30e05b1 to
6b799ab
Compare
|
@plbossart I have updated this pull request based on the findings on suspend/resume tests on chrome. Could I please request you to review the changes? The main changes are:
|
|
@plbossart, @ranj063 I have verified D0ix functionality with the mentioned 15 patches(6b799ab) |
|
@plbossart I'll merge this in upstream v2 when merged. |
|
@lgirdwood I think we should do a partial merge. There is an awful amount of code related to topology handling that could do in the first upstream batch (and would work without suspend/resume). It may be a good thing to split this set in two, one as groundwork and the second as suspend/resume/runtime_pm proper. |
|
This patch set has become too big to review in details, so I'll merge it and we'll have to deal with issues in follow-up patches. |
Hangbin Liu says:
====================
fix dev null pointer dereference when send packets larger than mtu in collect_md mode
When we send a packet larger than PMTU, we need to reply with
icmp_send(ICMP_FRAG_NEEDED) or icmpv6_send(ICMPV6_PKT_TOOBIG).
But with collect_md mode, kernel will crash while accessing the dst dev
as __metadata_dst_init() init dst->dev to NULL by default. Here is what
the code path looks like, for GRE:
- ip6gre_tunnel_xmit
- ip6gre_xmit_ipv4
- __gre6_xmit
- ip6_tnl_xmit
- if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE
- icmp_send
- net = dev_net(rt->dst.dev); <-- here
- ip6gre_xmit_ipv6
- __gre6_xmit
- ip6_tnl_xmit
- if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE
- icmpv6_send
...
- decode_session4
- oif = skb_dst(skb)->dev->ifindex; <-- here
- decode_session6
- oif = skb_dst(skb)->dev->ifindex; <-- here
We could not fix it in __metadata_dst_init() as there is no dev supplied.
Look in to the __icmp_send()/decode_session{4,6} code we could find the dst
dev is actually not needed. In __icmp_send(), we could get the net by skb->dev.
For decode_session{4,6}, as it was called by xfrm_decode_session_reverse()
in this scenario, the oif is not used by
fl4->flowi4_oif = reverse ? skb->skb_iif : oif;
The reproducer is easy:
ovs-vsctl add-br br0
ip link set br0 up
ovs-vsctl add-port br0 gre0 -- set interface gre0 type=gre options:remote_ip=$dst_addr
ip link set gre0 up
ip addr add ${local_gre6}/64 dev br0
ping6 $remote_gre6 -s 1500
The kernel will crash like
[40595.821651] BUG: kernel NULL pointer dereference, address: 0000000000000108
[40595.822411] #PF: supervisor read access in kernel mode
[40595.822949] #PF: error_code(0x0000) - not-present page
[40595.823492] PGD 0 P4D 0
[40595.823767] Oops: 0000 [#1] SMP PTI
[40595.824139] CPU: 0 PID: 2831 Comm: handler12 Not tainted 5.2.0 #57
[40595.824788] Hardware name: Red Hat KVM, BIOS 1.11.1-3.module+el8.1.0+2983+b2ae9c0a 04/01/2014
[40595.825680] RIP: 0010:__xfrm_decode_session+0x6b/0x930
[40595.826219] Code: b7 c0 00 00 00 b8 06 00 00 00 66 85 d2 0f b7 ca 48 0f 45 c1 44 0f b6 2c 06 48 8b 47 58 48 83 e0 fe 0f 84 f4 04 00 00 48 8b 00 <44> 8b 80 08 01 00 00 41 f6 c4 01 4c 89 e7
ba 58 00 00 00 0f 85 47
[40595.828155] RSP: 0018:ffffc90000a73438 EFLAGS: 00010286
[40595.828705] RAX: 0000000000000000 RBX: ffff8881329d7100 RCX: 0000000000000000
[40595.829450] RDX: 0000000000000000 RSI: ffff8881339e70ce RDI: ffff8881329d7100
[40595.830191] RBP: ffffc90000a73470 R08: 0000000000000000 R09: 000000000000000a
[40595.830936] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90000a73490
[40595.831682] R13: 000000000000002c R14: ffff888132ff1301 R15: ffff8881329d7100
[40595.832427] FS: 00007f5bfcfd6700(0000) GS:ffff88813ba00000(0000) knlGS:0000000000000000
[40595.833266] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40595.833883] CR2: 0000000000000108 CR3: 000000013a368000 CR4: 00000000000006f0
[40595.834633] Call Trace:
[40595.835392] ? rt6_multipath_hash+0x4c/0x390
[40595.835853] icmpv6_route_lookup+0xcb/0x1d0
[40595.836296] ? icmpv6_xrlim_allow+0x3e/0x140
[40595.836751] icmp6_send+0x537/0x840
[40595.837125] icmpv6_send+0x20/0x30
[40595.837494] tnl_update_pmtu.isra.27+0x19d/0x2a0 [ip_tunnel]
[40595.838088] ip_md_tunnel_xmit+0x1b6/0x510 [ip_tunnel]
[40595.838633] gre_tap_xmit+0x10c/0x160 [ip_gre]
[40595.839103] dev_hard_start_xmit+0x93/0x200
[40595.839551] sch_direct_xmit+0x101/0x2d0
[40595.839967] __dev_queue_xmit+0x69f/0x9c0
[40595.840399] do_execute_actions+0x1717/0x1910 [openvswitch]
[40595.840987] ? validate_set.isra.12+0x2f5/0x3d0 [openvswitch]
[40595.841596] ? reserve_sfa_size+0x31/0x130 [openvswitch]
[40595.842154] ? __ovs_nla_copy_actions+0x1b4/0xad0 [openvswitch]
[40595.842778] ? __kmalloc_reserve.isra.50+0x2e/0x80
[40595.843285] ? should_failslab+0xa/0x20
[40595.843696] ? __kmalloc+0x188/0x220
[40595.844078] ? __alloc_skb+0x97/0x270
[40595.844472] ovs_execute_actions+0x47/0x120 [openvswitch]
[40595.845041] ovs_packet_cmd_execute+0x27d/0x2b0 [openvswitch]
[40595.845648] genl_family_rcv_msg+0x3a8/0x430
[40595.846101] genl_rcv_msg+0x47/0x90
[40595.846476] ? __alloc_skb+0x83/0x270
[40595.846866] ? genl_family_rcv_msg+0x430/0x430
[40595.847335] netlink_rcv_skb+0xcb/0x100
[40595.847777] genl_rcv+0x24/0x40
[40595.848113] netlink_unicast+0x17f/0x230
[40595.848535] netlink_sendmsg+0x2ed/0x3e0
[40595.848951] sock_sendmsg+0x4f/0x60
[40595.849323] ___sys_sendmsg+0x2bd/0x2e0
[40595.849733] ? sock_poll+0x6f/0xb0
[40595.850098] ? ep_scan_ready_list.isra.14+0x20b/0x240
[40595.850634] ? _cond_resched+0x15/0x30
[40595.851032] ? ep_poll+0x11b/0x440
[40595.851401] ? _copy_to_user+0x22/0x30
[40595.851799] __sys_sendmsg+0x58/0xa0
[40595.852180] do_syscall_64+0x5b/0x190
[40595.852574] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[40595.853105] RIP: 0033:0x7f5c00038c7d
[40595.853489] Code: c7 20 00 00 75 10 b8 2e 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 8e f7 ff ff 48 89 04 24 b8 2e 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 d7 f7 ff ff 48 89
d0 48 83 c4 08 48 3d 01
[40595.855443] RSP: 002b:00007f5bfcf73c00 EFLAGS: 00003293 ORIG_RAX: 000000000000002e
[40595.856244] RAX: ffffffffffffffda RBX: 00007f5bfcf74a60 RCX: 00007f5c00038c7d
[40595.856990] RDX: 0000000000000000 RSI: 00007f5bfcf73c60 RDI: 0000000000000015
[40595.857736] RBP: 0000000000000004 R08: 0000000000000b7c R09: 0000000000000110
[40595.858613] R10: 0001000800050004 R11: 0000000000003293 R12: 000055c2d8329da0
[40595.859401] R13: 00007f5bfcf74120 R14: 0000000000000347 R15: 00007f5bfcf73c60
[40595.860185] Modules linked in: ip_gre ip_tunnel gre openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sunrpc bochs_drm ttm drm_kms_helper drm pcspkr joydev i2c_piix4 qemu_fw_cfg xfs libcrc32c virtio_net net_failover serio_raw failover ata_generic virtio_blk pata_acpi floppy
[40595.863155] CR2: 0000000000000108
[40595.863551] ---[ end trace 22209bbcacb4addd ]---
v4: Julian Anastasov remind skb->dev also could be NULL in icmp_send. We'd
better still use dst.dev and do a check to avoid crash.
v3: only replace pkg to packets in cover letter. So I didn't update the version
info in the follow up patches.
v2: fix it in __icmp_send() and decode_session{4,6} separately instead of
updating shared dst dev in {ip_md, ip6}_tunnel_xmit.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The debug_pagealloc functionality is useful to catch buggy page allocator users that cause e.g. use after free or double free. When page inconsistency is detected, debugging is often simpler by knowing the call stack of process that last allocated and freed the page. When page_owner is also enabled, we record the allocation stack trace, but not freeing. This patch therefore adds recording of freeing process stack trace to page owner info, if both page_owner and debug_pagealloc are configured and enabled. With only page_owner enabled, this info is not useful for the memory leak debugging use case. dump_page() is adjusted to print the info. An example result of calling __free_pages() twice may look like this (note the page last free stack trace): BUG: Bad page state in process bash pfn:13d8f8 page:ffffc31984f63e00 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x1affff800000000() raw: 01affff800000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000 page dumped because: nonzero _refcount page_owner tracks the page as freed page last allocated via order 0, migratetype Unmovable, gfp_mask 0xcc0(GFP_KERNEL) prep_new_page+0x143/0x150 get_page_from_freelist+0x289/0x380 __alloc_pages_nodemask+0x13c/0x2d0 khugepaged+0x6e/0xc10 kthread+0xf9/0x130 ret_from_fork+0x3a/0x50 page last free stack trace: free_pcp_prepare+0x134/0x1e0 free_unref_page+0x18/0x90 khugepaged+0x7b/0xc10 kthread+0xf9/0x130 ret_from_fork+0x3a/0x50 Modules linked in: CPU: 3 PID: 271 Comm: bash Not tainted 5.3.0-rc4-2.g07a1a73-default+ thesofproject#57 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x85/0xc0 bad_page.cold+0xba/0xbf rmqueue_pcplist.isra.0+0x6c5/0x6d0 rmqueue+0x2d/0x810 get_page_from_freelist+0x191/0x380 __alloc_pages_nodemask+0x13c/0x2d0 __get_free_pages+0xd/0x30 __pud_alloc+0x2c/0x110 copy_page_range+0x4f9/0x630 dup_mmap+0x362/0x480 dup_mm+0x68/0x110 copy_process+0x19e1/0x1b40 _do_fork+0x73/0x310 __x64_sys_clone+0x75/0x80 do_syscall_64+0x6e/0x1e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f10af854a10 ... Link: http://lkml.kernel.org/r/20190820131828.22684-5-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here's the KASAN report: BUG: KASAN: use-after-free in skcipher_crypt_done+0xe8/0x1a8 Read of size 1 at addr ffff00002304001c by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc1-00162-gfcb90d5 thesofproject#57 Hardware name: LS1046A RDB Board (DT) Call trace: dump_backtrace+0x0/0x260 show_stack+0x14/0x20 dump_stack+0xe8/0x144 print_address_description.isra.11+0x64/0x348 __kasan_report+0x11c/0x230 kasan_report+0xc/0x18 __asan_load1+0x5c/0x68 skcipher_crypt_done+0xe8/0x1a8 caam_jr_dequeue+0x390/0x608 tasklet_action_common.isra.13+0x1ec/0x230 tasklet_action+0x24/0x30 efi_header_end+0x1a4/0x370 irq_exit+0x114/0x128 __handle_domain_irq+0x80/0xe0 gic_handle_irq+0x50/0xa0 el1_irq+0xb8/0x180 _raw_spin_unlock_irq+0x2c/0x78 finish_task_switch+0xa4/0x2f8 __schedule+0x3a4/0x890 schedule_idle+0x28/0x50 do_idle+0x22c/0x338 cpu_startup_entry+0x24/0x40 rest_init+0xf8/0x10c arch_call_rest_init+0xc/0x14 start_kernel+0x774/0x7b4 Allocated by task 263: save_stack+0x24/0xb0 __kasan_kmalloc.isra.10+0xc4/0xe0 kasan_kmalloc+0xc/0x18 __kmalloc+0x178/0x2b8 skcipher_edesc_alloc+0x21c/0x1018 skcipher_encrypt+0x84/0x150 crypto_skcipher_encrypt+0x50/0x68 test_skcipher_vec_cfg+0x4d4/0xc10 test_skcipher_vec+0xf8/0x1d8 alg_test_skcipher+0xec/0x230 alg_test.part.44+0x114/0x4a0 alg_test+0x1c/0x60 cryptomgr_test+0x34/0x58 kthread+0x1b8/0x1c0 ret_from_fork+0x10/0x18 Freed by task 0: save_stack+0x24/0xb0 __kasan_slab_free+0x10c/0x188 kasan_slab_free+0x10/0x18 kfree+0x7c/0x298 skcipher_crypt_done+0xe0/0x1a8 caam_jr_dequeue+0x390/0x608 tasklet_action_common.isra.13+0x1ec/0x230 tasklet_action+0x24/0x30 efi_header_end+0x1a4/0x370 The buggy address belongs to the object at ffff000023040000 which belongs to the cache dma-kmalloc-512 of size 512 The buggy address is located 28 bytes inside of 512-byte region [ffff000023040000, ffff000023040200) The buggy address belongs to the page: page:fffffe00006c1000 refcount:1 mapcount:0 mapping:ffff00093200c400 index:0x0 compound_mapcount: 0 flags: 0xffff00000010200(slab|head) raw: 0ffff00000010200 dead000000000100 dead000000000122 ffff00093200c400 raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff00002303ff00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff00002303ff80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff000023040000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff000023040080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff000023040100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: ee38767 ("crypto: caam - support crypto_engine framework for SKCIPHER algorithms") Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
…le_activate In case if isi.nr_pages is 0, we are making sis->pages (which is unsigned int) a huge value in iomap_swapfile_activate() by assigning -1. This could cause a kernel crash in kernel v4.18 (with below signature). Or could lead to unknown issues on latest kernel if the fake big swap gets used. Fix this issue by returning -EINVAL in case of nr_pages is 0, since it is anyway a invalid swapfile. Looks like this issue will be hit when we have pagesize < blocksize type of configuration. I was able to hit the issue in case of a tiny swap file with below test script. https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/scripts/swap-issue.sh kernel crash analysis on v4.18 ============================== On v4.18 kernel, it causes a kernel panic, since sis->pages becomes a huge value and isi.nr_extents is 0. When 0 is returned it is considered as a swapfile over NFS and SWP_FILE is set (sis->flags |= SWP_FILE). Then when swapoff was getting called it was calling a_ops->swap_deactivate() if (sis->flags & SWP_FILE) is true. Since a_ops->swap_deactivate() is NULL in case of XFS, it causes below panic. Panic signature on v4.18 kernel: ======================================= root@qemu:/home/qemu# [ 8291.723351] XFS (loop2): Unmounting Filesystem [ 8292.123104] XFS (loop2): Mounting V5 Filesystem [ 8292.132451] XFS (loop2): Ending clean mount [ 8292.263362] Adding 4294967232k swap on /mnt1/test/swapfile. Priority:-2 extents:1 across:274877906880k [ 8292.277834] Unable to handle kernel paging request for instruction fetch [ 8292.278677] Faulting instruction address: 0x00000000 cpu 0x19: Vector: 400 (Instruction Access) at [c0000009dd5b7ad0] pc: 0000000000000000 lr: c0000000003eb9dc: destroy_swap_extents+0xfc/0x120 sp: c0000009dd5b7d50 msr: 8000000040009033 current = 0xc0000009b6710080 paca = 0xc00000003ffcb280 irqmask: 0x03 irq_happened: 0x01 pid = 5604, comm = swapoff Linux version 4.18.0 (riteshh@xxxxxxx) (gcc version 8.4.0 (Ubuntu 8.4.0-1ubuntu1~18.04)) #57 SMP Wed Mar 3 01:33:04 CST 2021 enter ? for help [link register ] c0000000003eb9dc destroy_swap_extents+0xfc/0x120 [c0000009dd5b7d50] c0000000025a7058 proc_poll_event+0x0/0x4 (unreliable) [c0000009dd5b7da0] c0000000003f0498 sys_swapoff+0x3f8/0x910 [c0000009dd5b7e30] c00000000000bbe4 system_call+0x5c/0x70 Exception: c01 (System Call) at 00007ffff7d208d8 Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> [djwong: rework the comment to provide more details] Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
Since priv->rx_mapping[i] is maped in moxart_mac_open(), we should unmap it from moxart_mac_stop(). Fixes 2 warnings. 1. During error unwinding in moxart_mac_probe(): "goto init_fail;", then moxart_mac_free_memory() calls dma_unmap_single() with priv->rx_mapping[i] pointers zeroed. WARNING: CPU: 0 PID: 1 at kernel/dma/debug.c:963 check_unmap+0x704/0x980 DMA-API: moxart-ethernet 92000000.mac: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=1600 bytes] CPU: 0 PID: 1 Comm: swapper Not tainted 5.19.0+ thesofproject#60 Hardware name: Generic DT based system unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x34/0x44 dump_stack_lvl from __warn+0xbc/0x1f0 __warn from warn_slowpath_fmt+0x94/0xc8 warn_slowpath_fmt from check_unmap+0x704/0x980 check_unmap from debug_dma_unmap_page+0x8c/0x9c debug_dma_unmap_page from moxart_mac_free_memory+0x3c/0xa8 moxart_mac_free_memory from moxart_mac_probe+0x190/0x218 moxart_mac_probe from platform_probe+0x48/0x88 platform_probe from really_probe+0xc0/0x2e4 2. After commands: ip link set dev eth0 down ip link set dev eth0 up WARNING: CPU: 0 PID: 55 at kernel/dma/debug.c:570 add_dma_entry+0x204/0x2ec DMA-API: moxart-ethernet 92000000.mac: cacheline tracking EEXIST, overlapping mappings aren't supported CPU: 0 PID: 55 Comm: ip Not tainted 5.19.0+ thesofproject#57 Hardware name: Generic DT based system unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x34/0x44 dump_stack_lvl from __warn+0xbc/0x1f0 __warn from warn_slowpath_fmt+0x94/0xc8 warn_slowpath_fmt from add_dma_entry+0x204/0x2ec add_dma_entry from dma_map_page_attrs+0x110/0x328 dma_map_page_attrs from moxart_mac_open+0x134/0x320 moxart_mac_open from __dev_open+0x11c/0x1ec __dev_open from __dev_change_flags+0x194/0x22c __dev_change_flags from dev_change_flags+0x14/0x44 dev_change_flags from devinet_ioctl+0x6d4/0x93c devinet_ioctl from inet_ioctl+0x1ac/0x25c v1 -> v2: Extraneous change removed. Fixes: 6c821bd ("net: Add MOXA ART SoCs ethernet driver") Signed-off-by: Sergei Antonov <saproj@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220819110519.1230877-1-saproj@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
…) to avoid crash [ Upstream commit 68b99e9 ] When CPU 0 is offline and intel_powerclamp is used to inject idle, it generates kernel BUG: BUG: using smp_processor_id() in preemptible [00000000] code: bash/15687 caller is debug_smp_processor_id+0x17/0x20 CPU: 4 PID: 15687 Comm: bash Not tainted 5.19.0-rc7+ thesofproject#57 Call Trace: <TASK> dump_stack_lvl+0x49/0x63 dump_stack+0x10/0x16 check_preemption_disabled+0xdd/0xe0 debug_smp_processor_id+0x17/0x20 powerclamp_set_cur_state+0x7f/0xf9 [intel_powerclamp] ... ... Here CPU 0 is the control CPU by default and changed to the current CPU, if CPU 0 offlined. This check has to be performed under cpus_read_lock(), hence the above warning. Use get_cpu() instead of smp_processor_id() to avoid this BUG. Suggested-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patchset deals with implementation the suspend/resume flow for PM in the SOF driver.
The main differences from the previous pull request are as follows:
runtime_pm for these devices.
Still only tested on up squared board with a single pipeline. Multiple pipelines testing is WIP.