-
Notifications
You must be signed in to change notification settings - Fork 149
Fix issues with possible security implications (Part 1) #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
NewEraCracker
wants to merge
4
commits into
OnePlusOSS:oneplus3/6.0.1
from
NewEraCracker:oneplus3/6.0.1-security
Closed
Fix issues with possible security implications (Part 1) #11
NewEraCracker
wants to merge
4
commits into
OnePlusOSS:oneplus3/6.0.1
from
NewEraCracker:oneplus3/6.0.1-security
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
commit b4a1b4f upstream. This fixes CVE-2015-7550. There's a race between keyctl_read() and keyctl_revoke(). If the revoke happens between keyctl_read() checking the validity of a key and the key's semaphore being taken, then the key type read method will see a revoked key. This causes a problem for the user-defined key type because it assumes in its read method that there will always be a payload in a non-revoked key and doesn't check for a NULL pointer. Fix this by making keyctl_read() check the validity of a key after taking semaphore instead of before. I think the bug was introduced with the original keyrings code. This was discovered by a multithreaded test program generated by syzkaller (http://github.com/google/syzkaller). Here's a cleaned up version: #include <sys/types.h> #include <keyutils.h> #include <pthread.h> void *thr0(void *arg) { key_serial_t key = (unsigned long)arg; keyctl_revoke(key); return 0; } void *thr1(void *arg) { key_serial_t key = (unsigned long)arg; char buffer[16]; keyctl_read(key, buffer, 16); return 0; } int main() { key_serial_t key = add_key("user", "%", "foo", 3, KEY_SPEC_USER_KEYRING); pthread_t th[5]; pthread_create(&th[0], 0, thr0, (void *)(unsigned long)key); pthread_create(&th[1], 0, thr1, (void *)(unsigned long)key); pthread_create(&th[2], 0, thr0, (void *)(unsigned long)key); pthread_create(&th[3], 0, thr1, (void *)(unsigned long)key); pthread_join(th[0], 0); pthread_join(th[1], 0); pthread_join(th[2], 0); pthread_join(th[3], 0); return 0; } Build as: cc -o keyctl-race keyctl-race.c -lkeyutils -lpthread Run as: while keyctl-race; do :; done as it may need several iterations to crash the kernel. The crash can be summarised as: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffff81279b08>] user_read+0x56/0xa3 ... Call Trace: [<ffffffff81276aa9>] keyctl_read_key+0xb6/0xd7 [<ffffffff81277815>] SyS_keyctl+0x83/0xe0 [<ffffffff815dbb97>] entry_SYSCALL_64_fastpath+0x12/0x6f Change-Id: I7650e04b254ba73cf97d34dfb1410cb30f64b209 Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: David Howells <[email protected]> Tested-by: Dmitry Vyukov <[email protected]> Signed-off-by: James Morris <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: engstk <[email protected]>
[ Upstream commit 38327424b40bcebe2de92d07312c89360ac9229a ]
If __key_link_begin() failed then "edit" would be uninitialized. I've
added a check to fix that.
This allows a random user to crash the kernel, though it's quite
difficult to achieve. There are three ways it can be done as the user
would have to cause an error to occur in __key_link():
(1) Cause the kernel to run out of memory. In practice, this is difficult
to achieve without ENOMEM cropping up elsewhere and aborting the
attempt.
(2) Revoke the destination keyring between the keyring ID being looked up
and it being tested for revocation. In practice, this is difficult to
time correctly because the KEYCTL_REJECT function can only be used
from the request-key upcall process. Further, users can only make use
of what's in /sbin/request-key.conf, though this does including a
rejection debugging test - which means that the destination keyring
has to be the caller's session keyring in practice.
(3) Have just enough key quota available to create a key, a new session
keyring for the upcall and a link in the session keyring, but not then
sufficient quota to create a link in the nominated destination keyring
so that it fails with EDQUOT.
The bug can be triggered using option (3) above using something like the
following:
echo 80 >/proc/sys/kernel/keys/root_maxbytes
keyctl request2 user debug:fred negate @t
The above sets the quota to something much lower (80) to make the bug
easier to trigger, but this is dependent on the system. Note also that
the name of the keyring created contains a random number that may be
between 1 and 10 characters in size, so may throw the test off by
changing the amount of quota used.
Assuming the failure occurs, something like the following will be seen:
kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h
------------[ cut here ]------------
kernel BUG at ../mm/slab.c:2821!
...
RIP: 0010:[<ffffffff811600f9>] kfree_debugcheck+0x20/0x25
RSP: 0018:ffff8804014a7de8 EFLAGS: 00010092
RAX: 0000000000000034 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000
RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300
RBP: ffff8804014a7df0 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8804014a7e68 R11: 0000000000000054 R12: 0000000000000202
R13: ffffffff81318a66 R14: 0000000000000000 R15: 0000000000000001
...
Call Trace:
kfree+0xde/0x1bc
assoc_array_cancel_edit+0x1f/0x36
__key_link_end+0x55/0x63
key_reject_and_link+0x124/0x155
keyctl_reject_key+0xb6/0xe0
keyctl_negate_key+0x10/0x12
SyS_keyctl+0x9f/0xe7
do_syscall_64+0x63/0x13a
entry_SYSCALL64_slow_path+0x25/0x25
Fixes: f70e2e0 ('KEYS: Do preallocation for __key_link()')
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: engstk <[email protected]>
Fix a short sprintf buffer in proc_keys_show(). If the gcc stack protector is turned on, this can cause a panic due to stack corruption. The problem is that xbuf[] is not big enough to hold a 64-bit timeout rendered as weeks: (gdb) p 0xffffffffffffffffULL/(60*60*24*7) $2 = 30500568904943 That's 14 chars plus NUL, not 11 chars plus NUL. Expand the buffer to 16 chars. I think the unpatched code apparently works if the stack-protector is not enabled because on a 32-bit machine the buffer won't be overflowed and on a 64-bit machine there's a 64-bit aligned pointer at one side and an int that isn't checked again on the other side. The panic incurred looks something like: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81352ebe CPU: 0 PID: 1692 Comm: reproducer Not tainted 4.7.2-201.fc24.x86_64 OnePlusOSS#1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 0000000000000086 00000000fbbd2679 ffff8800a044bc00 ffffffff813d941f ffffffff81a28d58 ffff8800a044bc98 ffff8800a044bc88 ffffffff811b2cb6 ffff880000000010 ffff8800a044bc98 ffff8800a044bc30 00000000fbbd2679 Call Trace: [<ffffffff813d941f>] dump_stack+0x63/0x84 [<ffffffff811b2cb6>] panic+0xde/0x22a [<ffffffff81352ebe>] ? proc_keys_show+0x3ce/0x3d0 [<ffffffff8109f7f9>] __stack_chk_fail+0x19/0x30 [<ffffffff81352ebe>] proc_keys_show+0x3ce/0x3d0 [<ffffffff81350410>] ? key_validate+0x50/0x50 [<ffffffff8134db30>] ? key_default_cmp+0x20/0x20 [<ffffffff8126b31c>] seq_read+0x2cc/0x390 [<ffffffff812b6b12>] proc_reg_read+0x42/0x70 [<ffffffff81244fc7>] __vfs_read+0x37/0x150 [<ffffffff81357020>] ? security_file_permission+0xa0/0xc0 [<ffffffff81246156>] vfs_read+0x96/0x130 [<ffffffff81247635>] SyS_read+0x55/0xc0 [<ffffffff817eb872>] entry_SYSCALL_64_fastpath+0x1a/0xa4 Change-Id: I0787d5a38c730ecb75d3c08f28f0ab36295d59e7 Reported-by: Ondrej Kozina <[email protected]> Signed-off-by: David Howells <[email protected]> Tested-by: Ondrej Kozina <[email protected]> Signed-off-by: engstk <[email protected]>
Signed-off-by: Jann Horn <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> (cherry picked from commit b7f76ea) Test: Builds. Change-Id: I6a45b878e7f2aeb9d926630cc724dc4ada79b4d1 Signed-off-by: Jorge Lucangeli Obes <[email protected]> Signed-off-by: engstk <[email protected]>
Author
|
This issues affect the stock kernel as well as most 3rd party kernels that I've audited. It is important those fixes are applied as soon a possible and updates are made available. |
NewEraCracker
pushed a commit
to NewEraCracker/android_kernel_oneplus_msm8996
that referenced
this pull request
Feb 26, 2017
commit f0d6bd1 ("usb: dwc3-msm: Add dynamic pm_qos, clock and bus voting in host mode") adding support for bus voting with PM_QOS_LATENCY. With this change driver can call pm_qos_update_request or pm_qos_remove_requests without adding the request and printing warnings below. pm_qos_update_request() called for unknown object Modules linked in: CPU: 2 PID: 406 Comm: kworker/2:2 Tainted: G W 3.18.31-gfbfce40-dirty OnePlusOSS#11 Workqueue: events dwc3_msm_otg_sm_work Call trace: [<ffffffc0000898a0>] dump_backtrace+0x0/0x270 [<ffffffc000089b24>] show_stack+0x14/0x1c [<ffffffc000d81d90>] dump_stack+0x9c/0xd4 [<ffffffc0000a2cf8>] warn_slowpath_common+0x8c/0xb0 [<ffffffc0000a2d6c>] warn_slowpath_fmt+0x50/0x58 [<ffffffc0000eb32c>] pm_qos_update_request+0x34/0x5c [<ffffffc0006a90ec>] dwc3_msm_perf_vote_update+0x19c/0x2b4 [<ffffffc0006abd90>] dwc3_otg_start_peripheral+0x1c0/0x204 [<ffffffc0006ad984>] dwc3_msm_otg_sm_work+0x590/0x878 [<ffffffc0000b8eb4>] process_one_work+0x260/0x450 [<ffffffc0000b9644>] worker_thread+0x2fc/0x418 [<ffffffc0000bd724>] kthread+0xe4/0xec ---[ end trace 9998738a22dac7cc ]--- Hence fix these warnings by adding proper check in driver for calling pm_qos_update_request/pm_qos_remove_request. Change-Id: I95cc2e68ce497cd2d11a4fbf783868076efb7359 Signed-off-by: Chandana Kishori Chiluveru <[email protected]>
Martinusbe
referenced
this pull request
in GZR-Kernels/kernel_oneplus_msm8996
Mar 28, 2017
Currently CONFIG_TIMER_STATS exposes process information across namespaces:
kernel/time/timer_list.c print_timer():
SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);
/proc/timer_list:
#11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570
Given that the tracer can give the same information, this patch entirely
removes CONFIG_TIMER_STATS.
Change-Id: Ice26d74094d3ad563808342c1604ad444234844b
Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Acked-by: John Stultz <[email protected]>
Cc: Nicolas Pitre <[email protected]>
Cc: [email protected]
Cc: Lai Jiangshan <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Xing Gao <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Jessica Frazelle <[email protected]>
Cc: [email protected]
Cc: Nicolas Iooss <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Richard Cochran <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Olof Johansson <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Cc: Arjan van de Ven <[email protected]>
Link: http://lkml.kernel.org/r/20170208192659.GA32582@beast
Signed-off-by: Thomas Gleixner <[email protected]>
Martinusbe
referenced
this pull request
in GZR-Kernels/kernel_oneplus_msm8996
Mar 29, 2017
Currently CONFIG_TIMER_STATS exposes process information across namespaces:
kernel/time/timer_list.c print_timer():
SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);
/proc/timer_list:
#11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570
Given that the tracer can give the same information, this patch entirely
removes CONFIG_TIMER_STATS.
Change-Id: Ice26d74094d3ad563808342c1604ad444234844b
Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Acked-by: John Stultz <[email protected]>
Cc: Nicolas Pitre <[email protected]>
Cc: [email protected]
Cc: Lai Jiangshan <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Xing Gao <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Jessica Frazelle <[email protected]>
Cc: [email protected]
Cc: Nicolas Iooss <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Richard Cochran <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Olof Johansson <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Cc: Arjan van de Ven <[email protected]>
Link: http://lkml.kernel.org/r/20170208192659.GA32582@beast
Signed-off-by: Thomas Gleixner <[email protected]>
dustin84
pushed a commit
to dustin84/Oneplus3T
that referenced
this pull request
Apr 16, 2017
Currently CONFIG_TIMER_STATS exposes process information across namespaces:
kernel/time/timer_list.c print_timer():
SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);
/proc/timer_list:
OnePlusOSS#11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570
Given that the tracer can give the same information, this patch entirely
removes CONFIG_TIMER_STATS.
Change-Id: Ice26d74094d3ad563808342c1604ad444234844b
Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Acked-by: John Stultz <[email protected]>
Cc: Nicolas Pitre <[email protected]>
Cc: [email protected]
Cc: Lai Jiangshan <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Xing Gao <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Jessica Frazelle <[email protected]>
Cc: [email protected]
Cc: Nicolas Iooss <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Richard Cochran <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Olof Johansson <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Cc: Arjan van de Ven <[email protected]>
Link: http://lkml.kernel.org/r/20170208192659.GA32582@beast
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: engstk <[email protected]>
NewEraCracker
pushed a commit
to NewEraCracker/android_kernel_oneplus_msm8996
that referenced
this pull request
Apr 20, 2017
commit 45caeaa5ac0b4b11784ac6f932c0ad4c6b67cda0 upstream. As Eric Dumazet pointed out this also needs to be fixed in IPv6. v2: Contains the IPv6 tcp/Ipv6 dccp patches as well. We have seen a few incidents lately where a dst_enty has been freed with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that dst_entry. If the conditions/timings are right a crash then ensues when the freed dst_entry is referenced later on. A Common crashing back trace is: OnePlusOSS#8 [] page_fault at ffffffff8163e648 [exception RIP: __tcp_ack_snd_check+74] . . OnePlusOSS#9 [] tcp_rcv_established at ffffffff81580b64 OnePlusOSS#10 [] tcp_v4_do_rcv at ffffffff8158b54a OnePlusOSS#11 [] tcp_v4_rcv at ffffffff8158cd02 OnePlusOSS#12 [] ip_local_deliver_finish at ffffffff815668f4 OnePlusOSS#13 [] ip_local_deliver at ffffffff81566bd9 OnePlusOSS#14 [] ip_rcv_finish at ffffffff8156656d OnePlusOSS#15 [] ip_rcv at ffffffff81566f06 OnePlusOSS#16 [] __netif_receive_skb_core at ffffffff8152b3a2 OnePlusOSS#17 [] __netif_receive_skb at ffffffff8152b608 OnePlusOSS#18 [] netif_receive_skb at ffffffff8152b690 OnePlusOSS#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3] OnePlusOSS#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3] OnePlusOSS#21 [] net_rx_action at ffffffff8152bac2 OnePlusOSS#22 [] __do_softirq at ffffffff81084b4f OnePlusOSS#23 [] call_softirq at ffffffff8164845c OnePlusOSS#24 [] do_softirq at ffffffff81016fc5 OnePlusOSS#25 [] irq_exit at ffffffff81084ee5 OnePlusOSS#26 [] do_IRQ at ffffffff81648ff8 Of course it may happen with other NIC drivers as well. It's found the freed dst_entry here: 224 static bool tcp_in_quickack_mode(struct sock *sk)↩ 225 {↩ 226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩ 227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩ 228 ↩ 229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩ 230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩ 231 }↩ But there are other backtraces attributed to the same freed dst_entry in netfilter code as well. All the vmcores showed 2 significant clues: - Remote hosts behind the default gateway had always been redirected to a different gateway. A rtable/dst_entry will be added for that host. Making more dst_entrys with lower reference counts. Making this more probable. - All vmcores showed a postitive LockDroppedIcmps value, e.g: LockDroppedIcmps 267 A closer look at the tcp_v4_err() handler revealed that do_redirect() will run regardless of whether user space has the socket locked. This can result in a race condition where the same dst_entry cached in sk->sk_dst_entry can be decremented twice for the same socket via: do_redirect()->__sk_dst_check()-> dst_release(). Which leads to the dst_entry being prematurely freed with another socket pointing to it via sk->sk_dst_cache and a subsequent crash. To fix this skip do_redirect() if usespace has the socket locked. Instead let the redirect take place later when user space does not have the socket locked. The dccp/IPv6 code is very similar in this respect, so fixing it there too. As Eric Garver pointed out the following commit now invalidates routes. Which can set the dst->obsolete flag so that ipv4_dst_check() returns null and triggers the dst_release(). Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.") Cc: Eric Garver <[email protected]> Cc: Hannes Sowa <[email protected]> Signed-off-by: Jon Maxwell <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
NewEraCracker
pushed a commit
to NewEraCracker/android_kernel_oneplus_msm8996
that referenced
this pull request
Jul 13, 2017
commit 4dfce57db6354603641132fac3c887614e3ebe81 upstream.
There have been several reports over the years of NULL pointer
dereferences in xfs_trans_log_inode during xfs_fsr processes,
when the process is doing an fput and tearing down extents
on the temporary inode, something like:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
PID: 29439 TASK: ffff880550584fa0 CPU: 6 COMMAND: "xfs_fsr"
[exception RIP: xfs_trans_log_inode+0x10]
OnePlusOSS#9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs]
OnePlusOSS#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs]
OnePlusOSS#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs]
OnePlusOSS#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs]
OnePlusOSS#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs]
OnePlusOSS#14 [ffff8800a57bbe00] evict at ffffffff811e1b67
OnePlusOSS#15 [ffff8800a57bbe28] iput at ffffffff811e23a5
OnePlusOSS#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8
OnePlusOSS#17 [ffff8800a57bbe88] dput at ffffffff811dd06c
OnePlusOSS#18 [ffff8800a57bbea8] __fput at ffffffff811c823b
OnePlusOSS#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e
OnePlusOSS#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27
OnePlusOSS#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c
OnePlusOSS#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d
As it turns out, this is because the i_itemp pointer, along
with the d_ops pointer, has been overwritten with zeros
when we tear down the extents during truncate. When the in-core
inode fork on the temporary inode used by xfs_fsr was originally
set up during the extent swap, we mistakenly looked at di_nextents
to determine whether all extents fit inline, but this misses extents
generated by speculative preallocation; we should be using if_bytes
instead.
This mistake corrupts the in-memory inode, and code in
xfs_iext_remove_inline eventually gets bad inputs, causing
it to memmove and memset incorrect ranges; this became apparent
because the two values in ifp->if_u2.if_inline_ext[1] contained
what should have been in d_ops and i_itemp; they were memmoved due
to incorrect array indexing and then the original locations
were zeroed with memset, again due to an array overrun.
Fix this by properly using i_df.if_bytes to determine the number
of extents, not di_nextents.
Thanks to dchinner for looking at this with me and spotting the
root cause.
[nborisov: backported to 4.4]
Signed-off-by: Eric Sandeen <[email protected]>
Reviewed-by: Brian Foster <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Martinusbe
referenced
this pull request
in GZR-Kernels/kernel_oneplus_msm8996
Jul 24, 2017
commit 45caeaa5ac0b4b11784ac6f932c0ad4c6b67cda0 upstream. As Eric Dumazet pointed out this also needs to be fixed in IPv6. v2: Contains the IPv6 tcp/Ipv6 dccp patches as well. We have seen a few incidents lately where a dst_enty has been freed with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that dst_entry. If the conditions/timings are right a crash then ensues when the freed dst_entry is referenced later on. A Common crashing back trace is: #8 [] page_fault at ffffffff8163e648 [exception RIP: __tcp_ack_snd_check+74] . . #9 [] tcp_rcv_established at ffffffff81580b64 #10 [] tcp_v4_do_rcv at ffffffff8158b54a #11 [] tcp_v4_rcv at ffffffff8158cd02 #12 [] ip_local_deliver_finish at ffffffff815668f4 #13 [] ip_local_deliver at ffffffff81566bd9 #14 [] ip_rcv_finish at ffffffff8156656d #15 [] ip_rcv at ffffffff81566f06 #16 [] __netif_receive_skb_core at ffffffff8152b3a2 #17 [] __netif_receive_skb at ffffffff8152b608 #18 [] netif_receive_skb at ffffffff8152b690 #19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3] #20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3] #21 [] net_rx_action at ffffffff8152bac2 #22 [] __do_softirq at ffffffff81084b4f #23 [] call_softirq at ffffffff8164845c #24 [] do_softirq at ffffffff81016fc5 #25 [] irq_exit at ffffffff81084ee5 #26 [] do_IRQ at ffffffff81648ff8 Of course it may happen with other NIC drivers as well. It's found the freed dst_entry here: 224 static bool tcp_in_quickack_mode(struct sock *sk)↩ 225 {↩ 226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩ 227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩ 228 ↩ 229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩ 230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩ 231 }↩ But there are other backtraces attributed to the same freed dst_entry in netfilter code as well. All the vmcores showed 2 significant clues: - Remote hosts behind the default gateway had always been redirected to a different gateway. A rtable/dst_entry will be added for that host. Making more dst_entrys with lower reference counts. Making this more probable. - All vmcores showed a postitive LockDroppedIcmps value, e.g: LockDroppedIcmps 267 A closer look at the tcp_v4_err() handler revealed that do_redirect() will run regardless of whether user space has the socket locked. This can result in a race condition where the same dst_entry cached in sk->sk_dst_entry can be decremented twice for the same socket via: do_redirect()->__sk_dst_check()-> dst_release(). Which leads to the dst_entry being prematurely freed with another socket pointing to it via sk->sk_dst_cache and a subsequent crash. To fix this skip do_redirect() if usespace has the socket locked. Instead let the redirect take place later when user space does not have the socket locked. The dccp/IPv6 code is very similar in this respect, so fixing it there too. As Eric Garver pointed out the following commit now invalidates routes. Which can set the dst->obsolete flag so that ipv4_dst_check() returns null and triggers the dst_release(). Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.") Cc: Eric Garver <[email protected]> Cc: Hannes Sowa <[email protected]> Signed-off-by: Jon Maxwell <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Martinusbe <[email protected]>
Martinusbe
referenced
this pull request
in GZR-Kernels/kernel_oneplus_msm8996
Jul 24, 2017
commit 4dfce57db6354603641132fac3c887614e3ebe81 upstream.
There have been several reports over the years of NULL pointer
dereferences in xfs_trans_log_inode during xfs_fsr processes,
when the process is doing an fput and tearing down extents
on the temporary inode, something like:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
PID: 29439 TASK: ffff880550584fa0 CPU: 6 COMMAND: "xfs_fsr"
[exception RIP: xfs_trans_log_inode+0x10]
#9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs]
#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs]
#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs]
#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs]
#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs]
#14 [ffff8800a57bbe00] evict at ffffffff811e1b67
#15 [ffff8800a57bbe28] iput at ffffffff811e23a5
#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8
#17 [ffff8800a57bbe88] dput at ffffffff811dd06c
#18 [ffff8800a57bbea8] __fput at ffffffff811c823b
#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e
#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27
#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c
#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d
As it turns out, this is because the i_itemp pointer, along
with the d_ops pointer, has been overwritten with zeros
when we tear down the extents during truncate. When the in-core
inode fork on the temporary inode used by xfs_fsr was originally
set up during the extent swap, we mistakenly looked at di_nextents
to determine whether all extents fit inline, but this misses extents
generated by speculative preallocation; we should be using if_bytes
instead.
This mistake corrupts the in-memory inode, and code in
xfs_iext_remove_inline eventually gets bad inputs, causing
it to memmove and memset incorrect ranges; this became apparent
because the two values in ifp->if_u2.if_inline_ext[1] contained
what should have been in d_ops and i_itemp; they were memmoved due
to incorrect array indexing and then the original locations
were zeroed with memset, again due to an array overrun.
Fix this by properly using i_df.if_bytes to determine the number
of extents, not di_nextents.
Thanks to dchinner for looking at this with me and spotting the
root cause.
[nborisov: backported to 4.4]
Signed-off-by: Eric Sandeen <[email protected]>
Reviewed-by: Brian Foster <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Martinusbe <[email protected]>
nathanchance
pushed a commit
to android-linux-stable/op3
that referenced
this pull request
Apr 10, 2018
[ Upstream commit 5f16a046f8e144c294ef98cd29d9458b5f8273e5 ] FUTEX_OP_OPARG_SHIFT instructs the futex code to treat the 12-bit oparg field as a shift value, potentially leading to a left shift value that is negative or with an absolute value that is significantly larger then the size of the type. UBSAN chokes with: ================================================================================ UBSAN: Undefined behaviour in ./arch/arm64/include/asm/futex.h:60:13 shift exponent -1 is negative CPU: 1 PID: 1449 Comm: syz-executor0 Not tainted 4.11.0-rc4-00005-g977eb52-dirty OnePlusOSS#11 Hardware name: linux,dummy-virt (DT) Call trace: [<ffff200008094778>] dump_backtrace+0x0/0x538 arch/arm64/kernel/traps.c:73 [<ffff200008094cd0>] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:228 [<ffff200008c194a8>] __dump_stack lib/dump_stack.c:16 [inline] [<ffff200008c194a8>] dump_stack+0x120/0x188 lib/dump_stack.c:52 [<ffff200008cc24b8>] ubsan_epilogue+0x18/0x98 lib/ubsan.c:164 [<ffff200008cc3098>] __ubsan_handle_shift_out_of_bounds+0x250/0x294 lib/ubsan.c:421 [<ffff20000832002c>] futex_atomic_op_inuser arch/arm64/include/asm/futex.h:60 [inline] [<ffff20000832002c>] futex_wake_op kernel/futex.c:1489 [inline] [<ffff20000832002c>] do_futex+0x137c/0x1740 kernel/futex.c:3231 [<ffff200008320504>] SYSC_futex kernel/futex.c:3281 [inline] [<ffff200008320504>] SyS_futex+0x114/0x268 kernel/futex.c:3249 [<ffff200008084770>] el0_svc_naked+0x24/0x28 ================================================================================ syz-executor1 uses obsolete (PF_INET,SOCK_PACKET) sock: process `syz-executor0' is using obsolete setsockopt SO_BSDCOMPAT This patch attempts to fix some of this by: * Making encoded_op an unsigned type, so we can shift it left even if the top bit is set. * Casting to signed prior to shifting right when extracting oparg and cmparg * Consider only the bottom 5 bits of oparg when using it as a left-shift value. Whilst I think this catches all of the issues, I'd much prefer to remove this stuff, as I think it's unused and the bugs are copy-pasted between a bunch of architectures. Reviewed-by: Robin Murphy <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
nathanchance
pushed a commit
to android-linux-stable/op3
that referenced
this pull request
Apr 13, 2018
[ Upstream commit 5f16a046f8e144c294ef98cd29d9458b5f8273e5 ] FUTEX_OP_OPARG_SHIFT instructs the futex code to treat the 12-bit oparg field as a shift value, potentially leading to a left shift value that is negative or with an absolute value that is significantly larger then the size of the type. UBSAN chokes with: ================================================================================ UBSAN: Undefined behaviour in ./arch/arm64/include/asm/futex.h:60:13 shift exponent -1 is negative CPU: 1 PID: 1449 Comm: syz-executor0 Not tainted 4.11.0-rc4-00005-g977eb52-dirty OnePlusOSS#11 Hardware name: linux,dummy-virt (DT) Call trace: [<ffff200008094778>] dump_backtrace+0x0/0x538 arch/arm64/kernel/traps.c:73 [<ffff200008094cd0>] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:228 [<ffff200008c194a8>] __dump_stack lib/dump_stack.c:16 [inline] [<ffff200008c194a8>] dump_stack+0x120/0x188 lib/dump_stack.c:52 [<ffff200008cc24b8>] ubsan_epilogue+0x18/0x98 lib/ubsan.c:164 [<ffff200008cc3098>] __ubsan_handle_shift_out_of_bounds+0x250/0x294 lib/ubsan.c:421 [<ffff20000832002c>] futex_atomic_op_inuser arch/arm64/include/asm/futex.h:60 [inline] [<ffff20000832002c>] futex_wake_op kernel/futex.c:1489 [inline] [<ffff20000832002c>] do_futex+0x137c/0x1740 kernel/futex.c:3231 [<ffff200008320504>] SYSC_futex kernel/futex.c:3281 [inline] [<ffff200008320504>] SyS_futex+0x114/0x268 kernel/futex.c:3249 [<ffff200008084770>] el0_svc_naked+0x24/0x28 ================================================================================ syz-executor1 uses obsolete (PF_INET,SOCK_PACKET) sock: process `syz-executor0' is using obsolete setsockopt SO_BSDCOMPAT This patch attempts to fix some of this by: * Making encoded_op an unsigned type, so we can shift it left even if the top bit is set. * Casting to signed prior to shifting right when extracting oparg and cmparg * Consider only the bottom 5 bits of oparg when using it as a left-shift value. Whilst I think this catches all of the issues, I'd much prefer to remove this stuff, as I think it's unused and the bugs are copy-pasted between a bunch of architectures. Reviewed-by: Robin Murphy <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Slim80
pushed a commit
to Slim80/android_kernel_oneplus_msm8996
that referenced
this pull request
Apr 16, 2018
[ Upstream commit 5f16a046f8e144c294ef98cd29d9458b5f8273e5 ] FUTEX_OP_OPARG_SHIFT instructs the futex code to treat the 12-bit oparg field as a shift value, potentially leading to a left shift value that is negative or with an absolute value that is significantly larger then the size of the type. UBSAN chokes with: ================================================================================ UBSAN: Undefined behaviour in ./arch/arm64/include/asm/futex.h:60:13 shift exponent -1 is negative CPU: 1 PID: 1449 Comm: syz-executor0 Not tainted 4.11.0-rc4-00005-g977eb52-dirty OnePlusOSS#11 Hardware name: linux,dummy-virt (DT) Call trace: [<ffff200008094778>] dump_backtrace+0x0/0x538 arch/arm64/kernel/traps.c:73 [<ffff200008094cd0>] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:228 [<ffff200008c194a8>] __dump_stack lib/dump_stack.c:16 [inline] [<ffff200008c194a8>] dump_stack+0x120/0x188 lib/dump_stack.c:52 [<ffff200008cc24b8>] ubsan_epilogue+0x18/0x98 lib/ubsan.c:164 [<ffff200008cc3098>] __ubsan_handle_shift_out_of_bounds+0x250/0x294 lib/ubsan.c:421 [<ffff20000832002c>] futex_atomic_op_inuser arch/arm64/include/asm/futex.h:60 [inline] [<ffff20000832002c>] futex_wake_op kernel/futex.c:1489 [inline] [<ffff20000832002c>] do_futex+0x137c/0x1740 kernel/futex.c:3231 [<ffff200008320504>] SYSC_futex kernel/futex.c:3281 [inline] [<ffff200008320504>] SyS_futex+0x114/0x268 kernel/futex.c:3249 [<ffff200008084770>] el0_svc_naked+0x24/0x28 ================================================================================ syz-executor1 uses obsolete (PF_INET,SOCK_PACKET) sock: process `syz-executor0' is using obsolete setsockopt SO_BSDCOMPAT This patch attempts to fix some of this by: * Making encoded_op an unsigned type, so we can shift it left even if the top bit is set. * Casting to signed prior to shifting right when extracting oparg and cmparg * Consider only the bottom 5 bits of oparg when using it as a left-shift value. Whilst I think this catches all of the issues, I'd much prefer to remove this stuff, as I think it's unused and the bugs are copy-pasted between a bunch of architectures. Reviewed-by: Robin Murphy <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
nathanchance
pushed a commit
to android-linux-stable/op3
that referenced
this pull request
May 30, 2018
[ Upstream commit 2bbea6e117357d17842114c65e9a9cf2d13ae8a3 ] when mounting an ISO filesystem sometimes (very rarely) the system hangs because of a race condition between two tasks. PID: 6766 TASK: ffff88007b2a6dd0 CPU: 0 COMMAND: "mount" #0 [ffff880078447ae0] __schedule at ffffffff8168d605 #1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49 #2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995 OnePlusOSS#3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef OnePlusOSS#4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod] OnePlusOSS#5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50 OnePlusOSS#6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3 OnePlusOSS#7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs] OnePlusOSS#8 [ffff880078447da8] mount_bdev at ffffffff81202570 OnePlusOSS#9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs] OnePlusOSS#10 [ffff880078447e28] mount_fs at ffffffff81202d09 OnePlusOSS#11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f OnePlusOSS#12 [ffff880078447ea8] do_mount at ffffffff81220fee OnePlusOSS#13 [ffff880078447f28] sys_mount at ffffffff812218d6 OnePlusOSS#14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49 RIP: 00007fd9ea914e9a RSP: 00007ffd5d9bf648 RFLAGS: 00010246 RAX: 00000000000000a5 RBX: ffffffff81698c49 RCX: 0000000000000010 RDX: 00007fd9ec2bc210 RSI: 00007fd9ec2bc290 RDI: 00007fd9ec2bcf30 RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000010 R10: 00000000c0ed0001 R11: 0000000000000206 R12: 00007fd9ec2bc040 R13: 00007fd9eb6b2380 R14: 00007fd9ec2bc210 R15: 00007fd9ec2bcf30 ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b This task was trying to mount the cdrom. It allocated and configured a super_block struct and owned the write-lock for the super_block->s_umount rwsem. While exclusively owning the s_umount lock, it called sr_block_ioctl and waited to acquire the global sr_mutex lock. PID: 6785 TASK: ffff880078720fb0 CPU: 0 COMMAND: "systemd-udevd" #0 [ffff880078417898] __schedule at ffffffff8168d605 #1 [ffff880078417900] schedule at ffffffff8168dc59 #2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605 OnePlusOSS#3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838 OnePlusOSS#4 [ffff8800784179d0] down_read at ffffffff8168cde0 OnePlusOSS#5 [ffff8800784179e8] get_super at ffffffff81201cc7 OnePlusOSS#6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de OnePlusOSS#7 [ffff880078417a40] flush_disk at ffffffff8123a94b OnePlusOSS#8 [ffff880078417a88] check_disk_change at ffffffff8123ab50 OnePlusOSS#9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom] OnePlusOSS#10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod] OnePlusOSS#11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86 OnePlusOSS#12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65 OnePlusOSS#13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b OnePlusOSS#14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7 OnePlusOSS#15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf OnePlusOSS#16 [ffff880078417d00] do_last at ffffffff8120d53d OnePlusOSS#17 [ffff880078417db0] path_openat at ffffffff8120e6b2 OnePlusOSS#18 [ffff880078417e48] do_filp_open at ffffffff8121082b OnePlusOSS#19 [ffff880078417f18] do_sys_open at ffffffff811fdd33 OnePlusOSS#20 [ffff880078417f70] sys_open at ffffffff811fde4e OnePlusOSS#21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49 RIP: 00007f29438b0c20 RSP: 00007ffc76624b78 RFLAGS: 00010246 RAX: 0000000000000002 RBX: ffffffff81698c49 RCX: 0000000000000000 RDX: 00007f2944a5fa70 RSI: 00000000000a0800 RDI: 00007f2944a5fa70 RBP: 00007f2944a5f540 R8: 0000000000000000 R9: 0000000000000020 R10: 00007f2943614c40 R11: 0000000000000246 R12: ffffffff811fde4e R13: ffff880078417f78 R14: 000000000000000c R15: 00007f2944a4b010 ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b This task tried to open the cdrom device, the sr_block_open function acquired the global sr_mutex lock. The call to check_disk_change() then saw an event flag indicating a possible media change and tried to flush any cached data for the device. As part of the flush, it tried to acquire the super_block->s_umount lock associated with the cdrom device. This was the same super_block as created and locked by the previous task. The first task acquires the s_umount lock and then the sr_mutex_lock; the second task acquires the sr_mutex_lock and then the s_umount lock. This patch fixes the issue by moving check_disk_change() out of cdrom_open() and let the caller take care of it. Signed-off-by: Maurizio Lombardi <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
ppajda
pushed a commit
to ppajda/android_kernel_oneplus_msm8996
that referenced
this pull request
Sep 21, 2018
…ilable If big.LITTLE driver is initialized even when MCPM is unavailable, we get the below warning the first time cpu tries to enter deeper C-states. ------------[ cut here ]------------ WARNING: CPU: 4 PID: 0 at kernel/arch/arm/common/mcpm_entry.c:130 mcpm_cpu_suspend+0x6d/0x74() Modules linked in: CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.19.0-rc3-00007-gaf5a2cb1ad5c-dirty OnePlusOSS#11 Hardware name: ARM-Versatile Express [<c0013fa5>] (unwind_backtrace) from [<c001084d>] (show_stack+0x11/0x14) [<c001084d>] (show_stack) from [<c04fe7f1>] (dump_stack+0x6d/0x78) [<c04fe7f1>] (dump_stack) from [<c0020645>] (warn_slowpath_common+0x69/0x90) [<c0020645>] (warn_slowpath_common) from [<c00206db>] (warn_slowpath_null+0x17/0x1c) [<c00206db>] (warn_slowpath_null) from [<c001cbdd>] (mcpm_cpu_suspend+0x6d/0x74) [<c001cbdd>] (mcpm_cpu_suspend) from [<c03c6919>] (bl_powerdown_finisher+0x21/0x24) [<c03c6919>] (bl_powerdown_finisher) from [<c001218d>] (cpu_suspend_abort+0x1/0x14) [<c001218d>] (cpu_suspend_abort) from [<00000000>] ( (null)) ---[ end trace d098e3fd00000008 ]--- This patch fixes the issue by checking for the availability of MCPM before initializing the big.LITTLE cpuidle driver Signed-off-by: Sudeep Holla <[email protected]> Acked-by: Lorenzo Pieralisi <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Signed-off-by: crian <[email protected]> Signed-off-by: crian <[email protected]> Signed-off-by: crian <[email protected]>
nathanchance
pushed a commit
to android-linux-stable/op3
that referenced
this pull request
Dec 17, 2018
[ Upstream commit c5a94f434c82529afda290df3235e4d85873c5b4 ] It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 OnePlusOSS#3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e OnePlusOSS#4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] OnePlusOSS#5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] OnePlusOSS#6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] OnePlusOSS#7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] OnePlusOSS#8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 OnePlusOSS#9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] OnePlusOSS#10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 OnePlusOSS#11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa OnePlusOSS#12 [ffff881ff43eff18] sys_read at ffffffff811fda62 OnePlusOSS#13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Slim80
pushed a commit
to Slim80/android_kernel_oneplus_msm8996
that referenced
this pull request
Dec 26, 2018
[ Upstream commit c5a94f434c82529afda290df3235e4d85873c5b4 ] It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 OnePlusOSS#1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed OnePlusOSS#2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 OnePlusOSS#3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e OnePlusOSS#4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] OnePlusOSS#5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] OnePlusOSS#6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] OnePlusOSS#7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] OnePlusOSS#8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 OnePlusOSS#9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] OnePlusOSS#10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 OnePlusOSS#11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa OnePlusOSS#12 [ffff881ff43eff18] sys_read at ffffffff811fda62 OnePlusOSS#13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
nathanchance
pushed a commit
to android-linux-stable/op3
that referenced
this pull request
Mar 23, 2019
[ Upstream commit 163d1c3d6f17556ed3c340d3789ea93be95d6c28 ] Back in 2013 Hannes took care of most of such leaks in commit bceaa90 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") But the bug in l2tp_ip6_recvmsg() has not been fixed. syzbot report : BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 CPU: 1 PID: 10996 Comm: syz-executor362 Not tainted 5.0.0+ OnePlusOSS#11 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:600 kmsan_internal_check_memory+0x9f4/0xb10 mm/kmsan/kmsan.c:694 kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601 _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 copy_to_user include/linux/uaccess.h:174 [inline] move_addr_to_user+0x311/0x570 net/socket.c:227 ___sys_recvmsg+0xb65/0x1310 net/socket.c:2283 do_recvmmsg+0x646/0x10c0 net/socket.c:2390 __sys_recvmmsg net/socket.c:2469 [inline] __do_sys_recvmmsg net/socket.c:2492 [inline] __se_sys_recvmmsg+0x1d1/0x350 net/socket.c:2485 __x64_sys_recvmmsg+0x62/0x80 net/socket.c:2485 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x445819 Code: e8 6c b6 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 12 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f64453eddb8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00000000006dac28 RCX: 0000000000445819 RDX: 0000000000000005 RSI: 0000000020002f80 RDI: 0000000000000003 RBP: 00000000006dac20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dac2c R13: 00007ffeba8f87af R14: 00007f64453ee9c0 R15: 20c49ba5e353f7cf Local variable description: ----addr@___sys_recvmsg Variable was created at: ___sys_recvmsg+0xf6/0x1310 net/socket.c:2244 do_recvmmsg+0x646/0x10c0 net/socket.c:2390 Bytes 0-31 of 32 are uninitialized Memory access of size 32 starts at ffff8880ae62fbb0 Data copied to user address 0000000020000000 Fixes: a32e0ee ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
nathanchance
pushed a commit
to android-linux-stable/op3
that referenced
this pull request
Apr 27, 2019
[ Upstream commit d982b33133284fa7efa0e52ae06b88f9be3ea764 ]
=================================================================
==20875==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 1160 byte(s) in 1 object(s) allocated from:
#0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
#1 0x55bd50005599 in zalloc util/util.h:23
#2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327
OnePlusOSS#3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216
OnePlusOSS#4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69
OnePlusOSS#5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358
OnePlusOSS#6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388
OnePlusOSS#7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583
OnePlusOSS#8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722
OnePlusOSS#9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
OnePlusOSS#10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
OnePlusOSS#11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
OnePlusOSS#12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520
OnePlusOSS#13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Indirect leak of 19 byte(s) in 1 object(s) allocated from:
#0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30)
#1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f)
Signed-off-by: Changbin Du <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Fixes: 6a6cd11 ("perf test: Add test for the sched tracepoint format fields")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ppajda
pushed a commit
to ppajda/android_kernel_oneplus_msm8996
that referenced
this pull request
May 29, 2019
…ilable If big.LITTLE driver is initialized even when MCPM is unavailable, we get the below warning the first time cpu tries to enter deeper C-states. ------------[ cut here ]------------ WARNING: CPU: 4 PID: 0 at kernel/arch/arm/common/mcpm_entry.c:130 mcpm_cpu_suspend+0x6d/0x74() Modules linked in: CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.19.0-rc3-00007-gaf5a2cb1ad5c-dirty OnePlusOSS#11 Hardware name: ARM-Versatile Express [<c0013fa5>] (unwind_backtrace) from [<c001084d>] (show_stack+0x11/0x14) [<c001084d>] (show_stack) from [<c04fe7f1>] (dump_stack+0x6d/0x78) [<c04fe7f1>] (dump_stack) from [<c0020645>] (warn_slowpath_common+0x69/0x90) [<c0020645>] (warn_slowpath_common) from [<c00206db>] (warn_slowpath_null+0x17/0x1c) [<c00206db>] (warn_slowpath_null) from [<c001cbdd>] (mcpm_cpu_suspend+0x6d/0x74) [<c001cbdd>] (mcpm_cpu_suspend) from [<c03c6919>] (bl_powerdown_finisher+0x21/0x24) [<c03c6919>] (bl_powerdown_finisher) from [<c001218d>] (cpu_suspend_abort+0x1/0x14) [<c001218d>] (cpu_suspend_abort) from [<00000000>] ( (null)) ---[ end trace d098e3fd00000008 ]--- This patch fixes the issue by checking for the availability of MCPM before initializing the big.LITTLE cpuidle driver Signed-off-by: Sudeep Holla <[email protected]> Acked-by: Lorenzo Pieralisi <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Signed-off-by: crian <[email protected]> Signed-off-by: crian <[email protected]> Signed-off-by: crian <[email protected]>
ppajda
pushed a commit
to ppajda/android_kernel_oneplus_msm8996
that referenced
this pull request
Sep 8, 2019
commit cf3591ef832915892f2499b7e54b51d4c578b28c upstream.
Revert the commit bd293d071ffe65e645b4d8104f9d8fe15ea13862. The proper
fix has been made available with commit d0a255e795ab ("loop: set
PF_MEMALLOC_NOIO for the worker thread").
Note that the fix offered by commit bd293d071ffe doesn't really prevent
the deadlock from occuring - if we look at the stacktrace reported by
Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex -
i.e. it has already successfully taken the mutex. Changing the mutex
from mutex_lock to mutex_trylock won't help with deadlocks that happen
afterwards.
PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0"
#0 [ffff8813dedfb938] __schedule at ffffffff8173f405
OnePlusOSS#1 [ffff8813dedfb990] schedule at ffffffff8173fa27
OnePlusOSS#2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
OnePlusOSS#3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
OnePlusOSS#4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
OnePlusOSS#5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
OnePlusOSS#6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
OnePlusOSS#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
OnePlusOSS#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
OnePlusOSS#9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
OnePlusOSS#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
OnePlusOSS#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
OnePlusOSS#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
OnePlusOSS#13 [ffff8813dedfbec0] kthread at ffffffff810a8428
OnePlusOSS#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242
Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]
Fixes: bd293d071ffe ("dm bufio: fix deadlock with loop device")
Depends-on: d0a255e795ab ("loop: set PF_MEMALLOC_NOIO for the worker thread")
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Change-Id: I6cd6b254cca8cb8e72c197d44b91790c1a799554
ppajda
pushed a commit
to ppajda/android_kernel_oneplus_msm8996
that referenced
this pull request
Apr 22, 2020
commit ca4463bf8438b403596edd0ec961ca0d4fbe0220 upstream.
The VT_DISALLOCATE ioctl can free a virtual console while tty_release()
is still running, causing a use-after-free in con_shutdown(). This
occurs because VT_DISALLOCATE considers a virtual console's
'struct vc_data' to be unused as soon as the corresponding tty's
refcount hits 0. But actually it may be still being closed.
Fix this by making vc_data be reference-counted via the embedded
'struct tty_port'. A newly allocated virtual console has refcount 1.
Opening it for the first time increments the refcount to 2. Closing it
for the last time decrements the refcount (in tty_operations::cleanup()
so that it happens late enough), as does VT_DISALLOCATE.
Reproducer:
#include <fcntl.h>
#include <linux/vt.h>
#include <sys/ioctl.h>
#include <unistd.h>
int main()
{
if (fork()) {
for (;;)
close(open("/dev/tty5", O_RDWR));
} else {
int fd = open("/dev/tty10", O_RDWR);
for (;;)
ioctl(fd, VT_DISALLOCATE, 5);
}
}
KASAN report:
BUG: KASAN: use-after-free in con_shutdown+0x76/0x80 drivers/tty/vt/vt.c:3278
Write of size 8 at addr ffff88806a4ec108 by task syz_vt/129
CPU: 0 PID: 129 Comm: syz_vt Not tainted 5.6.0-rc2 OnePlusOSS#11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20191223_100556-anatol 04/01/2014
Call Trace:
[...]
con_shutdown+0x76/0x80 drivers/tty/vt/vt.c:3278
release_tty+0xa8/0x410 drivers/tty/tty_io.c:1514
tty_release_struct+0x34/0x50 drivers/tty/tty_io.c:1629
tty_release+0x984/0xed0 drivers/tty/tty_io.c:1789
[...]
Allocated by task 129:
[...]
kzalloc include/linux/slab.h:669 [inline]
vc_allocate drivers/tty/vt/vt.c:1085 [inline]
vc_allocate+0x1ac/0x680 drivers/tty/vt/vt.c:1066
con_install+0x4d/0x3f0 drivers/tty/vt/vt.c:3229
tty_driver_install_tty drivers/tty/tty_io.c:1228 [inline]
tty_init_dev+0x94/0x350 drivers/tty/tty_io.c:1341
tty_open_by_driver drivers/tty/tty_io.c:1987 [inline]
tty_open+0x3ca/0xb30 drivers/tty/tty_io.c:2035
[...]
Freed by task 130:
[...]
kfree+0xbf/0x1e0 mm/slab.c:3757
vt_disallocate drivers/tty/vt/vt_ioctl.c:300 [inline]
vt_ioctl+0x16dc/0x1e30 drivers/tty/vt/vt_ioctl.c:818
tty_ioctl+0x9db/0x11b0 drivers/tty/tty_io.c:2660
[...]
Fixes: 4001d7b ("vt: push down the tty lock so we can see what is left to tackle")
Cc: <[email protected]> # v3.4+
Reported-by: [email protected]
Acked-by: Jiri Slaby <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Change-Id: Idcba632315100a195b42d3178aa7393f2fda9197
ppajda
pushed a commit
to ppajda/android_kernel_oneplus_msm8996
that referenced
this pull request
Jun 8, 2020
commit 5480e299b5ae57956af01d4839c9fc88a465eeab upstream. Some time ago the block layer was modified such that timeout handlers are called from thread context instead of interrupt context. Make it safe to run the iSCSI timeout handler in thread context. This patch fixes the following lockdep complaint: ================================ WARNING: inconsistent lock state 5.5.1-dbg+ OnePlusOSS#11 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. kworker/7:1H/206 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff88802d9827e8 (&(&session->frwd_lock)->rlock){+.?.}, at: iscsi_eh_cmd_timed_out+0xa6/0x6d0 [libiscsi] {IN-SOFTIRQ-W} state was registered at: lock_acquire+0x106/0x240 _raw_spin_lock+0x38/0x50 iscsi_check_transport_timeouts+0x3e/0x210 [libiscsi] call_timer_fn+0x132/0x470 __run_timers.part.0+0x39f/0x5b0 run_timer_softirq+0x63/0xc0 __do_softirq+0x12d/0x5fd irq_exit+0xb3/0x110 smp_apic_timer_interrupt+0x131/0x3d0 apic_timer_interrupt+0xf/0x20 default_idle+0x31/0x230 arch_cpu_idle+0x13/0x20 default_idle_call+0x53/0x60 do_idle+0x38a/0x3f0 cpu_startup_entry+0x24/0x30 start_secondary+0x222/0x290 secondary_startup_64+0xa4/0xb0 irq event stamp: 1383705 hardirqs last enabled at (1383705): [<ffffffff81aace5c>] _raw_spin_unlock_irq+0x2c/0x50 hardirqs last disabled at (1383704): [<ffffffff81aacb98>] _raw_spin_lock_irq+0x18/0x50 softirqs last enabled at (1383690): [<ffffffffa0e2efea>] iscsi_queuecommand+0x76a/0xa20 [libiscsi] softirqs last disabled at (1383682): [<ffffffffa0e2e998>] iscsi_queuecommand+0x118/0xa20 [libiscsi] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&session->frwd_lock)->rlock); <Interrupt> lock(&(&session->frwd_lock)->rlock); *** DEADLOCK *** 2 locks held by kworker/7:1H/206: #0: ffff8880d57bf928 ((wq_completion)kblockd){+.+.}, at: process_one_work+0x472/0xab0 OnePlusOSS#1: ffff88802b9c7de8 ((work_completion)(&q->timeout_work)){+.+.}, at: process_one_work+0x476/0xab0 stack backtrace: CPU: 7 PID: 206 Comm: kworker/7:1H Not tainted 5.5.1-dbg+ OnePlusOSS#11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: kblockd blk_mq_timeout_work Call Trace: dump_stack+0xa5/0xe6 print_usage_bug.cold+0x232/0x23b mark_lock+0x8dc/0xa70 __lock_acquire+0xcea/0x2af0 lock_acquire+0x106/0x240 _raw_spin_lock+0x38/0x50 iscsi_eh_cmd_timed_out+0xa6/0x6d0 [libiscsi] scsi_times_out+0xf4/0x440 [scsi_mod] scsi_timeout+0x1d/0x20 [scsi_mod] blk_mq_check_expired+0x365/0x3a0 bt_iter+0xd6/0xf0 blk_mq_queue_tag_busy_iter+0x3de/0x650 blk_mq_timeout_work+0x1af/0x380 process_one_work+0x56d/0xab0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 Fixes: 287922eb0b18 ("block: defer timeouts to a workqueue") Cc: Christoph Hellwig <[email protected]> Cc: Keith Busch <[email protected]> Cc: Lee Duncan <[email protected]> Cc: Chris Leech <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Lee Duncan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Cc: Henri Rosten <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Lee Jones <[email protected]> Change-Id: Ibc2cadfbc65ed1210c1dd9a9c7ee8bf65f2d137a
ppajda
pushed a commit
to ppajda/android_kernel_oneplus_msm8996
that referenced
this pull request
Jun 14, 2020
commit 816f318b2364262a51024096da7ca3b84e78e3b5 upstream. When a seq-virmidi driver is initialized, it registers a rawmidi instance with its callback to create an associated seq kernel client. Currently it's done throughly in rawmidi's register_mutex context. Recently it was found that this may lead to a deadlock another rawmidi device that is being attached with the sequencer is accessed, as both open with the same register_mutex. This was actually triggered by syzkaller, as Dmitry Vyukov reported: ====================================================== [ INFO: possible circular locking dependency detected ] 4.8.0-rc1+ OnePlusOSS#11 Not tainted ------------------------------------------------------- syz-executor/7154 is trying to acquire lock: (register_mutex#5){+.+.+.}, at: [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341 but task is already holding lock: (&grp->list_mutex){++++.+}, at: [<ffffffff850138bb>] check_and_subscribe_port+0x5b/0x5c0 sound/core/seq/seq_ports.c:495 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> OnePlusOSS#1 (&grp->list_mutex){++++.+}: [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746 [<ffffffff863f6199>] down_read+0x49/0xc0 kernel/locking/rwsem.c:22 [< inline >] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:681 [<ffffffff85005c5e>] snd_seq_deliver_event+0x35e/0x890 sound/core/seq/seq_clientmgr.c:822 [<ffffffff85006e96>] > snd_seq_kernel_client_dispatch+0x126/0x170 sound/core/seq/seq_clientmgr.c:2418 [<ffffffff85012c52>] snd_seq_system_broadcast+0xb2/0xf0 sound/core/seq/seq_system.c:101 [<ffffffff84fff70a>] snd_seq_create_kernel_client+0x24a/0x330 sound/core/seq/seq_clientmgr.c:2297 [< inline >] snd_virmidi_dev_attach_seq sound/core/seq/seq_virmidi.c:383 [<ffffffff8502d29f>] snd_virmidi_dev_register+0x29f/0x750 sound/core/seq/seq_virmidi.c:450 [<ffffffff84fd208c>] snd_rawmidi_dev_register+0x30c/0xd40 sound/core/rawmidi.c:1645 [<ffffffff84f816d3>] __snd_device_register.part.0+0x63/0xc0 sound/core/device.c:164 [< inline >] __snd_device_register sound/core/device.c:162 [<ffffffff84f8235d>] snd_device_register_all+0xad/0x110 sound/core/device.c:212 [<ffffffff84f7546f>] snd_card_register+0xef/0x6c0 sound/core/init.c:749 [<ffffffff85040b7f>] snd_virmidi_probe+0x3ef/0x590 sound/drivers/virmidi.c:123 [<ffffffff833ebf7b>] platform_drv_probe+0x8b/0x170 drivers/base/platform.c:564 ...... -> #0 (register_mutex#5){+.+.+.}: [< inline >] check_prev_add kernel/locking/lockdep.c:1829 [< inline >] check_prevs_add kernel/locking/lockdep.c:1939 [< inline >] validate_chain kernel/locking/lockdep.c:2266 [<ffffffff814791f4>] __lock_acquire+0x4d44/0x4d80 kernel/locking/lockdep.c:3335 [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746 [< inline >] __mutex_lock_common kernel/locking/mutex.c:521 [<ffffffff863f0ef1>] mutex_lock_nested+0xb1/0xa20 kernel/locking/mutex.c:621 [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341 [<ffffffff8502e7c7>] midisynth_subscribe+0xf7/0x350 sound/core/seq/seq_midi.c:188 [< inline >] subscribe_port sound/core/seq/seq_ports.c:427 [<ffffffff85013cc7>] check_and_subscribe_port+0x467/0x5c0 sound/core/seq/seq_ports.c:510 [<ffffffff85015da9>] snd_seq_port_connect+0x2c9/0x500 sound/core/seq/seq_ports.c:579 [<ffffffff850079b8>] snd_seq_ioctl_subscribe_port+0x1d8/0x2b0 sound/core/seq/seq_clientmgr.c:1480 [<ffffffff84ffe9e4>] snd_seq_do_ioctl+0x184/0x1e0 sound/core/seq/seq_clientmgr.c:2225 [<ffffffff84ffeae8>] snd_seq_kernel_client_ctl+0xa8/0x110 sound/core/seq/seq_clientmgr.c:2440 [<ffffffff85027664>] snd_seq_oss_midi_open+0x3b4/0x610 sound/core/seq/oss/seq_oss_midi.c:375 [<ffffffff85023d67>] snd_seq_oss_synth_setup_midi+0x107/0x4c0 sound/core/seq/oss/seq_oss_synth.c:281 [<ffffffff8501b0a8>] snd_seq_oss_open+0x748/0x8d0 sound/core/seq/oss/seq_oss_init.c:274 [<ffffffff85019d8a>] odev_open+0x6a/0x90 sound/core/seq/oss/seq_oss.c:138 [<ffffffff84f7040f>] soundcore_open+0x30f/0x640 sound/sound_core.c:639 ...... other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&grp->list_mutex); lock(register_mutex#5); lock(&grp->list_mutex); lock(register_mutex#5); *** DEADLOCK *** ====================================================== The fix is to simply move the registration parts in snd_rawmidi_dev_register() to the outside of the register_mutex lock. The lock is needed only to manage the linked list, and it's not necessarily to cover the whole initialization process. Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Lee Jones <[email protected]> Change-Id: I1f0633d56236b763edc482abf81106e626f6a28a
ppajda
pushed a commit
to ppajda/android_kernel_oneplus_msm8996
that referenced
this pull request
Jul 4, 2020
[ Upstream commit 18f855e574d9799a0e7489f8ae6fd8447d0dd74a ] Stefano reported a crash with using SQPOLL with io_uring: BUG: kernel NULL pointer dereference, address: 00000000000003b0 CPU: 2 PID: 1307 Comm: io_uring-sq Not tainted 5.7.0-rc7 OnePlusOSS#11 RIP: 0010:task_numa_work+0x4f/0x2c0 Call Trace: task_work_run+0x68/0xa0 io_sq_thread+0x252/0x3d0 kthread+0xf9/0x130 ret_from_fork+0x35/0x40 which is task_numa_work() oopsing on current->mm being NULL. The task work is queued by task_tick_numa(), which checks if current->mm is NULL at the time of the call. But this state isn't necessarily persistent, if the kthread is using use_mm() to temporarily adopt the mm of a task. Change the task_tick_numa() check to exclude kernel threads in general, as it doesn't make sense to attempt ot balance for kthreads anyway. Reported-by: Stefano Garzarella <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Lee Jones <[email protected]> Change-Id: Iea013623ba2e5be89c503f27c94e07f91433f5e9
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Current branch is impacted by CVE-2015-7550 and other issues. Should be fixed.