-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: stack trace shows confusing call sequence during panic #13857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CC @aclements |
The confusing call sequence is because it's in the process of unwinding defers. It looks like TestEntityIssuesBaseEntityQuerySequentiallyForPromulgatedURL pushed (*Cache).Close on the defer stack, so the panic in Unlock, in a sense, did call (*Cache).Close, though I agree that the presentation of this could perhaps be less confusing. Close wound up waiting on a waitgroup that never finished (and perhaps never will?), so you got stuck in the middle of unwinding the defers. Since, in principle, it was still possible to recover this panic at this point, it couldn't print out the panic. So, given this, I'm curious if you have thoughts on what you would like to see in the stack trace that would make this clearer. |
I'd expect to see some mention of "panic" in the stack trace. I'd expect to see it As it turned out, this particular case happened when there were two nested Here's a simpler piece of code that illustrates the same issue (http://play.golang.org/p/r8F94naBit):
Something along this kind of line would be preferable (without any
Of course it would be even better if there was some indication of |
It should be easy to show the panics in the traceback. In fact, you will see them if you run with GOTRACEBACK=system. They just don't show by default because "panic" doesn't actually exist and the runtime function that implements it gets elided because it's a runtime function. We could just special case traceback to show the runtime.gopanic frame and probably rewrite it to "runtime.panic" (or maybe even "panic") to make it really clear. Giving more information about what frames belong to the defer and where that defer came from is a bit trickier, but I think still doable. The defer is still on the G's defer list while it's running, so most of the information is there, including the exact PC where the defer was created. We might have to record not just that the defer has started, but what SP it's running at to give traceback enough bread crumbs. /cc @randall77 for any thoughts. |
If it's a regression, should we fix it in 1.6 too?
|
I don't think this is important enough to hold up 1.6.0, but we could cherry-pick it for 1.6.1. |
@rogpeppe, any thoughts on how you would like deferrals to appear in the traceback? Adding the panic calls in is straightforward, but I'm not sure what to show for deferrals (if anything). Already any call from a panic frame is a deferral. Is there more information you'd like to see? We could simply mark those as deferred calls so users don't need to know that that's what it means when panic "calls" a function. We could add a line that says where they were deferred from, perhaps printing the function name so it's easier to match up with the later frame? |
I don't think it meets the requirements for point releases though. Isn't
that we just need to adjust a few string comparisons in runtime?
|
I didn't mean to suggest we would do 1.6.1 just for this. But, yes, it is just a few string comparisons that need to be fixed. I've posted my CL at https://go-review.googlesource.com/19492 and will let @rsc decide if we should go ahead and push it for 1.6. |
CL https://golang.org/cl/19492 mentions this issue. |
We used to include panic calls in tracebacks; however, when runtime.panic was renamed to runtime.gopanic in the conversion of the runtime to Go, we missed the special case in showframe that includes panic calls even though they're in package runtime. Fix the function name check in showframe (and, while we're here, fix the other check for "runtime.panic" in runtime/pprof). Since the "runtime.gopanic" name doesn't match what users call panic and hence isn't very user-friendly, make traceback rewrite it to just "panic". Updates #5832, #13857. Fixes #14315. Change-Id: I8059621b41ec043e63d5cfb4cbee479f47f64973 Reviewed-on: https://go-review.googlesource.com/19492 Run-TryBot: Austin Clements <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Russ Cox <[email protected]>
@aclements It would be useful to show where the defer call is made, otherwise it can be hard to find that out. ("deferred from xxxxx" where xxxxx would indicate calling function and source location of defer statement? ) |
@rogpeppe, I thought about that, but it's not obvious to me why that's particularly useful. If the defer is for a closure constructed at the defer point, the source location is only going to be a few lines from what it already prints in the traceback. But perhaps it's more useful when you defer something other than a closure? It looks like that's the case in your original traceback. |
@aclements Is there anything left to do as part of this issue? Do you want to add more information about the location of the defer call? |
@quentinmit, the important part (showing panic calls) is fixed. The defer part is a "nice to have" and may break things that try to parse the backtrace format, so maybe bump this to Go1.8Early? |
@rogpeppe's program now prints:
I think that's clear enough. |
I had a program that seemed to be deadlocking, and it showed the
following confusing stack trace:
It seems fairly clear that sync.Mutex.Unlock should not be calling entitycache.Cache.Close directly.
It seems we're unlocking an already-unlocked mutex (a bug in the code), so sync.Mutex is panicking, but shouldn't this fact be visible in the stack trace?
To reproduce, there's a branch in github.com/rogpeppe/charmstore called stacktrace-oddity; it should be checked out into gopkg.in/juju/charmstore.v5-unstable and its prereqs go-gotten. Running go test in internal/entitycache reproduces the issue reasonably often (reproduced under Go 1.4.1 and Go 1.6 beta). ^\ to see the stack trace when it's deadlocked (go 1.4 panics with a deadlock).
The text was updated successfully, but these errors were encountered: