-
Notifications
You must be signed in to change notification settings - Fork 18.1k
runtime: debug.Stack() and runtime.Callers() PCs differ on panic trigger site #34123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Marking as release-blocker since this sounds serious. But it's going to be hard for us to investigate without a reproduction case. |
@ianlancetaylor I edited the issue with the container go env and uname values In the meanwhile I can add some more info I found while trying to diagnose this that may (or may not) be helpul:
Here are the different stacktraces built on the same recovery middleware for comparison (some package paths ommited for policy):
|
A couple things you could do to help us debug:
|
@randall77 I'm with some blockers on our side regarding being able to test 1.13 at work but I'll be back with these things on 1.12 |
@randall77 here's the things you requested, these were on 1.12.9, with changes on extern, mprof and traceback files to print the program counters: dumps.zip @ianlancetaylor while I was investigating this and looking into the runtime I found some more details that might be of help on diagnosing this and even coming up with a portable repro
|
Ok, I think I can piece together what's happening here. The problem is that when an instruction faults, we record in the results of |
Change https://golang.org/cl/196962 mentions this issue: |
Thanks for the quick response, in a couple days you managed to fix this after all the time I spent trying to pin it down, amazing! I didn't have time to set up my docker image to build the revision locally, but I hope I can do so soon. Is the milestone confirmed to be 1.14 only or there's a chance of a backport to 1.13/1.12? |
Interestingly on Go1.13.1 I can't reproduce @randall77's repro in the CL but I can reproduce it on tip/master $ go run main.go
devel +740d2c8c22 Thu Sep 26 18:47:41 2019 +0000
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: expected line 24, got line 23
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x109769e]
goroutine 1 [running]:
main.main.func1()
/Users/emmanuelodeke/Desktop/openSrc/bugs/golang/34123/main.go:36 +0x238
panic(0x10b00e0, 0x116cae0)
/Users/emmanuelodeke/go/src/go.googlesource.com/go/src/runtime/panic.go:679 +0x1b2
main.f()
/Users/emmanuelodeke/Desktop/openSrc/bugs/golang/34123/main.go:24 +0xe
main.main()
/Users/emmanuelodeke/Desktop/openSrc/bugs/golang/34123/main.go:43 +0x74
exit status 2 and on Go1.12.10 $ go run main.go
go1.12.10
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: expected line 24, got line 23
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x109480e]
goroutine 1 [running]:
main.main.func1()
/Users/emmanuelodeke/Desktop/openSrc/bugs/golang/34123/main.go:36 +0x29e
panic(0x10ada00, 0x116e480)
/Users/emmanuelodeke/go/src/go.googlesource.com/go/src/runtime/panic.go:522 +0x1b5
main.f()
/Users/emmanuelodeke/Desktop/openSrc/bugs/golang/34123/main.go:24 +0xe
main.main()
/Users/emmanuelodeke/Desktop/openSrc/bugs/golang/34123/main.go:43 +0x67
exit status 2 So that's going to be an interesting backport but also perhaps there is something extra at play. |
Thanks for the explanation @randall77 in https://go-review.googlesource.com/c/go/+/196962/2#message-1b08a694cb1ee65964a69e436361f75eb5c8fe67 |
@randall77 I´m still able to repro this on my local container using gotip: I can go back to get the a new set of dumps like you requested before if needed |
Luis, please do so :)
…On Wed, Nov 13, 2019 at 8:47 AM Luis Gabriel Gómez ***@***.***> wrote:
@randall77 <https://github.com/randall77> I´m still able to repro this on
my local container using gotip:
go version devel +bf49905 Wed Nov 13 15:52:21 2019 +0000 linux/amd64
I can go back to get the a new set of dumps like you requested before
<#34123 (comment)> if
needed
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#34123?email_source=notifications&email_token=ABFL3V55ZFEULCIT4NOYKPTQTQVSVA5CNFSM4IUBLLOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED6YYEI#issuecomment-553487377>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABFL3V5JBBV5ANYXCD7TNQLQTQVSVANCNFSM4IUBLLOA>
.
|
Yes, a new set of dumps would be helpful. |
@odeke-em @randall77 nevermind, the fix works on all frontiers, thank you for the hard work Took me a while to figure it out but the issue is on gotip: it uses the toolchain compiled from nightly but not its goroot so I had to run |
Uh oh!
There was an error while loading. Please reload this page.
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Can´t tell for the time being, the only repro we have is on a web app and we still don´t have docker images with go 1.13/1.12.7 on our end
What operating system and processor architecture are you using (
go env
)?What did you do?
Quoting my original issue on newrelic/go-agent#100:
I'm working on a custom gin recover middleware that prints the offending stacktrace whenever a panic goes uncatched. In this scenario, we noticed an application's code path in which the line number on the printed stacktrace is wrong, returning line 59 instead of line 61 (it took me a long time to realize that something was wrong with the stacktrace itself and not the reported line):
We fixed this by writing a implementation based on debug.Stack() but, surprisingly, after deploying the fix to verify it I'm observing that the internal stacktrace used by txn.NoticeError has the same inaccuracy that our original middleware implementation had:
After some research the main culprit seems to be a wrong PC being used when retrieving the stack frames, but I'm still not certain about the concrete cause. I changed the stacktrace generator code that manually generated the frames via newrelic/go-agent#101 but I keep noticing the same issue, so I guess it boils down to a difference between how runtime.Stack and runtime.Callers manipulate the systemstack
The scenario I've been showing is on a specific branch of a web app we use at work (thus being unable to share it here), and unfortunately I've been trying to create a local repro case to no avail
The text was updated successfully, but these errors were encountered: