-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: "runtime: bad pointer in frame runtime.sellock" crash on some builders #15936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CC @aclements @RLH |
We've seen a few other "bad pointer in frame" failures:
I don't believe the plan9-amd64-9front failure since basically every test crashed on that run, but the others look entirely plausible. This may suggest another bug in the liveness/stack map information. Probably the right next step is to build at one of the failing commits/configurations and look at the machine code and stack map around the call with the bad frame. |
I looked at the most recent link above, https://build.golang.org/log/df89109d252dc12ecd0fa587cc00c11cdb0af1c5 . It doesn't seem like a liveness problem. The crash is when
This stack trace shows that the pointer value is already bad (I don't know if it is really
I don't know what is going on but I don't see how there could be any liveness confusion here. |
I wrote a CL to clobber all dead pointer fields immediately after all call sites. It didn't find anything interesting, unfortunately. |
The fact that this seems to happen on arm64 (the single amd64 case is from a long time ago, and is not in the select code) makes me wonder if something in the channel/select stack transfer code is not following the weak arm64 memory model. |
Actually, I did find something, the bug in #16016. I just didn't realize it was a bug. I'm not sure whether that bug is the cause for any of these failures, but it is possible. |
I've kicked off three tests on my pine64 linux/arm64 machine, based on the failure logs: |
24 hours later, I see no failures. I'm going to try running all.bash in a loop for 24 hours instead. |
It may be worth running a stress test on the commit before e3f1c66, which fixed #16016. If we can reproduce it there, we'll know that was probably the cause and that it's fixed. |
Good idea. I've started the same three stress tests using 1bdf1c3. |
No failures after a day at 1bdf1c3, either. Unfortunately, I'll be away from my hardware for the next two weeks. |
Looks like we might punt this to 1.8. |
@josharian Have you found any other instances of this? Otherwise I think we should probably close this as not reproducible. |
There haven't been any instances of this on the dashboard since June 2nd (#15936 (comment)). I think it's safe to close this. |
CL https://golang.org/cl/23924 mentions this issue. |
The experiment "clobberdead" clobbers all pointer fields that the compiler thinks are dead, just before and after every safepoint. Useful for debugging the generation of live pointer bitmaps. Helped find the following issues: Update #15936 Update #16026 Update #16095 Update #18860 Change-Id: Id1d12f86845e3d93bae903d968b1eac61fc461f9 Reviewed-on: https://go-review.googlesource.com/23924 Run-TryBot: Keith Randall <[email protected]> Reviewed-by: Matthew Dempsky <[email protected]> Reviewed-by: Cherry Zhang <[email protected]>
While working on another builder crash, I noticed this once:
https://build.golang.org/log/e3cc1e2bd6c5d4e0c3726213974c3d0a1af12f5a
Filing this issue if the crash is interesting in itself or in case in happens again on the same or another builder.
The text was updated successfully, but these errors were encountered: