-
Notifications
You must be signed in to change notification settings - Fork 18.1k
runtime: fatal error: throwOnGCWork (and general builder instability) #29124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CC @aclements |
Dup of #27993. (Definitely a release-blocker. I'm actively working on this one.)
Dup of #25519. Only affects debug call injection, so not high-priority. This isn't technically Android-specific, though it seems to happen a lot there, so maybe we should just disable that test on Android.
Stuck doing a syscall.Open? |
I believe the vet process used ~100% cpu when I sent it SIGQUIT. So an infinite loop somehow. |
Change https://golang.org/cl/154112 mentions this issue: |
This fixes a few different issues that led to hangs and general flakiness in the TestDebugCall* tests. 1. This fixes missing wake-ups in two error paths of the SIGTRAP signal handler. If the goroutine was in an unknown state, or if there was an unknown debug call status, we currently don't wake the injection coordinator. These are terminal states, so this resulted in a hang. 2. This adds a retry if the target goroutine is in a transient state that prevents us from injecting a call. The most common failure mode here is that the target goroutine is in _Grunnable, but this was previously masked because it deadlocked the test. 3. Related to 2, this switches the "ready" signal from the target goroutine from a blocking channel send to a non-blocking channel send. This makes it much less likely that we'll catch this goroutine while it's in the runtime performing that send. 4. This increases GOMAXPROCS from 2 to 8 during these tests. With the current setting of 2, we can have at most the non-preemptible goroutine we're injecting a call in to and the goroutine that's trying to make it exit. If anything else comes along, it can deadlock. One particular case I observed was in TestDebugCallGC, where runtime.GC() returns before the forEachP that prepares sweeping on all goroutines has finished. When this happens, the forEachP blocks on the non-preemptible loop, which means we now have at least three goroutines that need to run. Fixes #25519. Updates #29124. Change-Id: I7bc41dc0b865b7d0bb379cb654f9a1218bc37428 Reviewed-on: https://go-review.googlesource.com/c/154112 Run-TryBot: Austin Clements <[email protected]> Reviewed-by: Michael Knyszek <[email protected]>
Seen on several builders:
linux/amd64: https://build.golang.org/log/2ce33c8fa9981d37f3d5a81e384285c0be6df37a
android/arm: https://build.golang.org/log/34a763d60d9908f39856dd7380948081bef8fff9 (although it looks like the error is from the host running vet)
I noticed the errors because the Android builder has started to time out a lot:
https://build.golang.org/log/8b9d9127e958601e4a0a3aa71243463d814bd92e
https://build.golang.org/log/8aa5d8614243f1aea5354ba3598e66143de3249c
https://build.golang.org/log/80435e9f9573e9eec360dcb0a69772ca4ccf991a
https://build.golang.org/log/24e04fb21e1383cd002c3a575398f38cf15c7532
For this run I had to manually send a SIGQUIT signal to a vet process that hung for hours:
https://build.golang.org/log/994e945a2a0f4e4ac2efdbabf9c7fedcf402553e
Timeouts also happen on several builders unrelated to mobile:
linux/amd64: https://build.golang.org/log/f1d308e0e68a514f98c242fce22f83b790cc8304
linux/amd64: https://build.golang.org/log/bebe0b5a38b0218a6814e60769645435e0a48a90
darwin/amd64: https://build.golang.org/log/75959bdf3602be17a3247ce5f1735a323392794d
The text was updated successfully, but these errors were encountered: