-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: redesign GC synchronization #12041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CL https://golang.org/cl/15890 mentions this issue. |
CL https://golang.org/cl/16297 mentions this issue. |
GC assists must block until the assist can be satisfied (either through stealing credit or doing work) or the GC cycle ends. Currently, this is implemented as a retry loop with a 100 µs delay. This obviously isn't ideal, as it wastes CPU and delays mutator execution. It also has the somewhat peculiar downside that sleeping a G requires allocation, and this requires working around recursive allocation. Replace this timed delay with a proper scheduling queue. When an assist can't be satisfied immediately, it adds the allocating G to a queue and parks it. Any time background scan credit is flushed, it consults this queue, directly satisfies the debt of queued assists, and wakes up satisfied assists before flushing any remaining credit to the background credit pool. No effect on the go1 benchmarks. Slightly speeds up the garbage benchmark. name old time/op new time/op delta XBenchGarbage-12 5.81ms ± 1% 5.72ms ± 4% -1.65% (p=0.011 n=20+20) Updates #12041. Change-Id: I8ee3b6274dd097b12b10a8030796a958a4b0e7b7 Reviewed-on: https://go-review.googlesource.com/15890 Reviewed-by: Rick Hudson <[email protected]> Run-TryBot: Austin Clements <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
Currently dedicated mark workers participate in the getfull barrier during concurrent mark. However, the getfull barrier wasn't designed for concurrent work and this causes no end of headaches. In the concurrent setting, participants come and go. This makes mark completion susceptible to live-lock: since dedicated workers are only periodically polling for completion, it's possible for the program to be in some transient worker each time one of the dedicated workers wakes up to check if it can exit the getfull barrier. It also complicates reasoning about the system because dedicated workers participate directly in the getfull barrier, but transient workers must instead use trygetfull because they have exit conditions that aren't captured by getfull (e.g., fractional workers exit when preempted). The complexity of implementing these exit conditions contributed to #11677. Furthermore, the getfull barrier is inefficient because we could be running user code instead of spinning on a P. In effect, we're dedicating 25% of the CPU to marking even if that means we have to spin to make that 25%. It also causes issues on Windows because we can't actually sleep for 100µs (#8687). Fix this by making dedicated workers no longer participate in the getfull barrier. Instead, dedicated workers simply return to the scheduler when they fail to get more work, regardless of what others workers are doing, and the scheduler only starts new dedicated workers if there's work available. Everything that needs to be handled by this barrier is already handled by detection of mark completion. This makes the system much more symmetric because all workers and assists now use trygetfull during concurrent mark. It also loosens the 25% CPU target so that we can give some of that 25% back to user code if there isn't enough work to keep the mark worker busy. And it eliminates the problematic 100µs sleep on Windows during concurrent mark (though not during mark termination). The downside of this is that if we hit a bottleneck in the heap graph that then expands back out, the system may shut down dedicated workers and take a while to start them back up. We'll address this in the next commit. Updates #12041 and #8687. No effect on the go1 benchmarks. This slows down the garbage benchmark by 9%, but we'll more than make it up in the next commit. name old time/op new time/op delta XBenchGarbage-12 5.80ms ± 2% 6.32ms ± 4% +9.03% (p=0.000 n=20+20) Change-Id: I65100a9ba005a8b5cf97940798918672ea9dd09b Reviewed-on: https://go-review.googlesource.com/16297 Reviewed-by: Rick Hudson <[email protected]>
Is this done? |
I believe so, Austin did the work, I just reviewed it. On Tue, Nov 24, 2015 at 11:42 AM, Russ Cox [email protected] wrote:
|
There is still a timed polling loop in getfull (point 5), but it only happens during STW mark termination and doesn't seem to be causing problems at this point (it may extend mark termination time, but I don't have concrete evidence for that). I believe every other point is resolved, so I'll close this issue. |
The barrier mechanism for transitioning from concurrent mark to mark termination has grown to be incredibly complex and fragile. It evolved incrementally from the getfull barrier used by the 1.4 STW GC, but while this barrier was acceptable for STW GC (and for STW mark termination in concurrent GC), its use during concurrent marking has well exceeded its original design criteria.
There are several difficulties:
For all of these reasons, I think we need to redesign the GC barrier for 1.6.
This isn't a concrete proposal, but there are some properties I think the synchronization should have:
Most likely the solution to this is deeply tied to the solution to #11970.
@RLH @rsc
The text was updated successfully, but these errors were encountered: