Skip to content

Commit 1678b2c

Browse files
committed
runtime: implement STW GC in terms of concurrent GC
Currently, STW GC works very differently from concurrent GC. The largest differences in that in concurrent GC, all marking work is done by background mark workers during the mark phase, while in STW GC, all marking work is done by gchelper during the mark termination phase. This is a consequence of the evolution of Go's GC from a STW GC by incrementally moving work from STW mark termination into concurrent mark. However, at this point, the STW code paths exist only as a debugging mode. Having separate code paths for this increases the maintenance burden and complexity of the garbage collector. At the same time, these code paths aren't tested nearly as well, making it far more likely that they will bit-rot. This CL reverses the relationship between STW GC, by re-implementing STW GC in terms of concurrent GC. This builds on the new scheduled support for disabling user goroutine scheduling. During sweep termination, it disables user scheduling, so when the GC starts the world again for concurrent mark, it's really only "concurrent" with itself. There are several code paths that were specific to STW GC that are now vestigial. We'll remove these in the follow-up CLs. Updates #26903. Change-Id: Ia3883d2fcf7ab1d89bdc9c8ee54bf9bffb32c096 Reviewed-on: https://go-review.googlesource.com/c/134780 Run-TryBot: Austin Clements <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Rick Hudson <[email protected]>
1 parent 6e9fb11 commit 1678b2c

File tree

1 file changed

+67
-59
lines changed

1 file changed

+67
-59
lines changed

src/runtime/mgc.go

Lines changed: 67 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -455,6 +455,12 @@ func (c *gcControllerState) startCycle() {
455455
c.fractionalUtilizationGoal = 0
456456
}
457457

458+
// In STW mode, we just want dedicated workers.
459+
if debug.gcstoptheworld > 0 {
460+
c.dedicatedMarkWorkersNeeded = int64(gomaxprocs)
461+
c.fractionalUtilizationGoal = 0
462+
}
463+
458464
// Clear per-P state
459465
for _, p := range allp {
460466
p.gcAssistTime = 0
@@ -1264,9 +1270,7 @@ func gcStart(trigger gcTrigger) {
12641270
traceGCStart()
12651271
}
12661272

1267-
if mode == gcBackgroundMode {
1268-
gcBgMarkStartWorkers()
1269-
}
1273+
gcBgMarkStartWorkers()
12701274

12711275
gcResetMarkState()
12721276

@@ -1296,65 +1300,65 @@ func gcStart(trigger gcTrigger) {
12961300
clearpools()
12971301

12981302
work.cycles++
1299-
if mode == gcBackgroundMode { // Do as much work concurrently as possible
1300-
gcController.startCycle()
1301-
work.heapGoal = memstats.next_gc
13021303

1303-
// Enter concurrent mark phase and enable
1304-
// write barriers.
1305-
//
1306-
// Because the world is stopped, all Ps will
1307-
// observe that write barriers are enabled by
1308-
// the time we start the world and begin
1309-
// scanning.
1310-
//
1311-
// Write barriers must be enabled before assists are
1312-
// enabled because they must be enabled before
1313-
// any non-leaf heap objects are marked. Since
1314-
// allocations are blocked until assists can
1315-
// happen, we want enable assists as early as
1316-
// possible.
1317-
setGCPhase(_GCmark)
1318-
1319-
gcBgMarkPrepare() // Must happen before assist enable.
1320-
gcMarkRootPrepare()
1321-
1322-
// Mark all active tinyalloc blocks. Since we're
1323-
// allocating from these, they need to be black like
1324-
// other allocations. The alternative is to blacken
1325-
// the tiny block on every allocation from it, which
1326-
// would slow down the tiny allocator.
1327-
gcMarkTinyAllocs()
1328-
1329-
// At this point all Ps have enabled the write
1330-
// barrier, thus maintaining the no white to
1331-
// black invariant. Enable mutator assists to
1332-
// put back-pressure on fast allocating
1333-
// mutators.
1334-
atomic.Store(&gcBlackenEnabled, 1)
1335-
1336-
// Assists and workers can start the moment we start
1337-
// the world.
1338-
gcController.markStartTime = now
1339-
1340-
// Concurrent mark.
1341-
systemstack(func() {
1342-
now = startTheWorldWithSema(trace.enabled)
1343-
})
1304+
gcController.startCycle()
1305+
work.heapGoal = memstats.next_gc
1306+
1307+
// In STW mode, disable scheduling of user Gs. This may also
1308+
// disable scheduling of this goroutine, so it may block as
1309+
// soon as we start the world again.
1310+
if mode != gcBackgroundMode {
1311+
schedEnableUser(false)
1312+
}
1313+
1314+
// Enter concurrent mark phase and enable
1315+
// write barriers.
1316+
//
1317+
// Because the world is stopped, all Ps will
1318+
// observe that write barriers are enabled by
1319+
// the time we start the world and begin
1320+
// scanning.
1321+
//
1322+
// Write barriers must be enabled before assists are
1323+
// enabled because they must be enabled before
1324+
// any non-leaf heap objects are marked. Since
1325+
// allocations are blocked until assists can
1326+
// happen, we want enable assists as early as
1327+
// possible.
1328+
setGCPhase(_GCmark)
1329+
1330+
gcBgMarkPrepare() // Must happen before assist enable.
1331+
gcMarkRootPrepare()
1332+
1333+
// Mark all active tinyalloc blocks. Since we're
1334+
// allocating from these, they need to be black like
1335+
// other allocations. The alternative is to blacken
1336+
// the tiny block on every allocation from it, which
1337+
// would slow down the tiny allocator.
1338+
gcMarkTinyAllocs()
1339+
1340+
// At this point all Ps have enabled the write
1341+
// barrier, thus maintaining the no white to
1342+
// black invariant. Enable mutator assists to
1343+
// put back-pressure on fast allocating
1344+
// mutators.
1345+
atomic.Store(&gcBlackenEnabled, 1)
1346+
1347+
// Assists and workers can start the moment we start
1348+
// the world.
1349+
gcController.markStartTime = now
1350+
1351+
// Concurrent mark.
1352+
systemstack(func() {
1353+
now = startTheWorldWithSema(trace.enabled)
13441354
work.pauseNS += now - work.pauseStart
13451355
work.tMark = now
1346-
} else {
1347-
if trace.enabled {
1348-
// Switch to mark termination STW.
1349-
traceGCSTWDone()
1350-
traceGCSTWStart(0)
1351-
}
1352-
t := nanotime()
1353-
work.tMark, work.tMarkTerm = t, t
1354-
work.heapGoal = work.heap0
1355-
1356-
// Perform mark termination. This will restart the world.
1357-
gcMarkTermination(memstats.triggerRatio)
1356+
})
1357+
// In STW mode, we could block the instant systemstack
1358+
// returns, so don't do anything important here. Make sure we
1359+
// block rather than returning to user code.
1360+
if mode != gcBackgroundMode {
1361+
Gosched()
13581362
}
13591363

13601364
semrelease(&work.startSema)
@@ -1468,6 +1472,10 @@ top:
14681472
// world again.
14691473
semrelease(&work.markDoneSema)
14701474

1475+
// In STW mode, re-enable user goroutines. These will be
1476+
// queued to run after we start the world.
1477+
schedEnableUser(true)
1478+
14711479
// endCycle depends on all gcWork cache stats being flushed.
14721480
// The termination algorithm above ensured that up to
14731481
// allocations since the ragged barrier.

0 commit comments

Comments
 (0)