@@ -208,70 +208,51 @@ limit , I propose the following calculation:
208
208
209
209

210
210
211
- Where  is (per ` runtime/metrics` memory names)
212
-
213
- ` ` `
214
- /memory/classes/metadata/mcache/free:bytes +
215
- /memory/classes/metadata/mcache/inuse:bytes +
216
- /memory/classes/metadata/mspan/free:bytes +
217
- /memory/classes/metadata/mspan/inuse:bytes +
218
- /memory/classes/metadata/other:bytes +
219
- /memory/classes/os-stacks:bytes +
220
- /memory/classes/other:bytes +
221
- /memory/classes/profiling/buckets:bytes
222
- ```
223
-
224
- and ![ ` O_I ` ] ( 48409/inl4.png ) is the maximum of
225
- ` /memory/classes/heap/unused:bytes + /memory/classes/heap/free:bytes ` over the
226
- last GC cycle.
227
-
228
- These terms (called ![ ` O ` ] ( 48409/inl5.png ) , for "overheads") account for all
229
- memory that is not accounted for by the GC pacer (from the [ new pacer
230
- proposal] ( https://github.com/golang/proposal/blob/329650d4723a558c2b76b81b4995fc5c267e6bc1/design/44167-gc-pacer-redesign.md#heap-goal ) ).
211
+  is the total amount of memory mapped by the Go runtime.
212
+  is the amount of free and unscavenged memory the Go
213
+ runtime is holding.
214
+  is the number of bytes in allocated heap objects at the
215
+ time  is computed.
216
+
217
+ The second term, , represents the sum of
218
+ non-heap overheads.
219
+ Free and unscavenged memory is specifically excluded because this is memory that
220
+ the runtime might use in the near future, and the scavenger is specifically
221
+ instructed to leave the memory up to the heap goal unscavenged.
222
+ Failing to exclude free and unscavenged memory could lead to a very poor
223
+ accounting of non-heap overheads.
231
224
232
225
With  fully defined, our heap goal for cycle
233
- ![ ` n ` ] ( 48409/inl6 .png ) (![ ` N_n ` ] ( 48409/inl7 .png ) ) is a straightforward extension
226
+  () is a straightforward extension
234
227
of the existing one.
235
228
236
229
Where
237
- * ![ ` M_n ` ] ( 48409/inl8 .png ) is equal to bytes marked at the end of GC n's mark
230
+ *  is equal to bytes marked at the end of GC n's mark
238
231
phase
239
- * ![ ` S_n ` ] ( 48409/inl9 .png ) is equal to stack bytes at the beginning of GC n's
232
+ *  is equal to stack bytes at the beginning of GC n's
240
233
mark phase
241
- * ![ ` G_n ` ] ( 48409/inl10 .png ) is equal to bytes of globals at the beginning of GC
234
+ *  is equal to bytes of globals at the beginning of GC
242
235
n's mark phase
243
- * ![ ` \gamma ` ] ( 48409/inl11 .png ) is equal to
244
- ![ ` 1+\frac{GOGC}{100} ` ] ( 48409/inl12 .png )
236
+ *  is equal to
237
+ 
245
238
246
239
then
247
240
248
241

249
242
250
- Over the course of a GC cycle ![ ` O_M ` ] ( 48409/inl3.png ) remains stable because it
251
- increases monotonically.
252
- There's only one situation where ![ ` O_M ` ] ( 48409/inl3.png ) can grow tremendously
253
- (relative to active heap objects) in a short period of time (< 1 GC cycle), and
254
- that's when ` GOMAXPROCS ` increases.
255
- So, I also propose recomputing this value at that time.
256
-
257
- Meanwhile ![ ` O_I ` ] ( 48409/inl4.png ) stays relatively stable (and doesn't have a
258
- sawtooth pattern, as one might expect from a sum of idle heap memory) because
259
- object sweeping occurs incrementally, specifically proportionally to how fast
260
- the application is allocating.
261
- Furthermore, this value is guaranteed to stay relatively stable across a single
262
- GC cycle, because the total size of the heap for one GC cycle is bounded by the
263
- heap goal.
264
- Taking the highwater mark of this value places a conservative upper bound on the
265
- total impact of this memory, so the heap goal stays safe from major changes.
266
-
267
- One concern with the above definition of ![ ` \hat{L} ` ] ( 48409/inl1.png ) is that it
268
- is fragile to changes to the Go GC.
269
- In the past, seemingly unrelated changes to the Go runtime have impacted the
270
- GC's pacer, usually due to an unforeseen influence on the accounting that the
271
- pacer relies on.
272
- To minimize the impact of these accidents on the conversion function, I propose
273
- centralizing and categorizing all the variables used in accounting, and writing
274
- tests to ensure that expected properties of the account remain in-tact.
243
+ Over the course of a GC cycle, non-heap overheads remain stable because the
244
+ mostly increase monotonically.
245
+ However, the GC needs to be responsive to any change in non-heap overheads.
246
+ Therefore, I propose a more heavy-weight recomputation of the heap goal every
247
+ time its needed, as opposed to computing it only once per cycle.
248
+ This also means the GC trigger point needs to be dynamically recomputable.
249
+ This check will create additional overheads, but they're likely to be low, as
250
+ the GC's internal statistics are updated only on slow paths.
251
+
252
+ The nice thing about this definition of  is that
253
+ it's fairly robust to changes to the Go GC, since total mapped memory, free and
254
+ unscavenged memory, and bytes allocated in objects, are fairly fundamental
255
+ properties (especially to any tracing GC design).
275
256
276
257
#### Death spirals
277
258
@@ -322,7 +303,7 @@ large enough to accommodate worst-case pause times but not too large such that a
322
303
more than about a second.
323
304
1 CPU-second per ` GOMAXPROCS` seems like a reasonable place to start.
324
305
325
- Unfortunately, 50% is not a reasonable choice for small values of ` GOGC ` .
306
+ Unfortunately, 50% is not always a reasonable choice for small values of ` GOGC` .
326
307
Consider an application running with ` GOGC=10 ` : an overall 50% GC CPU
327
308
utilization limit for ` GOGC=10 ` is likely going to be always active, leading to
328
309
significant overshoot.
@@ -359,22 +340,13 @@ use approaches the limit.
359
340
I propose it does so using a proportional-integral controller whose input is the
360
341
difference between the memory limit and the memory used by Go, and whose output
361
342
is the CPU utilization target of the background scavenger.
362
- The output will be clamped at a minimum of 1% and a maximum of 10% overall CPU
363
- utilization.
364
- Note that the 10% is chosen arbitrarily; in general, returning memory to the
365
- platform is nowhere near as costly as the GC, but the number must be chosen such
366
- that the mutator still has plenty of room to make progress (thus, I assert that
367
- 40% of CPU time is enough).
368
- In order to make the scavenger scale to overall CPU utilization effectively, it
369
- requires some improvements to avoid the aforementioned locking issues it deals
370
- with today.
371
-
372
- Any CPU time spent in the scavenger should also be accounted for in the leaky
373
- bucket algorithm described in the [ Death spirals] ( #death-spirals ) section as GC
374
- time, however I don't think it should be throttled in the same way.
375
- The intuition behind that is that returning memory to the platform is generally
376
- going to be more immediately fruitful than spending more time in garbage
377
- collection.
343
+ This will make the background scavenger more reliable.
344
+
345
+ However, the background scavenger likely won't return memory to the OS promptly
346
+ enough for the memory limit, so in addition, I propose having span allocations
347
+ eagerly return memory to the OS to stay under the limit.
348
+ The time a goroutine spends in this will also count toward the 50% GC CPU limit
349
+ described in the [Death spirals](#death-spirals) section.
378
350
379
351
#### Alternative approaches considered
380
352
@@ -418,38 +390,13 @@ go beyond the spans already in-use.
418
390
419
391
##### Returning memory to the platform
420
392
421
- A potential issue with the proposed design is that because the scavenger is
422
- running in the background, it may not react readily to spikes in memory use that
423
- exceed the limit.
424
-
425
- In contrast, [ TCMalloc] ( #tcmalloc ) searches for memory to return eagerly, if an
426
- allocation were to exceed the limit.
427
- In the Go 1.13 cycle, I attempted a similar policy when first implementing the
428
- scavenger, and found that it could cause unacceptable tail latency increases in
429
- some applications.
430
- While that policy certainly tried to return memory back to the platform
431
- significantly more often than it would be in this case, it still has a couple of
432
- downsides:
433
- 1 . It introduces latency.
434
- The background scavenger can be more efficiently time-sliced in between other
435
- work, so it generally should only impact throughput.
436
- 1 . It's much more complicated to bound the total amount of time spent searching
437
- for and returning memory to the platform during an allocation.
438
-
439
- The key insight as to why this policy works just fine for TCMalloc and won't
440
- work for Go comes from a fundamental difference in design.
441
- Manual memory allocators are typically designed to have a LIFO-style memory
442
- reuse pattern.
443
- Once an allocation is freed, it is immediately available for reallocation.
444
- In contrast, most efficient tracing garbage collection algorithms require a
445
- FIFO-style memory reuse pattern, since allocations are freed in bulk.
446
- The result is that the page allocator in a garbage-collected memory allocator is
447
- accessed far more frequently than in manual memory allocator, so this path will
448
- be hit a lot harder.
449
-
450
- For the purposes of this design, I don't believe the benefits of eager return
451
- outweigh the costs, and I do believe that the proposed design is good enough for
452
- most cases.
393
+ If returning memory to the OS eagerly becomes a significant performance issue, a
394
+ reasonable alternative could be to crank up the background scavenger's CPU usage
395
+ in response to growing memory pressure.
396
+ This needs more thought, but given that it would now be controlled by a
397
+ controller, its CPU usage will be more reliable, and this is an option we can
398
+ keep in mind.
399
+ One benefit of this option is that it may impact latency less prominently.
453
400
454
401
### Documentation
455
402
@@ -513,6 +460,11 @@ decides to shrink the heap space used; more recent implementations (e.g. G1) do
513
460
so more rarely, except when [the application is
514
461
idle](https://openjdk.java.net/jeps/346).
515
462
463
+ Some JVMs are "container aware" and read the memory limits of their containers
464
+ to stay under the limit.
465
+ This behavior is closer to what is proposed in this document, but I do not
466
+ believe the memory limit is directly configurable, like the one proposed here.
467
+
516
468
### SetMaxHeap
517
469
518
470
For nearly 4 years, the Go project has been trialing an experimental API in the
0 commit comments