runtime: wbuf allocation increased significantly from 1.5 to 1.6 #15319

Ariemeth · 2016-04-15T17:26:55Z

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?
1.6, 1.6.1, 1.5.2
What operating system and processor architecture are you using (go env)?
set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\Development\Projects\go
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GO15VENDOREXPERIMENT=
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0
set CXX=g++
set CGO_ENABLED=1
What did you do?
If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.
Ran the following program on windows using go 1.6.1 and found go routines using ~11kB of memory each. Decided to run the test again using go 1.5.2 to see if there was a difference in the amount of memory being used per go routine.
https://play.golang.org/p/bgP8fs5O7q
What did you expect to see?
I had expected to see go routines using approximately the same memory in different versions of go.
What did you see instead?
Go routines were using ~11kB of memory in 1.6.1 and 9.5kB in 1.5.2.

The text was updated successfully, but these errors were encountered:

bradfitz · 2016-04-15T17:49:12Z

I think you mean per goroutine. Your program has only one channel total.

bradfitz · 2016-04-15T17:49:46Z

/cc @alexbrainman @aclements

aclements · 2016-04-15T18:27:34Z

The channel may actually be important here. On Linux, it went from 2,590 bytes in 1.5.2 to 4,709 bytes in 1.6. However, if you replace <-ch with select {}, it only goes up to 2,600 bytes in 1.6. This suggests that, at least on Linux, the channel operation used to fit in the initial 2K stack allocation and now doesn't. The explanation may be different on Windows, however, since there the initial stack is 8K, so it would grow to 16K if it did grow, which is more than the observed 11K.

aclements · 2016-04-15T18:32:19Z

Interesting. 1.6 is not growing the stack, so that's not where the extra memory is coming from.

aclements · 2016-04-15T18:52:23Z

The extra memory is all in mstats.GCSys, which went from 33,596,416 in 1.5.2 to 212,801,536 in 1.6. The majority of that is almost certainly in workbufs. I'm not surprised there are a fair number of workbufs, since it's going to pick up all of the sudogs created by the blocked goroutines during stack scanning. However, I don't know why it would have increased so much since 1.5.2.

/cc @RLH

Ariemeth · 2016-04-15T19:22:50Z

You are right. I had channels on the brain. I meant go goroutine

Ariemeth · 2016-04-15T19:23:12Z

Thank you for fixing that @bradfitz

valyala · 2016-04-16T12:40:32Z

The channel may actually be important here. On Linux, it went from 2,590 bytes in 1.5.2 to 4,709 bytes in 1.6. However, if you replace <-ch with select {}, it only goes up to 2,600 bytes in 1.6. This suggests that, at least on Linux, the channel operation used to fit in the initial 2K stack allocation and now doesn't.

This sounds pretty bad :(
Is there justified reason for such a large stack size fo channel operations? This effectively prohibits using channels in memory-effective highly concurrent code operating millions of goroutines.

aclements · 2016-04-16T18:54:41Z

@valyala, I confirmed (in my later comments) that it's not in fact stack growth causing this. The goroutines are still running on their initial stack allocation. What's causing the increased memory usage is that GC is allocating more internal memory (most likely work buffers), though I haven't tracked down why yet.

aclements · 2016-05-24T01:34:52Z

Using benchmany run -n 1 -order metric -metric gc-bytes -buildcmd 'go build' go1.5..go1.6 to bisect on memstats.GCSys between 1.5 and 1.6, there are two clear change points: commit 1870572 made it go from 33.6 MB to 417 MB and commit b6c0934 made it go down to 213 MB.

This makes sense to some extent: 1870572 increased the size of the workbuf by 16x, but that was supposed to mean we had ~16x fewer of them. Commit b6c0934 then halved the workbuf size (since it started caching two of them). Instead, in this benchmark, we have almost the same number of workbufs. The next step is to figure out why they aren't being reused like they're supposed to be.

aclements · 2016-05-24T02:01:18Z

This is happening because of the dispose in scanstack. Because of the rutime.GC calls, alls stacks are being scanned during mark termination, which causes every scanstack to dispose its buffer. Even though there are only a few pointers in the buffer when it's disposed here, it goes to the "full" queue. Since all of the stack scans happen before we start draining mark work during mark termination, the number of work buffers is proportional to the number of stacks, rather than the number of pointers. In fact, the math works out almost exactly: 213 MB / 2048 bytes/workbuf = 1.09e5 workbufs ≈ 1e5 goroutines.

aclements · 2016-05-24T02:09:34Z

I have a fairly simple fix that reduces this test down to 10 MB of workbufs. I'll test and benchmark it more thoroughly tomorrow and send a CL.

gopherbot · 2016-05-24T15:00:37Z

CL https://golang.org/cl/23391 mentions this issue.

bradfitz changed the title ~~Memory usage of channels increased on Windows by 20% from 1.5 to 1.6~~ runtime: memory usage of goroutines on Windows increased by 20% from 1.5 to 1.6 Apr 15, 2016

bradfitz added this to the Unplanned milestone Apr 15, 2016

aclements modified the milestones: Go1.7, Unplanned Apr 20, 2016

aclements mentioned this issue May 18, 2016

runtime: efficiency of collection 1.5.3 vs 1.6 #15068

Closed

aclements changed the title ~~runtime: memory usage of goroutines on Windows increased by 20% from 1.5 to 1.6~~ runtime: wbuf allocation increased significantly from 1.5 to 1.6 May 24, 2016

aclements self-assigned this May 24, 2016

gopherbot closed this as completed in 3be48b4 May 25, 2016

aclements mentioned this issue May 26, 2016

runtime: pprof should report non-heap memory #15848

Open

golang locked and limited conversation to collaborators May 25, 2017

gopherbot added the FrozenDueToAge label May 25, 2017

rsc unassigned aclements Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: wbuf allocation increased significantly from 1.5 to 1.6 #15319

runtime: wbuf allocation increased significantly from 1.5 to 1.6 #15319

Ariemeth commented Apr 15, 2016 •

edited

Loading

bradfitz commented Apr 15, 2016

bradfitz commented Apr 15, 2016

aclements commented Apr 15, 2016

aclements commented Apr 15, 2016

aclements commented Apr 15, 2016

Ariemeth commented Apr 15, 2016

Ariemeth commented Apr 15, 2016

valyala commented Apr 16, 2016

aclements commented Apr 16, 2016

aclements commented May 24, 2016

aclements commented May 24, 2016

aclements commented May 24, 2016

gopherbot commented May 24, 2016

runtime: wbuf allocation increased significantly from 1.5 to 1.6 #15319

runtime: wbuf allocation increased significantly from 1.5 to 1.6 #15319

Comments

Ariemeth commented Apr 15, 2016 • edited Loading

bradfitz commented Apr 15, 2016

bradfitz commented Apr 15, 2016

aclements commented Apr 15, 2016

aclements commented Apr 15, 2016

aclements commented Apr 15, 2016

Ariemeth commented Apr 15, 2016

Ariemeth commented Apr 15, 2016

valyala commented Apr 16, 2016

aclements commented Apr 16, 2016

aclements commented May 24, 2016

aclements commented May 24, 2016

aclements commented May 24, 2016

gopherbot commented May 24, 2016

Ariemeth commented Apr 15, 2016 •

edited

Loading