-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Description
What version of Go are you using (go version
)?
$ go version go version go1.15.3 linux/amd64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go env GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/home/niaow/.cache/go-build" GOENV="/home/niaow/.config/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOINSECURE="" GOMODCACHE="/home/niaow/go/pkg/mod" GONOPROXY="github.com/molecula" GONOSUMDB="github.com/molecula" GOOS="linux" GOPATH="/home/niaow/go" GOPRIVATE="github.com/molecula" GOPROXY="https://proxy.golang.org,direct" GOROOT="/usr/lib/go" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build497931594=/tmp/go-build -gno-record-gcc-switches"
What did you do?
Passed a single context with cancellation to a bunch of goroutines.
These goroutines had a cold-path compute task, interlaced with calls to context.Err()
to detect cancellation.
The loop looks something like:
var out []Thing
for iterator.Next() {
if err := ctx.Err(); err != nil {
// caller doesn't need a result anymore.
return nil, err
}
// Fetch thing from iterator, apply some filtering functions, and append it to out.
}
What did you expect to see?
A bit of a slowdown from the context check maybe?
What did you see instead?
Slightly over 50% of the CPU time was spent in runtime.findrunnable
. The cancelContext
struct uses a sync.Mutex
, and due to extreme lock contention (64 CPU threads spamming it), this was triggering lockSlow
. From poking at pprof, it appears that about 86% of CPU time was spent in functions related to this lock acquire.
I was able to work around this by adding a counter and checking it less frequently. However, I do not think that this is an intended performance degradation path. Theoretically this could be made more efficient with sync/atomic
, although I think a sync.RWMutex
would still be more than sufficient.