Skip to content

Unexpected pprof heap output means buggy pprof or misunderstanding in how pprof heap output works #49152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
simonhf opened this issue Oct 25, 2021 · 3 comments

Comments

@simonhf
Copy link

simonhf commented Oct 25, 2021

What version of Go are you using (go version)?

$ go version
go version go1.15.9 linux/amd64

Does this issue reproduce with the latest release?

Have not tried.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN="/home/simonhf/go/bin"
GOCACHE="/home/simonhf/.cache/go-build"
GOENV="/home/simonhf/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/simonhf/go/pkg/mod"
GOOS="linux"
GOPATH="/home/simonhf/go"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go-1.15"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.15/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build734185766=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I have not tried to reproduce this in a smaller standalone program. The pprof output shown below is the product of a larger Golang program running for several days. I posted it here mainly to sanity check my understanding of how pprof works, because based on my current understanding pprof seems to not work as expected, i.e. is it buggy?

What did you expect to see?

Expected to see top inuse_space item github.com/hashicorp/golang-lru/simplelru.(*LRU).Add in pprof heap output for all pprof heap modes including alloc_space and alloc_objects, and not just inuse_space and inuse_objects.

What did you see instead?

I've been using pprof to examine heap usage, and am getting confused by the results, making me think that there might be a bug in pprof? Otherwise, how to explain the results below?

So after collecting the file profiler/heap-2021-10-18T15:52:15Z output by pprof then I run the pprof command top9999 on the file using the 4 different modes; inuse_space, inuse_objects, alloc_space, and alloc_objects:

$ echo "profiler/heap-2021-10-18T15:52:15Z" | perl -lane '$pprof_file=$_; foreach $report(qw(inuse_space inuse_objects alloc_space alloc_objects)){ $command = qq[echo "top9999" | go tool pprof -$report $pprof_file 2>&1 | tee $pprof_file.top9999.$report]; printf qq[- running: %s\n], $command; `$command`; }'
- running: echo "top9999" | go tool pprof -inuse_space profiler/heap-2021-10-18T15:52:15Z 2>&1 | tee profiler/heap-2021-10-18T15:52:15Z.top9999.inuse_space
- running: echo "top9999" | go tool pprof -inuse_objects profiler/heap-2021-10-18T15:52:15Z 2>&1 | tee profiler/heap-2021-10-18T15:52:15Z.top9999.inuse_objects
- running: echo "top9999" | go tool pprof -alloc_space profiler/heap-2021-10-18T15:52:15Z 2>&1 | tee profiler/heap-2021-10-18T15:52:15Z.top9999.alloc_space
- running: echo "top9999" | go tool pprof -alloc_objects profiler/heap-2021-10-18T15:52:15Z 2>&1 | tee profiler/heap-2021-10-18T15:52:15Z.top9999.alloc_objects

This gives 4 different text files containing the output, and here are their lengths in bytes:

$ wc -c profiler/*.top9999* profiler/heap-2021-10-18T15:52:15Z
   29031 profiler/heap-2021-10-18T15:52:15Z.top9999.alloc_objects
   30011 profiler/heap-2021-10-18T15:52:15Z.top9999.alloc_space
    5680 profiler/heap-2021-10-18T15:52:15Z.top9999.inuse_objects
    9594 profiler/heap-2021-10-18T15:52:15Z.top9999.inuse_space
  308702 profiler/heap-2021-10-18T15:52:15Z

The output files are quite big, but here are the first lines of each of them:

$ head profiler/*.top9999*
==> profiler/heap-2021-10-18T15:52:15Z.top9999.alloc_objects <==
File: app
Build ID: c70f54797ef88ff7ee99f6f39ce39b1b4c3ab616
Type: alloc_objects
Time: Oct 18, 2021 at 8:52am (PDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) Showing nodes accounting for 6475315640, 82.21% of 7876264512 total
Dropped 1695 nodes (cum <= 39381322)
      flat  flat%   sum%        cum   cum%
 804291951 10.21% 10.21%  804291951 10.21%  github.com/onflow/flow-go/module/metrics.(*CacheCollector).CacheHit
 487455161  6.19% 16.40%  487455161  6.19%  reflect.Value.Slice

==> profiler/heap-2021-10-18T15:52:15Z.top9999.alloc_space <==
File: app
Build ID: c70f54797ef88ff7ee99f6f39ce39b1b4c3ab616
Type: alloc_space
Time: Oct 18, 2021 at 8:52am (PDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) Showing nodes accounting for 565080.18MB, 82.15% of 687902.88MB total
Dropped 1691 nodes (cum <= 3439.51MB)
      flat  flat%   sum%        cum   cum%
129031.23MB 18.76% 18.76% 129031.23MB 18.76%  github.com/onflow/flow-go/module/metrics.(*CacheCollector).CacheHit
50031.10MB  7.27% 26.03% 50256.62MB  7.31%  github.com/dgraph-io/badger/v2.(*DB).newTransaction

==> profiler/heap-2021-10-18T15:52:15Z.top9999.inuse_objects <==
File: app
Build ID: c70f54797ef88ff7ee99f6f39ce39b1b4c3ab616
Type: inuse_objects
Time: Oct 18, 2021 at 8:52am (PDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) Showing nodes accounting for 23144863, 97.34% of 23778530 total
Dropped 340 nodes (cum <= 118892)
      flat  flat%   sum%        cum   cum%
   9224542 38.79% 38.79%   17635313 74.16%  github.com/onflow/flow-go/network/p2p.(*RcvCache).add
   4407430 18.54% 57.33%    8438078 35.49%  github.com/hashicorp/golang-lru/simplelru.(*LRU).Add

==> profiler/heap-2021-10-18T15:52:15Z.top9999.inuse_space <==
File: app
Build ID: c70f54797ef88ff7ee99f6f39ce39b1b4c3ab616
Type: inuse_space
Time: Oct 18, 2021 at 8:52am (PDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) Showing nodes accounting for 1516.26MB, 97.29% of 1558.44MB total
Dropped 304 nodes (cum <= 7.79MB)
      flat  flat%   sum%        cum   cum%
  355.50MB 22.81% 22.81%   540.01MB 34.65%  github.com/hashicorp/golang-lru/simplelru.(*LRU).Add
  351.51MB 22.56% 45.37%   890.53MB 57.14%  github.com/onflow/flow-go/network/p2p.(*RcvCache).add

I'm looking at github.com/hashicorp/golang-lru/simplelru.(*LRU).Add which is 'top' for inuse_space (355 MB out of 1,516 MB) and 2nd top for inuse_objects (9,224,542 out of 23,144,863 objects).

I'm assuming inuse_... means heap space and objects 'in use' and not yet garbage collected. And I'm assuming that alloc_... means heap space and objects 'ever allocated'. For example, if an object is 1MB in size and ever allocated 10 times, but garbage collected 8 times, then for that object I'm expecting inuse_space to be 2MB, inuse_objects to be 2, alloc_space to be 10MB, and alloc_objects to be 10. Correct?

With this in mind then I go looking for github.com/hashicorp/golang-lru/simplelru.(*LRU).Add in the alloc_space and alloc_objects output, but unexpectedly don't find it:

$ cat profiler/heap-2021-10-18T15:52:15Z.top9999.alloc_space | egrep "simplelru"
$ cat profiler/heap-2021-10-18T15:52:15Z.top9999.alloc_objects | egrep "simplelru"
$

This is lowest line where i would expected to find it in each file (shortened by ... for brevity). The two bottom lines show the surrounding next highest and lowest counts, e.g. if the inuse_space for github.com/hashicorp/golang-lru/simplelru.(*LRU).Add was the result of allocations with no garbage collection, then I would expect to see the 355MB for it sandwiched between the 409MB and 348MB items below... however, it's not there and its not anywhere else to be found in the alloc_space output. Why not? Is there something wrong with my assumptions about how this works, or could there be a bug in pprof?

$ cat profiler/heap-2021-10-18T15:52:15Z.top9999.alloc_space
File: app
Build ID: c70f54797ef88ff7ee99f6f39ce39b1b4c3ab616
Type: alloc_space
Time: Oct 18, 2021 at 8:52am (PDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) Showing nodes accounting for 565080.18MB, 82.15% of 687902.88MB total
Dropped 1691 nodes (cum <= 3439.51MB)
      flat  flat%   sum%        cum   cum%
129031.23MB 18.76% 18.76% 129031.23MB 18.76%  github.com/onflow/flow-go/module/metrics.(*CacheCollector).CacheHit
50031.10MB  7.27% 26.03% 50256.62MB  7.31%  github.com/dgraph-io/badger/v2.(*DB).newTransaction
38181.27MB  5.55% 31.58% 38181.27MB  5.55%  github.com/libp2p/go-libp2p-pubsub/pb.(*Message).Unmarshal
...
  409.05MB 0.059% 81.47% 72876.40MB 10.59%  github.com/onflow/flow-go/state/protocol/inmem.FromEpoch
  348.53MB 0.051% 81.53%  3908.18MB  0.57%  github.com/onflow/flow-go/network/p2p.(*Network).processNetworkMessage
...

$ cat profiler/heap-2021-10-18T15:52:15Z.top9999.alloc_objects
File: app
Build ID: c70f54797ef88ff7ee99f6f39ce39b1b4c3ab616
Type: alloc_objects
Time: Oct 18, 2021 at 8:52am (PDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) Showing nodes accounting for 6475315640, 82.21% of 7876264512 total
Dropped 1695 nodes (cum <= 39381322)
      flat  flat%   sum%        cum   cum%
 804291951 10.21% 10.21%  804291951 10.21%  github.com/onflow/flow-go/module/metrics.(*CacheCollector).CacheHit
 487455161  6.19% 16.40%  487455161  6.19%  reflect.Value.Slice
...
   9650467  0.12% 80.93%  390536920  4.96%  github.com/libp2p/go-libp2p-core/crypto.PublicKeyToProto
   9012016  0.11% 81.04%  531889260  6.75%  github.com/onflow/flow-go/network/validator/pubsub.TopicValidator.func1
...
@seankhliao
Copy link
Member

Unlike many projects, the Go project does not use GitHub Issues for general discussion or asking questions. GitHub Issues are used for tracking bugs and proposals only.

For questions please refer to https://github.com/golang/go/wiki/Questions

@simonhf
Copy link
Author

simonhf commented Oct 26, 2021

@seankhliao Sorry for the misunderstanding and thanks for the link. FYI I moved the question here [1] to the Golang forum. If they think it's a bug in pprof then I'll let you know, and presumably then it would be okay to post it here (again)?

[1] https://forum.golangbridge.org/t/unexpected-pprof-heap-output-means-buggy-pprof-or-misunderstanding-in-how-pprof-heap-output-works/25142

@seankhliao
Copy link
Member

Yes, and preferably with an actual reproducer

@golang golang locked and limited conversation to collaborators Oct 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants