gopprof

"go test -cpuprofile" does not generate a full trace


Issue

I have a go package, with a test suite.

When I run the test suite for this package, the total runtime is ~ 7 seconds :

$ go test ./mydbpackage/ -count 1
ok      mymodule/mydbpackage    7.253s

However, when I add a -cpuprofile=cpu.out option, the sampling does not cover the whole run :

$ go test ./mydbpackage/ -count 1 -cpuprofile=cpu.out
ok      mymodule/mydbpackage    7.029s

$ go tool pprof -text -cum cpu.out
File: mydbpackage.test
Type: cpu
Time: Aug 6, 2020 at 9:42am (CEST)
Duration: 5.22s, Total samples = 780ms (14.95%)     # <--- depending on the runs, I get 400ms to 1s
Showing nodes accounting for 780ms, 100% of 780ms total
      flat  flat%   sum%        cum   cum%
         0     0%     0%      440ms 56.41%  testing.tRunner
      10ms  1.28%  1.28%      220ms 28.21%  database/sql.withLock
      10ms  1.28%  2.56%      180ms 23.08%  runtime.findrunnable
         0     0%  2.56%      180ms 23.08%  runtime.mcall
      ...

Looking at the collected samples :

# sample from another run :
$ go tool pprof -traces cpu.out | grep "ms "  # get the first line of each sample
      10ms   runtime.nanotime
      10ms   fmt.(*readRune).ReadRune
      30ms   syscall.Syscall
      10ms   runtime.scanobject
      10ms   runtime.gentraceback
      ...
# 98 samples collected, for a total sum of 1.12s

The issue I see is : for some reason, the sampling profiler stops collecting samples, or is blocked/slowed down at some point.

Context

go version is 1.14.6, platform is linux/amd64

$ go version
go version go1.14.6 linux/amd64

This package contains code that interact with a database, and the tests are run against a live postgresql server.

One thing I tried : t.Skip() internally calls runtime.Goexit(), so I replaced calls to t.Skip and variants with a simple return ; but it didn't change the outcome.

Question

Why aren't more samples collected ? I there some known pattern that blocks/slows down the sampler, or terminates the sampler earlier than it should ?


Solution

  • @Volker guided me to the answer in his comments :
    -cpuprofile creates a profile in which only goroutines actively using the CPU are sampled.

    In my use case : my go code spends a lot of time waiting for the answers of the postgresql server.

    Generating a trace using go test -trace=trace.out, and then extracting a network blocking profile using go tool trace -pprof=net trace.out > network.out yielded much more relevant information.

    For reference, on top of opening the complete trace using go tool trace trace.out, here are the values you can pass to -pprof= :

    from go tool trace docs :

    • net: network blocking profile
    • sync: synchronization blocking profile
    • syscall: syscall blocking profile
    • sched: scheduler latency profile