Description
Volunteer (and often reluctant) toolspeed cop here. (Sorry, @dr2chase.)
CL 102435 has a non-trivial impact on compilation speed, memory usage, and binary size:
name old time/op new time/op delta
Template 176ms ± 2% 181ms ± 3% +2.61% (p=0.000 n=45+50)
Unicode 87.5ms ± 5% 87.9ms ± 4% ~ (p=0.147 n=48+49)
GoTypes 557ms ± 4% 569ms ± 2% +2.18% (p=0.000 n=42+44)
Compiler 2.65s ± 3% 2.70s ± 3% +1.82% (p=0.000 n=49+49)
SSA 7.16s ± 2% 7.37s ± 2% +3.00% (p=0.000 n=48+47)
Flate 118ms ± 2% 123ms ± 3% +4.05% (p=0.000 n=48+49)
GoParser 138ms ± 3% 143ms ± 2% +3.28% (p=0.000 n=49+47)
Reflect 360ms ± 3% 367ms ± 3% +1.76% (p=0.000 n=48+48)
Tar 157ms ± 4% 160ms ± 3% +2.31% (p=0.000 n=50+49)
XML 201ms ± 4% 207ms ± 3% +2.79% (p=0.000 n=48+49)
[Geo mean] 353ms 362ms +2.42%
name old user-time/op new user-time/op delta
Template 215ms ± 3% 219ms ± 3% +1.67% (p=0.000 n=48+49)
Unicode 110ms ± 5% 110ms ± 3% ~ (p=0.051 n=48+46)
GoTypes 741ms ± 4% 749ms ± 3% +1.05% (p=0.000 n=47+46)
Compiler 3.60s ± 4% 3.63s ± 2% +0.84% (p=0.002 n=44+49)
SSA 10.3s ± 4% 10.5s ± 2% +2.13% (p=0.000 n=44+46)
Flate 138ms ± 3% 143ms ± 3% +3.28% (p=0.000 n=48+46)
GoParser 159ms ± 3% 175ms ± 4% +9.82% (p=0.000 n=50+47)
Reflect 464ms ± 2% 466ms ± 3% +0.47% (p=0.020 n=47+49)
Tar 195ms ± 4% 198ms ± 3% +1.40% (p=0.000 n=50+46)
XML 241ms ± 9% 258ms ± 3% +7.04% (p=0.000 n=50+48)
[Geo mean] 446ms 458ms +2.79%
name old alloc/op new alloc/op delta
Template 35.1MB ± 0% 36.8MB ± 0% +4.91% (p=0.008 n=5+5)
Unicode 29.3MB ± 0% 29.8MB ± 0% +1.59% (p=0.008 n=5+5)
GoTypes 115MB ± 0% 121MB ± 0% +5.15% (p=0.008 n=5+5)
Compiler 521MB ± 0% 560MB ± 0% +7.48% (p=0.008 n=5+5)
SSA 1.71GB ± 0% 1.91GB ± 0% +11.69% (p=0.008 n=5+5)
Flate 24.2MB ± 0% 25.4MB ± 0% +4.91% (p=0.008 n=5+5)
GoParser 28.1MB ± 0% 29.5MB ± 0% +4.87% (p=0.008 n=5+5)
Reflect 78.7MB ± 0% 82.4MB ± 0% +4.65% (p=0.008 n=5+5)
Tar 34.5MB ± 0% 36.1MB ± 0% +4.62% (p=0.008 n=5+5)
XML 43.3MB ± 0% 45.5MB ± 0% +5.27% (p=0.008 n=5+5)
[Geo mean] 78.1MB 82.4MB +5.48%
name old allocs/op new allocs/op delta
Template 328k ± 0% 336k ± 0% +2.59% (p=0.008 n=5+5)
Unicode 336k ± 0% 338k ± 0% +0.37% (p=0.008 n=5+5)
GoTypes 1.14M ± 0% 1.17M ± 0% +2.29% (p=0.008 n=5+5)
Compiler 4.77M ± 0% 4.88M ± 0% +2.23% (p=0.008 n=5+5)
SSA 13.7M ± 0% 14.0M ± 0% +2.49% (p=0.008 n=5+5)
Flate 220k ± 0% 226k ± 0% +2.71% (p=0.008 n=5+5)
GoParser 273k ± 0% 280k ± 0% +2.29% (p=0.008 n=5+5)
Reflect 940k ± 0% 971k ± 0% +3.32% (p=0.008 n=5+5)
Tar 321k ± 0% 330k ± 0% +2.65% (p=0.008 n=5+5)
XML 383k ± 0% 390k ± 0% +1.94% (p=0.008 n=5+5)
[Geo mean] 751k 768k +2.28%
name old text-bytes new text-bytes delta
HelloSize 672kB ± 0% 680kB ± 0% +1.22% (p=0.000 n=50+50)
name old data-bytes new data-bytes delta
HelloSize 134kB ± 0% 134kB ± 0% ~ (all equal)
name old exe-bytes new exe-bytes delta
HelloSize 1.43MB ± 0% 1.49MB ± 0% +4.00% (p=0.000 n=50+50)
Reading the CL description, I can't quite tell what the value is to the user that offsets that impact.
I'd like to revisit whether the speed, memory impact, and binary size impacts can be mitigated at all before Go 1.11 is released, and I personally would like to understand better what benefits the CL brings.
Also, as an aside, I am slightly concerned about this bit from the CL description:
The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices. This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.
Value order should not matter, and cannot be relied upon. (And @randall77 has an old, outstanding CL to randomize Value order to enforce that.) And it is unclear whether the fuse.go changes need to be permanent or not.
I have very little dedicated laptop time in the immediate future due to personal life stuff, but I wanted to flag this right away. I'll look carefully at the CL and the impact as soon as I can.