-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Closed
Labels
FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.The path to resolution is known, but the work has not been done.
Milestone
Description
There seems to be a large regression in performance when function pointer calls where changed to use the link register instead of the count register. Based on when this change went in, this likely affects all versions back to go1.14. The impact can be mitigated by setting the appropriate branch hint for this usage e.g
The regex benchmarks are a pretty good example of how much this
hint can help generic ppc64le code on a power9 machine:
name old time/op new time/op delta
Find 606ns ± 0% 447ns ± 0% -26.27%
FindAllNoMatches 309ns ± 0% 205ns ± 0% -33.72%
FindString 609ns ± 0% 451ns ± 0% -26.04%
FindSubmatch 734ns ± 0% 594ns ± 0% -19.07%
FindStringSubmatch 706ns ± 0% 574ns ± 0% -18.83%
Literal 177ns ± 0% 136ns ± 0% -22.89%
NotLiteral 4.69µs ± 0% 2.34µs ± 0% -50.14%
MatchClass 6.05µs ± 0% 3.26µs ± 0% -46.08%
MatchClass_InRange 5.93µs ± 0% 3.15µs ± 0% -46.86%
ReplaceAll 3.15µs ± 0% 2.18µs ± 0% -30.77%
AnchoredLiteralShortNonMatch 156ns ± 0% 109ns ± 0% -30.61%
AnchoredLiteralLongNonMatch 192ns ± 0% 136ns ± 0% -29.34%
AnchoredShortMatch 268ns ± 0% 209ns ± 0% -22.00%
AnchoredLongMatch 472ns ± 0% 357ns ± 0% -24.30%
OnePassShortA 1.16µs ± 0% 0.87µs ± 0% -25.03%
NotOnePassShortA 1.34µs ± 0% 1.20µs ± 0% -10.63%
OnePassShortB 940ns ± 0% 655ns ± 0% -30.29%
NotOnePassShortB 873ns ± 0% 703ns ± 0% -19.52%
OnePassLongPrefix 258ns ± 0% 155ns ± 0% -40.13%
OnePassLongNotPrefix 943ns ± 0% 529ns ± 0% -43.89%
MatchParallelShared 591ns ± 0% 436ns ± 0% -26.31%
MatchParallelCopied 596ns ± 0% 435ns ± 0% -27.10%
QuoteMetaAll 186ns ± 0% 186ns ± 0% -0.16%
QuoteMetaNone 55.9ns ± 0% 55.9ns ± 0% +0.02%
Compile/Onepass 9.64µs ± 0% 9.26µs ± 0% -3.97%
Compile/Medium 21.7µs ± 0% 20.6µs ± 0% -4.90%
Compile/Hard 174µs ± 0% 174µs ± 0% +0.07%
Match/Easy0/16 7.35ns ± 0% 7.34ns ± 0% -0.11%
Match/Easy0/32 116ns ± 0% 97ns ± 0% -16.27%
Match/Easy0/1K 592ns ± 0% 562ns ± 0% -5.04%
Match/Easy0/32K 12.6µs ± 0% 12.5µs ± 0% -0.64%
Match/Easy0/1M 556µs ± 0% 556µs ± 0% -0.00%
Match/Easy0/32M 17.7ms ± 0% 17.7ms ± 0% +0.05%
Match/Easy0i/16 7.34ns ± 0% 7.35ns ± 0% +0.10%
Match/Easy0i/32 2.82µs ± 0% 1.64µs ± 0% -41.71%
Match/Easy0i/1K 83.2µs ± 0% 48.2µs ± 0% -42.06%
Match/Easy0i/32K 2.13ms ± 0% 1.80ms ± 0% -15.34%
Match/Easy0i/1M 68.1ms ± 0% 57.6ms ± 0% -15.31%
Match/Easy0i/32M 2.18s ± 0% 1.80s ± 0% -17.52%
Match/Easy1/16 7.36ns ± 0% 7.34ns ± 0% -0.24%
Match/Easy1/32 118ns ± 0% 96ns ± 0% -18.72%
Match/Easy1/1K 2.46µs ± 0% 1.58µs ± 0% -35.65%
Match/Easy1/32K 80.2µs ± 0% 54.6µs ± 0% -31.92%
Match/Easy1/1M 2.75ms ± 0% 1.88ms ± 0% -31.66%
Match/Easy1/32M 87.5ms ± 0% 59.8ms ± 0% -31.62%
Match/Medium/16 7.34ns ± 0% 7.34ns ± 0% +0.01%
Match/Medium/32 2.60µs ± 0% 1.50µs ± 0% -42.61%
Match/Medium/1K 78.1µs ± 0% 43.7µs ± 0% -44.06%
Match/Medium/32K 2.08ms ± 0% 1.52ms ± 0% -27.11%
Match/Medium/1M 66.5ms ± 0% 48.6ms ± 0% -26.96%
Match/Medium/32M 2.14s ± 0% 1.60s ± 0% -25.18%
Match/Hard/16 7.35ns ± 0% 7.35ns ± 0% +0.03%
Match/Hard/32 3.58µs ± 0% 2.44µs ± 0% -31.82%
Match/Hard/1K 108µs ± 0% 75µs ± 0% -31.04%
Match/Hard/32K 2.79ms ± 0% 2.25ms ± 0% -19.30%
Match/Hard/1M 89.4ms ± 0% 72.2ms ± 0% -19.26%
Match/Hard/32M 2.91s ± 0% 2.37s ± 0% -18.60%
Match/Hard1/16 11.1µs ± 0% 8.3µs ± 0% -25.07%
Match/Hard1/32 21.4µs ± 0% 16.1µs ± 0% -24.85%
Match/Hard1/1K 658µs ± 0% 498µs ± 0% -24.27%
Match/Hard1/32K 12.2ms ± 0% 11.7ms ± 0% -4.60%
Match/Hard1/1M 391ms ± 0% 374ms ± 0% -4.40%
Match/Hard1/32M 12.6s ± 0% 12.0s ± 0% -4.68%
Match_onepass_regex/16 870ns ± 0% 611ns ± 0% -29.79%
Match_onepass_regex/32 1.58µs ± 0% 1.08µs ± 0% -31.48%
Match_onepass_regex/1K 45.7µs ± 0% 30.3µs ± 0% -33.58%
Match_onepass_regex/32K 1.45ms ± 0% 0.97ms ± 0% -33.20%
Match_onepass_regex/1M 46.2ms ± 0% 30.9ms ± 0% -33.01%
Match_onepass_regex/32M 1.46s ± 0% 0.99s ± 0% -32.02%
name old alloc/op new alloc/op delta
Find 0.00B 0.00B 0.00%
FindAllNoMatches 0.00B 0.00B 0.00%
FindString 0.00B 0.00B 0.00%
FindSubmatch 48.0B ± 0% 48.0B ± 0% 0.00%
FindStringSubmatch 32.0B ± 0% 32.0B ± 0% 0.00%
Compile/Onepass 4.02kB ± 0% 4.02kB ± 0% 0.00%
Compile/Medium 9.39kB ± 0% 9.39kB ± 0% 0.00%
Compile/Hard 84.7kB ± 0% 84.7kB ± 0% 0.00%
Match_onepass_regex/16 0.00B 0.00B 0.00%
Match_onepass_regex/32 0.00B 0.00B 0.00%
Match_onepass_regex/1K 0.00B 0.00B 0.00%
Match_onepass_regex/32K 0.00B 0.00B 0.00%
Match_onepass_regex/1M 5.00B ± 0% 3.00B ± 0% -40.00%
Match_onepass_regex/32M 136B ± 0% 68B ± 0% -50.00%
name old allocs/op new allocs/op delta
Find 0.00 0.00 0.00%
FindAllNoMatches 0.00 0.00 0.00%
FindString 0.00 0.00 0.00%
FindSubmatch 1.00 ± 0% 1.00 ± 0% 0.00%
FindStringSubmatch 1.00 ± 0% 1.00 ± 0% 0.00%
Compile/Onepass 52.0 ± 0% 52.0 ± 0% 0.00%
Compile/Medium 112 ± 0% 112 ± 0% 0.00%
Compile/Hard 424 ± 0% 424 ± 0% 0.00%
Match_onepass_regex/16 0.00 0.00 0.00%
Match_onepass_regex/32 0.00 0.00 0.00%
Match_onepass_regex/1K 0.00 0.00 0.00%
Match_onepass_regex/32K 0.00 0.00 0.00%
Match_onepass_regex/1M 0.00 0.00 0.00%
Match_onepass_regex/32M 2.00 ± 0% 1.00 ± 0% -50.00%
name old speed new speed delta
QuoteMetaAll 75.2MB/s ± 0% 75.3MB/s ± 0% +0.15%
QuoteMetaNone 465MB/s ± 0% 465MB/s ± 0% -0.02%
Match/Easy0/16 2.18GB/s ± 0% 2.18GB/s ± 0% +0.10%
Match/Easy0/32 276MB/s ± 0% 330MB/s ± 0% +19.46%
Match/Easy0/1K 1.73GB/s ± 0% 1.82GB/s ± 0% +5.29%
Match/Easy0/32K 2.60GB/s ± 0% 2.62GB/s ± 0% +0.64%
Match/Easy0/1M 1.89GB/s ± 0% 1.89GB/s ± 0% +0.00%
Match/Easy0/32M 1.89GB/s ± 0% 1.89GB/s ± 0% -0.05%
Match/Easy0i/16 2.18GB/s ± 0% 2.18GB/s ± 0% -0.10%
Match/Easy0i/32 11.4MB/s ± 0% 19.5MB/s ± 0% +71.48%
Match/Easy0i/1K 12.3MB/s ± 0% 21.2MB/s ± 0% +72.62%
Match/Easy0i/32K 15.4MB/s ± 0% 18.2MB/s ± 0% +18.12%
Match/Easy0i/1M 15.4MB/s ± 0% 18.2MB/s ± 0% +18.12%
Match/Easy0i/32M 15.4MB/s ± 0% 18.6MB/s ± 0% +21.21%
Match/Easy1/16 2.17GB/s ± 0% 2.18GB/s ± 0% +0.24%
Match/Easy1/32 271MB/s ± 0% 333MB/s ± 0% +23.07%
Match/Easy1/1K 417MB/s ± 0% 648MB/s ± 0% +55.38%
Match/Easy1/32K 409MB/s ± 0% 600MB/s ± 0% +46.88%
Match/Easy1/1M 381MB/s ± 0% 558MB/s ± 0% +46.33%
Match/Easy1/32M 383MB/s ± 0% 561MB/s ± 0% +46.25%
Match/Medium/16 2.18GB/s ± 0% 2.18GB/s ± 0% -0.01%
Match/Medium/32 12.3MB/s ± 0% 21.4MB/s ± 0% +74.13%
Match/Medium/1K 13.1MB/s ± 0% 23.4MB/s ± 0% +78.73%
Match/Medium/32K 15.7MB/s ± 0% 21.6MB/s ± 0% +37.23%
Match/Medium/1M 15.8MB/s ± 0% 21.6MB/s ± 0% +36.93%
Match/Medium/32M 15.7MB/s ± 0% 21.0MB/s ± 0% +33.67%
Match/Hard/16 2.18GB/s ± 0% 2.18GB/s ± 0% -0.03%
Match/Hard/32 8.93MB/s ± 0% 13.10MB/s ± 0% +46.70%
Match/Hard/1K 9.48MB/s ± 0% 13.74MB/s ± 0% +44.94%
Match/Hard/32K 11.7MB/s ± 0% 14.5MB/s ± 0% +23.87%
Match/Hard/1M 11.7MB/s ± 0% 14.5MB/s ± 0% +23.87%
Match/Hard/32M 11.6MB/s ± 0% 14.2MB/s ± 0% +22.86%
Match/Hard1/16 1.44MB/s ± 0% 1.93MB/s ± 0% +34.03%
Match/Hard1/32 1.49MB/s ± 0% 1.99MB/s ± 0% +33.56%
Match/Hard1/1K 1.56MB/s ± 0% 2.05MB/s ± 0% +31.41%
Match/Hard1/32K 2.68MB/s ± 0% 2.80MB/s ± 0% +4.48%
Match/Hard1/1M 2.68MB/s ± 0% 2.80MB/s ± 0% +4.48%
Match/Hard1/32M 2.66MB/s ± 0% 2.79MB/s ± 0% +4.89%
Match_onepass_regex/16 18.4MB/s ± 0% 26.2MB/s ± 0% +42.41%
Match_onepass_regex/32 20.2MB/s ± 0% 29.5MB/s ± 0% +45.92%
Match_onepass_regex/1K 22.4MB/s ± 0% 33.8MB/s ± 0% +50.54%
Match_onepass_regex/32K 22.6MB/s ± 0% 33.9MB/s ± 0% +49.67%
Match_onepass_regex/1M 22.7MB/s ± 0% 33.9MB/s ± 0% +49.27%
Match_onepass_regex/32M 23.0MB/s ± 0% 33.9MB/s ± 0% +47.14%
Metadata
Metadata
Assignees
Labels
FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.The path to resolution is known, but the work has not been done.