You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
runtime: zero upper bit of Y registers in asyncPreempt on darwin/amd64
Apparently, the signal handling code path in darwin kernel leaves
the upper bits of Y registers in a dirty state, which causes many
SSE operations (128-bit and narrower) become much slower. Clear
the upper bits to get to a clean state.
We do it at the entry of asyncPreempt, which is immediately
following exiting from the kernel's signal handling code, if we
actually injected a call. It does not cover other exits where we
don't inject a call, e.g. failed preemption, profiling signal, or
other async signals. But it does cover an important use case of
async signals, preempting a tight numerical loop, which we
introduced in this cycle.
Running the benchmark in issue #37174:
name old time/op new time/op delta
Fast-8 90.0ns ± 1% 46.8ns ± 3% -47.97% (p=0.000 n=10+10)
Slow-8 188ns ± 5% 49ns ± 1% -73.82% (p=0.000 n=10+9)
There is no more slowdown due to preemption signals.
For #37174.
Change-Id: I8b83d083fade1cabbda09b4bc25ccbadafaf7605
Reviewed-on: https://go-review.googlesource.com/c/go/+/219131
Run-TryBot: Cherry Zhang <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Reviewed-by: Keith Randall <[email protected]>
0 commit comments