Skip to content

Commit 2e02768

Browse files
authored
runtime: optimize zeroing of registers in secret_amd64.s
Use VPXORQ instead of VMOVAPD because the former, when in the form of a zeroing idiom, is handled directly by the renamer. Tweak also the KXORQs to operate each on a single register, making it trivial to understand what the intent is, and so that all can potentially execute in parallel.
1 parent 1b291b7 commit 2e02768

File tree

1 file changed

+31
-24
lines changed

1 file changed

+31
-24
lines changed

src/runtime/secret_amd64.s

Lines changed: 31 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -71,33 +71,40 @@ avx:
7171
JNE noavx512
7272

7373
// Zero X16-X31
74-
// Note that VZEROALL above already cleared Z0-Z15.
75-
VMOVAPD Z0, Z16
76-
VMOVAPD Z0, Z17
77-
VMOVAPD Z0, Z18
78-
VMOVAPD Z0, Z19
79-
VMOVAPD Z0, Z20
80-
VMOVAPD Z0, Z21
81-
VMOVAPD Z0, Z22
82-
VMOVAPD Z0, Z23
83-
VMOVAPD Z0, Z24
84-
VMOVAPD Z0, Z25
85-
VMOVAPD Z0, Z26
86-
VMOVAPD Z0, Z27
87-
VMOVAPD Z0, Z28
88-
VMOVAPD Z0, Z29
89-
VMOVAPD Z0, Z30
90-
VMOVAPD Z0, Z31
74+
// VPXORQ r, r, r is a zeroing idiom according to section
75+
// 3.5.1.7 "Clearing Registers and Dependency Breaking Idioms" in
76+
// "Intel® 64 and IA-32 Architectures Optimization Reference Manual: Volume 1"
77+
// (April 2024)
78+
VPXORQ Z16, Z16, Z16
79+
VPXORQ Z17, Z17, Z17
80+
VPXORQ Z18, Z18, Z18
81+
VPXORQ Z19, Z19, Z19
82+
VPXORQ Z20, Z20, Z20
83+
VPXORQ Z21, Z21, Z21
84+
VPXORQ Z22, Z22, Z22
85+
VPXORQ Z23, Z23, Z23
86+
VPXORQ Z24, Z24, Z24
87+
VPXORQ Z25, Z25, Z25
88+
VPXORQ Z26, Z26, Z26
89+
VPXORQ Z27, Z27, Z27
90+
VPXORQ Z28, Z28, Z28
91+
VPXORQ Z29, Z29, Z29
92+
VPXORQ Z30, Z30, Z30
93+
VPXORQ Z31, Z31, Z31
9194

9295
// Zero k0-k7
96+
// While these are not categorized as zeroing idioms, having them
97+
// operate on a single register per instruction makes it easy to
98+
// understand what each instruction does.
99+
// Note: for wider compatibility these could equally also be KXORW.
93100
KXORQ K0, K0, K0
94-
KXORQ K0, K0, K1
95-
KXORQ K0, K0, K2
96-
KXORQ K0, K0, K3
97-
KXORQ K0, K0, K4
98-
KXORQ K0, K0, K5
99-
KXORQ K0, K0, K6
100-
KXORQ K0, K0, K7
101+
KXORQ K1, K1, K1
102+
KXORQ K2, K2, K2
103+
KXORQ K3, K3, K3
104+
KXORQ K4, K4, K4
105+
KXORQ K5, K5, K5
106+
KXORQ K6, K6, K6
107+
KXORQ K7, K7, K7
101108

102109
noavx512:
103110
// misc registers

0 commit comments

Comments
 (0)