Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Commit 7f4d54d

Browse files
zeuxdtig
authored andcommitted
Add v8x16.shuffle1 instruction (#71)
This change adds a variable shuffle instruction to SIMD proposal. When indices are out of range, the result is specified as 0 for each lane. This matches hardware behavior on ARM and RISCV architectures. On x86_64 and MIPS, the hardware provides instructions that can select 0 when the high bit is set to 1 (x86_64) or any of the two high bits are set to 1 (MIPS). On these architectures, the backend is expected to emit a pair of instructions, saturating add (saturate(x + (128 - 16)) for x86_64) and permute, to emulate the proposed behavior. To distinguish variable shuffles with immediate shuffles, existing v8x16.shuffle instruction is renamed to v8x16.shuffle2_imm to be explicit about the fact that it shuffles two vectors with an immediate argument. This naming scheme allows for adding variants like v8x16.shuffle2 and v8x16.shuffle1_imm in the future. Fixes #68. Contributes to #24. Fixes #11.
1 parent a289c58 commit 7f4d54d

File tree

3 files changed

+27
-7
lines changed

3 files changed

+27
-7
lines changed

proposals/simd/BinarySIMD.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,13 @@ instr ::= ...
2323
```
2424

2525
Some SIMD instructions have additional immediate operands following `simdop`.
26-
The `v8x16.shuffle` instruction has 16 bytes after `simdop`.
26+
The `v8x16.shuffle2_imm` instruction has 16 bytes after `simdop`.
2727

2828
| Instruction | `simdop` | Immediate operands |
2929
| --------------------------|---------:|--------------------|
3030
| `v128.load` | `0x00`| m:memarg |
3131
| `v128.store` | `0x01`| m:memarg |
3232
| `v128.const` | `0x02`| i:ImmByte[16] |
33-
| `v8x16.shuffle` | `0x03`| s:LaneIdx32[16] |
3433
| `i8x16.splat` | `0x04`| - |
3534
| `i8x16.extract_lane_s` | `0x05`| i:LaneIdx16 |
3635
| `i8x16.extract_lane_u` | `0x06`| i:LaneIdx16 |
@@ -167,3 +166,5 @@ The `v8x16.shuffle` instruction has 16 bytes after `simdop`.
167166
| `f32x4.convert_u/i32x4` | `0xb0`| - |
168167
| `f64x2.convert_s/i64x2` | `0xb1`| - |
169168
| `f64x2.convert_u/i64x2` | `0xb2`| - |
169+
| `v8x16.shuffle1` | `0xc0`| - |
170+
| `v8x16.shuffle2_imm` | `0xc1`| s:LaneIdx32[16] |

proposals/simd/SIMD.md

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -284,8 +284,8 @@ def S.replace_lane(a, i, x):
284284
The input lane value, `x`, is interpreted the same way as for the splat
285285
instructions. For the `i8` and `i16` lanes, the high bits of `x` are ignored.
286286

287-
### Shuffle lanes
288-
* `v8x16.shuffle(a: v128, b: v128, imm: ImmLaneIdx32[16]) -> v128`
287+
### Shuffling using immediate indices
288+
* `v8x16.shuffle2_imm(a: v128, b: v128, imm: ImmLaneIdx32[16]) -> v128`
289289

290290
Returns a new vector with lanes selected from the lanes of the two input vectors
291291
`a` and `b` specified in the 16 byte wide immediate mode operand `imm`. This
@@ -294,7 +294,7 @@ return. The indices `i` in range `[0, 15]` select the `i`-th element of `a`. The
294294
indices in range `[16, 31]` select the `i - 16`-th element of `b`.
295295

296296
```python
297-
def S.shuffle(a, b, s):
297+
def S.shuffle2_imm(a, b, s):
298298
result = S.New()
299299
for i in range(S.Lanes):
300300
if s[i] < S.lanes:
@@ -304,6 +304,25 @@ def S.shuffle(a, b, s):
304304
return result
305305
```
306306

307+
### Shuffling using variable indices
308+
* `v8x16.shuffle1(a: v128, s: v128) -> v128`
309+
310+
Returns a new vector with lanes selected from the lanes of the first input
311+
vector `a` specified in the second input vector `s`. The indices `i` in range
312+
`[0, 15]` select the `i`-th element of `a`. For indices outside of the range
313+
the resulting lane is 0.
314+
315+
```python
316+
def S.shuffle1(a, s):
317+
result = S.New()
318+
for i in range(S.Lanes):
319+
if s[i] < S.lanes:
320+
result[i] = a[s[i]]
321+
else:
322+
result[i] = 0
323+
return result
324+
```
325+
307326
## Integer arithmetic
308327

309328
Wrapping integer arithmetic discards the high bits of the result.

proposals/simd/TextSIMD.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,8 @@ The canonical text format used for printing `v128.const` instructions is
2020
v128.const i32x4 0xNNNNNNNN 0xNNNNNNNN 0xNNNNNNNN 0xNNNNNNNN
2121
```
2222

23-
### v8x16.shuffle
23+
### v8x16.shuffle2_imm
2424

2525
```
26-
v8x16.shuffle i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5
26+
v8x16.shuffle2_imm i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5 i5
2727
```

0 commit comments

Comments
 (0)