Commit b0e9b00
authored
[NVPTX] Make nvptx mma instructions convergent. (#96521)
We are running into NVPTX backend generating wrong code for an input:
```
%0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...)
if laneid == 0:
ret
else:
store %0
```
The backend reorder the instruction (as an effect of `MachineSink` pass)
to
```
if laneid == 0:
ret
else:
%0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...)
store %0
```
This is incorrect because `mma` is a warp instruction which needs all
threads to sync before performing the operation instead of being guarded
by a specific thread id. It should be similar as the shuffle instruction
`shfl` in terms of warp level sync, and `shfl` is marked as
`isConvergent = true`.
Apply `isConvergent = true` to `mma` instructions.1 parent 7ea63b9 commit b0e9b00
File tree
2 files changed
+30
-0
lines changed- llvm
- lib/Target/NVPTX
- test/CodeGen/NVPTX
2 files changed
+30
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6725 | 6725 | | |
6726 | 6726 | | |
6727 | 6727 | | |
| 6728 | + | |
6728 | 6729 | | |
6729 | 6730 | | |
6730 | 6731 | | |
| |||
6746 | 6747 | | |
6747 | 6748 | | |
6748 | 6749 | | |
| 6750 | + | |
6749 | 6751 | | |
6750 | 6752 | | |
6751 | 6753 | | |
| |||
6775 | 6777 | | |
6776 | 6778 | | |
6777 | 6779 | | |
| 6780 | + | |
6778 | 6781 | | |
6779 | 6782 | | |
6780 | 6783 | | |
| |||
6794 | 6797 | | |
6795 | 6798 | | |
6796 | 6799 | | |
| 6800 | + | |
6797 | 6801 | | |
6798 | 6802 | | |
6799 | 6803 | | |
| |||
Lines changed: 26 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
0 commit comments