Commit c01a65f
committed
[AArch64] Improve codegen of vectorised early exit loops
Once PR llvm#112138 lands we are able to start vectorising more loops
that have uncountable early exits. The typical loop structure
looks like this:
vector.body:
...
%pred = icmp eq <2 x ptr> %wide.load, %broadcast.splat
...
%or.reduc = tail call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> %pred)
%iv.cmp = icmp eq i64 %index.next, 4
%exit.cond = or i1 %or.reduc, %iv.cmp
br i1 %exit.cond, label %middle.split, label %vector.body
middle.split:
br i1 %or.reduc, label %found, label %notfound
found:
ret i64 1
notfound:
ret i64 0
The problem with this is that %or.reduc is kept live after the loop,
and since this is a boolean it typically requires making a copy of
the condition code register. For AArch64 this requires an additional
cset instruction, which is quite expensive for a typical find loop
that only contains 6 or 7 instructions.
This patch attempts to improve the codegen by sinking the reduction
out of the loop to the location of it's user. It's a lot cheaper to
keep the predicate alive if the type is legal and has lots of
registers for it. There is a potential downside in that a little
more work is required after the loop, but I believe this is worth
it since we are likely to spend most of our time in the loop.1 parent daaf62a commit c01a65f
File tree
2 files changed
+35
-12
lines changed- llvm
- lib/Target/AArch64
- test/CodeGen/AArch64
2 files changed
+35
-12
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5092 | 5092 | | |
5093 | 5093 | | |
5094 | 5094 | | |
5095 | | - | |
| 5095 | + | |
| 5096 | + | |
| 5097 | + | |
| 5098 | + | |
| 5099 | + | |
| 5100 | + | |
5096 | 5101 | | |
5097 | 5102 | | |
5098 | 5103 | | |
5099 | 5104 | | |
| 5105 | + | |
5100 | 5106 | | |
5101 | 5107 | | |
5102 | 5108 | | |
5103 | 5109 | | |
5104 | 5110 | | |
5105 | 5111 | | |
5106 | 5112 | | |
| 5113 | + | |
| 5114 | + | |
| 5115 | + | |
| 5116 | + | |
| 5117 | + | |
| 5118 | + | |
| 5119 | + | |
| 5120 | + | |
| 5121 | + | |
| 5122 | + | |
| 5123 | + | |
| 5124 | + | |
| 5125 | + | |
| 5126 | + | |
| 5127 | + | |
| 5128 | + | |
| 5129 | + | |
5107 | 5130 | | |
5108 | 5131 | | |
5109 | 5132 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
95 | 95 | | |
96 | 96 | | |
97 | 97 | | |
98 | | - | |
| 98 | + | |
99 | 99 | | |
100 | | - | |
101 | | - | |
102 | | - | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
103 | 103 | | |
104 | 104 | | |
105 | | - | |
| 105 | + | |
106 | 106 | | |
107 | | - | |
108 | 107 | | |
109 | 108 | | |
110 | 109 | | |
111 | | - | |
112 | | - | |
| 110 | + | |
| 111 | + | |
113 | 112 | | |
114 | 113 | | |
115 | | - | |
| 114 | + | |
| 115 | + | |
116 | 116 | | |
117 | 117 | | |
118 | 118 | | |
| |||
147 | 147 | | |
148 | 148 | | |
149 | 149 | | |
150 | | - | |
151 | 150 | | |
152 | 151 | | |
153 | 152 | | |
154 | 153 | | |
155 | 154 | | |
156 | 155 | | |
157 | 156 | | |
158 | | - | |
| 157 | + | |
| 158 | + | |
159 | 159 | | |
160 | 160 | | |
161 | 161 | | |
| |||
0 commit comments