Commit b01e834
committed
aarch64: Optimize SVE encode functions to use peak-performance vector combinations
Update both ec_encode_data_sve() and ec_encode_data_sve2() to use optimal
4 and 5 vector combinations based on benchmark results showing these
achieve the highest performance.
Key optimizations:
- Loop over 4-vector operations when rows > 7 (peak performance)
- Use 4+3 combination for 7 vectors instead of single 7-vector call
- Use 4+2 combination for 6 vectors instead of single 6-vector call
- Keep 5-vector for 5 vectors (second-best performance)
- Applies to both SVE and SVE2 variants for consistent optimization
This leverages the benchmark findings that 4 and 5 vector operations
achieve 40+ GB/s performance, significantly better than 6-7 vector
operations which drop to 30-36 GB/s.
Signed-off-by: Jonathan Swinney <[email protected]>1 parent a00e9db commit b01e834
1 file changed
+26
-66
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
214 | 214 | | |
215 | 215 | | |
216 | 216 | | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
222 | 222 | | |
223 | 223 | | |
224 | 224 | | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | | - | |
| 225 | + | |
| 226 | + | |
248 | 227 | | |
249 | 228 | | |
250 | 229 | | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
| 230 | + | |
255 | 231 | | |
256 | 232 | | |
257 | | - | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
258 | 238 | | |
259 | 239 | | |
260 | 240 | | |
| |||
285 | 265 | | |
286 | 266 | | |
287 | 267 | | |
288 | | - | |
289 | | - | |
290 | | - | |
291 | | - | |
292 | | - | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
293 | 273 | | |
294 | 274 | | |
295 | 275 | | |
296 | | - | |
297 | | - | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
307 | | - | |
308 | | - | |
309 | | - | |
310 | | - | |
311 | | - | |
312 | | - | |
313 | | - | |
314 | | - | |
315 | | - | |
316 | | - | |
317 | | - | |
318 | | - | |
| 276 | + | |
| 277 | + | |
319 | 278 | | |
320 | 279 | | |
321 | 280 | | |
322 | | - | |
323 | | - | |
324 | | - | |
325 | | - | |
| 281 | + | |
326 | 282 | | |
327 | 283 | | |
328 | | - | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
329 | 289 | | |
330 | 290 | | |
331 | 291 | | |
| |||
0 commit comments