Skip to content

Commit d894a21

Browse files
authored
[SYCL][Fusion][Doc] Document reductions support (#12641)
Document reduction strategies supported by kernel fusion and how users should use reductions in their code. --------- Signed-off-by: Victor Perez <[email protected]>
1 parent eead989 commit d894a21

File tree

1 file changed

+36
-1
lines changed

1 file changed

+36
-1
lines changed

sycl/doc/design/KernelFusionJIT.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -313,12 +313,47 @@ computing the private memory size. As range rounding only applies to basic
313313
kernels (parametrized by a `sycl::range`), local internalization is not affected
314314
by the range rounding transformation.
315315

316+
### Reductions
317+
318+
Kernel fusion of reductions is partially supported. In order to preserve the
319+
legality of the fused kernel, i.e., the fact that fused kernel must perform the
320+
same work as the graph of kernels to be fused, only the fusion of the following
321+
reduction strategies is supported at the time of writing:
322+
323+
- `group_reduce_and_last_wg_detection`
324+
- `local_atomic_and_atomic_cross_wg`
325+
- `range_basic`
326+
- `group_reduce_and_atomic_cross_wg`
327+
- `local_mem_tree_and_atomic_cross_wg`
328+
329+
Other strategies require implicit inter-work-group synchronization, not
330+
supported in kernel fusion.
331+
332+
Users may encounters errors, e.g., fusion being aborted or incorrect results due
333+
to race conditions or any other cause, when using the `sycl::reduction`
334+
interface. The SYCL runtime will choose different algorithms depending on the
335+
reduction operator, data type and hardware capabilities, so strategy selection
336+
is not possible through the regular interface. In this case, users can instead
337+
use `sycl::detail::reduction_parallel_for`, forcing a supported fusion
338+
strategy. Reductions implementation in
339+
[`sycl/reduction.hpp`](../../include/sycl/reduction.hpp) might give users an
340+
insight into which kind of reductions to use for their purposes:
341+
342+
```c++
343+
q.submit([&](sycl::handler &cgh) {
344+
sycl::accessor in(dataBuf, cgh, sycl::read_only);
345+
sycl::reduction sum(sumBuf, cgh, sycl::plus<>{});
346+
// Force supported 'group_reduce_and_last_wg_detection' strategy
347+
sycl::detail::reduction_parallel_for<sycl::detail::auto_name,
348+
sycl::detail::strategy::group_reduce_and_last_wg_detection>(...);
349+
});
350+
```
351+
316352
### Unsupported SYCL constructs
317353

318354
The following SYCL API constructs are currently not officially supported for
319355
kernel fusion and should be considered untested/unsupported:
320356

321-
- Reductions
322357
- `sycl::stream`
323358
- Specialization constants and `sycl::kernel_handler`
324359
- Images (`sycl::unsampled_image` and `sycl::sampled_image`)

0 commit comments

Comments
 (0)