Skip to content

Any/all reductions without atomic aspect #1563

@antonwolfy

Description

@antonwolfy

As I can in scope of #1557 there was a check check_atomic_support added to a code of reduction pybind11 extension in dpctl.
Does it mean (was it intended) that any and all reductions now require to have atomics support per execution SYCL queue and USM type? And there is no other alternative kernels as was present previously?

This question brought up when I was running existing dpnp tests on WSL and encountered failures with the latest dpctl due to

import dpctl, dpctl.tensor as dpt

a = dpt.ones(10, usm_type="shared")
dpt.any(a)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[3], line 1
----> 1 dpt.any(a)

File ~/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/tensor/_utility_functions.py:126, in any(x, axis, keepdims)
    101 def any(x, axis=None, keepdims=False):
    102     """any(x, axis=None, keepdims=False)
    103
    104     Tests whether any input array elements evaluate to True along a given axis.
   (...)
    124             containing the results of the logical OR reduction.
    125     """
--> 126     return _boolean_reduction(x, axis, keepdims, tri._any)

File ~/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/tensor/_utility_functions.py:45, in _boolean_reduction(x, axis, keepdims, func)
     38 wait_list = []
     39 res_tmp = dpt.empty(
     40     res_shape,
     41     dtype=dpt.int32,
     42     usm_type=res_usm_type,
     43     sycl_queue=exec_q,
     44 )
---> 45 hev0, ev0 = func(
     46     src=x_tmp,
     47     trailing_dims_to_reduce=red_nd,
     48     dst=res_tmp,
     49     sycl_queue=exec_q,
     50 )
     51 wait_list.append(hev0)
     53 # copy to boolean result array

ValueError: This reduction is not supported for this device and usm_type.

dpctl.__version__
# Out: '0.17.0dev0+20.g16f23f7e04'

The reduction is failing only for GPU device (both OCL and L0) and with both "shared" and "host" USM types.
But works on CPU device or on any device with "device" USM type.

The device info:

$ python -m dpctl -f
Platform  0 ::
    Name        Intel(R) OpenCL
    Version     OpenCL 3.0 LINUX
    Vendor      Intel(R) Corporation
    Backend     opencl
    Num Devices 1
      # 0
        Name                11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
        Version             2024.17.2.0.13_160000
        Filter string       opencl:cpu:0
Platform  1 ::
    Name        Intel(R) OpenCL Graphics
    Version     OpenCL 3.0
    Vendor      Intel(R) Corporation
    Backend     opencl
    Num Devices 1
      # 0
        Name                Intel(R) Graphics [0x9a49]
        Version             23.52.28202.16
        Filter string       opencl:gpu:0
Platform  2 ::
    Name        Intel(R) Level-Zero
    Version     1.3
    Vendor      Intel(R) Corporation
    Backend     ext_oneapi_level_zero
    Num Devices 1
      # 0
        Name                Intel(R) Graphics [0x9a49]
        Version             1.3.28202
        Filter string       level_zero:gpu:0

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions