Skip to content

dpnp.add() doesn't work properly with a scalar #1288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 11, 2023

Conversation

antonwolfy
Copy link
Contributor

@antonwolfy antonwolfy commented Feb 3, 2023

Since the change implemented in #1201 and until now, implicit operation + of array with a scalar or explicit one through dpnp.add call leads to raising of the not-implemented exception in dpnp. This is because dpnp was falling back on NumPy implementation in that use case, but has to call an internal kernel function from the backend.
The issue was reported in #1273.

The PR proposes to copy a scalar data from host memory into USM allocated memory, which means to create zero-dimension dpnp array from the scalar. This way the add of array with a scalar will be handled in the same way as addition of two arrays, which already works fine in dpnp.

Note that if some input argument is a scalar and another one is an array, memory for the scalar will be allocated on the same SYCL device and with the same USM type as the array has. It's required to proceed with further computing and to be compliant with compute follows data paradigm (all input data has to reside on the same device).

Also, minor fixes in dpnp.multiply implementation were included in the PR:

  • use oneapi::mkl::vm::mul() also for complex types
  • fallback on numpy or raise an exception if both inputs are scalars
  • control USM type of output array to be compliant with compute follows data paradigm

A common macro MACRO_2ARG_3TYPES_OP used in all elementwise arithmetical operation was refactor to get rid of excess memory copy in shared memory.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • If this PR is a work in progress, are you filing the PR as a draft?

@antonwolfy antonwolfy self-assigned this Feb 3, 2023
@antonwolfy antonwolfy force-pushed the add_by_scalar branch 3 times, most recently from f7a0f23 to c49ab5b Compare February 9, 2023 10:15
Copy link
Contributor

@oleksandr-pavlyk oleksandr-pavlyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Built, ran few examples. Looks good! Thank you @antonwolfy

@antonwolfy antonwolfy merged commit 863f587 into IntelPython:master Feb 11, 2023
@antonwolfy antonwolfy deleted the add_by_scalar branch February 11, 2023 00:41
antonwolfy added a commit that referenced this pull request Feb 23, 2023
* Complete support of python 3.10 in external CI (#1269)

* Set minimum required versions & fix debug building (#1270)

* Set minimum required versions & fix debug building

* Fix typo

* Add support of NumPy 1.24 (#1276)

* Set minimum required versions & fix debug building

* Add support of numpy 1.24

* Get rid of 'has_aspect_host' property in tests (#1274)

* Set minimum required versions & fix debug building

* Get rid of 'has_aspect_host' property in tests

* Update tests/test_sycl_queue.py

Co-authored-by: Oleksandr Pavlyk <[email protected]>

Co-authored-by: Oleksandr Pavlyk <[email protected]>

* Add support of dpnp.less_equal() (#1275)

* Set minimum required versions & fix debug building

* Add support of dpnp.less_equal()

* Test no broadcast together with input shapes

* Add support of comparison operations (#1278)

* Use eye() function from dpctl.tensor. (#1271)

* Use eye() function from dpctl.tensor.

* Add missed order in test for eye() function.

* Updated copyright year. Added parameter like for eye() function.

* Removed input argumet additional kwards for eye() function.

* Get rid of unsupported types in array creation tests (#1283)

* Add support of logical comparison operations (#1280)

* Add device and sycl_queue keyword arguments to random calls (#1277)

* Set minimum required versions & fix debug building

* Add device and sycl_queue keyword arguments to random calls

* Add device and sycl_queue to dpnp.random.seed() & use random values if seed is None

* Update dpnp/random/dpnp_iface_random.py

Co-authored-by: Oleksandr Pavlyk <[email protected]>

---------

Co-authored-by: Oleksandr Pavlyk <[email protected]>

* add __repr__

* add __str__

* reviewer's comments

* Fixed gh-1272 (#1287)

* Support high=None in dpnp.randint() (#1284)

* linter changes applied

* Add operation __index__ and __complex__ (#1285)

* Add operation __index__ and __complex__

* Add tests

* Update tests with proper call of dpctl.SyclQueue() (#1290)

* Update minimum required versions of dependent components. (#1289)

* dpnp.add() doesn't work properly with a scalar (#1288)

* dpnp.add() doesn't work properly with a scalar

* get rid of dpctl.SyclQueue() call in tests with unsupported device keyword

* Add a fix for crash on CPU device

* USM type in operations with a scalar

* Porting fix for crash to logic kernel

* Add dlpack support with tests and docstrings

* Add a test for dlpack with dpt

* Fix remarks, add _create_from_usm_ndarray func and move tests to test_sycl_queue

* Use tril() and triu() function from dpctl.tensor (#1286)

* Use tril() function from dpctl.tensor

* Use triu() function from dpctl.tensor

* Changed tests for tril() and triu() functions.

* Skip tests for tril() and triu() functions with usm_type.

* dpnp.subtract() doesn't work properly with a scalar (#1292)

* dpnp.add() doesn't work properly with a scalar

* dpnp.subtract() doesn't work properly with a scalar

* USM type in operations with a scalar

* Rollback excluded 'floor_divide' tests from skip scope

* Explicit vector operations instead of saturation functions

* Use std::int32_t and std::int64_t types

* Tune tail's loop of kernel for the vector op

* dpnp.divide() doesn't work properly with a scalar (#1295)

* dpnp.add() doesn't work properly with a scalar

* dpnp.subtract() doesn't work properly with a scalar

* dpnp.divide() doesn't work properly with a scalar

* dpnp.divide() doesn't work properly with a scalar

* Use std::int32_t and std::int64_t types

* Disable floating-point optimizations that assume arguments and results are not NaNs or +-Inf

* Fix issue with divide on Iris Xe

* Updae example3 building from debug build script (#1298)

* Remove temporary solution accepting CFD with equal SYCL context instead of queue (#1303)

* Intel LLVM is to use conda's gcc toolchain, sysroot and target libraries (#1306)

* Tests are crashing if no default device (#1311)

* Setting version to 0.11.1 (#1308)

---------

Co-authored-by: Oleksandr Pavlyk <[email protected]>
Co-authored-by: Natalia Polina <[email protected]>
Co-authored-by: Vahid Tavanashad <[email protected]>
Co-authored-by: Vladislav Perevezentsev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants