Skip to content

Conversation

ndgrigorian
Copy link
Collaborator

@ndgrigorian ndgrigorian commented Feb 22, 2024

This pull request further modularizes reduction code and fully moves boolean reductions (any and all) functionality into reductions.hpp.

Sequential reductions now have a submit_sequential_reduction wrapper, reducing repetitious kernel calls.

Boolean reductions now use the same basic kernels as other reductions.

Additionally, reduction type dispatching has been moved into individual reduction files, i.e., reductions/max.cpp now contains the dispatch factories, and so on for all reductions. This significantly reduces the amount of code in kernels/reductions.hpp.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you opening the PR as a draft?

Copy link

github-actions bot commented Feb 22, 2024

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

@coveralls
Copy link
Collaborator

coveralls commented Feb 22, 2024

Coverage Status

coverage: 91.148%. remained the same
when pulling 713a177 on further-reduction-refactoring
into 7ab3731 on master.

Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_21 ran successfully.
Passed: 907
Failed: 0
Skipped: 88

Use `sycl_utils::Is[Op]` functions rather than checking if a reduction operator is a sycl operator
Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_22 ran successfully.
Passed: 904
Failed: 3
Skipped: 88

@oleksandr-pavlyk
Copy link
Contributor

New test failure:

_________________________ test_max_min_nan_propagation _________________________

    def test_max_min_nan_propagation():
        get_queue_or_skip()
    
        # float, finites
        x = dpt.arange(4, dtype="f4")
        x[0] = dpt.nan
>       assert dpt.isnan(dpt.max(x))
E       AssertionError: assert usm_ndarray(False)
E        +  where usm_ndarray(False) = <UnaryElementwiseFunc 'isnan'>(usm_ndarray(-inf, dtype=float32))
E        +    where <UnaryElementwiseFunc 'isnan'> = dpt.isnan
E        +    and   usm_ndarray(-inf, dtype=float32) = <function max at 0x7fbe36c69580>(usm_ndarray([nan,  1.,  2.,  3.], dtype=float32))
E        +      where <function max at 0x7fbe36c69580> = dpt.max

/usr/share/miniconda/envs/test_dpctl/lib/python3.11/site-packages/dpctl/tests/test_usm_ndarray_reductions.py:140: AssertionError

@ndgrigorian ndgrigorian force-pushed the further-reduction-refactoring branch from 2b43713 to e4a9344 Compare February 22, 2024 19:09
@ndgrigorian
Copy link
Collaborator Author

New test failure:

_________________________ test_max_min_nan_propagation _________________________

    def test_max_min_nan_propagation():
        get_queue_or_skip()
    
        # float, finites
        x = dpt.arange(4, dtype="f4")
        x[0] = dpt.nan
>       assert dpt.isnan(dpt.max(x))
E       AssertionError: assert usm_ndarray(False)
E        +  where usm_ndarray(False) = <UnaryElementwiseFunc 'isnan'>(usm_ndarray(-inf, dtype=float32))
E        +    where <UnaryElementwiseFunc 'isnan'> = dpt.isnan
E        +    and   usm_ndarray(-inf, dtype=float32) = <function max at 0x7fbe36c69580>(usm_ndarray([nan,  1.,  2.,  3.], dtype=float32))
E        +      where <function max at 0x7fbe36c69580> = dpt.max

/usr/share/miniconda/envs/test_dpctl/lib/python3.11/site-packages/dpctl/tests/test_usm_ndarray_reductions.py:140: AssertionError

This was caused by the atomic fetch_min/max operations being called for reductions which used our custom implementations.

To fix this, I added more utilities for checking if an operator is specifically a sycl op. That way if a work-around is needed for reduce_over_group in the future, but we are still using the sycl op, we can leverage atomics better.

These allow atomic reductions to use custom_reduce_over_group with the faster `fetch_[op]` atomic operations
@ndgrigorian ndgrigorian force-pushed the further-reduction-refactoring branch from e4a9344 to 713a177 Compare February 22, 2024 19:27
Copy link

Array API standard conformance tests for dpctl=0.17.0dev0=py310h15de555_23 ran successfully.
Passed: 906
Failed: 1
Skipped: 88

Copy link
Contributor

@oleksandr-pavlyk oleksandr-pavlyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT! Thank you @ndgrigorian !

@ndgrigorian ndgrigorian merged commit 577b98d into master Feb 22, 2024
@ndgrigorian ndgrigorian deleted the further-reduction-refactoring branch February 29, 2024 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants