Skip to content

[DOCS] Clarify that cooperative is for device, parallel for host #522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 19, 2025

Conversation

gmarkall
Copy link
Contributor

I'm not sure if this helps move things in the right direction towards a clear and succinct summary of these two modules, but I'm interested in others' thoughts.

I'm not sure if this helps move things in the right direction towards a
clear and succinct summary of these two modules, but I'm interested in
others' thoughts.
Copy link
Contributor

copy-pr-bot bot commented Mar 17, 2025

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

leofang
leofang previously approved these changes Mar 17, 2025
@leofang leofang requested review from rwgk and kkraus14 March 17, 2025 16:47
README.md Outdated
* [cuda.cooperative](https://nvidia.github.io/cccl/cuda_cooperative/): Pythonic exposure of CUB cooperative algorithms
* [cuda.parallel](https://nvidia.github.io/cccl/cuda_parallel/): Pythonic exposure of Thrust parallel algorithms
* [cuda.cooperative](https://nvidia.github.io/cccl/cuda_cooperative/): Exposure of CUB cooperative algorithms for use in Python kernels
* [cuda.parallel](https://nvidia.github.io/cccl/cuda_parallel/): Pythonic exposure of Thrust parallel algorithms launched from the host
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question, based on this observation:

$ cd cccl/c/parallel
$ git grep -i '#include.*cub' | wc -l
44
$ git grep -i '#include.*thrust' | wc -l
0

I.e., currently thrust headers aren't used at all in cuda.parallel.

Could it be better to write

Pythonic exposure of CCCL parallel algorithms launched from the host

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about being a bit more explicit here?

  • cuda.parallel: A Python package offering highly efficient and customizable implementations of parallel algorithms like sort, reduce, scan, transform, etc.,

  • cuda.cooperative: A Python package providing CUB's reusable block-wide and warp-wide primitives for use within Numba CUDA kernels.

I like putting cuda.parallel first as it's probably a more recognizable feature set compared to cuda.cooperative.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like Ashwin's version. Maybe replace "offering" with "for easy access to"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind for easy access to

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently thrust headers aren't used at all in cuda.parallel

An accurate reflection of implementation detail in the package name was not the intent. On the GPU side, everything is actually CUB and Thrust is a sort of C++ friendly wrappers on top, so the intent was to capture the spirit that we get to answer the question "I need a sort that can be launched from host like thrust::sort" with the answer "no problem, use cuda.parallel.sort."

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a big improvement!

Comment on lines 9 to 10
- `cuda.cooperative`_: Exposure of CUB cooperative algorithms for use in Python kernels
- `cuda.parallel`_: Pythonic exposure of Thrust parallel algorithms launched from the host
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we adopt the new README.md wording here, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adopted.

README.md Outdated
* [cuda.cooperative](https://nvidia.github.io/cccl/cuda_cooperative/): Pythonic exposure of CUB cooperative algorithms
* [cuda.parallel](https://nvidia.github.io/cccl/cuda_parallel/): Pythonic exposure of Thrust parallel algorithms
* [cuda.cooperative](https://nvidia.github.io/cccl/cuda_cooperative/): A Python package for easy access to highly efficient and customizable parallel algorithms, like `sort`, `scan`, `reduce`, `transform`, etc.
* [cuda.parallel](https://nvidia.github.io/cccl/cuda_parallel/): A Python package providing CUB's reusable block-wide and warp-wide primitives for use within Numba CUDA kernels.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No dot at end (for self-consistency within this file)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@leofang
Copy link
Member

leofang commented Mar 17, 2025

Since we are touching these two files, I am thinking to also add numba-cuda to the list. Thoughts?

@rwgk
Copy link
Collaborator

rwgk commented Mar 19, 2025

/ok to test

@rwgk rwgk enabled auto-merge (squash) March 19, 2025 14:51

This comment has been minimized.

@rwgk rwgk merged commit 4d0b5a6 into NVIDIA:main Mar 19, 2025
74 checks passed
Copy link

Doc Preview CI
Preview removed because the pull request was closed or merged.

@leofang leofang added documentation Improvements or additions to documentation P1 Medium priority - Should do labels Mar 24, 2025
@leofang leofang added this to the cuda-python 12.9.0 & 11.8.7 milestone Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation P1 Medium priority - Should do
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants