Skip to content

Conversation

XXXXRT666
Copy link
Contributor

@XXXXRT666 XXXXRT666 commented Oct 6, 2025

Description

The example kernel in the documentation uses C[tid] = A[tid] + B[tid] inside a loop, which should instead be C[i] = A[i] + B[I], to correctly handle strided indexing across threads.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Copy link
Contributor

copy-pr-bot bot commented Oct 6, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@leofang
Copy link
Member

leofang commented Oct 6, 2025

Thanks for catching the issue @XXXXRT666, LGTM! It seems we only caught the issue partially in #582. FWIW this bug also exists in the sample https://github.com/NVIDIA/cuda-python/blob/main/cuda_core/examples/vector_add.py on which the doc is based, would you like to fix it as well?

@leofang leofang added bug Something isn't working P0 High priority - Must do! cuda.core Everything related to the cuda.core module example Improvements or additions to code examples labels Oct 6, 2025
@leofang leofang added this to the cuda.core beta 7 milestone Oct 6, 2025
@leofang
Copy link
Member

leofang commented Oct 6, 2025

Also please kindly add a release note entry to cuda_core/docs/source/release/0.X.Y-notes.rst 🙂

@XXXXRT666
Copy link
Contributor Author

FWIW this bug also exists in the sample https://github.com/NVIDIA/cuda-python/blob/main/cuda_core/examples/vector_add.py on which the doc is based, would you like to fix it as well?

I think I have fixed all the [C[tid] = A[tid] + B[tid] problems now. Thanks!

Also please kindly add a release note entry to cuda_core/docs/source/release/0.X.Y-notes.rst 🙂

Done

leofang
leofang previously approved these changes Oct 6, 2025
@leofang leofang added the documentation Improvements or additions to documentation label Oct 6, 2025
@leofang
Copy link
Member

leofang commented Oct 6, 2025

/ok to test c7b9d6e

@leofang
Copy link
Member

leofang commented Oct 6, 2025

pre-commit.ci autofix

@leofang
Copy link
Member

leofang commented Oct 6, 2025

/ok to test efcdc6b

Copy link

github-actions bot commented Oct 6, 2025

@kkraus14
Copy link
Collaborator

kkraus14 commented Oct 6, 2025

@XXXXRT666 thank for your contribution! Could you please push a commit that is signed-off (https://git-scm.com/docs/git-commit#Documentation/git-commit.txt---signoff) which indicates your agreement to our DCO (https://github.com/NVIDIA/cuda-python/blob/main/CONTRIBUTING.md#developer-certificate-of-origin-dco)?

This is required before we're able to merge your pull request.

Signed-off-by: XXXXRT666 <[email protected]>
@XXXXRT666
Copy link
Contributor Author

Sure, I’ve added a signed-off commit, does this look OK?

@kkraus14
Copy link
Collaborator

kkraus14 commented Oct 6, 2025

Sure, I’ve added a signed-off commit, does this look OK?

Perfect, thank you!

@kkraus14
Copy link
Collaborator

kkraus14 commented Oct 6, 2025

/ok to test f397f57

@kkraus14 kkraus14 enabled auto-merge (squash) October 6, 2025 16:24
@kkraus14 kkraus14 merged commit 9b7821a into NVIDIA:main Oct 6, 2025
53 of 61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cuda.core Everything related to the cuda.core module documentation Improvements or additions to documentation example Improvements or additions to code examples P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants