-
Notifications
You must be signed in to change notification settings - Fork 214
[Docs]: Fix incorrect index usage in vector_add example in documentation #1086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for catching the issue @XXXXRT666, LGTM! It seems we only caught the issue partially in #582. FWIW this bug also exists in the sample https://github.com/NVIDIA/cuda-python/blob/main/cuda_core/examples/vector_add.py on which the doc is based, would you like to fix it as well? |
Also please kindly add a release note entry to |
I think I have fixed all the
Done |
/ok to test c7b9d6e |
pre-commit.ci autofix |
/ok to test efcdc6b |
|
@XXXXRT666 thank for your contribution! Could you please push a commit that is signed-off (https://git-scm.com/docs/git-commit#Documentation/git-commit.txt---signoff) which indicates your agreement to our DCO (https://github.com/NVIDIA/cuda-python/blob/main/CONTRIBUTING.md#developer-certificate-of-origin-dco)? This is required before we're able to merge your pull request. |
Signed-off-by: XXXXRT666 <[email protected]>
Sure, I’ve added a signed-off commit, does this look OK? |
Perfect, thank you! |
/ok to test f397f57 |
Description
The example kernel in the documentation uses
C[tid] = A[tid] + B[tid]
inside a loop, which should instead beC[i] = A[i] + B[I]
, to correctly handle strided indexing across threads.Checklist