Skip to content

Test against CUDA wheels #368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 12, 2025
Merged

Test against CUDA wheels #368

merged 4 commits into from
Jan 12, 2025

Conversation

leofang
Copy link
Member

@leofang leofang commented Jan 9, 2025

Close #367.

For detail see #368 (comment) below.

@leofang leofang self-assigned this Jan 9, 2025
Copy link
Contributor

copy-pr-bot bot commented Jan 9, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

/ok to test

@leofang leofang added this to the cuda-python 12-next, 11-next milestone Jan 9, 2025
@leofang leofang added CI/CD CI/CD infrastructure cuda.bindings Everything related to the cuda.bindings module P0 High priority - Must do! and removed cuda.bindings Everything related to the cuda.bindings module labels Jan 9, 2025
@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

/ok to test

1 similar comment
@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

/ok to test

@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

/ok to test

@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

/ok to test

@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

/ok to test

2 similar comments
@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

/ok to test

@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

/ok to test

@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

/ok to test

@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

Note: 11.8 + CTK wheels tests fail because we have not yet backported the [all] support (#363 (comment)).

@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

This PR is ready for review. Most of the commits are coming from #357, so it should be merged first.

After that, this PR will only have 2 commits:

  • 5015410: make the test job a standalone reusable workflow, to avoid increasing the length of the CI file (because we'll add more jobs to it later)
  • 7f36568: This adds a new local_ctk axis to the test matrix that choose between local CTK (provided by the mini ctk action) or CTK wheels

@leofang leofang changed the title WIP: Add wheel-based test workflow Test against CUDA wheels Jan 9, 2025
@leofang leofang added the to-be-backported Trigger the bot to raise a backport PR upon merge label Jan 9, 2025
@leofang leofang marked this pull request as ready for review January 9, 2025 21:39
@leofang
Copy link
Member Author

leofang commented Jan 9, 2025

/ok to test

@leofang
Copy link
Member Author

leofang commented Jan 10, 2025

@vzhurba01 sorry for the noise elsewhere. Could you check the cuda.core test failure? I confirm that the CI fetches the correct wheel from the 11.8.x (run_id 12695861564, see log), and the nvrtc wheel is correctly installed, but we still hit dlopen or nvrtcVersion not found errors...

@leofang
Copy link
Member Author

leofang commented Jan 10, 2025

I noticed that unlike in the main branch, the linker flag --disable-new-dtags for RPATH hack was not showing up in the 11.8.x branch...
https://github.com/NVIDIA/cuda-python/actions/runs/12695861564/job/35388691369#step:9:844

@leofang
Copy link
Member Author

leofang commented Jan 10, 2025

@vzhurba01
Copy link
Collaborator

vzhurba01 commented Jan 10, 2025

#355

Backport of above should resolve that bug.

Edit: Also needed #376

@vzhurba01
Copy link
Collaborator

/ok to test

@leofang leofang closed this Jan 10, 2025
@leofang leofang reopened this Jan 10, 2025
@leofang
Copy link
Member Author

leofang commented Jan 10, 2025

/ok to test

@leofang
Copy link
Member Author

leofang commented Jan 10, 2025

#376 fixed it! This is ready now.

@leofang

This comment was marked as outdated.

@leofang
Copy link
Member Author

leofang commented Jan 12, 2025

/ok to test

@leofang
Copy link
Member Author

leofang commented Jan 12, 2025

Since this is not touching the main source code, only CI, and all jobs have passed, let me admin-merge to get the Ci infra works completed. Backport PR is in #378.

@leofang leofang merged commit 57a710a into NVIDIA:main Jan 12, 2025
78 checks passed
@leofang leofang deleted the wheel_test branch January 12, 2025 19:26
Copy link

Backport failed because this pull request contains merge commits. You can either backport this pull request manually, or configure the action to skip merge commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD CI/CD infrastructure P0 High priority - Must do! to-be-backported Trigger the bot to raise a backport PR upon merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI: Add a new test axis to test against CUDA wheels
2 participants