Skip to content

CI: Windows GPU runners do not stop on error #483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
leofang opened this issue Mar 3, 2025 · 1 comment
Open

CI: Windows GPU runners do not stop on error #483

leofang opened this issue Mar 3, 2025 · 1 comment
Assignees
Labels
bug Something isn't working CI/CD CI/CD infrastructure P0 High priority - Must do!

Comments

@leofang
Copy link
Member

leofang commented Mar 3, 2025

In this CI run we hit a bizarre NVRTC not found error at test time. However, it should have been properly installed prior to test execution. Turns out that Powershell decides to swallow any pip install failures (this happens because of #482), so we did not install the dependencies (including NVRTC) successfully:
https://github.com/NVIDIA/cuda-python/actions/runs/13623016730/job/38075976144?pr=423#step:18:39

It looks like we hit a known runner issue, which was closed without a proper fix: actions/runner-images#6668 (the recommendation there was to switch to the bash shell; I'd love to do this too as it'd allow us to not maintain 2 versions of workflows, however it is not possible for GH-hosted Windows GPU runners)

@leofang leofang added bug Something isn't working CI/CD CI/CD infrastructure P0 High priority - Must do! labels Mar 3, 2025
@leofang leofang changed the title CI: Windows runners do not stop on error CI: Windows GPU runners do not stop on error Mar 3, 2025
@leofang
Copy link
Member Author

leofang commented Mar 3, 2025

Not sure if setting this in the beginning of a workflow would help:

# Stop the script when a cmdlet or a native command fails
$ErrorActionPreference = 'Stop'
$PSNativeCommandUseErrorActionPreference = $true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CI/CD CI/CD infrastructure P0 High priority - Must do!
Projects
None yet
Development

No branches or pull requests

2 participants