Skip to content

Conversation

@cpcloud
Copy link
Contributor

@cpcloud cpcloud commented Oct 22, 2025

This PR should address the issue of cancelled-due-to-failure-but-still-times-out shenanigans present in GitHub Actions. Details and links are in the comment.

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Oct 22, 2025

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cpcloud
Copy link
Contributor Author

cpcloud commented Oct 22, 2025

/ok to test

@cpcloud cpcloud force-pushed the run-always-but-fail-on-cancelled branch from c8e96c3 to 0f834c2 Compare October 22, 2025 15:27
@cpcloud
Copy link
Contributor Author

cpcloud commented Oct 22, 2025

/ok to test

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR attempts to fix a GitHub Actions workflow issue where cancelled jobs (triggered by critical failures in dependencies) would cause timeouts but paradoxically allow builds to pass. The change modifies the final status check job in ci.yml to: (1) always run regardless of dependency status (via if: always()), (2) explicitly check if any dependent job was cancelled, and (3) fail the build if cancellations are detected. The workflow uses a final gate job pattern where checks depends on test-linux-64, test-linux-aarch64, test-windows, and doc—this is common in repos that need a single mergeable status check across multiple parallel job matrices.

PR Description Notes:

  • The PR description references "Details and links are in the comment" but no such comment is visible in the provided metadata.

Critical Issues

Syntax Error: Bash cannot evaluate GitHub Actions expressions directly

Lines 222-225 contain a fatal flaw—the code attempts to use GitHub Actions template syntax ${{ ... }} inside a bash if statement:

if ${{ needs.test-linux-64.result == 'cancelled' || ... }}; then

This will fail because:

  1. GitHub Actions expressions are evaluated during workflow compilation, before the runner executes bash
  2. The bash interpreter will receive malformed syntax like if true || false; then (literal boolean tokens, not valid bash)
  3. The correct pattern is to interpolate each expression as a string and use bash string comparison:
if [[ "${{ needs.test-linux-64.result }}" == "cancelled" ]] || \
   [[ "${{ needs.test-linux-aarch64.result }}" == "cancelled" ]] || \
   [[ "${{ needs.test-windows.result }}" == "cancelled" ]] || \
   [[ "${{ needs.doc.result }}" == "cancelled" ]]; then

Alternative approach: Use the if: condition at the step level with pure GitHub Actions expression syntax (no bash):

- name: Fail if any dependency was cancelled
  if: |
    needs.test-linux-64.result == 'cancelled' ||
    needs.test-linux-aarch64.result == 'cancelled' ||
    needs.test-windows.result == 'cancelled' ||
    needs.doc.result == 'cancelled'
  run: exit 1

This will prevent merges when dependencies are cancelled, which is the intended behavior.

Confidence Score

1 out of 5 — The PR will not achieve its goal due to the syntax error. The logic is sound, but the implementation is broken and must be corrected before merge.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@cpcloud cpcloud force-pushed the run-always-but-fail-on-cancelled branch from 0f834c2 to 83a09d3 Compare October 22, 2025 15:28
@cpcloud
Copy link
Contributor Author

cpcloud commented Oct 22, 2025

/ok to test

@cpcloud cpcloud force-pushed the run-always-but-fail-on-cancelled branch from 83a09d3 to 0b1b1e0 Compare October 22, 2025 15:28
@cpcloud
Copy link
Contributor Author

cpcloud commented Oct 22, 2025

/ok to test

Comment on lines +205 to +214
run: |
# if any dependencies were cancelled, that's a failure
#
# see https://docs.github.com/en/actions/reference/workflows-and-actions/expressions#always
# and https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/collaborating-on-repositories-with-code-quality-features/troubleshooting-required-status-checks#handling-skipped-but-required-checks
# for why this cannot be encoded in the job-level `if:` field
#
# TL; DR: `$REASONS`
#
# The intersection of skipped-as-success and required status checks

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I feel every proper GHA-based status check workflow needs a comment expressing the dismay and astonishment that came from this discovery. See also: https://github.com/NVIDIA/cccl/blob/main/.github/workflows/ci-workflow-pull-request.yml#L259-L263

🙂

jrhemstad
jrhemstad previously approved these changes Oct 22, 2025
@cpcloud
Copy link
Contributor Author

cpcloud commented Oct 22, 2025

Seems like cancellation is working: https://github.com/NVIDIA/cuda-python/actions/runs/18721400957/job/53395760809?pr=1174

I manually cancelled and that caused the status check to fail.

@cpcloud
Copy link
Contributor Author

cpcloud commented Oct 22, 2025

/ok to test

@github-actions
Copy link

@cpcloud
Copy link
Contributor Author

cpcloud commented Oct 22, 2025

/ok to test

1 similar comment
@cpcloud
Copy link
Contributor Author

cpcloud commented Oct 22, 2025

/ok to test

@cpcloud cpcloud force-pushed the run-always-but-fail-on-cancelled branch from 5bf077f to 41d1ce7 Compare October 22, 2025 20:12
@cpcloud
Copy link
Contributor Author

cpcloud commented Oct 22, 2025

/ok to test

@cpcloud cpcloud force-pushed the run-always-but-fail-on-cancelled branch from 41d1ce7 to 9088268 Compare October 22, 2025 20:12
@cpcloud cpcloud requested review from leofang and rparolin October 22, 2025 20:12
@cpcloud cpcloud changed the title ci: run status check always but considered cancellations failures ci: run status check always but consider cancellations failures Oct 22, 2025
Copy link
Contributor

@mdboom mdboom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙃

@cpcloud cpcloud enabled auto-merge (squash) October 22, 2025 21:11
@cpcloud cpcloud merged commit 3cb1eb6 into NVIDIA:main Oct 22, 2025
71 checks passed
@leofang leofang added this to the cuda-python 13-next, 12-next milestone Oct 23, 2025
@leofang leofang added bug Something isn't working P0 High priority - Must do! labels Oct 23, 2025
@leofang leofang added the CI/CD CI/CD infrastructure label Oct 23, 2025
github-actions bot pushed a commit that referenced this pull request Nov 10, 2025
Removed preview folders for the following PRs:
- PR #1021
- PR #1034
- PR #1052
- PR #1059
- PR #1069
- PR #1086
- PR #1090
- PR #1096
- PR #1102
- PR #1103
- PR #1106
- PR #1107
- PR #1117
- PR #1133
- PR #1140
- PR #1166
- PR #1174
- PR #1185
- PR #1188
- PR #1191
... and 41 more
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working CI/CD CI/CD infrastructure P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants