-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[ci] Handle the case where all reported tests pass but the build is still a failure #120264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…till a failure In this build: https://buildkite.com/llvm-project/github-pull-requests/builds/126961 The builds actually failed, probably because prerequisite of a test suite failed to build. However they still ran other tests and all those passed. This meant that the test reports were green even though the build was red. On some level this is technically correct, but's very misleading in practice. So I've also passed the build script's return code, as it was when we entered the on exit handler, to the generator. So that when this happens again, the report will not list any test failures but will draw the viewer's attention to the overall failure. There will be a link in the report to the build's log file, so the next step to investigate is clear. It would be nice to say "tests failed and there was some other build error", but we cannot tell what the non-zero return code was caused by. Could be either.
This is all done within the "if we have a buildkite agent", so it should not disrupt work on the new system. It's getting a bit complex, but (famous last words) I think this is the last combination it needs to handle, as shown in the table. |
Pretty sure I'm not the right reviewer here... Maybe @lnihlen is? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working this up. I'd like @boomanaiden154 to verify this keeps the new infra happy too, and had a question about fragility in one of the tests, but otherwise LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable enough to me. Sorry for the long review delay on my end.
…till a failure (llvm#120264) In this build: https://buildkite.com/llvm-project/github-pull-requests/builds/126961 The builds actually failed, probably because prerequisite of a test suite failed to build. However they still ran other tests and all those passed. This meant that the test reports were green even though the build was red. On some level this is technically correct, but it is very misleading in practice. So I've also passed the build script's return code, as it was when we entered the on exit handler, to the generator, so that when this happens again, the report will draw the viewer's attention to the overall failure. There will be a link in the report to the build's log file, so the next step to investigate is clear. It would be nice to say "tests failed and there was some other build error", but we cannot tell what the non-zero return code was caused by. Could be either. The script handles the following situations now: | Have Result Files? | Tests reported failed? | Return code | Report | |--------------------|------------------------|-------------|-----------------------------------------------------------------------------| | Yes | No | 0 | Success style report. | | Yes | Yes | 0 | Shouldn't happen, but if it did, failure style report showing the failures. | | Yes | No | 1 | Failure style report, showing no failures but noting that the build failed. | | Yes | Yes | 1 | Failure style report, showing the test failures. | | No | ? | 0 | No test report, success shown in the normal build display. | | No | ? | 1 | No test report, failure shown in the normal build display. |
In this build:
https://buildkite.com/llvm-project/github-pull-requests/builds/126961
The builds actually failed, probably because prerequisite of a test suite failed to build.
However they still ran other tests and all those passed. This meant that the test reports were green even though the build was red. On some level this is technically correct, but it is very misleading in practice.
So I've also passed the build script's return code, as it was when we entered the on exit handler, to the generator, so that when this happens again, the report will draw the viewer's attention to the overall failure. There will be a link in the report to the build's log file, so the next step to investigate is clear.
It would be nice to say "tests failed and there was some other build error", but we cannot tell what the non-zero return code was caused by. Could be either.
The script handles the following situations now: