[Tests] Run slow matrix sequentially #3500

pcuenca · 2023-05-21T18:48:16Z

This is just a suspicion, feel free to close.

Both "Slow PyTorch CUDA tests on Ubuntu" and "Slow ONNXRuntime CUDA tests on Ubuntu" use the same runner (docker-gpu), and there is 1 machine configured to run those tests. My understanding is that the CI environment will run tests in a matrix in parallel by default, which could be the reason for the weird oom issues.

If this is actually the case, we could maybe reorganize the tests differently so that the "Slow Flax TPU tests", which use a different runner, run in parallel with any of these.

HuggingFaceDocBuilderDev · 2023-05-21T18:54:25Z

The documentation is not available anymore as the PR was closed or merged.

pcuenca · 2023-05-21T23:04:07Z

Doesn't seem the cause. Running the tests inside the docker container it looks like a portion of memory is not being freed up and it accumulates. Running outside the container I don't see the same problem but compile doesn't work, I get error RuntimeError: Triton Error [CUDA]: device kernel image is invalid. This happens in the latest PyTorch (2.0.1).

pcuenca · 2023-05-22T05:40:03Z

Running tests with -k "not Flax and not Onnx and not compile", I don't see the OOM errors.

patrickvonplaten · 2023-05-22T15:22:18Z

@pcuenca feel free to merge if it helps

[tests] Run slow matrix sequentially.

[tests] Run slow matrix sequentially.

3cd155d

patrickvonplaten merged commit fdec231 into main Jun 7, 2023

patrickvonplaten deleted the sequential-test-matrix branch June 7, 2023 10:01

AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024

[Tests] Run slow matrix sequentially (huggingface#3500)

3a2d274

[tests] Run slow matrix sequentially.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Tests] Run slow matrix sequentially #3500

[Tests] Run slow matrix sequentially #3500

Uh oh!

pcuenca commented May 21, 2023

Uh oh!

HuggingFaceDocBuilderDev commented May 21, 2023 •

edited

Loading

Uh oh!

pcuenca commented May 21, 2023

Uh oh!

pcuenca commented May 22, 2023

Uh oh!

patrickvonplaten commented May 22, 2023

Uh oh!

Uh oh!

[Tests] Run slow matrix sequentially #3500

[Tests] Run slow matrix sequentially #3500

Uh oh!

Conversation

pcuenca commented May 21, 2023

Uh oh!

HuggingFaceDocBuilderDev commented May 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcuenca commented May 21, 2023

Uh oh!

pcuenca commented May 22, 2023

Uh oh!

patrickvonplaten commented May 22, 2023

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 21, 2023 •

edited

Loading