Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" #27714

simon-mo · 2025-10-29T03:57:42Z

Broke CUDA 13 build. https://buildkite.com/vllm/release/builds/9637/steps/canvas?sid=019a2dfc-911c-4783-b421-9d3acc153e1b

…27598)" This reverts commit f257544.

gemini-code-assist

Code Review

This pull request reverts a previous change that used a pre-built xformers wheel, opting instead to build xformers from source within the Docker image. This is done by adding a RUN command to the Dockerfile for compilation and removing the xformers dependency from requirements/cuda.txt. However, the implementation has a critical flaw: the list of CUDA architectures for the xformers build is too restrictive. My review provides a specific suggestion to expand this list to ensure broader GPU compatibility (notably for Volta GPUs like V100) and improve performance on modern GPUs by including them for ahead-of-time compilation.

gemini-code-assist · 2025-10-29T03:59:45Z

docker/Dockerfile

+# TODO (huydhn): Remove this once xformers is released for 2.9.0
+RUN --mount=type=cache,target=/root/.cache/uv bash - <<'BASH'
+    . /etc/environment
+    export TORCH_CUDA_ARCH_LIST='7.5 8.0+PTX 9.0a'


The hardcoded TORCH_CUDA_ARCH_LIST is overly restrictive and may cause compatibility and performance issues for users of the Docker image.

Dropped Architectures: This list removes support for Volta (7.0), which is used by V100 GPUs. This is a significant regression as V100s are still widely used in cloud environments and research.

Performance on Modern GPUs: It relies on just-in-time (JIT) compilation from PTX for modern architectures like Ada Lovelace (8.9) and Hopper (9.0), as they are not explicitly listed. This can lead to significant startup delays when vLLM is first run on these GPUs.

Inconsistency: The default torch_cuda_arch_list defined as a build argument earlier in this Dockerfile (line 144) is much more comprehensive. While that argument is not available in this build stage, its value serves as a good reference for what architectures are generally supported.

To ensure broad compatibility and optimal performance, I recommend using a more inclusive list of architectures. This suggested list restores Volta support, provides ahead-of-time (AOT) compilation for common modern GPUs, and maintains forward compatibility for future architectures via PTX.

export TORCH_CUDA_ARCH_LIST='7.0 7.5 8.0 8.9 9.0a+PTX'

…llm-project#27714) Signed-off-by: Bhagyashri <[email protected]>

…vllm-project#27714) This reverts commit 9007bf5.

…vllm-project#27714) This reverts commit 9007bf5. Signed-off-by: Huy Do <[email protected]>

…llm-project#27714)

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 (#…

6879036

…27598)" This reverts commit f257544.

simon-mo merged commit 9007bf5 into main Oct 29, 2025
5 of 10 checks passed

simon-mo deleted the revert-27598-use-prebuilt-xformers branch October 29, 2025 03:58

mergify bot added the ci/build label Oct 29, 2025

gemini-code-assist bot reviewed Oct 29, 2025

View reviewed changes

bhagyashrigai pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Oct 29, 2025

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (v…

c6dc8c3

…llm-project#27714) Signed-off-by: Bhagyashri <[email protected]>

huydhn added a commit to huydhn/vllm that referenced this pull request Oct 29, 2025

Reapply "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (…

6f07867

…vllm-project#27714) This reverts commit 9007bf5.

huydhn added a commit to huydhn/vllm that referenced this pull request Oct 29, 2025

Reapply "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (…

7ba2115

…vllm-project#27714) This reverts commit 9007bf5. Signed-off-by: Huy Do <[email protected]>

ilmarkov pushed a commit to neuralmagic/vllm that referenced this pull request Nov 7, 2025

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (v…

557cbc8

…llm-project#27714)

ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (v…

05a4386

…llm-project#27714)

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (v…

c9edcfd

…llm-project#27714)

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (v…

44f0b2a

…llm-project#27714)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" #27714

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" #27714

Uh oh!

simon-mo commented Oct 29, 2025 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" #27714

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" #27714

Uh oh!

Conversation

simon-mo commented Oct 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simon-mo commented Oct 29, 2025 •

edited by github-actions bot

Loading