Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 #27598

huydhn · 2025-10-27T17:32:33Z

Purpose

Instead of waiting for xformers to release a new version for PyTorch 2.9.0, I have built 0.0.32.post2 locally and made the wheel available.

For more context, we don’t want to wait for xformers package for 2.9 to become available. So, I opt to build it from source. This works for CI, but has several issues like (1) increasing build time and (2) not listed as a dependency in cuda.txt. So, installing a pre-built wheel would help in the meantime until there is a new xformers version

@ywang96

Signed-off-by: Huy Do <[email protected]>

gemini-code-assist

Code Review

This pull request aims to install a prebuilt version of xformers-0.0.32.post2 built with PyTorch 2.9.0, instead of waiting for an official release. The changes involve removing the temporary installation of xformers from the Dockerfile and updating the requirements file to include the prebuilt wheel URL. I have identified a critical issue related to the hardcoding of the xformers version in the requirements file.

requirements/cuda.txt

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-10-27T17:36:00Z

docker/Dockerfile

    && uv pip install --system dist/*.whl --verbose \
        --extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.')



Ensure runtime image still installs xformers

The Docker runtime stage no longer installs xformers after the manual uv pip install git+…[email protected] block was removed. The new wheel URL was added to requirements/cuda.txt, but that requirements file is consumed only in the earlier build stage (COPY requirements/cuda.txt … followed by uv pip install --python /opt/venv/bin/python3 -r requirements/cuda.txt). The vllm-base stage (lines shown) now installs only the built vLLM wheel and FlashInfer, so the final images vllm-base, vllm-openai, etc. ship without xformers. Any runtime paths that import xformers (memory-efficient attention, sliding window, etc.) will fail with ModuleNotFoundError. A separate uv pip install for the new wheel needs to run in the runtime stage as before.

Useful? React with 👍 / 👎.

simon-mo · 2025-10-27T18:49:10Z

AFIAK this doesn't work when we distribute this as wheel on PyPI and users run uv pip install vllm.

ywang96 · 2025-10-27T19:01:14Z

Hmm I also actually run into this error when installing this branch

  × Failed to build `vllm @ file:///home/coder/devspaces/vllm`
  ├─▶ The build backend returned an error
  ╰─▶ Call to `setuptools.build_meta.build_editable` failed (exit status: 1)

      [stdout]
      Downloading wheel from
      https://wheels.vllm.ai/69f064062ba78a0ac44962f55a46a9d79cfb9ce0/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
      to /tmp/vllm-wheelsxbz6upvq/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
      [extract] vllm/_C.abi3.so
      [extract] vllm/_flashmla_C.abi3.so
      [extract] vllm/_flashmla_extension_C.abi3.so
      [extract] vllm/_moe_C.abi3.so
      [extract] vllm/cumem_allocator.abi3.so
      [extract] vllm/vllm_flash_attn/_vllm_fa2_C.abi3.so
      [extract] vllm/vllm_flash_attn/_vllm_fa3_C.abi3.so
      [extract] vllm/vllm_flash_attn/__init__.py
      [extract] vllm/vllm_flash_attn/flash_attn_interface.py
      [extract] vllm/vllm_flash_attn/layers/__init__.py
      [extract] vllm/vllm_flash_attn/layers/rotary.py
      [extract] vllm/vllm_flash_attn/ops/__init__.py
      [extract] vllm/vllm_flash_attn/ops/triton/__init__.py
      [extract] vllm/vllm_flash_attn/ops/triton/rotary.py
      Removing temporary directory /tmp/vllm-wheelsxbz6upvq

      [stderr]
      /home/coder/.cache/uv/builds-v0/.tmpr1reEQ/lib/python3.12/site-packages/torch/_subclasses/functional_tensor.py:279:
      UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at
      /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
        cpu = _conversion_method_template(device=torch.device("cpu"))
      /home/coder/.cache/uv/builds-v0/.tmpr1reEQ/lib/python3.12/site-packages/setuptools_scm/_integration/version_inference.py:51:
      UserWarning: version of None already set
        warnings.warn(self.message)
      error in setup command: 'install_requires' must be a string or iterable of strings containing valid
      project/version requirement specifiers; Expected end or semicolon (after name and no valid version
      specifier)
          https://download.pytorch.org/whl/cu129/xformers-0.0.33%2B5d4b92a5.d20251026-cp39-abi3-linux_x86_64.whl;
      platform_system == 'Linux' and platform_machine == 'x86_64'

huydhn · 2025-10-27T19:11:40Z

Hmm I also actually run into this error when installing this branch

uv pip install -r github/vllm/requirements/cuda.txt looks ok to me, maybe updating uv or setuptools version would help. I have uv==0.8.15 and setuptools==80.9.0 on my end

huydhn · 2025-10-27T19:15:20Z

AFIAK this doesn't work when we distribute this as wheel on PyPI and users run uv pip install vllm.

~~Unfortunately yes, this is only for CI and Docker release. I don't know if there is a way to address this without xformers pushing a new package on pypi for 2.9~~

Let's wait for CI and and I can check what is the xformers version set in vLLM wheel metadata in this case.

ywang96 · 2025-10-27T19:54:41Z

uv pip install -r github/vllm/requirements/cuda.txt looks ok to me, maybe updating uv or setuptools version would help. I have uv==0.8.15 and setuptools==80.9.0 on my end

I was trying to install editable - but yea I turned on ready label so you can inspect

Signed-off-by: Huy Do <[email protected]>

huydhn · 2025-10-28T02:02:15Z

I was trying to install editable - but yea I turned on ready label so you can inspect

I had a mental lapse lol, xformers==0.0.33+5d4b92a5.d20251026 is the right way to get the package

Signed-off-by: Huy Do <[email protected]>

simon-mo · 2025-10-29T02:39:10Z

LoRA failed today on nightly https://buildkite.com/organizations/vllm/analytics/suites/ci-1/tests/f36a301a-3b8a-8a11-a011-b58b871880bf?branch=main&period=1day&execution_id=019a2dd1-0f20-7305-b8e6-e4e27a676f44

Quantization is also known failure https://buildkite.com/vllm/ci/builds/36507/steps/canvas?sid=019a28fa-289d-4cb9-ab28-f16f76f5ba29#019a28fa-2967-4891-b480-35d514706161/156-3583

I'm forcing merging this so we can cut a new rc and lower the build time to save CI cost.

simon-mo · 2025-10-29T03:27:47Z

@huydhn the CUDA 13 build failed

https://buildkite.com/vllm/release/builds/9637/steps/canvas?sid=019a2dfc-911c-4783-b421-9d3acc153e1b

ZJY0516 · 2025-10-29T03:35:59Z

I encountered the following error while attempting to install the latest main branch code.

VLLM_USE_PRECOMPILED=1 uv pip install -e .
  × No solution found when resolving dependencies:
  ╰─▶ Because there is no version of xformers{platform_machine == 'x86_64' and sys_platform ==
      'linux'}==0.0.33+5d4b92a5.d20251026 and vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled depends on
      xformers{platform_machine == 'x86_64' and sys_platform == 'linux'}==0.0.33+5d4b92a5.d20251026, we can conclude that
      vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled cannot be used.
      And because only vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled is available and you require vllm, we can
      conclude that your requirements are unsatisfiable.

huydhn · 2025-10-29T03:36:40Z

True, I just realize that there are only:

Let me find out where that cu130 wheel is, get that one ready, and we can retry cu130 build.

huydhn · 2025-10-29T03:37:10Z

I encountered the following error while attempting to install the latest main branch code.

VLLM_USE_PRECOMPILED=1 uv pip install -e .
  × No solution found when resolving dependencies:
  ╰─▶ Because there is no version of xformers{platform_machine == 'x86_64' and sys_platform ==
      'linux'}==0.0.33+5d4b92a5.d20251026 and vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled depends on
      xformers{platform_machine == 'x86_64' and sys_platform == 'linux'}==0.0.33+5d4b92a5.d20251026, we can conclude that
      vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled cannot be used.
      And because only vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled is available and you require vllm, we can
      conclude that your requirements are unsatisfiable.

Could you try it with VLLM_USE_PRECOMPILED=1 uv pip install -e . --extra-index-url https://download.pytorch.org/whl/cu129 in the meantime? We're still waiting for xformers to get the official wheel built for 2.9.0 on pypi

simon-mo · 2025-10-29T03:57:36Z

@huydhn I'm going to revert this to unbreak main 😢

…27598)" This reverts commit f257544.

…ject#27598) Signed-off-by: Huy Do <[email protected]> Co-authored-by: Roger Wang <[email protected]> Signed-off-by: Bhagyashri <[email protected]>

…ject#27598) Signed-off-by: Huy Do <[email protected]> Co-authored-by: Roger Wang <[email protected]>

huydhn added 2 commits October 27, 2025 10:27

Install prebuilt xformers-0.0.32.post2 built with pt-2.9.0

7d1b04a

Signed-off-by: Huy Do <[email protected]>

Update cuda.txt instead

dead20b

Signed-off-by: Huy Do <[email protected]>

mergify bot added the ci/build label Oct 27, 2025

gemini-code-assist bot reviewed Oct 27, 2025

View reviewed changes

requirements/cuda.txt Outdated Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Oct 27, 2025

View reviewed changes

huydhn changed the title ~~Install prebuilt xformers-0.0.32.post2 built with pt-2.9.0~~ Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 Oct 27, 2025

ywang96 added this to the v0.11.1 milestone Oct 27, 2025

Merge branch 'main' into use-prebuilt-xformers

43f5336

ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 27, 2025

Try to use the wheel from cu129

3c0d1b8

Signed-off-by: Huy Do <[email protected]>

huydhn and others added 2 commits October 28, 2025 00:47

Merge branch 'main' into use-prebuilt-xformers

c7d06b6

Signed-off-by: Huy Do <[email protected]>

Merge branch 'main' into use-prebuilt-xformers

6eaa64e

simon-mo approved these changes Oct 28, 2025

View reviewed changes

simon-mo enabled auto-merge (squash) October 28, 2025 22:26

simon-mo disabled auto-merge October 29, 2025 02:39

simon-mo merged commit f257544 into vllm-project:main Oct 29, 2025
84 of 89 checks passed

simon-mo added a commit that referenced this pull request Oct 29, 2025

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 (#…

6879036

…27598)" This reverts commit f257544.

simon-mo mentioned this pull request Oct 29, 2025

Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" #27714

Merged

varun-sundar-rabindranath mentioned this pull request Oct 29, 2025

[Build] Revert triton_kernels requirements #27659

Merged

noooop mentioned this pull request Oct 29, 2025

[CI Failure]: torch._inductor.exc.InductorError in Nightly build to run all tests #27724

Closed

3 tasks

huydhn mentioned this pull request Oct 29, 2025

Reapply "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" #27768

Merged

ilmarkov pushed a commit to neuralmagic/vllm that referenced this pull request Nov 7, 2025

Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 (vllm-pro…

41f239b

…ject#27598) Signed-off-by: Huy Do <[email protected]> Co-authored-by: Roger Wang <[email protected]>

ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025

Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 (vllm-pro…

e572187

…ject#27598) Signed-off-by: Huy Do <[email protected]> Co-authored-by: Roger Wang <[email protected]>

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 (vllm-pro…

790f863

…ject#27598) Signed-off-by: Huy Do <[email protected]> Co-authored-by: Roger Wang <[email protected]>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 (vllm-pro…

a4e9335

…ject#27598) Signed-off-by: Huy Do <[email protected]> Co-authored-by: Roger Wang <[email protected]>

		&& uv pip install --system dist/*.whl --verbose \
		--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION \| cut -d. -f1,2 \| tr -d '.')

Uh oh!

Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 #27598

Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 #27598

Uh oh!

Conversation

huydhn commented Oct 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

simon-mo commented Oct 27, 2025

Uh oh!

ywang96 commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huydhn commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huydhn commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ywang96 commented Oct 27, 2025

Uh oh!

huydhn commented Oct 28, 2025

Uh oh!

simon-mo commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

simon-mo commented Oct 29, 2025

Uh oh!

ZJY0516 commented Oct 29, 2025

Uh oh!

huydhn commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huydhn commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

simon-mo commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

huydhn commented Oct 27, 2025 •

edited by github-actions bot

Loading

ywang96 commented Oct 27, 2025 •

edited

Loading

huydhn commented Oct 27, 2025 •

edited

Loading

huydhn commented Oct 27, 2025 •

edited

Loading

simon-mo commented Oct 29, 2025 •

edited

Loading

huydhn commented Oct 29, 2025 •

edited

Loading

huydhn commented Oct 29, 2025 •

edited

Loading