Skip to content

Conversation

@huydhn
Copy link
Contributor

@huydhn huydhn commented Oct 27, 2025

Purpose

Instead of waiting for xformers to release a new version for PyTorch 2.9.0, I have built 0.0.32.post2 locally and made the wheel available.

For more context, we don’t want to wait for xformers package for 2.9 to become available. So, I opt to build it from source. This works for CI, but has several issues like (1) increasing build time and (2) not listed as a dependency in cuda.txt. So, installing a pre-built wheel would help in the meantime until there is a new xformers version

@ywang96

@mergify mergify bot added the ci/build label Oct 27, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to install a prebuilt version of xformers-0.0.32.post2 built with PyTorch 2.9.0, instead of waiting for an official release. The changes involve removing the temporary installation of xformers from the Dockerfile and updating the requirements file to include the prebuilt wheel URL. I have identified a critical issue related to the hardcoding of the xformers version in the requirements file.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 361 to 363
&& uv pip install --system dist/*.whl --verbose \
--extra-index-url ${PYTORCH_CUDA_INDEX_BASE_URL}/cu$(echo $CUDA_VERSION | cut -d. -f1,2 | tr -d '.')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Ensure runtime image still installs xformers

The Docker runtime stage no longer installs xformers after the manual uv pip install git+…[email protected] block was removed. The new wheel URL was added to requirements/cuda.txt, but that requirements file is consumed only in the earlier build stage (COPY requirements/cuda.txt … followed by uv pip install --python /opt/venv/bin/python3 -r requirements/cuda.txt). The vllm-base stage (lines shown) now installs only the built vLLM wheel and FlashInfer, so the final images vllm-base, vllm-openai, etc. ship without xformers. Any runtime paths that import xformers (memory-efficient attention, sliding window, etc.) will fail with ModuleNotFoundError. A separate uv pip install for the new wheel needs to run in the runtime stage as before.

Useful? React with 👍 / 👎.

@huydhn huydhn changed the title Install prebuilt xformers-0.0.32.post2 built with pt-2.9.0 Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 Oct 27, 2025
@ywang96 ywang96 added this to the v0.11.1 milestone Oct 27, 2025
@simon-mo
Copy link
Collaborator

AFIAK this doesn't work when we distribute this as wheel on PyPI and users run uv pip install vllm.

@ywang96
Copy link
Member

ywang96 commented Oct 27, 2025

Hmm I also actually run into this error when installing this branch

  × Failed to build `vllm @ file:///home/coder/devspaces/vllm`
  ├─▶ The build backend returned an error
  ╰─▶ Call to `setuptools.build_meta.build_editable` failed (exit status: 1)

      [stdout]
      Downloading wheel from
      https://wheels.vllm.ai/69f064062ba78a0ac44962f55a46a9d79cfb9ce0/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
      to /tmp/vllm-wheelsxbz6upvq/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
      [extract] vllm/_C.abi3.so
      [extract] vllm/_flashmla_C.abi3.so
      [extract] vllm/_flashmla_extension_C.abi3.so
      [extract] vllm/_moe_C.abi3.so
      [extract] vllm/cumem_allocator.abi3.so
      [extract] vllm/vllm_flash_attn/_vllm_fa2_C.abi3.so
      [extract] vllm/vllm_flash_attn/_vllm_fa3_C.abi3.so
      [extract] vllm/vllm_flash_attn/__init__.py
      [extract] vllm/vllm_flash_attn/flash_attn_interface.py
      [extract] vllm/vllm_flash_attn/layers/__init__.py
      [extract] vllm/vllm_flash_attn/layers/rotary.py
      [extract] vllm/vllm_flash_attn/ops/__init__.py
      [extract] vllm/vllm_flash_attn/ops/triton/__init__.py
      [extract] vllm/vllm_flash_attn/ops/triton/rotary.py
      Removing temporary directory /tmp/vllm-wheelsxbz6upvq

      [stderr]
      /home/coder/.cache/uv/builds-v0/.tmpr1reEQ/lib/python3.12/site-packages/torch/_subclasses/functional_tensor.py:279:
      UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at
      /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
        cpu = _conversion_method_template(device=torch.device("cpu"))
      /home/coder/.cache/uv/builds-v0/.tmpr1reEQ/lib/python3.12/site-packages/setuptools_scm/_integration/version_inference.py:51:
      UserWarning: version of None already set
        warnings.warn(self.message)
      error in setup command: 'install_requires' must be a string or iterable of strings containing valid
      project/version requirement specifiers; Expected end or semicolon (after name and no valid version
      specifier)
          https://download.pytorch.org/whl/cu129/xformers-0.0.33%2B5d4b92a5.d20251026-cp39-abi3-linux_x86_64.whl;
      platform_system == 'Linux' and platform_machine == 'x86_64'

@huydhn
Copy link
Contributor Author

huydhn commented Oct 27, 2025

Hmm I also actually run into this error when installing this branch

uv pip install -r github/vllm/requirements/cuda.txt looks ok to me, maybe updating uv or setuptools version would help. I have uv==0.8.15 and setuptools==80.9.0 on my end

@huydhn
Copy link
Contributor Author

huydhn commented Oct 27, 2025

AFIAK this doesn't work when we distribute this as wheel on PyPI and users run uv pip install vllm.

Unfortunately yes, this is only for CI and Docker release. I don't know if there is a way to address this without xformers pushing a new package on pypi for 2.9

Let's wait for CI and and I can check what is the xformers version set in vLLM wheel metadata in this case.

@ywang96 ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 27, 2025
@ywang96
Copy link
Member

ywang96 commented Oct 27, 2025

uv pip install -r github/vllm/requirements/cuda.txt looks ok to me, maybe updating uv or setuptools version would help. I have uv==0.8.15 and setuptools==80.9.0 on my end

I was trying to install editable - but yea I turned on ready label so you can inspect

@huydhn
Copy link
Contributor Author

huydhn commented Oct 28, 2025

I was trying to install editable - but yea I turned on ready label so you can inspect

I had a mental lapse lol, xformers==0.0.33+5d4b92a5.d20251026 is the right way to get the package

@simon-mo simon-mo enabled auto-merge (squash) October 28, 2025 22:26
@simon-mo
Copy link
Collaborator

simon-mo commented Oct 29, 2025

@simon-mo simon-mo disabled auto-merge October 29, 2025 02:39
@simon-mo simon-mo merged commit f257544 into vllm-project:main Oct 29, 2025
84 of 89 checks passed
@simon-mo
Copy link
Collaborator

@ZJY0516
Copy link
Contributor

ZJY0516 commented Oct 29, 2025

I encountered the following error while attempting to install the latest main branch code.

VLLM_USE_PRECOMPILED=1 uv pip install -e .
  × No solution found when resolving dependencies:
  ╰─▶ Because there is no version of xformers{platform_machine == 'x86_64' and sys_platform ==
      'linux'}==0.0.33+5d4b92a5.d20251026 and vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled depends on
      xformers{platform_machine == 'x86_64' and sys_platform == 'linux'}==0.0.33+5d4b92a5.d20251026, we can conclude that
      vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled cannot be used.
      And because only vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled is available and you require vllm, we can
      conclude that your requirements are unsatisfiable.

@huydhn
Copy link
Contributor Author

huydhn commented Oct 29, 2025

True, I just realize that there are only:

Let me find out where that cu130 wheel is, get that one ready, and we can retry cu130 build.

@huydhn
Copy link
Contributor Author

huydhn commented Oct 29, 2025

I encountered the following error while attempting to install the latest main branch code.

VLLM_USE_PRECOMPILED=1 uv pip install -e .
  × No solution found when resolving dependencies:
  ╰─▶ Because there is no version of xformers{platform_machine == 'x86_64' and sys_platform ==
      'linux'}==0.0.33+5d4b92a5.d20251026 and vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled depends on
      xformers{platform_machine == 'x86_64' and sys_platform == 'linux'}==0.0.33+5d4b92a5.d20251026, we can conclude that
      vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled cannot be used.
      And because only vllm==0.11.1rc4.dev74+gf25754470.d20251029.precompiled is available and you require vllm, we can
      conclude that your requirements are unsatisfiable.

Could you try it with VLLM_USE_PRECOMPILED=1 uv pip install -e . --extra-index-url https://download.pytorch.org/whl/cu129 in the meantime? We're still waiting for xformers to get the official wheel built for 2.9.0 on pypi

@simon-mo
Copy link
Collaborator

@huydhn I'm going to revert this to unbreak main 😢

bhagyashrigai pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Oct 29, 2025
ilmarkov pushed a commit to neuralmagic/vllm that referenced this pull request Nov 7, 2025
ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants