Skip to content

pip installation error - ERROR: Failed building wheel for vllm #799

@dxlong2000

Description

@dxlong2000

Dear the team,

Thank you for your great work. I have tried to install vllm on my server Linux environment. I got an unexpected error. Could you please advise soon? Thanks!

ENV:

Pytorch: pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118
Python: 3.8.17
CUDA: 12

ERROR:

pip install vllm
Collecting vllm
  Using cached vllm-0.1.3.tar.gz (102 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting ninja (from vllm)
  Using cached ninja-1.11.1-py2.py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (145 kB)
Collecting psutil (from vllm)
  Using cached psutil-5.9.5-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (282 kB)
Collecting ray>=2.5.1 (from vllm)
  Obtaining dependency information for ray>=2.5.1 from https://files.pythonhosted.org/packages/2a/9c/4ab1fe33db75eab17d6ef2822c3d418ba47a1a487653b24e5de694410aa4/ray-2.6.3-cp38-cp38-manylinux2014_x86_64.whl.metadata
  Using cached ray-2.6.3-cp38-cp38-manylinux2014_x86_64.whl.metadata (12 kB)
Collecting sentencepiece (from vllm)
  Using cached sentencepiece-0.1.99-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
Collecting numpy (from vllm)
  Obtaining dependency information for numpy from https://files.pythonhosted.org/packages/98/5d/5738903efe0ecb73e51eb44feafba32bdba2081263d40c5043568ff60faf/numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB)
Collecting torch>=2.0.0 (from vllm)
  Using cached torch-2.0.1-cp38-cp38-manylinux1_x86_64.whl (619.9 MB)
Collecting transformers>=4.31.0 (from vllm)
  Obtaining dependency information for transformers>=4.31.0 from https://files.pythonhosted.org/packages/21/02/ae8e595f45b6c8edee07913892b3b41f5f5f273962ad98851dc6a564bbb9/transformers-4.31.0-py3-none-any.whl.metadata
  Using cached transformers-4.31.0-py3-none-any.whl.metadata (116 kB)
Collecting xformers>=0.0.19 (from vllm)
  Obtaining dependency information for xformers>=0.0.19 from https://files.pythonhosted.org/packages/ce/4a/3b0368fad4ff89ab25fe8276512dce160bbfe33b7a7e43c2502f08b175d6/xformers-0.0.21-cp38-cp38-manylinux2014_x86_64.whl.metadata
  Using cached xformers-0.0.21-cp38-cp38-manylinux2014_x86_64.whl.metadata (1.0 kB)
Collecting fastapi (from vllm)
  Obtaining dependency information for fastapi from https://files.pythonhosted.org/packages/09/ae/8378894f9fbdf0297cdffdc79496ccd779166d675fec47cad8d2ca782739/fastapi-0.101.1-py3-none-any.whl.metadata
  Using cached fastapi-0.101.1-py3-none-any.whl.metadata (23 kB)
Collecting uvicorn (from vllm)
  Obtaining dependency information for uvicorn from https://files.pythonhosted.org/packages/79/96/b0882a1c3f7ef3dd86879e041212ae5b62b4bd352320889231cc735a8e8f/uvicorn-0.23.2-py3-none-any.whl.metadata
  Using cached uvicorn-0.23.2-py3-none-any.whl.metadata (6.2 kB)
Collecting pydantic<2 (from vllm)
  Obtaining dependency information for pydantic<2 from https://files.pythonhosted.org/packages/5d/68/7a0c5f8b854d3fad9cd82a6312205025597481e46b4ec36f6dea4f1fb93b/pydantic-1.10.12-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached pydantic-1.10.12-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (149 kB)
Collecting typing-extensions>=4.2.0 (from pydantic<2->vllm)
  Obtaining dependency information for typing-extensions>=4.2.0 from https://files.pythonhosted.org/packages/ec/6b/63cc3df74987c36fe26157ee12e09e8f9db4de771e0f3404263117e75b95/typing_extensions-4.7.1-py3-none-any.whl.metadata
  Using cached typing_extensions-4.7.1-py3-none-any.whl.metadata (3.1 kB)
Collecting click>=7.0 (from ray>=2.5.1->vllm)
  Obtaining dependency information for click>=7.0 from https://files.pythonhosted.org/packages/00/2e/d53fa4befbf2cfa713304affc7ca780ce4fc1fd8710527771b58311a3229/click-8.1.7-py3-none-any.whl.metadata
  Using cached click-8.1.7-py3-none-any.whl.metadata (3.0 kB)
Collecting filelock (from ray>=2.5.1->vllm)
  Obtaining dependency information for filelock from https://files.pythonhosted.org/packages/00/45/ec3407adf6f6b5bf867a4462b2b0af27597a26bd3cd6e2534cb6ab029938/filelock-3.12.2-py3-none-any.whl.metadata
  Using cached filelock-3.12.2-py3-none-any.whl.metadata (2.7 kB)
Requirement already satisfied: jsonschema in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from ray>=2.5.1->vllm) (4.19.0)
Collecting msgpack<2.0.0,>=1.0.0 (from ray>=2.5.1->vllm)
  Using cached msgpack-1.0.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (322 kB)
Collecting packaging (from ray>=2.5.1->vllm)
  Using cached packaging-23.1-py3-none-any.whl (48 kB)
Collecting protobuf!=3.19.5,>=3.15.3 (from ray>=2.5.1->vllm)
  Obtaining dependency information for protobuf!=3.19.5,>=3.15.3 from https://files.pythonhosted.org/packages/4c/87/59648989ad7f5ba6fe3c7f8abc555183f28559b6f6cd14ad17a3f0d3094f/protobuf-4.24.1-cp37-abi3-manylinux2014_x86_64.whl.metadata
  Using cached protobuf-4.24.1-cp37-abi3-manylinux2014_x86_64.whl.metadata (540 bytes)
Collecting pyyaml (from ray>=2.5.1->vllm)
  Obtaining dependency information for pyyaml from https://files.pythonhosted.org/packages/c8/6b/6600ac24725c7388255b2f5add93f91e58a5d7efaf4af244fdbcc11a541b/PyYAML-6.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached PyYAML-6.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting aiosignal (from ray>=2.5.1->vllm)
  Using cached aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Collecting frozenlist (from ray>=2.5.1->vllm)
  Obtaining dependency information for frozenlist from https://files.pythonhosted.org/packages/0b/36/c276486f89bee9098332710af2207344f360c6c6f104a4931a7566039b1d/frozenlist-1.4.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached frozenlist-1.4.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.2 kB)
Collecting requests (from ray>=2.5.1->vllm)
  Obtaining dependency information for requests from https://files.pythonhosted.org/packages/70/8e/0e2d847013cb52cd35b38c009bb167a1a26b2ce6cd6965bf26b47bc0bf44/requests-2.31.0-py3-none-any.whl.metadata
  Using cached requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting grpcio>=1.32.0 (from ray>=2.5.1->vllm)
  Obtaining dependency information for grpcio>=1.32.0 from https://files.pythonhosted.org/packages/68/a8/7052e6a5c27159f080bb70fb8d8302c0e4bea148fb430acb57f83a8f2733/grpcio-1.57.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached grpcio-1.57.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.0 kB)
Collecting sympy (from torch>=2.0.0->vllm)
  Using cached sympy-1.12-py3-none-any.whl (5.7 MB)
Collecting networkx (from torch>=2.0.0->vllm)
  Using cached networkx-3.1-py3-none-any.whl (2.1 MB)
Collecting jinja2 (from torch>=2.0.0->vllm)
  Using cached Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting nvidia-cuda-nvrtc-cu11==11.7.99 (from torch>=2.0.0->vllm)
  Using cached nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
Collecting nvidia-cuda-runtime-cu11==11.7.99 (from torch>=2.0.0->vllm)
  Using cached nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
Collecting nvidia-cuda-cupti-cu11==11.7.101 (from torch>=2.0.0->vllm)
  Using cached nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)
Collecting nvidia-cudnn-cu11==8.5.0.96 (from torch>=2.0.0->vllm)
  Using cached nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
Collecting nvidia-cublas-cu11==11.10.3.66 (from torch>=2.0.0->vllm)
  Using cached nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
Collecting nvidia-cufft-cu11==10.9.0.58 (from torch>=2.0.0->vllm)
  Using cached nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)
Collecting nvidia-curand-cu11==10.2.10.91 (from torch>=2.0.0->vllm)
  Using cached nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)
Collecting nvidia-cusolver-cu11==11.4.0.1 (from torch>=2.0.0->vllm)
  Using cached nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)
Collecting nvidia-cusparse-cu11==11.7.4.91 (from torch>=2.0.0->vllm)
  Using cached nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)
Collecting nvidia-nccl-cu11==2.14.3 (from torch>=2.0.0->vllm)
  Using cached nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB)
Collecting nvidia-nvtx-cu11==11.7.91 (from torch>=2.0.0->vllm)
  Using cached nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB)
Collecting triton==2.0.0 (from torch>=2.0.0->vllm)
  Using cached triton-2.0.0-1-cp38-cp38-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.2 MB)
Requirement already satisfied: setuptools in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=2.0.0->vllm) (68.0.0)
Requirement already satisfied: wheel in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=2.0.0->vllm) (0.38.4)
Collecting cmake (from triton==2.0.0->torch>=2.0.0->vllm)
  Obtaining dependency information for cmake from https://files.pythonhosted.org/packages/2e/51/3a4672a819b4532a378bfefad8f886cfe71057556e0d4eefb64523fd370a/cmake-3.27.2-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata
  Using cached cmake-3.27.2-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.7 kB)
Collecting lit (from triton==2.0.0->torch>=2.0.0->vllm)
  Using cached lit-16.0.6-py3-none-any.whl
Collecting huggingface-hub<1.0,>=0.14.1 (from transformers>=4.31.0->vllm)
  Obtaining dependency information for huggingface-hub<1.0,>=0.14.1 from https://files.pythonhosted.org/packages/7f/c4/adcbe9a696c135578cabcbdd7331332daad4d49b7c43688bc2d36b3a47d2/huggingface_hub-0.16.4-py3-none-any.whl.metadata
  Using cached huggingface_hub-0.16.4-py3-none-any.whl.metadata (12 kB)
Collecting regex!=2019.12.17 (from transformers>=4.31.0->vllm)
  Obtaining dependency information for regex!=2019.12.17 from https://files.pythonhosted.org/packages/1f/5c/374ac3fa3c7ed9a967ad273a5e841897ef6b10aa6aad938ff10717a3e2a3/regex-2023.8.8-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached regex-2023.8.8-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers>=4.31.0->vllm)
  Using cached tokenizers-0.13.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
Collecting safetensors>=0.3.1 (from transformers>=4.31.0->vllm)
  Obtaining dependency information for safetensors>=0.3.1 from https://files.pythonhosted.org/packages/4d/81/9b6ee8bd7faf7ae79afafde28b4e8abbcb897c9aa089d51eb5d0a1f3ffcd/safetensors-0.3.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached safetensors-0.3.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.5 kB)
Collecting tqdm>=4.27 (from transformers>=4.31.0->vllm)
  Obtaining dependency information for tqdm>=4.27 from https://files.pythonhosted.org/packages/00/e5/f12a80907d0884e6dff9c16d0c0114d81b8cd07dc3ae54c5e962cc83037e/tqdm-4.66.1-py3-none-any.whl.metadata
  Using cached tqdm-4.66.1-py3-none-any.whl.metadata (57 kB)
Collecting starlette<0.28.0,>=0.27.0 (from fastapi->vllm)
  Obtaining dependency information for starlette<0.28.0,>=0.27.0 from https://files.pythonhosted.org/packages/58/f8/e2cca22387965584a409795913b774235752be4176d276714e15e1a58884/starlette-0.27.0-py3-none-any.whl.metadata
  Using cached starlette-0.27.0-py3-none-any.whl.metadata (5.8 kB)
Collecting h11>=0.8 (from uvicorn->vllm)
  Using cached h11-0.14.0-py3-none-any.whl (58 kB)
Collecting fsspec (from huggingface-hub<1.0,>=0.14.1->transformers>=4.31.0->vllm)
  Obtaining dependency information for fsspec from https://files.pythonhosted.org/packages/e3/bd/4c0a4619494188a9db5d77e2100ab7d544a42e76b2447869d8e124e981d8/fsspec-2023.6.0-py3-none-any.whl.metadata
  Using cached fsspec-2023.6.0-py3-none-any.whl.metadata (6.7 kB)
Collecting anyio<5,>=3.4.0 (from starlette<0.28.0,>=0.27.0->fastapi->vllm)
  Obtaining dependency information for anyio<5,>=3.4.0 from https://files.pythonhosted.org/packages/19/24/44299477fe7dcc9cb58d0a57d5a7588d6af2ff403fdd2d47a246c91a3246/anyio-3.7.1-py3-none-any.whl.metadata
  Using cached anyio-3.7.1-py3-none-any.whl.metadata (4.7 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch>=2.0.0->vllm)
  Obtaining dependency information for MarkupSafe>=2.0 from https://files.pythonhosted.org/packages/de/e2/32c14301bb023986dff527a49325b6259cab4ebb4633f69de54af312fc45/MarkupSafe-2.1.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached MarkupSafe-2.1.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Requirement already satisfied: attrs>=22.2.0 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (23.1.0)
Requirement already satisfied: importlib-resources>=1.4.0 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (6.0.1)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (2023.7.1)
Requirement already satisfied: pkgutil-resolve-name>=1.3.10 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (1.3.10)
Requirement already satisfied: referencing>=0.28.4 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (0.30.2)
Requirement already satisfied: rpds-py>=0.7.1 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (0.9.2)
Collecting charset-normalizer<4,>=2 (from requests->ray>=2.5.1->vllm)
  Obtaining dependency information for charset-normalizer<4,>=2 from https://files.pythonhosted.org/packages/cb/e7/5e43745003bf1f90668c7be23fc5952b3a2b9c2558f16749411c18039b36/charset_normalizer-3.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Using cached charset_normalizer-3.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (31 kB)
Collecting idna<4,>=2.5 (from requests->ray>=2.5.1->vllm)
  Using cached idna-3.4-py3-none-any.whl (61 kB)
Collecting urllib3<3,>=1.21.1 (from requests->ray>=2.5.1->vllm)
  Obtaining dependency information for urllib3<3,>=1.21.1 from https://files.pythonhosted.org/packages/9b/81/62fd61001fa4b9d0df6e31d47ff49cfa9de4af03adecf339c7bc30656b37/urllib3-2.0.4-py3-none-any.whl.metadata
  Using cached urllib3-2.0.4-py3-none-any.whl.metadata (6.6 kB)
Collecting certifi>=2017.4.17 (from requests->ray>=2.5.1->vllm)
  Obtaining dependency information for certifi>=2017.4.17 from https://files.pythonhosted.org/packages/4c/dd/2234eab22353ffc7d94e8d13177aaa050113286e93e7b40eae01fbf7c3d9/certifi-2023.7.22-py3-none-any.whl.metadata
  Using cached certifi-2023.7.22-py3-none-any.whl.metadata (2.2 kB)
Collecting mpmath>=0.19 (from sympy->torch>=2.0.0->vllm)
  Using cached mpmath-1.3.0-py3-none-any.whl (536 kB)
Collecting sniffio>=1.1 (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm)
  Using cached sniffio-1.3.0-py3-none-any.whl (10 kB)
Collecting exceptiongroup (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm)
  Obtaining dependency information for exceptiongroup from https://files.pythonhosted.org/packages/ad/83/b71e58666f156a39fb29417e4c8ca4bc7400c0dd4ed9e8842ab54dc8c344/exceptiongroup-1.1.3-py3-none-any.whl.metadata
  Using cached exceptiongroup-1.1.3-py3-none-any.whl.metadata (6.1 kB)
Requirement already satisfied: zipp>=3.1.0 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from importlib-resources>=1.4.0->jsonschema->ray>=2.5.1->vllm) (3.16.2)
Using cached pydantic-1.10.12-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
Using cached ray-2.6.3-cp38-cp38-manylinux2014_x86_64.whl (57.0 MB)
Using cached numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
Using cached transformers-4.31.0-py3-none-any.whl (7.4 MB)
Using cached xformers-0.0.21-cp38-cp38-manylinux2014_x86_64.whl (167.0 MB)
Using cached fastapi-0.101.1-py3-none-any.whl (65 kB)
Using cached uvicorn-0.23.2-py3-none-any.whl (59 kB)
Using cached click-8.1.7-py3-none-any.whl (97 kB)
Using cached grpcio-1.57.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB)
Using cached huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
Using cached protobuf-4.24.1-cp37-abi3-manylinux2014_x86_64.whl (311 kB)
Using cached PyYAML-6.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (736 kB)
Using cached regex-2023.8.8-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (774 kB)
Using cached safetensors-0.3.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
Using cached starlette-0.27.0-py3-none-any.whl (66 kB)
Using cached tqdm-4.66.1-py3-none-any.whl (78 kB)
Using cached typing_extensions-4.7.1-py3-none-any.whl (33 kB)
Using cached frozenlist-1.4.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (220 kB)
Using cached filelock-3.12.2-py3-none-any.whl (10 kB)
Using cached requests-2.31.0-py3-none-any.whl (62 kB)
Using cached anyio-3.7.1-py3-none-any.whl (80 kB)
Using cached certifi-2023.7.22-py3-none-any.whl (158 kB)
Using cached charset_normalizer-3.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (199 kB)
Using cached MarkupSafe-2.1.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Using cached urllib3-2.0.4-py3-none-any.whl (123 kB)
Using cached cmake-3.27.2-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.1 MB)
Using cached fsspec-2023.6.0-py3-none-any.whl (163 kB)
Using cached exceptiongroup-1.1.3-py3-none-any.whl (14 kB)
Building wheels for collected packages: vllm
  Building wheel for vllm (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for vllm (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [134 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-cpython-38
      creating build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/config.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/logger.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/block.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/utils.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-38/vllm
      creating build/lib.linux-x86_64-cpython-38/vllm/engine
      copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
      copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
      copying vllm/engine/ray_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
      copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
      copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
      creating build/lib.linux-x86_64-cpython-38/vllm/model_executor
      copying vllm/model_executor/weight_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
      copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
      copying vllm/model_executor/model_loader.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
      copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
      copying vllm/model_executor/input_metadata.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
      creating build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
      copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
      copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
      copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
      creating build/lib.linux-x86_64-cpython-38/vllm/entrypoints
      copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
      copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
      copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
      creating build/lib.linux-x86_64-cpython-38/vllm/worker
      copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
      copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
      copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
      creating build/lib.linux-x86_64-cpython-38/vllm/core
      copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/core
      copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-38/vllm/core
      copying vllm/core/policy.py -> build/lib.linux-x86_64-cpython-38/vllm/core
      copying vllm/core/block_manager.py -> build/lib.linux-x86_64-cpython-38/vllm/core
      creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      copying vllm/model_executor/layers/attention.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/parallel_state.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
      creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      copying vllm/model_executor/parallel_utils/tensor_parallel/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      copying vllm/model_executor/parallel_utils/tensor_parallel/layers.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      copying vllm/model_executor/parallel_utils/tensor_parallel/utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      copying vllm/model_executor/parallel_utils/tensor_parallel/random.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      copying vllm/model_executor/parallel_utils/tensor_parallel/mappings.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      creating build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/baichuan.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      creating build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
      running build_ext
      Traceback (most recent call last):
        File "/home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
          return _build_backend().build_wheel(wheel_directory, config_settings,
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 434, in build_wheel
          return self._build_with_temp_dir(
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 419, in _build_with_temp_dir
          self.run_setup()
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 341, in run_setup
          exec(code, locals())
        File "<string>", line 145, in <module>
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/__init__.py", line 107, in setup
          return distutils.core.setup(**attrs)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
          return run_commands(dist)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
          dist.run_commands()
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1233, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 349, in run
          self.run_command("build")
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1233, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run
          self.run_command(cmd_name)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1233, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 88, in run
          _build_ext.run(self)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
          self.build_extensions()
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 499, in build_extensions
          _check_cuda_version(compiler_name, compiler_version)
        File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 387, in _check_cuda_version
          raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
      RuntimeError:
      The detected CUDA version (12.0) mismatches the version that was used to compile
      PyTorch (11.7). Please make sure to use the same CUDA versions.
      
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for vllm
Failed to build vllm
ERROR: Could not build wheels for vllm, which is required to install pyproject.toml-based projects

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions