-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Closed
Description
Dear the team,
Thank you for your great work. I have tried to install vllm on my server Linux environment. I got an unexpected error. Could you please advise soon? Thanks!
ENV:
Pytorch: pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118
Python: 3.8.17
CUDA: 12
ERROR:
pip install vllm
Collecting vllm
Using cached vllm-0.1.3.tar.gz (102 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting ninja (from vllm)
Using cached ninja-1.11.1-py2.py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (145 kB)
Collecting psutil (from vllm)
Using cached psutil-5.9.5-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (282 kB)
Collecting ray>=2.5.1 (from vllm)
Obtaining dependency information for ray>=2.5.1 from https://files.pythonhosted.org/packages/2a/9c/4ab1fe33db75eab17d6ef2822c3d418ba47a1a487653b24e5de694410aa4/ray-2.6.3-cp38-cp38-manylinux2014_x86_64.whl.metadata
Using cached ray-2.6.3-cp38-cp38-manylinux2014_x86_64.whl.metadata (12 kB)
Collecting sentencepiece (from vllm)
Using cached sentencepiece-0.1.99-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
Collecting numpy (from vllm)
Obtaining dependency information for numpy from https://files.pythonhosted.org/packages/98/5d/5738903efe0ecb73e51eb44feafba32bdba2081263d40c5043568ff60faf/numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB)
Collecting torch>=2.0.0 (from vllm)
Using cached torch-2.0.1-cp38-cp38-manylinux1_x86_64.whl (619.9 MB)
Collecting transformers>=4.31.0 (from vllm)
Obtaining dependency information for transformers>=4.31.0 from https://files.pythonhosted.org/packages/21/02/ae8e595f45b6c8edee07913892b3b41f5f5f273962ad98851dc6a564bbb9/transformers-4.31.0-py3-none-any.whl.metadata
Using cached transformers-4.31.0-py3-none-any.whl.metadata (116 kB)
Collecting xformers>=0.0.19 (from vllm)
Obtaining dependency information for xformers>=0.0.19 from https://files.pythonhosted.org/packages/ce/4a/3b0368fad4ff89ab25fe8276512dce160bbfe33b7a7e43c2502f08b175d6/xformers-0.0.21-cp38-cp38-manylinux2014_x86_64.whl.metadata
Using cached xformers-0.0.21-cp38-cp38-manylinux2014_x86_64.whl.metadata (1.0 kB)
Collecting fastapi (from vllm)
Obtaining dependency information for fastapi from https://files.pythonhosted.org/packages/09/ae/8378894f9fbdf0297cdffdc79496ccd779166d675fec47cad8d2ca782739/fastapi-0.101.1-py3-none-any.whl.metadata
Using cached fastapi-0.101.1-py3-none-any.whl.metadata (23 kB)
Collecting uvicorn (from vllm)
Obtaining dependency information for uvicorn from https://files.pythonhosted.org/packages/79/96/b0882a1c3f7ef3dd86879e041212ae5b62b4bd352320889231cc735a8e8f/uvicorn-0.23.2-py3-none-any.whl.metadata
Using cached uvicorn-0.23.2-py3-none-any.whl.metadata (6.2 kB)
Collecting pydantic<2 (from vllm)
Obtaining dependency information for pydantic<2 from https://files.pythonhosted.org/packages/5d/68/7a0c5f8b854d3fad9cd82a6312205025597481e46b4ec36f6dea4f1fb93b/pydantic-1.10.12-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached pydantic-1.10.12-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (149 kB)
Collecting typing-extensions>=4.2.0 (from pydantic<2->vllm)
Obtaining dependency information for typing-extensions>=4.2.0 from https://files.pythonhosted.org/packages/ec/6b/63cc3df74987c36fe26157ee12e09e8f9db4de771e0f3404263117e75b95/typing_extensions-4.7.1-py3-none-any.whl.metadata
Using cached typing_extensions-4.7.1-py3-none-any.whl.metadata (3.1 kB)
Collecting click>=7.0 (from ray>=2.5.1->vllm)
Obtaining dependency information for click>=7.0 from https://files.pythonhosted.org/packages/00/2e/d53fa4befbf2cfa713304affc7ca780ce4fc1fd8710527771b58311a3229/click-8.1.7-py3-none-any.whl.metadata
Using cached click-8.1.7-py3-none-any.whl.metadata (3.0 kB)
Collecting filelock (from ray>=2.5.1->vllm)
Obtaining dependency information for filelock from https://files.pythonhosted.org/packages/00/45/ec3407adf6f6b5bf867a4462b2b0af27597a26bd3cd6e2534cb6ab029938/filelock-3.12.2-py3-none-any.whl.metadata
Using cached filelock-3.12.2-py3-none-any.whl.metadata (2.7 kB)
Requirement already satisfied: jsonschema in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from ray>=2.5.1->vllm) (4.19.0)
Collecting msgpack<2.0.0,>=1.0.0 (from ray>=2.5.1->vllm)
Using cached msgpack-1.0.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (322 kB)
Collecting packaging (from ray>=2.5.1->vllm)
Using cached packaging-23.1-py3-none-any.whl (48 kB)
Collecting protobuf!=3.19.5,>=3.15.3 (from ray>=2.5.1->vllm)
Obtaining dependency information for protobuf!=3.19.5,>=3.15.3 from https://files.pythonhosted.org/packages/4c/87/59648989ad7f5ba6fe3c7f8abc555183f28559b6f6cd14ad17a3f0d3094f/protobuf-4.24.1-cp37-abi3-manylinux2014_x86_64.whl.metadata
Using cached protobuf-4.24.1-cp37-abi3-manylinux2014_x86_64.whl.metadata (540 bytes)
Collecting pyyaml (from ray>=2.5.1->vllm)
Obtaining dependency information for pyyaml from https://files.pythonhosted.org/packages/c8/6b/6600ac24725c7388255b2f5add93f91e58a5d7efaf4af244fdbcc11a541b/PyYAML-6.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached PyYAML-6.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting aiosignal (from ray>=2.5.1->vllm)
Using cached aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Collecting frozenlist (from ray>=2.5.1->vllm)
Obtaining dependency information for frozenlist from https://files.pythonhosted.org/packages/0b/36/c276486f89bee9098332710af2207344f360c6c6f104a4931a7566039b1d/frozenlist-1.4.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached frozenlist-1.4.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.2 kB)
Collecting requests (from ray>=2.5.1->vllm)
Obtaining dependency information for requests from https://files.pythonhosted.org/packages/70/8e/0e2d847013cb52cd35b38c009bb167a1a26b2ce6cd6965bf26b47bc0bf44/requests-2.31.0-py3-none-any.whl.metadata
Using cached requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting grpcio>=1.32.0 (from ray>=2.5.1->vllm)
Obtaining dependency information for grpcio>=1.32.0 from https://files.pythonhosted.org/packages/68/a8/7052e6a5c27159f080bb70fb8d8302c0e4bea148fb430acb57f83a8f2733/grpcio-1.57.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached grpcio-1.57.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.0 kB)
Collecting sympy (from torch>=2.0.0->vllm)
Using cached sympy-1.12-py3-none-any.whl (5.7 MB)
Collecting networkx (from torch>=2.0.0->vllm)
Using cached networkx-3.1-py3-none-any.whl (2.1 MB)
Collecting jinja2 (from torch>=2.0.0->vllm)
Using cached Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting nvidia-cuda-nvrtc-cu11==11.7.99 (from torch>=2.0.0->vllm)
Using cached nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
Collecting nvidia-cuda-runtime-cu11==11.7.99 (from torch>=2.0.0->vllm)
Using cached nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
Collecting nvidia-cuda-cupti-cu11==11.7.101 (from torch>=2.0.0->vllm)
Using cached nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)
Collecting nvidia-cudnn-cu11==8.5.0.96 (from torch>=2.0.0->vllm)
Using cached nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
Collecting nvidia-cublas-cu11==11.10.3.66 (from torch>=2.0.0->vllm)
Using cached nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
Collecting nvidia-cufft-cu11==10.9.0.58 (from torch>=2.0.0->vllm)
Using cached nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)
Collecting nvidia-curand-cu11==10.2.10.91 (from torch>=2.0.0->vllm)
Using cached nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)
Collecting nvidia-cusolver-cu11==11.4.0.1 (from torch>=2.0.0->vllm)
Using cached nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)
Collecting nvidia-cusparse-cu11==11.7.4.91 (from torch>=2.0.0->vllm)
Using cached nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)
Collecting nvidia-nccl-cu11==2.14.3 (from torch>=2.0.0->vllm)
Using cached nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB)
Collecting nvidia-nvtx-cu11==11.7.91 (from torch>=2.0.0->vllm)
Using cached nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB)
Collecting triton==2.0.0 (from torch>=2.0.0->vllm)
Using cached triton-2.0.0-1-cp38-cp38-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.2 MB)
Requirement already satisfied: setuptools in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=2.0.0->vllm) (68.0.0)
Requirement already satisfied: wheel in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=2.0.0->vllm) (0.38.4)
Collecting cmake (from triton==2.0.0->torch>=2.0.0->vllm)
Obtaining dependency information for cmake from https://files.pythonhosted.org/packages/2e/51/3a4672a819b4532a378bfefad8f886cfe71057556e0d4eefb64523fd370a/cmake-3.27.2-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata
Using cached cmake-3.27.2-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.7 kB)
Collecting lit (from triton==2.0.0->torch>=2.0.0->vllm)
Using cached lit-16.0.6-py3-none-any.whl
Collecting huggingface-hub<1.0,>=0.14.1 (from transformers>=4.31.0->vllm)
Obtaining dependency information for huggingface-hub<1.0,>=0.14.1 from https://files.pythonhosted.org/packages/7f/c4/adcbe9a696c135578cabcbdd7331332daad4d49b7c43688bc2d36b3a47d2/huggingface_hub-0.16.4-py3-none-any.whl.metadata
Using cached huggingface_hub-0.16.4-py3-none-any.whl.metadata (12 kB)
Collecting regex!=2019.12.17 (from transformers>=4.31.0->vllm)
Obtaining dependency information for regex!=2019.12.17 from https://files.pythonhosted.org/packages/1f/5c/374ac3fa3c7ed9a967ad273a5e841897ef6b10aa6aad938ff10717a3e2a3/regex-2023.8.8-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached regex-2023.8.8-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (40 kB)
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers>=4.31.0->vllm)
Using cached tokenizers-0.13.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
Collecting safetensors>=0.3.1 (from transformers>=4.31.0->vllm)
Obtaining dependency information for safetensors>=0.3.1 from https://files.pythonhosted.org/packages/4d/81/9b6ee8bd7faf7ae79afafde28b4e8abbcb897c9aa089d51eb5d0a1f3ffcd/safetensors-0.3.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached safetensors-0.3.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.5 kB)
Collecting tqdm>=4.27 (from transformers>=4.31.0->vllm)
Obtaining dependency information for tqdm>=4.27 from https://files.pythonhosted.org/packages/00/e5/f12a80907d0884e6dff9c16d0c0114d81b8cd07dc3ae54c5e962cc83037e/tqdm-4.66.1-py3-none-any.whl.metadata
Using cached tqdm-4.66.1-py3-none-any.whl.metadata (57 kB)
Collecting starlette<0.28.0,>=0.27.0 (from fastapi->vllm)
Obtaining dependency information for starlette<0.28.0,>=0.27.0 from https://files.pythonhosted.org/packages/58/f8/e2cca22387965584a409795913b774235752be4176d276714e15e1a58884/starlette-0.27.0-py3-none-any.whl.metadata
Using cached starlette-0.27.0-py3-none-any.whl.metadata (5.8 kB)
Collecting h11>=0.8 (from uvicorn->vllm)
Using cached h11-0.14.0-py3-none-any.whl (58 kB)
Collecting fsspec (from huggingface-hub<1.0,>=0.14.1->transformers>=4.31.0->vllm)
Obtaining dependency information for fsspec from https://files.pythonhosted.org/packages/e3/bd/4c0a4619494188a9db5d77e2100ab7d544a42e76b2447869d8e124e981d8/fsspec-2023.6.0-py3-none-any.whl.metadata
Using cached fsspec-2023.6.0-py3-none-any.whl.metadata (6.7 kB)
Collecting anyio<5,>=3.4.0 (from starlette<0.28.0,>=0.27.0->fastapi->vllm)
Obtaining dependency information for anyio<5,>=3.4.0 from https://files.pythonhosted.org/packages/19/24/44299477fe7dcc9cb58d0a57d5a7588d6af2ff403fdd2d47a246c91a3246/anyio-3.7.1-py3-none-any.whl.metadata
Using cached anyio-3.7.1-py3-none-any.whl.metadata (4.7 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch>=2.0.0->vllm)
Obtaining dependency information for MarkupSafe>=2.0 from https://files.pythonhosted.org/packages/de/e2/32c14301bb023986dff527a49325b6259cab4ebb4633f69de54af312fc45/MarkupSafe-2.1.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached MarkupSafe-2.1.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Requirement already satisfied: attrs>=22.2.0 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (23.1.0)
Requirement already satisfied: importlib-resources>=1.4.0 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (6.0.1)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (2023.7.1)
Requirement already satisfied: pkgutil-resolve-name>=1.3.10 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (1.3.10)
Requirement already satisfied: referencing>=0.28.4 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (0.30.2)
Requirement already satisfied: rpds-py>=0.7.1 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from jsonschema->ray>=2.5.1->vllm) (0.9.2)
Collecting charset-normalizer<4,>=2 (from requests->ray>=2.5.1->vllm)
Obtaining dependency information for charset-normalizer<4,>=2 from https://files.pythonhosted.org/packages/cb/e7/5e43745003bf1f90668c7be23fc5952b3a2b9c2558f16749411c18039b36/charset_normalizer-3.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached charset_normalizer-3.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (31 kB)
Collecting idna<4,>=2.5 (from requests->ray>=2.5.1->vllm)
Using cached idna-3.4-py3-none-any.whl (61 kB)
Collecting urllib3<3,>=1.21.1 (from requests->ray>=2.5.1->vllm)
Obtaining dependency information for urllib3<3,>=1.21.1 from https://files.pythonhosted.org/packages/9b/81/62fd61001fa4b9d0df6e31d47ff49cfa9de4af03adecf339c7bc30656b37/urllib3-2.0.4-py3-none-any.whl.metadata
Using cached urllib3-2.0.4-py3-none-any.whl.metadata (6.6 kB)
Collecting certifi>=2017.4.17 (from requests->ray>=2.5.1->vllm)
Obtaining dependency information for certifi>=2017.4.17 from https://files.pythonhosted.org/packages/4c/dd/2234eab22353ffc7d94e8d13177aaa050113286e93e7b40eae01fbf7c3d9/certifi-2023.7.22-py3-none-any.whl.metadata
Using cached certifi-2023.7.22-py3-none-any.whl.metadata (2.2 kB)
Collecting mpmath>=0.19 (from sympy->torch>=2.0.0->vllm)
Using cached mpmath-1.3.0-py3-none-any.whl (536 kB)
Collecting sniffio>=1.1 (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm)
Using cached sniffio-1.3.0-py3-none-any.whl (10 kB)
Collecting exceptiongroup (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi->vllm)
Obtaining dependency information for exceptiongroup from https://files.pythonhosted.org/packages/ad/83/b71e58666f156a39fb29417e4c8ca4bc7400c0dd4ed9e8842ab54dc8c344/exceptiongroup-1.1.3-py3-none-any.whl.metadata
Using cached exceptiongroup-1.1.3-py3-none-any.whl.metadata (6.1 kB)
Requirement already satisfied: zipp>=3.1.0 in /home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages (from importlib-resources>=1.4.0->jsonschema->ray>=2.5.1->vllm) (3.16.2)
Using cached pydantic-1.10.12-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
Using cached ray-2.6.3-cp38-cp38-manylinux2014_x86_64.whl (57.0 MB)
Using cached numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
Using cached transformers-4.31.0-py3-none-any.whl (7.4 MB)
Using cached xformers-0.0.21-cp38-cp38-manylinux2014_x86_64.whl (167.0 MB)
Using cached fastapi-0.101.1-py3-none-any.whl (65 kB)
Using cached uvicorn-0.23.2-py3-none-any.whl (59 kB)
Using cached click-8.1.7-py3-none-any.whl (97 kB)
Using cached grpcio-1.57.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB)
Using cached huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
Using cached protobuf-4.24.1-cp37-abi3-manylinux2014_x86_64.whl (311 kB)
Using cached PyYAML-6.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (736 kB)
Using cached regex-2023.8.8-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (774 kB)
Using cached safetensors-0.3.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
Using cached starlette-0.27.0-py3-none-any.whl (66 kB)
Using cached tqdm-4.66.1-py3-none-any.whl (78 kB)
Using cached typing_extensions-4.7.1-py3-none-any.whl (33 kB)
Using cached frozenlist-1.4.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (220 kB)
Using cached filelock-3.12.2-py3-none-any.whl (10 kB)
Using cached requests-2.31.0-py3-none-any.whl (62 kB)
Using cached anyio-3.7.1-py3-none-any.whl (80 kB)
Using cached certifi-2023.7.22-py3-none-any.whl (158 kB)
Using cached charset_normalizer-3.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (199 kB)
Using cached MarkupSafe-2.1.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Using cached urllib3-2.0.4-py3-none-any.whl (123 kB)
Using cached cmake-3.27.2-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.1 MB)
Using cached fsspec-2023.6.0-py3-none-any.whl (163 kB)
Using cached exceptiongroup-1.1.3-py3-none-any.whl (14 kB)
Building wheels for collected packages: vllm
Building wheel for vllm (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for vllm (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [134 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-38
creating build/lib.linux-x86_64-cpython-38/vllm
copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/config.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/logger.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/block.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/utils.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-38/vllm
creating build/lib.linux-x86_64-cpython-38/vllm/engine
copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
copying vllm/engine/ray_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
creating build/lib.linux-x86_64-cpython-38/vllm/model_executor
copying vllm/model_executor/weight_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
copying vllm/model_executor/model_loader.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
copying vllm/model_executor/input_metadata.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
creating build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
creating build/lib.linux-x86_64-cpython-38/vllm/entrypoints
copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
creating build/lib.linux-x86_64-cpython-38/vllm/worker
copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
creating build/lib.linux-x86_64-cpython-38/vllm/core
copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/core
copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-38/vllm/core
copying vllm/core/policy.py -> build/lib.linux-x86_64-cpython-38/vllm/core
copying vllm/core/block_manager.py -> build/lib.linux-x86_64-cpython-38/vllm/core
creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
copying vllm/model_executor/layers/attention.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
copying vllm/model_executor/parallel_utils/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
copying vllm/model_executor/parallel_utils/parallel_state.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
copying vllm/model_executor/parallel_utils/tensor_parallel/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
copying vllm/model_executor/parallel_utils/tensor_parallel/layers.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
copying vllm/model_executor/parallel_utils/tensor_parallel/utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
copying vllm/model_executor/parallel_utils/tensor_parallel/random.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
copying vllm/model_executor/parallel_utils/tensor_parallel/mappings.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
creating build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/baichuan.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
creating build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
running build_ext
Traceback (most recent call last):
File "/home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/bizon/miniconda3/envs/myenv/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
return _build_backend().build_wheel(wheel_directory, config_settings,
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 434, in build_wheel
return self._build_with_temp_dir(
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 419, in _build_with_temp_dir
self.run_setup()
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 341, in run_setup
exec(code, locals())
File "<string>", line 145, in <module>
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/__init__.py", line 107, in setup
return distutils.core.setup(**attrs)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1233, in run_command
super().run_command(command)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 349, in run
self.run_command("build")
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1233, in run_command
super().run_command(command)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run
self.run_command(cmd_name)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1233, in run_command
super().run_command(command)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 88, in run
_build_ext.run(self)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
self.build_extensions()
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 499, in build_extensions
_check_cuda_version(compiler_name, compiler_version)
File "/tmp/pip-build-env-ntqotulk/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 387, in _check_cuda_version
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
The detected CUDA version (12.0) mismatches the version that was used to compile
PyTorch (11.7). Please make sure to use the same CUDA versions.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for vllm
Failed to build vllm
ERROR: Could not build wheels for vllm, which is required to install pyproject.toml-based projects
Metadata
Metadata
Assignees
Labels
No labels