Skip to content

[Bug]: assortment of warnings / errors coming out of vllm basic python inference script #18634

@vadimkantorov

Description

@vadimkantorov

Your current environment

Versions:

>>> import torch; torch.__version__                                                                                                                                       
'2.7.0+cu126'
>>> import transformers; transformers.__version__                                                                                                                         
'4.52.2'
>>> import vllm; vllm.__version__                                                                                                                                         
'0.9.1.dev59+gb6a6e7a52'

🐛 Describe the bug

My script is really basic: preparing model input IDs using tokenizers, constructing vllm.LLM and then invoking .generate(...) once

But I'm somehow getting a bunch of different nasty errors and warnings. Are they expected? Is it possible to eliminate them?

Thanks!

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

[W523 21:36:39.228028808 TCPStore.cpp:125] [c10d] recvValue failed on SocketImpl(fd=170, addr=[localhost]:60438, remote=[localhost]:50427): failed to recv, got 0 bytes
Exception raised from recvBytes at /pytorch/torch/csrc/distributed/c10d/Utils.hpp:678 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x98 (0x7f4a777785e8 in /mnt/fs/venv_cu126_py312/lib/python3.12/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x5ba8afe (0x7f4a6065aafe in /mnt/fs/venv_cu126_py312/lib/python3.12/site-packages/torch/lib/libtorch_cpu.so)
frame #2: <unknown function> + 0x5baae40 (0x7f4a6065ce40 in /mnt/fs/venv_cu126_py312/lib/python3.12/site-packages/torch/lib/libtorch_cpu.so)
frame #3: <unknown function> + 0x5bab74a (0x7f4a6065d74a in /mnt/fs/venv_cu126_py312/lib/python3.12/site-packages/torch/lib/libtorch_cpu.so)
frame #4: c10d::TCPStore::check(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) + 0x2a9 (0x7f4a606571a9 in /mnt/fs/venv_cu126_py312/lib/python3.12/site-packages/torch/lib/libtorch_cpu.so)
frame #5: c10d::ProcessGroupNCCL::heartbeatMonitor() + 0x379 (0x7f4a1d8509a9 in /mnt/fs/venv_cu126_py312/lib/python3.12/site-packages/torch/lib/libtorch_cuda.so)
frame #6: <unknown function> + 0xdc253 (0x7f4b1dda3253 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #7: <unknown function> + 0x94ac3 (0x7f4b238e1ac3 in /lib/x86_64-linux-gnu/libc.so.6)
frame #8: <unknown function> + 0x126850 (0x7f4b23973850 in /lib/x86_64-linux-gnu/libc.so.6)

[W523 21:36:40.236201644 ProcessGroupNCCL.cpp:1659] [PG ID 0 PG GUID 0 Rank 3] Failed to check the "should dump" flag on TCPStore, (maybe TCPStore server has shut down too early), with error: failed to recv, got 0 bytes
[rank6]:[W523 21:36:40.236202281 ProcessGroupNCCL.cpp:1659] [PG ID 0 PG GUID 0 Rank 6] Failed to check the "should dump" flag on TCPStore, (maybe TCPStore server has shut down too early), with error: failed to recv, got 0 bytes

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions