-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Description
/vllm_2$ python examples/phi3v_example.py
WARNING 06-21 14:53:06 ray_utils.py:46] Failed to import Ray with ModuleNotFoundError("No module named 'ray'"). For multi-node inference, please install Ray with pip install ray
.
/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download
is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True
.
warnings.warn(
INFO 06-21 14:53:08 llm_engine.py:164] Initializing an LLM engine (v0.5.0.post1) with config: model='microsoft/Phi-3-vision-128k-instruct', speculative_config=None, tokenizer='microsoft/Phi-3-vision-128k-instruct', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=8128, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cpu, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None), seed=0, served_model_name=microsoft/Phi-3-vision-128k-instruct)
WARNING 06-21 14:53:10 cpu_executor.py:116] CUDA graph is not supported on CPU, fallback to the eager mode.
WARNING 06-21 14:53:10 cpu_executor.py:143] Environment variable VLLM_CPU_KVCACHE_SPACE (GB) for CPU backend is not set, using 4 by default.
INFO 06-21 14:53:10 selector.py:113] Cannot use _Backend.FLASH_ATTN backend on CPU.
INFO 06-21 14:53:10 selector.py:64] Using Torch SDPA backend.
INFO 06-21 14:53:11 selector.py:113] Cannot use _Backend.FLASH_ATTN backend on CPU.
INFO 06-21 14:53:11 selector.py:64] Using Torch SDPA backend.
INFO 06-21 14:53:12 weight_utils.py:218] Using model weights format ['*.safetensors']
INFO 06-21 14:53:14 cpu_executor.py:72] # CPU blocks: 682
Processed prompts: 0%| | 0/1 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s][rank0]: Traceback (most recent call last):
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/examples/phi3v_example.py", line 58, in
[rank0]: run_phi3v()
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/examples/phi3v_example.py", line 31, in run_phi3v
[rank0]: outputs = llm.generate(
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/utils.py", line 727, in inner
[rank0]: return fn(*args, **kwargs)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/entrypoints/llm.py", line 304, in generate
[rank0]: outputs = self._run_engine(use_tqdm=use_tqdm)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/entrypoints/llm.py", line 556, in _run_engine
[rank0]: step_outputs = self.llm_engine.step()
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/engine/llm_engine.py", line 806, in step
[rank0]: output = self.model_executor.execute_model(
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/executor/cpu_executor.py", line 78, in execute_model
[rank0]: output = self.driver_worker.execute_model(execute_model_req)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/worker/cpu_worker.py", line 302, in execute_model
[rank0]: output = self.model_runner.execute_model(seq_group_metadata_list,
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/worker/cpu_model_runner.py", line 337, in execute_model
[rank0]: ) = self.prepare_input_tensors(seq_group_metadata_list)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/worker/cpu_model_runner.py", line 287, in prepare_input_tensors
[rank0]: ) = self._prepare_prompt(seq_group_metadata_list)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/worker/cpu_model_runner.py", line 132, in _prepare_prompt
[rank0]: mm_kwargs = self.multi_modal_input_processor(mm_data)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/multimodal/registry.py", line 142, in process_input
[rank0]: .process_input(data, model_config, vlm_config)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/multimodal/base.py", line 126, in process_input
[rank0]: return processor(data, model_config, vlm_config)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/multimodal/image.py", line 109, in _default_input_processor
[rank0]: image_processor = self._get_hf_image_processor(
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/multimodal/image.py", line 97, in _get_hf_image_processor
[rank0]: return cached_get_image_processor(
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/vllm-0.5.0.post1+cpu-py3.10-linux-x86_64.egg/vllm/transformers_utils/image_processor.py", line 21, in get_image_processor
[rank0]: processor: BaseImageProcessor = AutoImageProcessor.from_pretrained(
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py", line 398, in from_pretrained
[rank0]: image_processor_class = get_class_from_dynamic_module(
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 501, in get_class_from_dynamic_module
[rank0]: final_module = get_cached_module_file(
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 326, in get_cached_module_file
[rank0]: modules_needed = check_imports(resolved_module_file)
[rank0]: File "/home/sapidblue/SapidBlue/invoice_data_extraction/vllm_2/venv/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 181, in check_imports
[rank0]: raise ImportError(
[rank0]: ImportError: This modeling file requires the following packages that were not found in your environment: torchvision. Run pip install torchvision
Processed prompts: 0%| | 0/1 [00:01<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]