Skip to content

[Bug]: ValueError: Model architectures ['LlamaForCausalLM'] failed to be inspected #11715

@npanpaliya

Description

@npanpaliya

Your current environment

Environment:
OS/Arch - IBM linux powerpc64le ubi9 OCP cluster
vLLM version - 0.6.6
torch: 2.5.1
torchvision: 0.20.1

Model Input Dumps

None

🐛 Describe the bug

While deploying a model through kserve inference service that uses vLLM serving runtime, I'm getting below error -

[root@bastion-0 ~]# oc logs tinyllama-predictor-6f7ccc8d86-dfj5r -c kserve-container
INFO 01-03 09:38:16 api_server.py:712] vLLM API server version 0.6.6.post2.dev29+g6036d4db.d20250103
INFO 01-03 09:38:16 api_server.py:713] args: Namespace(host=None, port=8080, uvicorn_log_level='info', allow_credentials=False, allowed_origins=['*'], allowed_methods=['*'], allowed_headers=['*'], api_key=None, lora_modules=None, prompt_adapters=None, chat_template=None, chat_template_content_format='auto', response_role='assistant', ssl_keyfile=None, ssl_certfile=None, ssl_ca_certs=None, ssl_cert_reqs=0, root_path=None, middleware=[], return_tokens_as_token_ids=False, disable_frontend_multiprocessing=False, enable_request_id_headers=False, enable_auto_tool_choice=False, tool_call_parser=None, tool_parser_plugin='', model='/mnt/models', task='auto', tokenizer=None, skip_tokenizer_init=False, revision=None, code_revision=None, tokenizer_revision=None, tokenizer_mode='auto', trust_remote_code=False, allowed_local_media_path=None, download_dir=None, load_format='auto', config_format=<ConfigFormat.AUTO: 'auto'>, dtype='auto', kv_cache_dtype='auto', quantization_param_path=None, max_model_len=None, guided_decoding_backend='xgrammar', logits_processor_pattern=None, distributed_executor_backend=None, worker_use_ray=False, pipeline_parallel_size=1, tensor_parallel_size=1, max_parallel_loading_workers=None, ray_workers_use_nsight=False, block_size=None, enable_prefix_caching=None, disable_sliding_window=False, use_v2_block_manager=True, num_lookahead_slots=0, seed=0, swap_space=4, cpu_offload_gb=0, gpu_memory_utilization=0.9, num_gpu_blocks_override=None, max_num_batched_tokens=None, max_num_seqs=None, max_logprobs=20, disable_log_stats=False, quantization=None, rope_scaling=None, rope_theta=None, hf_overrides=None, enforce_eager=False, max_seq_len_to_capture=8192, disable_custom_all_reduce=False, tokenizer_pool_size=0, tokenizer_pool_type='ray', tokenizer_pool_extra_config=None, limit_mm_per_prompt=None, mm_processor_kwargs=None, disable_mm_preprocessor_cache=False, enable_lora=False, enable_lora_bias=False, max_loras=1, max_lora_rank=16, lora_extra_vocab_size=256, lora_dtype='auto', long_lora_scaling_factors=None, max_cpu_loras=None, fully_sharded_loras=False, enable_prompt_adapter=False, max_prompt_adapters=1, max_prompt_adapter_token=0, device='auto', num_scheduler_steps=1, multi_step_stream_outputs=True, scheduler_delay_factor=0.0, enable_chunked_prefill=None, speculative_model=None, speculative_model_quantization=None, num_speculative_tokens=None, speculative_disable_mqa_scorer=False, speculative_draft_tensor_parallel_size=None, speculative_max_model_len=None, speculative_disable_by_batch_size=None, ngram_prompt_lookup_max=None, ngram_prompt_lookup_min=None, spec_decoding_acceptance_method='rejection_sampler', typical_acceptance_sampler_posterior_threshold=None, typical_acceptance_sampler_posterior_alpha=None, disable_logprobs_during_spec_decoding=None, model_loader_extra_config=None, ignore_patterns=[], preemption_mode=None, served_model_name=['tinyllama'], qlora_adapter_name_or_path=None, otlp_traces_endpoint=None, collect_detailed_traces=None, disable_async_output_proc=False, scheduling_policy='fcfs', override_neuron_config=None, override_pooler_config=None, compilation_config=None, kv_transfer_config=None, worker_cls='auto', generation_config=None, disable_log_requests=False, max_log_len=None, disable_fastapi_docs=False, enable_prompt_tokens_details=False)
INFO 01-03 09:38:16 api_server.py:199] Started engine process with PID 17
ERROR 01-03 09:38:23 registry.py:293] Error in inspecting model architecture 'LlamaForCausalLM'
ERROR 01-03 09:38:23 registry.py:293] Traceback (most recent call last):
ERROR 01-03 09:38:23 registry.py:293]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 488, in _run_in_subprocess
ERROR 01-03 09:38:23 registry.py:293]     returned.check_returncode()
ERROR 01-03 09:38:23 registry.py:293]   File "/opt/conda/lib/python3.11/subprocess.py", line 502, in check_returncode
ERROR 01-03 09:38:23 registry.py:293]     raise CalledProcessError(self.returncode, self.args, self.stdout,
ERROR 01-03 09:38:23 registry.py:293] subprocess.CalledProcessError: Command '['/opt/conda/bin/python3', '-m', 'vllm.model_executor.models.registry']' died with <Signals.SIGILL: 4>.
ERROR 01-03 09:38:23 registry.py:293] 
ERROR 01-03 09:38:23 registry.py:293] The above exception was the direct cause of the following exception:
ERROR 01-03 09:38:23 registry.py:293] 
ERROR 01-03 09:38:23 registry.py:293] Traceback (most recent call last):
ERROR 01-03 09:38:23 registry.py:293]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 291, in _try_inspect_model_cls
ERROR 01-03 09:38:23 registry.py:293]     return model.inspect_model_cls()
ERROR 01-03 09:38:23 registry.py:293]            ^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:23 registry.py:293]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 263, in inspect_model_cls
ERROR 01-03 09:38:23 registry.py:293]     return _run_in_subprocess(
ERROR 01-03 09:38:23 registry.py:293]            ^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:23 registry.py:293]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 491, in _run_in_subprocess
ERROR 01-03 09:38:23 registry.py:293]     raise RuntimeError(f"Error raised in subprocess:\n"
ERROR 01-03 09:38:23 registry.py:293] RuntimeError: Error raised in subprocess:
ERROR 01-03 09:38:23 registry.py:293] <frozen runpy>:128: RuntimeWarning: 'vllm.model_executor.models.registry' found in sys.modules after import of package 'vllm.model_executor.models', but prior to execution of 'vllm.model_executor.models.registry'; this may result in unpredictable behaviour
ERROR 01-03 09:38:23 registry.py:293] 
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/entrypoints/openai/api_server.py", line 783, in <module>
    uvloop.run(run_server(args))
  File "/opt/conda/lib/python3.11/site-packages/uvloop/__init__.py", line 105, in run
    return runner.run(wrapper())
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
  File "/opt/conda/lib/python3.11/site-packages/uvloop/__init__.py", line 61, in wrapper
    return await main
           ^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/entrypoints/openai/api_server.py", line 749, in run_server
    async with build_async_engine_client(args) as engine_client:
  File "/opt/conda/lib/python3.11/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/entrypoints/openai/api_server.py", line 118, in build_async_engine_client
    async with build_async_engine_client_from_engine_args(
  File "/opt/conda/lib/python3.11/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/entrypoints/openai/api_server.py", line 210, in build_async_engine_client_from_engine_args
    engine_config = engine_args.create_engine_config()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/arg_utils.py", line 1044, in create_engine_config
    model_config = self.create_model_config()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/arg_utils.py", line 970, in create_model_config
    return ModelConfig(
           ^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/config.py", line 343, in __init__
    self.multimodal_config = self._init_multimodal_config(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/config.py", line 398, in _init_multimodal_config
    if ModelRegistry.is_multimodal_model(architectures):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 427, in is_multimodal_model
    model_cls, _ = self.inspect_model_cls(architectures)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 387, in inspect_model_cls
    return self._raise_for_unsupported(architectures)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 344, in _raise_for_unsupported
    raise ValueError(
ValueError: Model architectures ['LlamaForCausalLM'] failed to be inspected. Please check the logs for more details.
ERROR 01-03 09:38:29 registry.py:293] Error in inspecting model architecture 'LlamaForCausalLM'
ERROR 01-03 09:38:29 registry.py:293] Traceback (most recent call last):
ERROR 01-03 09:38:29 registry.py:293]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 488, in _run_in_subprocess
ERROR 01-03 09:38:29 registry.py:293]     returned.check_returncode()
ERROR 01-03 09:38:29 registry.py:293]   File "/opt/conda/lib/python3.11/subprocess.py", line 502, in check_returncode
ERROR 01-03 09:38:29 registry.py:293]     raise CalledProcessError(self.returncode, self.args, self.stdout,
ERROR 01-03 09:38:29 registry.py:293] subprocess.CalledProcessError: Command '['/opt/conda/bin/python3', '-m', 'vllm.model_executor.models.registry']' died with <Signals.SIGILL: 4>.
ERROR 01-03 09:38:29 registry.py:293] 
ERROR 01-03 09:38:29 registry.py:293] The above exception was the direct cause of the following exception:
ERROR 01-03 09:38:29 registry.py:293] 
ERROR 01-03 09:38:29 registry.py:293] Traceback (most recent call last):
ERROR 01-03 09:38:29 registry.py:293]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 291, in _try_inspect_model_cls
ERROR 01-03 09:38:29 registry.py:293]     return model.inspect_model_cls()
ERROR 01-03 09:38:29 registry.py:293]            ^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:29 registry.py:293]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 263, in inspect_model_cls
ERROR 01-03 09:38:29 registry.py:293]     return _run_in_subprocess(
ERROR 01-03 09:38:29 registry.py:293]            ^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:29 registry.py:293]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 491, in _run_in_subprocess
ERROR 01-03 09:38:29 registry.py:293]     raise RuntimeError(f"Error raised in subprocess:\n"
ERROR 01-03 09:38:29 registry.py:293] RuntimeError: Error raised in subprocess:
ERROR 01-03 09:38:29 registry.py:293] <frozen runpy>:128: RuntimeWarning: 'vllm.model_executor.models.registry' found in sys.modules after import of package 'vllm.model_executor.models', but prior to execution of 'vllm.model_executor.models.registry'; this may result in unpredictable behaviour
ERROR 01-03 09:38:29 registry.py:293] 
ERROR 01-03 09:38:29 engine.py:366] Model architectures ['LlamaForCausalLM'] failed to be inspected. Please check the logs for more details.
ERROR 01-03 09:38:29 engine.py:366] Traceback (most recent call last):
ERROR 01-03 09:38:29 engine.py:366]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/multiprocessing/engine.py", line 357, in run_mp_engine
ERROR 01-03 09:38:29 engine.py:366]     engine = MQLLMEngine.from_engine_args(engine_args=engine_args,
ERROR 01-03 09:38:29 engine.py:366]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:29 engine.py:366]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/multiprocessing/engine.py", line 114, in from_engine_args
ERROR 01-03 09:38:29 engine.py:366]     engine_config = engine_args.create_engine_config(usage_context)
ERROR 01-03 09:38:29 engine.py:366]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:29 engine.py:366]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/arg_utils.py", line 1044, in create_engine_config
ERROR 01-03 09:38:29 engine.py:366]     model_config = self.create_model_config()
ERROR 01-03 09:38:29 engine.py:366]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:29 engine.py:366]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/arg_utils.py", line 970, in create_model_config
ERROR 01-03 09:38:29 engine.py:366]     return ModelConfig(
ERROR 01-03 09:38:29 engine.py:366]            ^^^^^^^^^^^^
ERROR 01-03 09:38:29 engine.py:366]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/config.py", line 343, in __init__
ERROR 01-03 09:38:29 engine.py:366]     self.multimodal_config = self._init_multimodal_config(
ERROR 01-03 09:38:29 engine.py:366]                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:29 engine.py:366]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/config.py", line 398, in _init_multimodal_config
ERROR 01-03 09:38:29 engine.py:366]     if ModelRegistry.is_multimodal_model(architectures):
ERROR 01-03 09:38:29 engine.py:366]        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:29 engine.py:366]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 427, in is_multimodal_model
ERROR 01-03 09:38:29 engine.py:366]     model_cls, _ = self.inspect_model_cls(architectures)
ERROR 01-03 09:38:29 engine.py:366]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:29 engine.py:366]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 387, in inspect_model_cls
ERROR 01-03 09:38:29 engine.py:366]     return self._raise_for_unsupported(architectures)
ERROR 01-03 09:38:29 engine.py:366]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-03 09:38:29 engine.py:366]   File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 344, in _raise_for_unsupported
ERROR 01-03 09:38:29 engine.py:366]     raise ValueError(
ERROR 01-03 09:38:29 engine.py:366] ValueError: Model architectures ['LlamaForCausalLM'] failed to be inspected. Please check the logs for more details.
Process SpawnProcess-1:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/opt/conda/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/multiprocessing/engine.py", line 368, in run_mp_engine
    raise e
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/multiprocessing/engine.py", line 357, in run_mp_engine
    engine = MQLLMEngine.from_engine_args(engine_args=engine_args,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/multiprocessing/engine.py", line 114, in from_engine_args
    engine_config = engine_args.create_engine_config(usage_context)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/arg_utils.py", line 1044, in create_engine_config
    model_config = self.create_model_config()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/engine/arg_utils.py", line 970, in create_model_config
    return ModelConfig(
           ^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/config.py", line 343, in __init__
    self.multimodal_config = self._init_multimodal_config(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/config.py", line 398, in _init_multimodal_config
    if ModelRegistry.is_multimodal_model(architectures):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 427, in is_multimodal_model
    model_cls, _ = self.inspect_model_cls(architectures)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 387, in inspect_model_cls
    return self._raise_for_unsupported(architectures)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/vllm-0.6.6.post2.dev29+g6036d4db.d20250103.cpu-py3.11-linux-ppc64le.egg/vllm/model_executor/models/registry.py", line 344, in _raise_for_unsupported
    raise ValueError(
ValueError: Model architectures ['LlamaForCausalLM'] failed to be inspected. Please check the logs for more details

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleOver 90 days of inactivity

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions