Skip to content

Huggingface error for models with arbitrary model name #103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
strangiato opened this issue Mar 31, 2025 · 2 comments
Closed

Huggingface error for models with arbitrary model name #103

strangiato opened this issue Mar 31, 2025 · 2 comments

Comments

@strangiato
Copy link

strangiato commented Mar 31, 2025

When executing guidellm against a vllm instance with an arbitrary model name set, guidellm errors out with a huggingface error that it can't access the tokenizer_config.json.

Duplicating the issue

Deploy a vllm instance with any model and set the following argument:

--served-model-name=my-model

Run a guidellm test against the endpoint:

guidellm \
  --target "http://localhost:8000/v1" \
  --model "my-model" \
  --data-type emulated \
  --data "prompt_tokens=512,generated_tokens=128"

Results

guidellm errors out with a 401 on the toeknizer_config.json for my-model since my-model isn't a valid huggingface model name.

requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/my-model/resolve/main/tokenizer_config.json

Stack Trace

The following is an example of a full stack trace of the error:

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/utils/_http.py", line 409, in hf_raise_for_status
    response.raise_for_status()
  File "/opt/app-root/lib64/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/granite/resolve/main/tokenizer_config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/utils/hub.py", line 424, in cached_files
    hf_hub_download(
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 961, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 1068, in _hf_hub_download_to_cache_dir
    _raise_on_head_call_error(head_call_error, force_download, local_files_only)
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 1596, in _raise_on_head_call_error
    raise head_call_error
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 1484, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 1401, in get_hf_file_metadata
    r = _request_wrapper(
        ^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 285, in _request_wrapper
    response = _request_wrapper(
               ^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 309, in _request_wrapper
    hf_raise_for_status(response)
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/utils/_http.py", line 459, in hf_raise_for_status
    raise _format(RepositoryNotFoundError, message, response) from e
huggingface_hub.errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-67eb10fa-305b679009abf7055fe388ff;30339271-9f91-49ff-8324-c347a6b5da16)

Repository Not Found for url: https://huggingface.co/granite/resolve/main/tokenizer_config.json.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication
Invalid username or password.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.11/site-packages/guidellm/main.py", line 239, in generate_benchmark_report
    tokenizer_inst = backend_inst.model_tokenizer()
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/guidellm/backend/base.py", line 173, in model_tokenizer
    return AutoTokenizer.from_pretrained(self.model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 910, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 742, in get_tokenizer_config
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/utils/hub.py", line 266, in cached_file
    file = cached_files(path_or_repo_id=path_or_repo_id, filenames=[filename], **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/utils/hub.py", line 456, in cached_files
    raise EnvironmentError(
OSError: granite is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/app-root/bin/guidellm", line 8, in <module>
    sys.exit(generate_benchmark_report_cli())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/guidellm/main.py", line 171, in generate_benchmark_report_cli
    generate_benchmark_report(
  File "/opt/app-root/lib64/python3.11/site-packages/guidellm/main.py", line 241, in generate_benchmark_report
    raise ValueError(
ValueError: Could not load model's tokenizer, --tokenizer must be provided for request generation

Why is this important

OpenShift AI sets the --served-model-name argument to the name of the ServingRuntime the user provides when they are deploying a vLLM instance and does not use the actual huggingface model name. Any model deployed with OpenShift AI will not be able to be load tested with guidellm unless the user knows how to customize the --served-model-name argument and that they need to set it to the correct huggingface name.

@sjmonson
Copy link
Collaborator

sjmonson commented Apr 1, 2025

You need to pass the field --tokenizer as a path to the on-disk model or the name of the model on huggingface. Guidellm needs access to the model's tokenizer for the "emulated" data mode since its generating token sequences.

@markurtz
Copy link
Member

As @sjmonson mentioned, passing the --tokenizer argument (now --processor) on main will enable you to work around it. The processor/tokenizer is needed for synthetic data generation to ensure the token counts are correct for the prompts to send.

With #96 landing, now the processor is only invoked as needed. So, another work around is to pass in a dataset either as an HF dataset or a txt/csv/jsonl file which will use the text stored within as the prompts and not require the processor to count tokens.

Closing this out, but feel free to reping if you hit any issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants