Skip to content

Conversation

david6666666
Copy link
Contributor

@david6666666 david6666666 commented Jul 31, 2025

Essential Elements of an Effective PR Description Checklist

  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Fix hermes tool parser handling of non-string argument types.

One example of the situation when argument type is integer is described in #21372.

Test Plan

Test the issue described in #21372.

Serving command:

vllm serve /workspace/models/Qwen3-4B     --reasoning-parser qwen3 --served-model-name 'Qwen/Qwen3-4B' --enable-auto-tool-choice --tool-call-parser hermes

Code:

import json

from openai import OpenAI

openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

messages = [
    {
        "role":
        "system",
        "content":
        "You are an artificial intelligence assistant who will call tools everytime when responding.",
    },
    {
        "role":
        "user",
        "content":
        "Hi! Do you have any detailed information about the product id 7355608 and inserted true?",
    },
]
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_product_info",
            "description":
            "Get detailed information of a product based on its product ID.",
            "parameters": {
                "type": "object",
                "properties": {
                    "inserted": {
                        "type": "boolean",
                        "description": "inserted.",
                    },
                    "product_id": {
                        "type": "integer",
                        "description": "The product ID of the product.",
                    },
                },
                "required": ["product_id", "inserted"],
            },
        },
    },
]
use_stream = True
model = client.models.list().data[0].id
chat_completion = client.chat.completions.create(
    stream=use_stream,
    messages=messages,
    top_p=0.95,
    temperature=0.66,
    presence_penalty=0,
    frequency_penalty=0.04,
    model=model,
    tools=tools,
    extra_body={
        "top_k": 20,
        "repetition_penalty": 1.05,
        "chat_template_kwargs": {
            "enable_thinking": False
        },
    },
)

debug_list = list()
if use_stream:
    print("Tool call args:")
    for c in chat_completion:
        if c.choices[0].delta.tool_calls:
            print(c.choices[0].delta.tool_calls[0].function.arguments, end="")
        debug_list.append(c)
else:
    print("Chat completion tool calls:")
    print(chat_completion.choices[0].message.tool_calls)
    print(chat_completion.choices[0].message.tool_calls[0].function.arguments)
print("\n")

Test Result

use_stream = True

Tool call args:
None{"product_id": 7355608, "inserted": true}

use_stream = False

Chat completion tool calls:
[ChatCompletionMessageFunctionToolCall(id='chatcmpl-tool-67eb1b45d3b3474bae4bfde6c2fe20a2', function=Function(arguments='{"product_id": 7355608, "inserted": true}', name='get_product_info'), type='function')]
{"product_id": 7355608, "inserted": true}

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug in the Hermes tool parser where it failed to handle non-string argument types during streaming. The change correctly identifies whether the last argument is a string by inspecting the JSON string representation and adjusts how the string is processed. My review includes one suggestion to enhance the readability and maintainability of this critical but complex logic.

Comment on lines 326 to 329
stripped_cur_arguments_json = cur_arguments_json[:-2] \
if (cur_arguments_json[-2] == '"'
or cur_arguments_json[-2] == "'") else \
cur_arguments_json[:-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While this logic correctly fixes the issue with non-string arguments, the ternary expression with a backslash for line continuation is a bit dense and can be hard to read and maintain.

For better clarity and maintainability, I suggest refactoring this into a standard if/else block. This makes the logic more explicit and easier to understand at a glance, which is valuable for complex string manipulations like this.

if cur_arguments_json[-2] in ('"', "'"):
    # Last argument is a string, so remove the closing quote and brace.
    stripped_cur_arguments_json = cur_arguments_json[:-2]
else:
    # Last argument is not a string, so remove the closing brace only.
    stripped_cur_arguments_json = cur_arguments_json[:-1]

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@david6666666
Copy link
Contributor Author

@DarkLight1337 @chaunceyjiang please review, thanks

@david6666666
Copy link
Contributor Author

@aarnphm please review, thanks

Copy link
Collaborator

@chaunceyjiang chaunceyjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some unit tests?

@chaunceyjiang
Copy link
Collaborator

I used the test script you provided and the latest code from the main branch, but I couldn’t reproduce the issue you described.

Tool call args:
ChoiceDeltaToolCall(index=0, id='chatcmpl-tool-b843a3616a2a46ecb1a9dc2434713c3c', function=ChoiceDeltaToolCallFunction(arguments=None, name='get_product_info'), type='function')
ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"product_id": 735', name=None), type=None)
ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='6', name=None), type=None)
ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='0', name=None), type=None)
ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='8', name=None), type=None)
ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='}', name=None), type=None)

@BruceW-07
Copy link
Contributor

BruceW-07 commented Aug 5, 2025

I used the test script you provided and the latest code from the main branch, but I couldn’t reproduce the issue you described.

Tool call args:
ChoiceDeltaToolCall(index=0, id='chatcmpl-tool-b843a3616a2a46ecb1a9dc2434713c3c', function=ChoiceDeltaToolCallFunction(arguments=None, name='get_product_info'), type='function')
ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"product_id": 735', name=None), type=None)
ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='6', name=None), type=None)
ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='0', name=None), type=None)
ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='8', name=None), type=None)
ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='}', name=None), type=None)

I think this is exactly the result described in the issue. The correct result should be "product_id": 7355608 (the number 5 should appear twice in the id)

@chaunceyjiang
Copy link
Collaborator

[Bugfix] Fix hermes tool parser handling of non-string argument types

@BruceW-07 My question is: in your PR description, you mentioned that the Hermes tool parser cannot handle non-string argument types, but based on my tests, "product_id": 735608 is a non-string argument and it seems to be processed correctly.

@chaunceyjiang
Copy link
Collaborator

I think this is exactly the result described in the issue. The correct result should be "product_id": 7355608 (the number 5 should appear twice in the id)

Do you mean that when the argument type is non-string, the Hermes tool parser unexpectedly removes some characters?

@BruceW-07
Copy link
Contributor

BruceW-07 commented Aug 5, 2025

I think this is exactly the result described in the issue. The correct result should be "product_id": 7355608 (the number 5 should appear twice in the id)

Do you mean that when the argument type is non-string, the Hermes tool parser unexpectedly removes some characters?

Yes, and according to my observation, the original code did not take into account the case where the argument is non-string, which resulted in some characters being unexpectedly missing in the result.

stripped_cur_arguments_json = cur_arguments_json[:-2]
else:
# last argument is not a string,
# so remove the closing brace only.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an example in the comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I added a simple example

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks~

# validate arguments
streamed_args = json.loads(function_args_str)
assert isinstance(streamed_args, dict)
assert isinstance(streamed_args.get("product_id"), int)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write another test with a bool parameter?

I believe bool types may also encounter the truncation issue with the tool parser.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I add a new Boolean argument to the function

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

messages = [
    {
        "role":
        "system",
        "content":
        "You are an artificial intelligence assistant who will call tools everytime when responding.",
    },
    {
        "role":
        "user",
        "content":
        "Hi! Do you have any detailed information about the product id 7355608 and inserted true?",
    },
]
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_product_info",
            "description":
            "Get detailed information of a product based on its product ID.",
            "parameters": {
                "type": "object",
                "properties": {
                    "inserted": {
                        "type": "boolean",
                        "description": "inserted.",
                    },
                    "product_id": {
                        "type": "integer",
                        "description": "The product ID of the product.",
                    },
                },
                "required": ["product_id", "inserted"],
            },
        },
    },
]
use_stream = True
model = client.models.list().data[0].id
chat_completion = client.chat.completions.create(
    stream=use_stream,
    messages=messages,
    top_p=0.95,
    temperature=0.66,
    presence_penalty=0,
    frequency_penalty=0.04,
    model=model,
    tools=tools,
    extra_body={
        "top_k": 20,
        "repetition_penalty": 1.05,
        "chat_template_kwargs": {
            "enable_thinking": False
        },
    },
)

I can reproduce the issue of the boolean value being truncated using this example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've added it to the unit test

@din0s
Copy link

din0s commented Aug 5, 2025

Here's another example which might be good to test for:

{
    "type": "function",
    "function": {
        "name": "search",
        "parameters": {
            "type": "object",
            "properties": {
                "search_request": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string"
                        },
                        "retrieval_method": {
                            "enum": ["keyword", "neural", "rrf"],
                            "type": "string"
                        }
                    },
                    "required": ["query", "retrieval_method"]
                }
            },
            "required": ["search_request"]
        }
    }
}

Expected:
{"search_request": {"query": "latest transformers papers", "retrieval_method": "rrf"}}
Actual:
{"search_request": "latest transformers papers", "retrieval_method": "rrf"}

@artmzhuk
Copy link

artmzhuk commented Aug 6, 2025

Hi! Is there an ETA for this bugfix?

@BruceW-07
Copy link
Contributor

Here's another example which might be good to test for:

{
    "type": "function",
    "function": {
        "name": "search",
        "parameters": {
            "type": "object",
            "properties": {
                "search_request": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string"
                        },
                        "retrieval_method": {
                            "enum": ["keyword", "neural", "rrf"],
                            "type": "string"
                        }
                    },
                    "required": ["query", "retrieval_method"]
                }
            },
            "required": ["search_request"]
        }
    }
}

Expected: {"search_request": {"query": "latest transformers papers", "retrieval_method": "rrf"}} Actual: {"search_request": "latest transformers papers", "retrieval_method": "rrf"}

Thanks for your example, I've fixed the issue and now I can get the correct result with Qwen3-4B. I haven't added it to the unit test yet, because the tool_chat_template_hermes.jinja used in the test seems to have problem dealing with the example you provided.

@BruceW-07
Copy link
Contributor

@aarnphm @chaunceyjiang please review, thanks!

@mergify mergify bot added ci/build deepseek Related to DeepSeek models llama Related to Llama models multi-modality Related to multi-modality (#4194) gpt-oss Related to GPT-OSS models speculative-decoding v1 labels Sep 19, 2025
@mergify mergify bot added the tpu Related to Google TPUs label Sep 19, 2025
@mergify mergify bot removed the tpu Related to Google TPUs label Sep 19, 2025
Signed-off-by: David Chen <[email protected]>
Signed-off-by: David Chen <[email protected]>
r'\{"name":\s*"' +
re.escape(function_name) + r'"\s*,\s*"arguments":\s*(.*)',
tool_call_portion.strip(), re.DOTALL)
cur_arguments_json = match.group(1)
Copy link
Contributor

@gcalmettes gcalmettes Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might still be a need to check if there is a match, as the match could be None for string arguments:

(APIServer pid=1) DEBUG 09-19 07:20:26 [entrypoints/.../tool_parsers/hermes_tool_parser.py:344] diffing old arguments: {}
(APIServer pid=1) DEBUG 09-19 07:20:26 [entrypoints/.../tool_parsers/hermes_tool_parser.py:345] against new ones: {'name': '263012.pdf'}
(APIServer pid=1) ERROR 09-19 07:20:26 [entrypoints/.../tool_parsers/hermes_tool_parser.py:441] Error trying to handle streaming tool call.
(APIServer pid=1) ERROR 09-19 07:20:26 [entrypoints/.../tool_parsers/hermes_tool_parser.py:441] Traceback (most recent call last):
(APIServer pid=1) ERROR 09-19 07:20:26 [entrypoints/.../tool_parsers/hermes_tool_parser.py:441]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 375, in extract_tool_calls_streaming
(APIServer pid=1) ERROR 09-19 07:20:26 [entrypoints/.../tool_parsers/hermes_tool_parser.py:441]     cur_arguments_json = match.group(1)
(APIServer pid=1) ERROR 09-19 07:20:26 [entrypoints/.../tool_parsers/hermes_tool_parser.py:441]                          ^^^^^^^^^^^
(APIServer pid=1) ERROR 09-19 07:20:26 [entrypoints/.../tool_parsers/hermes_tool_parser.py:441] AttributeError: 'NoneType' object has no attribute 'group'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding this change fixed it in my case:

                if match:
                    cur_arguments_json = match.group(1)
                else:
                    cur_arguments_json = json.dumps(cur_arguments,
                                                ensure_ascii=False)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks for your suggestion

@chaunceyjiang chaunceyjiang merged commit 0eecb31 into vllm-project:main Sep 22, 2025
43 checks passed
@github-project-automation github-project-automation bot moved this from To Triage to Done in gpt-oss Issues & Enhancements Sep 22, 2025
kingsmad pushed a commit to kingsmad/vllm that referenced this pull request Sep 22, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
…vllm-project#22002)

Signed-off-by: wangzi <[email protected]>
Signed-off-by: David Chen <[email protected]>
Co-authored-by: wangzi <[email protected]>
Co-authored-by: Chauncey <[email protected]>
Signed-off-by: charlifu <[email protected]>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
…#22002)

Signed-off-by: wangzi <[email protected]>
Signed-off-by: David Chen <[email protected]>
Co-authored-by: wangzi <[email protected]>
Co-authored-by: Chauncey <[email protected]>
Signed-off-by: yewentao256 <[email protected]>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
…vllm-project#22002)

Signed-off-by: wangzi <[email protected]>
Signed-off-by: David Chen <[email protected]>
Co-authored-by: wangzi <[email protected]>
Co-authored-by: Chauncey <[email protected]>
Signed-off-by: xuebwang-amd <[email protected]>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build deepseek Related to DeepSeek models frontend gpt-oss Related to GPT-OSS models llama Related to Llama models multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding tool-calling v1

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants