-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: add support for tool_choice to responses api #4106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✱ Stainless preview buildsThis PR will update the Edit this comment to update it. It will appear in the SDK's changelogs.
|
Pydantic uses |
Pydantic uses |
| 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceAllowedTools` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. |
| 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFileSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. |
| 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceWebSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. |
| 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFunctionTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. |
| 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceMCPTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. |
✅ llama-stack-client-kotlin studio · code · diff
Your SDK built successfully.
generate ⚠️→lint ✅→test ❗New diagnostics (10 note)
💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceAllowedTools` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFileSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceWebSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFunctionTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceMCPTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceCustomTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Go/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous. 💡 Java/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous. 💡 Java/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous. 💡 Java/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous.
✅ llama-stack-client-go studio · code · diff
Your SDK built successfully.
generate ⚠️→lint ❗→test ❗go get github.com/stainless-sdks/llama-stack-client-go@56702f16003886e559b06f15b1e1ef7e64dce679New diagnostics (7 note)
💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceAllowedTools` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFileSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceWebSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFunctionTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceMCPTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceCustomTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Go/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous.
✅ llama-stack-client-python studio · code · diff
Your SDK built successfully.
generate ⚠️→build ⏳→lint ⏳→test ⏳New diagnostics (7 note)
💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceAllowedTools` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFileSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceWebSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFunctionTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceMCPTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceCustomTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`. 💡 Go/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous.
This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
Last updated: 2025-12-02 14:49:15 UTC
9bab29b to
55bd671
Compare
3fd6509 to
a7e1132
Compare
a7e1132 to
09597b6
Compare
src/llama_stack/providers/inline/agents/meta_reference/responses/utils.py
Outdated
Show resolved
Hide resolved
src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py
Outdated
Show resolved
Hide resolved
ashwinb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bunch of inline comments. Thanks for this PR!
src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py
Outdated
Show resolved
Hide resolved
src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py
Outdated
Show resolved
Hide resolved
src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py
Outdated
Show resolved
Hide resolved
src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py
Outdated
Show resolved
Hide resolved
src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py
Outdated
Show resolved
Hide resolved
src/llama_stack/providers/inline/agents/meta_reference/responses/utils.py
Outdated
Show resolved
Hide resolved
|
This pull request has merge conflicts that must be resolved before it can be merged. @jaideepr97 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork |
26a2ee8 to
e6a6574
Compare
6de7dc7 to
9de2b1f
Compare
|
added unit tests but removed integration test from this PR for now since it requires changes in the client to pass cc @ashwinb lmk if there is a different way to proceed here |
4ac35ec to
116ddb4
Compare
|
Through some anecdotal testing I've been able to reproduce output produced by running queries specifying tool_choices both against openai directly as well as routing through llama stack |
116ddb4 to
983fba8
Compare
cdoern
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few questions/comments. looking good overall!
tests/unit/providers/agents/meta_reference/test_response_tool_context.py
Outdated
Show resolved
Hide resolved
src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py
Outdated
Show resolved
Hide resolved
src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py
Show resolved
Hide resolved
| response_format: OpenAIResponseFormatParam | ||
| tool_context: ToolContext | None | ||
| responses_tool_choice: OpenAIResponseInputToolChoice | None = None | ||
| chat_tool_choice: str | dict[str, Any] | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lots of different types here in this union, Is this going to be hard to enforce?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aren't we enforcing the type check by setting this union?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we just did not have chat_tool_choice here in this struct and let it be a local in the working loop, it might be clearer? then you can even make responses_tool_choice be just tool_choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, updated accordingly
1e02307 to
6fe37f7
Compare
13756c8 to
811e573
Compare
|
@ashwinb would you have bandwidth to give this a second look? |
|
Will review in detail soon. One quick comment: could you update the PR summary and remove the "Closes: " bit. We should close the issue only once we have landed client types and added integration tests. |
811e573 to
fd53229
Compare
| ) | ||
| # chat_tool_choice can be str, dict-like object, or None | ||
| if isinstance(chat_tool_choice, str | type(None)): | ||
| self.ctx.chat_tool_choice = chat_tool_choice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, why is this mutation to ctx necessary? in general, the "ctx" should be considered an immutable thing which is just a bag of parameters computed initially before hitting the main processing loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, not updating the ctx anymore
though again - this was done following the examples of other fields like chat_tools that are also getting modified and stored in the ctx earlier during the same processing loop
| break | ||
|
|
||
| n_iter += 1 | ||
| # After first iteration, reset tool_choice to "auto" to let model decide freely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels like a very model-specific thing baked deeply into the implementation with the API or documentation not making any note of it. does OpenAI talk about it, for example? I don't think we should do this at all since it is very surprising behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah to be honest this particular fix came from claude-4.5-thinking. I wouldn't know to come up with this myself
prior to this change when I tried to query against an openai model via llama stack, I was not getting usable results when i enforced tool_choice. It was ending up in an infinite loop of calling the same function over and over. After this fix was added I was able to get the same quality results from openai as querying it directly
so it seems like an important fix to maintain parity with querying openai directly. I think it's important that llama stack not introduce any performance deterioration when a user wants to query openai through llama stack. Having this fix in didn't seem to impact the result I saw from testing against qwen either, but this was not an exhaustive test by any means
I also undersatnd your concern regarding this so I'm not sure how to proceed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for referece:
here is a tool_choice query I'm making against gpt-4o
respB = client.responses.create(
model=args.model,
tools=[
{
"type": "mcp",
"server_label": MCP_LABEL,
"server_url": MCP_SERVER_URL,
"require_approval": "never",
}
],
tool_choice={
"type": "mcp", "server_label": MCP_LABEL, "name": "namespaces_list",
},
input=[{
"role": "user",
"content": (
"List what kubernetes MCP tools you are allowed to use in this context.Tell me something about the cluster. \
Try to call only the MCP tools that you have access to, and tell me which tools you called. If none are available, explain why."
)
}],
)
pretty_print_result("B: no restriction at the MCP tool (server) level, tool choice is mcp with server label and tool name", respB)response without this fix:
=== B: no restriction at the MCP tool (server) level, tool choice is mcp with server label and tool name ===
Output text:
mcp_call[1]: kubernetes :: namespaces_list :: {}
mcp_call[2]: kubernetes :: namespaces_list :: {}
mcp_call[3]: kubernetes :: namespaces_list :: {}
mcp_call[4]: kubernetes :: namespaces_list :: {}
mcp_call[5]: kubernetes :: namespaces_list :: {}
mcp_call[6]: kubernetes :: namespaces_list :: {}
mcp_call[7]: kubernetes :: namespaces_list :: {}
mcp_call[8]: kubernetes :: namespaces_list :: {}
mcp_call[9]: kubernetes :: namespaces_list :: {}
mcp_call[10]: kubernetes :: namespaces_list :: {}
response with this fix:
=== B: no restriction at the MCP tool (server) level, tool choice is mcp with server label and tool name ===
Output text:
Here is what I found about the Kubernetes cluster using the tools available:
### Tools Used
1. **Namespaces List**: This tool lists all the Kubernetes namespaces in the current cluster.
2. **Events List**: This tool lists all the Kubernetes events in the current cluster from all namespaces.
### Cluster Information
#### Namespaces
The cluster currently has the following namespaces:
- **default**: Active
- **kube-node-lease**: Active
- **kube-public**: Active
- **kube-system**: Active
- **local-path-storage**: Active
#### Events
Some of the notable events happening in the cluster include:
- **Node kind-control-plane**:
- Kubelet started successfully.
- Warning about failed node allocatable limits update.
- Node is ready with sufficient memory, no disk pressure, and sufficient PID.
- Registered the node successfully.
- **Pod Events**:
- Pods in the kube-system namespace like `coredns` and `kube-proxy` have been scheduled, container images pulled successfully, and started without issues.
- There are some scheduling warnings due to untolerated taints.
- **Resource Management Events**:
- Controller manager and scheduler leadership has been successfully maintained.
These observations reflect a running cluster with active namespace management and event tracking. This combination ensures efficient operations and identification of potential issues for administrative action.
mcp_call[1]: kubernetes :: namespaces_list :: {}
mcp_call[2]: kubernetes :: events_list :: {}| self.mcp_tool_to_server[t.name] = mcp_tool | ||
|
|
||
| # Add to reverse mapping for efficient server_label lookup | ||
| if mcp_tool.server_label not in self.server_label_to_tools: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is more of a lazy question: when does it happen that the initial dict is not sufficient to cover mcp_tool -- i.e., you are seeing a new tool during the loop? maybe I am just confused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by initial dict are you maybe referring to mcp_tool_to_server? cause this line is populating server_label_to_tools which is a separate reverse mapping so that it is easy to look up all the tools associated with a given server
I added this construct because it is required to able to quickly lookup all tools associated with a given mcp server_label to process the mcp tool choice, and it seemed easier to just build and maintain this reverse mapping as we see and process new tools rather than build it on the fly out of mcp_tool_to_server each time which could get computationally expensive
ashwinb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking much better thank you for the iteration @jaideepr97
Signed-off-by: Jaideep Rao <[email protected]>
Signed-off-by: Jaideep Rao <[email protected]>
fd53229 to
36d7abd
Compare
What does this PR do?
Adds support for enforcing tool usage via responses api. See https://platform.openai.com/docs/api-reference/responses/create#responses_create-tool_choice for details from official documentation.
Note: at present this PR only supports
file_searchandweb_searchas options to enforce builtin tool usageCloses #3548
Test Plan
./scripts/unit-tests.sh tests/unit/providers/agents/meta_reference/test_response_tool_context.py