feat: add support for tool_choice to responses api #4106

jaideepr97 · 2025-11-08T04:55:23Z

What does this PR do?

Adds support for enforcing tool usage via responses api. See https://platform.openai.com/docs/api-reference/responses/create#responses_create-tool_choice for details from official documentation.
Note: at present this PR only supports file_search and web_search as options to enforce builtin tool usage

Closes #3548

Test Plan

./scripts/unit-tests.sh tests/unit/providers/agents/meta_reference/test_response_tool_context.py

github-actions · 2025-11-08T04:55:58Z

✱ Stainless preview builds

This PR will update the llama-stack-client SDKs with the following commit message.

feat: add support for tool_choice to repsponses api

Edit this comment to update it. It will appear in the SDK's changelogs.

⚠️

llama-stack-client-node studio · code · diff

There was a regression in your SDK.
generate ⚠️ → build ✅ → lint ✅ → test ✅
npm install https://pkg.stainless.com/s/llama-stack-client-node/581b2bfc85cd9218b7476171de1c5031e0b7d8a5/dist.tar.gz
New diagnostics (5 warning, 7 note)

⚠️ Python/DuplicateDeclaration: We generated two separated types under the same name: `InputOpenAIResponseMessageOutput`. If they are the referring to the same type, they should be extracted to the same ref and be declared as a model. Otherwise, they should be renamed with `x-stainless-naming`

⚠️ Python/DuplicateDeclaration: We generated two separated types under the same name: `InputListOpenAIResponseMessageUnionOpenAIResponseInputFunctionToolCallOutputOpenAIResponseMessageInput`. If they are the referring to the same type, they should be extracted to the same ref and be declared as a model. Otherwise, they should be renamed with `x-stainless-naming`

⚠️ Python/DuplicateDeclaration: We generated two separated types under the same name: `DataOpenAIResponseMessageOutput`. If they are the referring to the same type, they should be extracted to the same ref and be declared as a model. Otherwise, they should be renamed with `x-stainless-naming`

⚠️ Python/NameNotAllowed: Encountered response property `model_type` which may conflict with Pydantic properties.
Pydantic uses model_ as a protected namespace that shouldn't be used for attributes of our own API's models.
Please rename it using the 'renameValue' transform.

⚠️ Python/NameNotAllowed: Encountered response property `model_type` which may conflict with Pydantic properties.
Pydantic uses model_ as a protected namespace that shouldn't be used for attributes of our own API's models.
Please rename it using the 'renameValue' transform.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceAllowedTools` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFileSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceWebSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFunctionTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceMCPTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

✅ llama-stack-client-kotlin studio · code · diff

Your SDK built successfully.
generate ⚠️ → lint ✅ → test ❗

New diagnostics (10 note)

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceAllowedTools` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFileSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceWebSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFunctionTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceMCPTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceCustomTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Go/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous.

💡 Java/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous.

💡 Java/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous.

💡 Java/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous.

✅ llama-stack-client-go studio · code · diff

Your SDK built successfully.
generate ⚠️ → lint ❗ → test ❗
go get github.com/stainless-sdks/llama-stack-client-go@56702f16003886e559b06f15b1e1ef7e64dce679
New diagnostics (7 note)

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceAllowedTools` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFileSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceWebSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFunctionTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceMCPTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceCustomTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Go/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous.

✅ llama-stack-client-python studio · code · diff

Your SDK built successfully.
generate ⚠️ → build ⏳ → lint ⏳ → test ⏳

New diagnostics (7 note)

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceAllowedTools` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFileSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceWebSearch` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceFunctionTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceMCPTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Model/Recommended: `#/components/schemas/OpenAIResponseInputToolChoiceCustomTool` could potentially be defined as a [model](https://www.stainless.com/docs/guides/configure#models) within `#/resources/responses`.

💡 Go/SchemaUnionDiscriminatorMissing: This union schema has more than one object variant, but no [`discriminator`](https://www.stainless.com/docs/reference/openapi-support#discriminator) property, so deserializing the union may be inefficient or ambiguous.

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
Last updated: 2025-12-02 14:49:15 UTC

src/llama_stack/providers/inline/agents/meta_reference/responses/utils.py

src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py

src/llama_stack_api/openai_responses.py

ashwinb

A bunch of inline comments. Thanks for this PR!

src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py

src/llama_stack/providers/inline/agents/meta_reference/responses/utils.py

mergify · 2025-11-18T19:26:22Z

This pull request has merge conflicts that must be resolved before it can be merged. @jaideepr97 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

jaideepr97 · 2025-11-22T13:59:59Z

added unit tests but removed integration test from this PR for now since it requires changes in the client to pass
I'm guessing we will need to merge this PR, update the client and then add integration tests in a follow up PR

cc @ashwinb lmk if there is a different way to proceed here

jaideepr97 · 2025-11-22T14:59:34Z

Through some anecdotal testing I've been able to reproduce output produced by running queries specifying tool_choices both against openai directly as well as routing through llama stack
Also tested against a locally hosted qwen3 model

cdoern

a few questions/comments. looking good overall!

src/llama_stack_api/inference.py

tests/unit/providers/agents/meta_reference/test_response_tool_context.py

src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py

cdoern · 2025-11-24T14:39:58Z

src/llama_stack/providers/inline/agents/meta_reference/responses/types.py

    response_format: OpenAIResponseFormatParam
    tool_context: ToolContext | None
+    responses_tool_choice: OpenAIResponseInputToolChoice | None = None
+    chat_tool_choice: str | dict[str, Any] | None = None


lots of different types here in this union, Is this going to be hard to enforce?

aren't we enforcing the type check by setting this union?

I think if we just did not have chat_tool_choice here in this struct and let it be a local in the working loop, it might be clearer? then you can even make responses_tool_choice be just tool_choice?

ack, updated accordingly

jaideepr97 · 2025-12-01T19:11:39Z

@ashwinb would you have bandwidth to give this a second look?

ashwinb · 2025-12-01T19:56:19Z

Will review in detail soon. One quick comment: could you update the PR summary and remove the "Closes: " bit. We should close the issue only once we have landed client types and added integration tests.

src/llama_stack_api/openai_responses.py

ashwinb · 2025-12-01T22:44:27Z

src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py

+            )
+            # chat_tool_choice can be str, dict-like object, or None
+            if isinstance(chat_tool_choice, str | type(None)):
+                self.ctx.chat_tool_choice = chat_tool_choice


hm, why is this mutation to ctx necessary? in general, the "ctx" should be considered an immutable thing which is just a bag of parameters computed initially before hitting the main processing loop

ack, not updating the ctx anymore
though again - this was done following the examples of other fields like chat_tools that are also getting modified and stored in the ctx earlier during the same processing loop

ashwinb · 2025-12-01T22:45:51Z

src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py

                    break

                n_iter += 1
+                # After first iteration, reset tool_choice to "auto" to let model decide freely


this feels like a very model-specific thing baked deeply into the implementation with the API or documentation not making any note of it. does OpenAI talk about it, for example? I don't think we should do this at all since it is very surprising behavior

yeah to be honest this particular fix came from claude-4.5-thinking. I wouldn't know to come up with this myself

prior to this change when I tried to query against an openai model via llama stack, I was not getting usable results when i enforced tool_choice. It was ending up in an infinite loop of calling the same function over and over. After this fix was added I was able to get the same quality results from openai as querying it directly
so it seems like an important fix to maintain parity with querying openai directly. I think it's important that llama stack not introduce any performance deterioration when a user wants to query openai through llama stack. Having this fix in didn't seem to impact the result I saw from testing against qwen either, but this was not an exhaustive test by any means

I also undersatnd your concern regarding this so I'm not sure how to proceed here

for referece:

here is a tool_choice query I'm making against gpt-4o

respB = client.responses.create( model=args.model, tools=[ { "type": "mcp", "server_label": MCP_LABEL, "server_url": MCP_SERVER_URL, "require_approval": "never", } ], tool_choice={ "type": "mcp", "server_label": MCP_LABEL, "name": "namespaces_list", }, input=[{ "role": "user", "content": ( "List what kubernetes MCP tools you are allowed to use in this context.Tell me something about the cluster. \ Try to call only the MCP tools that you have access to, and tell me which tools you called. If none are available, explain why." ) }], ) pretty_print_result("B: no restriction at the MCP tool (server) level, tool choice is mcp with server label and tool name", respB)

response without this fix:

=== B: no restriction at the MCP tool (server) level, tool choice is mcp with server label and tool name === Output text: mcp_call[1]: kubernetes :: namespaces_list :: {} mcp_call[2]: kubernetes :: namespaces_list :: {} mcp_call[3]: kubernetes :: namespaces_list :: {} mcp_call[4]: kubernetes :: namespaces_list :: {} mcp_call[5]: kubernetes :: namespaces_list :: {} mcp_call[6]: kubernetes :: namespaces_list :: {} mcp_call[7]: kubernetes :: namespaces_list :: {} mcp_call[8]: kubernetes :: namespaces_list :: {} mcp_call[9]: kubernetes :: namespaces_list :: {} mcp_call[10]: kubernetes :: namespaces_list :: {}

response with this fix:

=== B: no restriction at the MCP tool (server) level, tool choice is mcp with server label and tool name === Output text: Here is what I found about the Kubernetes cluster using the tools available: ### Tools Used 1. **Namespaces List**: This tool lists all the Kubernetes namespaces in the current cluster. 2. **Events List**: This tool lists all the Kubernetes events in the current cluster from all namespaces. ### Cluster Information #### Namespaces The cluster currently has the following namespaces: - **default**: Active - **kube-node-lease**: Active - **kube-public**: Active - **kube-system**: Active - **local-path-storage**: Active #### Events Some of the notable events happening in the cluster include: - **Node kind-control-plane**: - Kubelet started successfully. - Warning about failed node allocatable limits update. - Node is ready with sufficient memory, no disk pressure, and sufficient PID. - Registered the node successfully. - **Pod Events**: - Pods in the kube-system namespace like `coredns` and `kube-proxy` have been scheduled, container images pulled successfully, and started without issues. - There are some scheduling warnings due to untolerated taints. - **Resource Management Events**: - Controller manager and scheduler leadership has been successfully maintained. These observations reflect a running cluster with active namespace management and event tracking. This combination ensures efficient operations and identification of potential issues for administrative action. mcp_call[1]: kubernetes :: namespaces_list :: {} mcp_call[2]: kubernetes :: events_list :: {}

ashwinb · 2025-12-01T22:47:14Z

src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py

                    self.mcp_tool_to_server[t.name] = mcp_tool

+                    # Add to reverse mapping for efficient server_label lookup
+                    if mcp_tool.server_label not in self.server_label_to_tools:


this is more of a lazy question: when does it happen that the initial dict is not sufficient to cover mcp_tool -- i.e., you are seeing a new tool during the loop? maybe I am just confused

by initial dict are you maybe referring to mcp_tool_to_server? cause this line is populating server_label_to_tools which is a separate reverse mapping so that it is easy to look up all the tools associated with a given server

I added this construct because it is required to able to quickly lookup all tools associated with a given mcp server_label to process the mcp tool choice, and it seemed easier to just build and maintain this reverse mapping as we see and process new tools rather than build it on the fly out of mcp_tool_to_server each time which could get computationally expensive

ashwinb

This is looking much better thank you for the iteration @jaideepr97

Signed-off-by: Jaideep Rao <[email protected]>

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 8, 2025

jaideepr97 force-pushed the tool-choice branch 4 times, most recently from 9bab29b to 55bd671 Compare November 8, 2025 09:10

jaideepr97 changed the title ~~feat: add support for tool_choice to repsponses api~~ feat: add support for tool_choice to responses api Nov 8, 2025

jaideepr97 force-pushed the tool-choice branch 2 times, most recently from 3fd6509 to a7e1132 Compare November 10, 2025 16:21

jaideepr97 marked this pull request as ready for review November 10, 2025 19:47

jaideepr97 requested review from ashwinb, bbrowning, ehhuang, franciscojavierarceo, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, slekkala1, terrytangyuan and yanxi0830 as code owners November 10, 2025 19:47

jaideepr97 force-pushed the tool-choice branch from a7e1132 to 09597b6 Compare November 10, 2025 19:47

ashwinb reviewed Nov 13, 2025

View reviewed changes

src/llama_stack/providers/inline/agents/meta_reference/responses/utils.py Outdated Show resolved Hide resolved

src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py Outdated Show resolved Hide resolved

src/llama_stack_api/openai_responses.py Show resolved Hide resolved

ashwinb requested changes Nov 13, 2025

View reviewed changes

mergify bot added the needs-rebase label Nov 18, 2025

jaideepr97 force-pushed the tool-choice branch from 26a2ee8 to e6a6574 Compare November 22, 2025 13:11

jaideepr97 requested a review from cdoern as a code owner November 22, 2025 13:11

mergify bot removed the needs-rebase label Nov 22, 2025

jaideepr97 force-pushed the tool-choice branch 5 times, most recently from 6de7dc7 to 9de2b1f Compare November 22, 2025 13:59

jaideepr97 force-pushed the tool-choice branch from 4ac35ec to 116ddb4 Compare November 22, 2025 14:45

jaideepr97 force-pushed the tool-choice branch from 116ddb4 to 983fba8 Compare November 24, 2025 14:04

cdoern reviewed Nov 24, 2025

View reviewed changes

jaideepr97 force-pushed the tool-choice branch 4 times, most recently from 1e02307 to 6fe37f7 Compare December 1, 2025 14:26

jaideepr97 force-pushed the tool-choice branch 2 times, most recently from 13756c8 to 811e573 Compare December 1, 2025 19:11

ashwinb reviewed Dec 1, 2025

View reviewed changes

src/llama_stack_api/openai_responses.py Outdated Show resolved Hide resolved

jaideepr97 force-pushed the tool-choice branch from 811e573 to fd53229 Compare December 1, 2025 20:15

ashwinb reviewed Dec 1, 2025

View reviewed changes

ashwinb requested changes Dec 1, 2025

View reviewed changes

jaideepr97 added 2 commits December 2, 2025 18:47

feat: add support for tool_choice to repsponses api

d2d2c88

Signed-off-by: Jaideep Rao <[email protected]>

add unit tests

36d7abd

Signed-off-by: Jaideep Rao <[email protected]>

jaideepr97 force-pushed the tool-choice branch from fd53229 to 36d7abd Compare December 2, 2025 14:38

feat: add support for tool_choice to responses api #4106

Are you sure you want to change the base?

feat: add support for tool_choice to responses api #4106

Conversation

jaideepr97 commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

github-actions bot commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✱ Stainless preview builds

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ashwinb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Nov 18, 2025

Uh oh!

jaideepr97 commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jaideepr97 commented Nov 22, 2025

Uh oh!

cdoern left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jaideepr97 Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jaideepr97 commented Dec 1, 2025

Uh oh!

ashwinb commented Dec 1, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jaideepr97 Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ashwinb left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

jaideepr97 commented Nov 8, 2025 •

edited

Loading

github-actions bot commented Nov 8, 2025 •

edited

Loading

jaideepr97 commented Nov 22, 2025 •

edited

Loading

jaideepr97 Nov 24, 2025 •

edited

Loading

jaideepr97 Dec 2, 2025 •

edited

Loading