Responses max_tool_calls

### 🚀 Describe the new functionality needed

Responses API can accept a [max_tool_calls](https://platform.openai.com/docs/api-reference/responses/object#responses/object-max_tool_calls) parameter that limits the number of tool calls allowed to be executed for a given response. There are a few things to consider when implementing this:

### Requirements
- When handling inference calls that get converted into chat completions:
    - If a list of tool call requests is returned from the model, truncate that list down to `max_tool_calls`, then execute the calls for each tool.
    - If `max_tool_calls` < 0, return `Bad Request`

### 💡 Why is this needed? What if we don't build it?

This is a key functionality of responses and it allows the tool calls made to be filtered down so the model doesn't get overwhelmed by context. Also this is a feature gap.

### Other thoughts

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Responses max_tool_calls #3563

🚀 Describe the new functionality needed

Requirements

💡 Why is this needed? What if we don't build it?

Other thoughts

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Responses max_tool_calls #3563

Description

🚀 Describe the new functionality needed

Requirements

💡 Why is this needed? What if we don't build it?

Other thoughts

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions