Optimize JSON property order for prompt caching

Context/motivation: ability to reduce costs while using the APIs via this SDK.

![Image](https://github.com/user-attachments/assets/ec64b5b6-0d71-43de-8507-87d0a1a869c3)

In order to increase the chances of prompts hitting the cache, OpenAI suggests the following:

![Image](https://github.com/user-attachments/assets/88e9c03d-684f-4b54-9af9-8dbf61454feb)

As far as I understand, and based on experimentation and monitoring, caching works for the entirety of the content passed onto the LLM, and the structure of this content is inherited from the structure of the request body JSON.

While debugging the SDK I have found out that there is no straightforward way to control the structure of Responses API requests.

Example:
- Developer message (static)
- User message (static)
- User message (dynamic)
- Structured outputs schema (static)

Alternative structure to maximize the probability of cache hits:
- Structured outputs schema (static)
- Developer message (static)
- User message (static)
- User message (dynamic)

I have tried to call `.text()` function on `ResponseCreateParams` after everything else but it has no effect on the resulting request body.

Can there be a workaround or even functionality for this?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize JSON property order for prompt caching #316

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimize JSON property order for prompt caching #316

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions