-
Notifications
You must be signed in to change notification settings - Fork 1.9k
GH-1403: Implements Anthropic's prompt caching feature to improve tok… #4199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Why add this to usermessage/abstract message? I though a chatoption would be sufficient. I'm concerned about having functionality in the message hiearchy that isn't universal. |
e57caa3
to
2c97c14
Compare
…tOptions - Add cacheControl field to AnthropicChatOptions with builder method - Create AnthropicCacheType enum with EPHEMERAL type for type-safe cache creation - Update AnthropicChatModel.createRequest() to apply cache control from options to user message ContentBlocks - Extend ContentBlock record with cacheControl parameter and constructor for API compatibility - Update Usage record to include cacheCreationInputTokens and cacheReadInputTokens fields - Update StreamHelper to handle new Usage constructor with cache token parameters - Add AnthropicApiIT.chatWithPromptCache() test for low-level API validation - Add AnthropicChatModelIT.chatWithPromptCacheViaOptions() integration test - Add comprehensive unit tests for AnthropicChatOptions cache control functionality - Update documentation with cacheControl() method examples and usage patterns Cache control is configured through AnthropicChatOptions rather than message classes to maintain provider portability. The cache control gets applied during request creation in AnthropicChatModel when building ContentBlocks for user messages. Original implementation provided by @Claudio-code (Claudio Silva Junior) See spring-projects@15e5026 Fixes spring-projects#1403 Signed-off-by: Soby Chacko <[email protected]>
According to their request validation, Anthropic's API only supports a maximum of 4 cache blocks. I got this error with more than 4: Also system messages can be cached as well by changing |
@sobychacko - I added a pull request to your branch that takes care of the above issue while also making the caching using It allows fine-grained configuration of Anthropic Prompt Caching on outgoing chat requests. I added:
|
@adase11 Thanks for the feedback. We are addressing some of this feedback on our end with @markpollack. We will update it via a separate PR. Thanks again for looking into it. |
Appreciate it @sobychacko if there's anything I can help with I'm happy to. We're eagerly awaiting being able to leverage this. |
…en efficiency
This implementation follows Anthropic's prompt caching API which allows for more efficient token usage by caching frequently used prompts.
Original implementation provided by @Claudio-code (Claudio Silva Junior) See 15e5026
Fixes #1403