Skip to content

Commit 19ad12a

Browse files
committed
doc: add cache usage doc
1 parent 5acfb20 commit 19ad12a

File tree

2 files changed

+28
-1
lines changed

2 files changed

+28
-1
lines changed

README.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,29 @@ async def analyze_call_feedback(input: CallFeedbackInput) -> AsyncIterator[Run[C
253253
...
254254
```
255255

256+
### Caching
257+
258+
By default, the cache settings is `auto`, meaning that agent runs are cached when the temperature is 0
259+
(the default temperature value). Which means that, when running the same agent twice with the **exact** same input,
260+
the exact same output is returned and the underlying model is not called a second time.
261+
262+
The cache usage string literal is defined in [cache_usage.py](./workflowai/core/domain/cache_usage.py) file. There are 3 possible values:
263+
264+
- `auto`: (default) Use cached results only when temperature is 0
265+
- `always`: Always use cached results if available, regardless of model temperature
266+
- `never`: Never use cached results, always execute a new run
267+
268+
The cache usage can be passed to the agent function as a keyword argument:
269+
270+
```python
271+
@workflowai.agent(id="analyze-call-feedback")
272+
async def analyze_call_feedback(_: CallFeedbackInput) -> AsyncIterator[CallFeedbackOutput]: ...
273+
274+
run = await analyze_call_feedback(CallFeedbackInput(...), use_cache="always")
275+
```
276+
277+
<!-- TODO: add cache usage at agent level when available -->
278+
256279
### Replying to a run
257280

258281
Some use cases require the ability to have a back and forth between the client and the LLM. For example:

workflowai/core/domain/cache_usage.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
11
from typing import Literal
22

3-
CacheUsage = Literal["always", "never", "auto"]
3+
# Cache usage configuration for agent runs
4+
# - "auto": Use cached results only when temperature is 0
5+
# - "always": Always use cached results if available, regardless of model temperature
6+
# - "never": Never use cached results, always execute a new run
7+
CacheUsage = Literal["auto", "always", "never"]

0 commit comments

Comments
 (0)