Preserve tool-call JSON for deterministic local inference #22

latent-variable · 2025-11-18T03:50:20Z

Summary

add arguments_json to FunctionCall so we persist the exact tool-call payload returned by the model
have the OpenAI client reuse the stored JSON when constructing follow-up requests (falling back to a canonical dump only if the string is missing)
capture the raw JSON when parsing responses, keeping both the parsed dict (for tool execution) and the untouched string (for replay)

Problem

When Mini-Agent is configured to send requests to a local LM Studio endpoint (or any local serving stack with KV cache), each subsequent request must be byte-identical for the cached portion of the context. Today the request builder re-serializes every tool call in the transcript using json.dumps(..., sort_keys=True). That changes key ordering, whitespace, or float formatting compared to what the model actually emitted, meaning the tool call that was prepended to Request #2 is different from the one the model saw during Request #1. LM Studio therefore treats the assistant history as a cache miss, reprocessing all prior tokens (~12k tokens per turn in our setup), wasting latency and compute.

Testing

python - <<'PY' … (arguments_json present) ✅
python - <<'PY' … (fallback path) ✅
Manual: local LM Studio run now reports 99.88% prompt reuse on Request feat: Add Windows support with PowerShell integration #2 (previously ~33%).

Preserve raw tool-call JSON for deterministic local inference

91d7951

latent-variable force-pushed the fix/deterministic-local-tools branch from 2eba424 to 91d7951 Compare November 18, 2025 03:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Preserve tool-call JSON for deterministic local inference #22

Preserve tool-call JSON for deterministic local inference #22

Uh oh!

latent-variable commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Preserve tool-call JSON for deterministic local inference #22

Are you sure you want to change the base?

Preserve tool-call JSON for deterministic local inference #22

Uh oh!

Conversation

latent-variable commented Nov 18, 2025

Summary

Problem

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant