You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix conversation chaining: skip model routing when previous_response_id is present
When a Responses API request includes previous_response_id, the router now skips
model routing to ensure conversation continuity. This prevents routing subsequent
requests to different backend instances that don't have the conversation state.
- Added check for previous_response_id in handleResponsesAPIRequest
- Skip classification and model routing when conversation is chained
- Added test for this behavior (TestHandleResponsesAPIRequest_WithPreviousResponseID)
- Updated documentation to explain the limitation and recommended usage
Co-authored-by: rootfs <[email protected]>
Copy file name to clipboardExpand all lines: website/docs/api/router.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -359,7 +359,7 @@ The router will still perform classification and routing, but the actual executi
359
359
360
360
- GET `/v1/responses/{id}` requests pass through without modification (no routing or classification)
361
361
- POST `/v1/responses` requests go through the full routing pipeline
362
-
-The `previous_response_id`parameter is preserved during routing for conversation continuity
362
+
-**Conversation Chaining Limitation**: When using `previous_response_id`to chain conversations, the router will **not** change the model to ensure conversation continuity. This is because response state is stored on specific backend instances. For multi-turn conversations, specify a fixed model instead of using "auto", or ensure all backend instances share response storage.
363
363
- All Responses API features (tools, reasoning, streaming, background) work transparently through the router
0 commit comments