-
Notifications
You must be signed in to change notification settings - Fork 707
Fail CI if snapshots aren't present #304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR tightens the snapshot-based replay harness so that CI runs fail when tests issue new or changed requests without corresponding snapshots, and it updates tests/snapshots to align with that stricter behavior. It also adds stubs for new endpoints and tweaks specific E2E tests to avoid relying on live LLM behavior in CI.
Changes:
- Add or update session and permissions snapshot YAMLs (including a new
sendandwait_throws_on_timeoutsnapshot and updated tool arguments/assistant messages). - Enhance
ReplayingCapiProxyto (a) provide a default model for/models, (b) stub agent memory endpoints, and (c) treat missing cached responses as CI errors instead of warnings. - Update the Node.js session E2E tests to skip the timeout-specific
sendAndWaittest in CI, since it intentionally times out before a replayable response is available.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
test/snapshots/session/sendandwait_throws_on_timeout.yaml |
New snapshot capturing the model and initial system/user messages for the sendAndWait-timeout scenario. |
test/snapshots/session/send_returns_immediately_while_events_stream_in_background.yaml |
Adjusted tool-call arguments (intent string, shell description, mode: "sync") and slightly rephrased final assistant text to match current behavior. |
test/snapshots/permissions/should_receive_toolcallid_in_permission_requests.yaml |
Minor assistant message wording change in the permissions snapshot to align with updated responses. |
test/harness/replayingCapiProxy.ts |
Introduced a default model for /models, added stub handlers for /agents/.../memory/... endpoints, and changed the CI fallback behavior so missing snapshots produce an error via exitWithNoMatchingRequestError. |
nodejs/test/e2e/session.test.ts |
Marked the sendAndWait throws on timeout test as it.skipIf(process.env.CI === "true") with clarifying comments, so CI doesn't depend on a replayable LLM response for this client-side timeout behavior. |
Comments suppressed due to low confidence (1)
test/harness/replayingCapiProxy.ts:296
- When
process.env.CI === "true"and no cached response is found,exitWithNoMatchingRequestErrorcallsoptions.onError, but the code still falls through tosuper.performRequest(options). This can cause the proxy to both send a500 Proxy errorresponse viaonErrorand then attempt to proxy the request normally, leading to double-callbacks and potentialwrite after end/header-sent errors. To avoid conflicting responses and ensure CI fails immediately when snapshots are missing, this branch should return early afterexitWithNoMatchingRequestErrorinstead of continuing tosuper.performRequest.
// Fallback to normal proxying if no cached response found
// This implicitly captures the new exchange too
if (process.env.CI === "true") {
await exitWithNoMatchingRequestError(
options,
state.testInfo,
state.workDir,
state.toolResultNormalizers,
);
}
super.performRequest(options);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Handle memory endpoints - return stub responses in tests | ||
| // Matches: /agents/*/memory/*/enabled, /agents/*/memory/*/recent, etc. | ||
| if (options.requestOptions.path?.match(/\/agents\/.*\/memory\//)) { | ||
| let body: string; | ||
| if (options.requestOptions.path.includes("/enabled")) { | ||
| body = JSON.stringify({ enabled: false }); | ||
| } else if (options.requestOptions.path.includes("/recent")) { | ||
| body = JSON.stringify({ memories: [] }); | ||
| } else { | ||
| body = JSON.stringify({}); | ||
| } | ||
| const headers = { | ||
| "content-type": "application/json", | ||
| ...commonResponseHeaders, | ||
| }; | ||
| options.onResponseStart(200, headers); | ||
| options.onData(Buffer.from(body)); | ||
| options.onResponseEnd(); | ||
| return; | ||
| } |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new memory endpoint stubs (/agents/.../memory/...) and the CI-only error path for missing cached responses (exitWithNoMatchingRequestError) do not appear to have dedicated tests in replayingCapiProxy.test.ts, even though this file has comprehensive coverage for other behaviors. Given that these branches change how the proxy behaves in CI and for new endpoints, consider adding tests that (1) exercise the memory endpoint handlers and (2) verify that in CI, a request without a matching snapshot produces the expected ::error annotation and fails the request instead of silently proxying through.
SDK Consistency Review: Test Infrastructure ChangesThis PR modifies the shared test infrastructure to make CI builds fail when snapshots aren't present, and conditionally skips the Node.js ✅ Good: Shared Test Harness ImprovementsThe changes to
These changes benefit all four SDKs (Node.js, Python, Go, .NET) since they all use this shared test harness.
|
No description provided.