Skip to content

Conversation

@ihrpr
Copy link
Contributor

@ihrpr ihrpr commented Nov 2, 2025

This is a very simplified conformance tests suit with:

  • requirements for servers to implement (naming, what features should be doing)
  • typescript everything server implementation (using register<> API, not we are moving away from tool, prompt resourse to register<> methods)
  • tests (some are very simple)

Copy link
Member

@pcarleton pcarleton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a bunch of comments, but they're non-blocking / thinking through where we go with this. I'd say let's merge this all in and iterate forward.

A couple things this prompted me to that I want to explore are:

  • how to test a scenario, specifically making sure it fails when it should. I think we'll want to make this super easy to do
  • what negative tests we might want for a server (i.e. server shouldn't do this)
  • starting to compile the big "MUST" / "SHOULD" list and start burning down our coverage


return {
client,
close: async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not call close directly?

try {
const connection = await connectToServer(serverUrl);

// The connection process already does initialization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an example of where if we use the SDK, we lose some visibility. this one seems fine as long as the error the SDK throws is easy to interpret in the check output, but just flagging the pattern where we may want to break out of the sdk

}
}

export class PromptsGetSimpleScenario implements ClientScenario {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still getting a feel for the cost of an additional scenario, but I'd be inclined to do all these in 1 vs. separate scenarios for each.

@@ -0,0 +1,874 @@
# MCP Server Conformance Requirements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining this being a "suite" that is called something like "Basic", and it's a list of the tests to run, where each test describes the functionality it needs.

I like that this file describes the current SDK goal in a "1-stop" way though, so let's keep this, and we can explore "suites" or profiles later. Like for SDK's, we want this full set of features, but for servers that don't want some, we can have suites that don't fail on missing optional features.

import { ClientScenario, ConformanceCheck } from '../../types.js';
import { connectToServer } from './client-helper.js';

export class ResourcesListScenario implements ClientScenario {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar here, my current take is these would be better as 1 scenario w/ multiple checks

Progress
} from '@modelcontextprotocol/sdk/types.js';

export class ToolsListScenario implements ClientScenario {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these ones seem good as separate, but I might group it like:

  • list + call simple
  • call content types (image, audio, mixed, resource)
  • (rest as individual)

the rough heuristic I'm building up is "would it make sense for someone to implement a server that just does this scenario?" (open to iterating on that heuristic, I think we'll want it to be easy for folks to reason through when writing scenarios and using the scenarios). Right now, we're proposing folks have an everything-server that does all the things, but I don't want it to be a requirement that it's all 1 server since some scenarios will be mutually exclusive (i.e. an SDK should be able to support writing a server that doesn't support certain stateful features if that unlocks other transport things).

Some of this might also be fine to "start out everything super small, and then group things up later based on what's annoying or not as we use it more".

console.log(`\n=== Running scenario: ${scenarioName} ===`);
try {
const result = await runServerConformanceTest(serverUrl, scenarioName);
allResults.push({ scenario: scenarioName, checks: result.checks });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hypothesis here is we'll want results logged as we go, and summary at the end for faster feedback loops

@pcarleton pcarleton changed the base branch from pcarleton/server-runner to main November 4, 2025 14:56
@pcarleton pcarleton force-pushed the ihrpr/server-scenarios branch from 7635a76 to 7b3666a Compare November 4, 2025 14:59
@pcarleton pcarleton merged commit a692dd7 into main Nov 4, 2025
4 checks passed
@pcarleton pcarleton deleted the ihrpr/server-scenarios branch November 4, 2025 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants