Skip to content

Commit 4859a81

Browse files
authored
docs: add AGENTS.md and CONTRIBUTING.md (#194)
* docs: add AGENTS.md and CONTRIBUTING.md Codifies scenario design guidelines from past PR reviews: - fewer scenarios with more checks (CI cost multiplies per-SDK) - same check id for SUCCESS and FAIL, optimize for Ctrl+F - extend the everything-client/server rather than adding new example files - prove it passes and fails - reuse the CLI runner, don't add parallel entry points Also asks contributors to open an issue for discussion before a PR, and to target specific spec MUSTs or open issues rather than running generic AI bug-hunts against the harness. * docs: broaden issue-first guidance to cover bugs, not just scenarios
1 parent 8f3994c commit 4859a81

2 files changed

Lines changed: 127 additions & 0 deletions

File tree

AGENTS.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# AGENTS.md
2+
3+
Guidance for AI agents (and humans) contributing to the MCP conformance test framework.
4+
5+
## What this repo is
6+
7+
A test harness that exercises MCP SDK implementations against the protocol spec. The coverage number that matters here is **spec coverage** — how much of the protocol the scenarios test.
8+
9+
Uses **npm** (not pnpm/yarn). Don't commit `pnpm-lock.yaml` or `yarn.lock`.
10+
11+
## Where to start
12+
13+
**Open an issue first** — whether you've hit a bug in the harness or want to propose a new scenario. For scenarios, sketch which part of the spec you want to cover and roughly how; for bugs, include the command you ran and the output. Either way, a short discussion up front beats review churn on a PR that overlaps existing work or heads in a direction we're not going.
14+
15+
**Don't point an agent at the repo and ask it to "find bugs."** Generic bug-hunting on a test harness produces low-signal PRs (typo fixes, unused-variable cleanups, speculative refactors). If you want to contribute via an agent, give it a concrete target:
16+
17+
- Pick a specific MUST or SHOULD from the [MCP spec](https://modelcontextprotocol.io/specification/) that has no scenario yet, and ask the agent to draft one.
18+
- Pick an [open issue](https://github.com/modelcontextprotocol/conformance/issues) and work on that.
19+
20+
The valuable contribution here is **spec coverage**, not harness polish.
21+
22+
## Scenario design: fewer scenarios, more checks
23+
24+
**The strongest rule in this repo:** prefer one scenario with many checks over many scenarios with one check each.
25+
26+
Why:
27+
28+
- Each scenario often spins up its own HTTP server. These suites run in CI on every push for every SDK, so per-scenario overhead multiplies fast.
29+
- Less code to maintain and update when the spec shifts.
30+
- Progress on making an SDK better shows up as "pass 7/10 checks" rather than "pass 1 test, fail another" — finer-grained signal from the same run.
31+
32+
### Granularity heuristic
33+
34+
Ask: **"Would it make sense for someone to implement a server/client that does just this scenario?"**
35+
36+
If two scenarios would always be implemented together, merge them. Examples:
37+
38+
- `tools/list` + a simple `tools/call` → one scenario
39+
- All content-type variants (image, audio, mixed, resource) → one scenario
40+
- Full OAuth flow with token refresh → one scenario, not separate "basic" + "refresh" scenarios. A client that passes "basic" but not "refresh" just shows up as passing N−2 checks.
41+
42+
Keep scenarios separate when they're genuinely independent features or when they're mutually exclusive (e.g., an SDK should support writing a server that _doesn't_ implement certain stateful features).
43+
44+
### When a PR adds scenarios
45+
46+
- Start with **one end-to-end scenario** covering the happy path with many checks along the way.
47+
- Don't add "step 1 only" and "step 1+2" as separate scenarios — the second subsumes the first.
48+
- Register the scenario in the appropriate suite list in `src/scenarios/index.ts` (`core`, `extensions`, `backcompat`, etc.).
49+
50+
## Check conventions
51+
52+
- **Same `id` for SUCCESS and FAIL.** A check should use one slug and flip `status` + `errorMessage`, not branch into `foo-success` vs `foo-failure` slugs.
53+
- **Optimize for Ctrl+F on the slug.** Repetitive check blocks are fine — easier to find the failing one than to unwind a clever helper.
54+
- Reuse `ConformanceCheck` and other types from `src/types.ts` rather than defining parallel shapes.
55+
- Include `specReferences` pointing to the relevant spec section.
56+
57+
## Descriptions and wording
58+
59+
Be precise about what's **required** vs **optional**. A scenario description that tests optional behavior should make that clear — e.g. "Tests that a client _that wants a refresh token_ handles offline_access scope…" not "Tests that a client handles offline_access scope…". Don't accidentally promote a MAY/SHOULD to a MUST in the prose.
60+
61+
When in doubt about spec details (OAuth parameters, audiences, grant types), check the actual spec in `modelcontextprotocol` rather than guessing.
62+
63+
## Examples: prove it passes and fails
64+
65+
A new scenario should come with:
66+
67+
1. **A passing example** — usually by extending `examples/clients/typescript/everything-client.ts` or the everything-server, not a new file.
68+
2. **Evidence it fails when it should** — ideally a negative example (a deliberately broken client), or at minimum a manual run showing the failure mode.
69+
70+
Delete unused example scenarios. If a scenario key in the everything-client has no corresponding test, remove it.
71+
72+
## Don't add new ways to run tests
73+
74+
Use the existing CLI runner (`npx @modelcontextprotocol/conformance client|server ...`). If you need a feature the runner doesn't have, add it to the runner rather than building a parallel entry point.
75+
76+
## Before opening a PR
77+
78+
- `npm run build` passes
79+
- `npm test` passes
80+
- For non-trivial scenario changes, run against at least one real SDK (typescript-sdk or python-sdk) to see actual output. For changes to shared infrastructure (runner, tier-check), test against go-sdk or csharp-sdk too.
81+
- Scenario is registered in the right suite in `src/scenarios/index.ts`

CONTRIBUTING.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Contributing
2+
3+
Thanks for helping improve the MCP conformance suite!
4+
5+
The most valuable contributions are **new conformance scenarios** that cover under-tested parts of the [MCP spec](https://modelcontextprotocol.io/specification/). If you're not sure where to start, ask in `#conformance-testing-wg` on the MCP Contributors Discord.
6+
7+
## Before you start
8+
9+
**Open an issue first** — whether you've found a bug or want to propose a new scenario. A short discussion up front saves everyone time on PRs that overlap existing work or head in a direction we're not going.
10+
11+
Then read **[AGENTS.md](./AGENTS.md)** — it's the design guide for scenarios and checks. The short version:
12+
13+
- **Fewer scenarios, more checks.** Each scenario spins up its own server and runs in CI for every SDK. One scenario with 10 checks beats 10 scenarios with one check each.
14+
- **Prove it passes and fails.** Extend the existing everything-client/server to pass your scenario, and show (or include) a failing case.
15+
- **Reuse the CLI runner.** Don't add parallel entry points.
16+
17+
If you're using an AI agent to help, please **don't** point it at the repo with a generic "find bugs" prompt — give it a specific MUST from the spec or an open issue to work on. See AGENTS.md for details.
18+
19+
## Setup
20+
21+
```sh
22+
npm install
23+
npm run build
24+
npm test
25+
```
26+
27+
This repo uses **npm** — don't commit `pnpm-lock.yaml` or `yarn.lock`.
28+
29+
## Running your scenario
30+
31+
```sh
32+
# Against the bundled TypeScript example
33+
npm run build
34+
node dist/index.js client --command "tsx examples/clients/typescript/everything-client.ts" --scenario <your-scenario>
35+
36+
# Against a server
37+
node dist/index.js server --url http://localhost:3000/mcp --scenario <your-scenario>
38+
```
39+
40+
See the [README](./README.md) for full CLI options and the [SDK Integration Guide](./SDK_INTEGRATION.md) for testing against a real SDK.
41+
42+
## Pull requests
43+
44+
- Register your scenario in the right suite in `src/scenarios/index.ts`
45+
- Run against at least one real SDK before opening the PR — we'll ask what the output looked like
46+
- Keep PRs focused; one feature or scenario group at a time

0 commit comments

Comments
 (0)