Skip to content

Add onboarding MCP server for new contributors#4839

Draft
hanniavalera wants to merge 7 commits intomainfrom
dev/hanniavalera/mcpNewContributors
Draft

Add onboarding MCP server for new contributors#4839
hanniavalera wants to merge 7 commits intomainfrom
dev/hanniavalera/mcpNewContributors

Conversation

@hanniavalera
Copy link
Copy Markdown
Contributor

@hanniavalera hanniavalera commented Mar 25, 2026

This changes developer onboarding experience

The following changes are proposed:

  • Add an MCP server at onboarding-mcp that gives GitHub Copilot (agent mode) structured, repo-specific knowledge to help new contributors onboard faster
  • Add a pointer to the onboarding assistant at the top of CONTRIBUTING.md
  • Add mcp.json so the server is auto-registered for anyone who opens the workspace

The purpose of this change

New contributors to CMake Tools face a steep ramp-up: two operating modes (kits vs. presets), multiple generator types, a large src tree, and conventions scattered across CONTRIBUTING.md and docs/. This PR adds a lightweight MCP server that Copilot agent mode can call to answer onboarding questions without the contributor needing to hunt through files manually.

The server exposes 7 tools:

Tool What it does
get_setup_guide Step-by-step local dev setup derived from CONTRIBUTING.md
check_pr_readiness PR checklist with keyword-based warnings (e.g. flagging dependency changes)
explain_concept Explains kits, presets, drivers, CTest, and 11 other concepts with source file links
find_source_file Maps natural-language feature descriptions to the right source files
get_docs_page Returns the matching docs page with summary and section headings
get_contributor_issues Fetches open GitHub issues with contributor-friendliness signals
get_recent_changes Fetches recent commits annotated with affected codebase areas

The first five tools use static data baked into the server. The last two make live GitHub API calls (works unauthenticated at 60 req/hr, or 5,000/hr with a GITHUB_TOKEN).

Everything is self-contained under onboarding-mcp with its own package.json, tsconfig.json, and eslint config — no changes to the root build, dependencies, or yarn.lock.

Other Notes/Information

  • Built with @modelcontextprotocol/sdk v1.27.1, TypeScript ESM, and zod for input validation
  • Requires Node 18+ (for built-in fetch)
  • The mcp.json is committed so contributors get the server registered automatically — they just need to run yarn install && yarn build inside onboarding-mcp
  • Usage instructions and example prompts are in README.md and linked from CONTRIBUTING.md

@hanniavalera hanniavalera force-pushed the dev/hanniavalera/mcpNewContributors branch from 006b7c6 to 4c87b81 Compare March 25, 2026 15:39
@hanniavalera hanniavalera changed the title Dev/hanniavalera/mcp new contributors Add onboarding MCP server for new contributors Mar 25, 2026
@hanniavalera hanniavalera force-pushed the dev/hanniavalera/mcpNewContributors branch from 4c87b81 to fd48fe1 Compare March 25, 2026 15:43
@hanniavalera hanniavalera marked this pull request as draft March 25, 2026 16:00
@hanniavalera hanniavalera force-pushed the dev/hanniavalera/mcpNewContributors branch from fd48fe1 to 0c86ca0 Compare March 25, 2026 16:29
@gcampbell-msft
Copy link
Copy Markdown
Collaborator

I haven't taken a deep look at this, but couldn't a lot of this, especially the static data, be surfaced through SKILLs? rather than having to create the insfrastructure and have to compile and build the mcp server?

@hanniavalera
Copy link
Copy Markdown
Contributor Author

hanniavalera commented Mar 25, 2026

I haven't taken a deep look at this, but couldn't a lot of this, especially the static data, be surfaced through SKILLs? rather than having to create the insfrastructure and have to compile and build the mcp server?

Valid! For get_setup_guide and get_docs_page specifically, you're right that Copilot with repo context could handle those without a server. The stronger justification is for the other 5 tools:

  • check_pr_readiness applies repo-specific logic against a contributor's input: it's procedural, not retrieval
  • explain_concept / find_source_file encode a curated concept → source file map that makes navigation deterministic rather than relying on Copilot inferring structure from a 22-file src tree
  • get_contributor_issues / get_recent_changes make live GitHub API calls with repo-specific enrichment, such as annotating commits with affected codebase areas using our own source map, or scoring issues for contributor-friendliness. The official GitHub MCP server can fetch raw issues and commits, but the domain-specific interpretation layered on top is what this server adds.

That said, I'm happy to drop the two pure-retrieval tools if you think that makes the most sense :)

Hannia Valera added 7 commits March 26, 2026 10:54
…atures

- Updated README.md to reflect new tool phases and descriptions.
- Bump version to 0.2.0 in package.json.
- Implemented new tools for concept explanation, source file finding, and documentation retrieval.
- Added concepts data structure to provide detailed explanations of CMake Tools concepts.
- Created a source map to link keywords to relevant source files in the codebase.
- Developed a documentation map to connect topics with their respective documentation files.
- Introduced new tools: `explain_concept`, `find_source_file`, and `get_docs_page` for improved onboarding experience.
@hanniavalera hanniavalera force-pushed the dev/hanniavalera/mcpNewContributors branch from 8e212c0 to f5c672b Compare March 26, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants