fix: pin MCP server versions for reproducible benchmarks#246
Merged
Conversation
MCP servers are constantly evolving, and version changes can affect benchmark results. To ensure reproducibility, we pin to specific versions: - GitHub MCP Server: v0.15.0 (93 tools) - Switched from HTTP remote API to Docker-based STDIO for version control - Remote API at api.githubcopilot.com doesn't support version pinning - Notion MCP Server: @notionhq/notion-mcp-server@1.9.1 - Already pinned, no change needed Also includes improvements to NotionStateManager: - Add browser instance reuse within session for better performance - Add source hub orphan cleanup to prevent duplicate page accumulation - Add UI-based recovery for duplicate page detection - Improve error handling and logging Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
c07f4ab to
eb51fbb
Compare
xyliugo
approved these changes
Jan 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MCP servers are constantly evolving, and version changes can affect benchmark results. To ensure reproducibility, we pin to specific versions:
GitHub MCP Server:
v0.15.0(93 tools)api.githubcopilot.comdoesn't support version pinningNotion MCP Server:
@notionhq/notion-mcp-server@1.9.1NotionStateManager Improvements
Also includes improvements to NotionStateManager:
Test plan
close #245