Skip to content

Conversation

@BobDickinson
Copy link

Summary

CLI tests were implemented in scripts (not any kind of test framework), which caused a number of issues, including tests that "failed" but didn't cause the script to return failure.

Also, none of the tests actually validated the thing being tested - they only validated whether the CLI run produced an error or a "success" (which included MCP protocol responses with isError: true).

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Documentation update
  • Refactoring (no functional changes)
  • Test updates
  • Build/CI improvements

Changes Made

Implemented all existing CLI tests in vitest, removed old script-based CLI tests. Updated all tests to ensure they actually validate the thing the test is supposed to test. Remove all references to server-everything and any other registry provided servers. All tests now use an MCP server test harness implemented within the test infra - a stdio server that collects cwd, env vars, and params and reports them as resources for validation, or an sse/http composable and instrumented test server that can provide headers, message replay, detailed metadata, etc).

Testing

  • Tested in UI mode
  • Tested in CLI mode
  • Tested with STDIO transport
  • Tested with SSE transport
  • Tested with Streamable HTTP transport
  • Added/updated automated tests
  • Manual testing performed

Test Results and/or Instructions

Run all tests via npn run test - all 85 tests should pass.

Checklist

  • Code follows the style guidelines (ran npm run prettier-fix)
  • Self-review completed
  • Code is commented where necessary
  • Documentation updated (README, comments, etc.)

Breaking Changes

No

Additional Context

Updated npm test target, so CI should work properly (but not tested)

@cliffhall
Copy link
Member

@claude can you please review this PR?

@claude
Copy link
Contributor

claude bot commented Jan 15, 2026

Claude finished @cliffhall's task in 1m 58s —— View job


PR Review: Convert CLI Tests to Vitest

Great work on this comprehensive test migration! This PR significantly improves the CLI test infrastructure. Below is my detailed review.


✅ Strengths

1. Solid Test Infrastructure

  • Professional test harness: The InstrumentedServer class is well-designed with request recording, header capture, and metadata tracking
  • Comprehensive helpers: cli-runner.ts, assertions.ts, and fixtures.ts provide clean abstractions
  • Proper isolation: Dynamic port allocation and unique config files via crypto.randomUUID() enable parallel test execution
  • Good cleanup: Proper use of try/finally blocks to clean up servers and config files

2. Improved Test Quality

  • Tests now validate actual behavior, not just "didn't error"
  • Good coverage of edge cases (environment variables with equals signs, base64 tokens, etc.)
  • Transport-specific testing (stdio, HTTP, SSE) with proper server implementations
  • Metadata and header propagation properly validated

3. Code Quality

  • Clean TypeScript with proper types
  • Well-organized test structure with descriptive describe blocks
  • Good use of vitest's assertion API
  • Helpful README documenting test organization and running commands

4. Configuration

  • Sensible vitest config with 15s timeout for subprocess tests
  • Updated CI workflow is clean and straightforward
  • Package.json changes are minimal and focused

⚠️ Issues & Concerns

🔴 Critical: Missing tsx Dependency

The tests use tsx to run the test MCP server (cli/__tests__/helpers/fixtures.ts:106), but tsx is not listed in any package.json. This will cause all tests to fail.

Impact: All stdio transport tests will fail with "command not found: tsx"

Fix needed: Add tsx to cli/package.json devDependencies:

"devDependencies": {
  "@types/express": "^5.0.6",
  "tsx": "^4.7.0",
  "vitest": "^4.0.17"
}

Fix this →

🟡 Moderate: Potential Process Leak on Windows

In cli/__tests__/helpers/cli-runner.ts:51-59, the timeout handler attempts to kill process groups with process.kill(-child.pid!, "SIGTERM") on non-Windows systems, but this may fail silently if the process has already exited. Consider adding better error handling:

try {
  if (process.platform === "win32") {
    child.kill("SIGTERM");
  } else {
    // On Unix, kill the process group
    process.kill(-child.pid!, "SIGTERM");
  }
} catch (e) {
  // Process might already be dead, try direct kill
  try {
    child.kill("SIGKILL");
  } catch (e2) {
    // Process is definitely dead
  }
}

🟡 Moderate: Missing Root Test Script

The root package.json has a test script that runs npm run prettier-check && cd client && npm test, but doesn't include CLI tests. Consider updating to:

"test": "npm run prettier-check && cd client && npm test && npm run test-cli"

Or at minimum, update the PR description's "Test Results" section since it says "Run all tests via npm run test" but that command doesn't actually run CLI tests from the root.

🟡 Moderate: HTTP Server Cleanup

In instrumented-server.ts:415-431, the stop() method closes the HTTP server but doesn't force-close active connections. For tests, this could cause port conflicts if connections don't close quickly. Consider:

async stop(): Promise<void> {
  await this.mcpServer.close();

  if (this.transport) {
    await this.transport.close();
    this.transport = undefined;
  }

  if (this.httpServer) {
    return new Promise((resolve) => {
      // Force close all connections
      this.httpServer!.closeAllConnections?.();
      this.httpServer!.close(() => {
        this.httpServer = undefined;
        resolve();
      });
    });
  }
}

🟢 Minor: Inconsistent Error Handling

The hasValidJsonOutput function in assertions.ts:57-66 is defined but never used. Consider removing it or using it in appropriate tests.

🟢 Minor: Test Timeout Configuration

The vitest config sets testTimeout: 15000 (15s) and cli-runner.ts uses a default timeout of 12s. While this works, the 3-second buffer is tight. Consider either:

  • Increasing vitest timeout to 20s
  • Reducing cli-runner default to 10s
  • Adding a comment explaining the relationship

🤔 Questions

  1. CI Coverage: The PR says "Updated npm test target, so CI should work properly (but not tested)" - should we verify CI passes before merging?

  2. Coverage Tracking: The README notes that coverage isn't used because code runs in spawned processes. Is this acceptable, or should we consider using c8/nyc for coverage?

  3. Parallel Execution: The README says tests within a file run sequentially. Can we enable intra-file parallelism with proper isolation for faster test runs?


📋 Testing Recommendations

Before merging:

  1. ✅ Add tsx dependency
  2. ✅ Run the full test suite in CI to verify everything passes
  3. ⚠️ Test on Windows if possible (the process cleanup code is platform-specific)
  4. ⚠️ Verify the test suite completes in reasonable time (currently 85 tests)

🎯 Summary

Recommendation: ✅ Approve after fixing the tsx dependency

This is excellent work that significantly improves the test infrastructure. The only blocking issue is the missing tsx dependency. Once that's added and CI confirms all tests pass, this is ready to merge.

The new test infrastructure is:

  • More maintainable (vitest vs custom scripts)
  • More reliable (proper assertions vs just checking exit codes)
  • Better isolated (no external registry dependencies)
  • Well-documented (helpful README)

Great job! 🎉


Detailed Stats
  • Files changed: 18
  • Lines added: 4,059
  • Lines removed: 2,490
  • Net addition: +1,569 lines
  • Test files: 4 main test files + 5 helper modules
  • Test cases: ~120 test cases (describe/it blocks)
  • Infrastructure: Complete test harness with stdio and HTTP/SSE server implementations

… up process termination (possible leak on Windows), beefed up http server cleanup (close all connections), removed unused hasValidJsonOutput, reduced CLI timeout to give it breathing room with vitest timeout.
@BobDickinson
Copy link
Author

I addressed the valid points from the Claude PR review. This should be ready to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants