Converted CLI tests to use vitest #1012

BobDickinson · 2026-01-15T05:21:50Z

Summary

CLI tests were implemented in scripts (not any kind of test framework), which caused a number of issues, including tests that "failed" but didn't cause the script to return failure.

Also, none of the tests actually validated the thing being tested - they only validated whether the CLI run produced an error or a "success" (which included MCP protocol responses with isError: true).

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Documentation update
Refactoring (no functional changes)
Test updates
Build/CI improvements

Changes Made

Implemented all existing CLI tests in vitest, removed old script-based CLI tests. Updated all tests to ensure they actually validate the thing the test is supposed to test. Remove all references to server-everything and any other registry provided servers. All tests now use an MCP server test harness implemented within the test infra - a stdio server that collects cwd, env vars, and params and reports them as resources for validation, or an sse/http composable and instrumented test server that can provide headers, message replay, detailed metadata, etc).

Testing

Test Results and/or Instructions

Run all tests via npn run test - all 85 tests should pass.

Checklist

Code follows the style guidelines (ran npm run prettier-fix)
Self-review completed
Code is commented where necessary
Documentation updated (README, comments, etc.)

Breaking Changes

No

Additional Context

Updated npm test target, so CI should work properly (but not tested)

… version, fixed problem with undetected failures (isError: true payloads).

…tests

…stry), all tests actually validate what they say they test.

cliffhall · 2026-01-15T18:43:57Z

@claude can you please review this PR?

claude · 2026-01-15T18:44:12Z

Claude finished @cliffhall's task in 1m 58s —— View job

PR Review: Convert CLI Tests to Vitest

Great work on this comprehensive test migration! This PR significantly improves the CLI test infrastructure. Below is my detailed review.

✅ Strengths

1. Solid Test Infrastructure

Professional test harness: The InstrumentedServer class is well-designed with request recording, header capture, and metadata tracking
Comprehensive helpers: cli-runner.ts, assertions.ts, and fixtures.ts provide clean abstractions
Proper isolation: Dynamic port allocation and unique config files via crypto.randomUUID() enable parallel test execution
Good cleanup: Proper use of try/finally blocks to clean up servers and config files

2. Improved Test Quality

Tests now validate actual behavior, not just "didn't error"
Good coverage of edge cases (environment variables with equals signs, base64 tokens, etc.)
Transport-specific testing (stdio, HTTP, SSE) with proper server implementations
Metadata and header propagation properly validated

3. Code Quality

Clean TypeScript with proper types
Well-organized test structure with descriptive describe blocks
Good use of vitest's assertion API
Helpful README documenting test organization and running commands

4. Configuration

Sensible vitest config with 15s timeout for subprocess tests
Updated CI workflow is clean and straightforward
Package.json changes are minimal and focused

⚠️ Issues & Concerns

🔴 Critical: Missing `tsx` Dependency

The tests use tsx to run the test MCP server (cli/__tests__/helpers/fixtures.ts:106), but tsx is not listed in any package.json. This will cause all tests to fail.

Impact: All stdio transport tests will fail with "command not found: tsx"

Fix needed: Add tsx to cli/package.json devDependencies:

"devDependencies": {
  "@types/express": "^5.0.6",
  "tsx": "^4.7.0",
  "vitest": "^4.0.17"
}

Fix this →

🟡 Moderate: Potential Process Leak on Windows

In cli/__tests__/helpers/cli-runner.ts:51-59, the timeout handler attempts to kill process groups with process.kill(-child.pid!, "SIGTERM") on non-Windows systems, but this may fail silently if the process has already exited. Consider adding better error handling:

try {
  if (process.platform === "win32") {
    child.kill("SIGTERM");
  } else {
    // On Unix, kill the process group
    process.kill(-child.pid!, "SIGTERM");
  }
} catch (e) {
  // Process might already be dead, try direct kill
  try {
    child.kill("SIGKILL");
  } catch (e2) {
    // Process is definitely dead
  }
}

🟡 Moderate: Missing Root Test Script

The root package.json has a test script that runs npm run prettier-check && cd client && npm test, but doesn't include CLI tests. Consider updating to:

"test": "npm run prettier-check && cd client && npm test && npm run test-cli"

Or at minimum, update the PR description's "Test Results" section since it says "Run all tests via npm run test" but that command doesn't actually run CLI tests from the root.

🟡 Moderate: HTTP Server Cleanup

In instrumented-server.ts:415-431, the stop() method closes the HTTP server but doesn't force-close active connections. For tests, this could cause port conflicts if connections don't close quickly. Consider:

async stop(): Promise<void> {
  await this.mcpServer.close();

  if (this.transport) {
    await this.transport.close();
    this.transport = undefined;
  }

  if (this.httpServer) {
    return new Promise((resolve) => {
      // Force close all connections
      this.httpServer!.closeAllConnections?.();
      this.httpServer!.close(() => {
        this.httpServer = undefined;
        resolve();
      });
    });
  }
}

🟢 Minor: Inconsistent Error Handling

The hasValidJsonOutput function in assertions.ts:57-66 is defined but never used. Consider removing it or using it in appropriate tests.

🟢 Minor: Test Timeout Configuration

The vitest config sets testTimeout: 15000 (15s) and cli-runner.ts uses a default timeout of 12s. While this works, the 3-second buffer is tight. Consider either:

Increasing vitest timeout to 20s
Reducing cli-runner default to 10s
Adding a comment explaining the relationship

🤔 Questions

CI Coverage: The PR says "Updated npm test target, so CI should work properly (but not tested)" - should we verify CI passes before merging?
Coverage Tracking: The README notes that coverage isn't used because code runs in spawned processes. Is this acceptable, or should we consider using c8/nyc for coverage?
Parallel Execution: The README says tests within a file run sequentially. Can we enable intra-file parallelism with proper isolation for faster test runs?

📋 Testing Recommendations

Before merging:

✅ Add tsx dependency
✅ Run the full test suite in CI to verify everything passes
⚠️ Test on Windows if possible (the process cleanup code is platform-specific)
⚠️ Verify the test suite completes in reasonable time (currently 85 tests)

🎯 Summary

Recommendation: ✅ Approve after fixing the tsx dependency

This is excellent work that significantly improves the test infrastructure. The only blocking issue is the missing tsx dependency. Once that's added and CI confirms all tests pass, this is ready to merge.

The new test infrastructure is:

More maintainable (vitest vs custom scripts)
More reliable (proper assertions vs just checking exit codes)
Better isolated (no external registry dependencies)
Well-documented (helpful README)

Great job! 🎉

Detailed Stats

Files changed: 18
Lines added: 4,059
Lines removed: 2,490
Net addition: +1,569 lines
Test files: 4 main test files + 5 helper modules
Test cases: ~120 test cases (describe/it blocks)
Infrastructure: Complete test harness with stdio and HTTP/SSE server implementations

… up process termination (possible leak on Windows), beefed up http server cleanup (close all connections), removed unused hasValidJsonOutput, reduced CLI timeout to give it breathing room with vitest timeout.

BobDickinson · 2026-01-15T19:49:36Z

I addressed the valid points from the Claude PR review. This should be ready to merge.

BobDickinson added 5 commits January 14, 2026 13:41

Pinned server-everything, fixed all tests to work with current pinned…

bf3e765

… version, fixed problem with undetected failures (isError: true payloads).

First working vitest implementation

5eec809

Refactoring some single-use configs fixtures and into the refeencing …

395de2a

…tests

No tests refere to server-everything (or any other server from a regi…

20292b1

…stry), all tests actually validate what they say they test.

Removed server-everything dep from CI, minor cleanup

f57bc30

Addressed Claude PR review comments: Added tsx dev dependency, beefed…

5ee7d77

… up process termination (possible leak on Windows), beefed up http server cleanup (close all connections), removed unused hasValidJsonOutput, reduced CLI timeout to give it breathing room with vitest timeout.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Converted CLI tests to use vitest #1012

Converted CLI tests to use vitest #1012

BobDickinson commented Jan 15, 2026

Uh oh!

cliffhall commented Jan 15, 2026

Uh oh!

claude bot commented Jan 15, 2026 •

edited

Loading

Uh oh!

BobDickinson commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Converted CLI tests to use vitest #1012

Are you sure you want to change the base?

Converted CLI tests to use vitest #1012

Conversation

BobDickinson commented Jan 15, 2026

Summary

Type of Change

Changes Made

Testing

Test Results and/or Instructions

Checklist

Breaking Changes

Additional Context

Uh oh!

cliffhall commented Jan 15, 2026

Uh oh!

claude bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Convert CLI Tests to Vitest

✅ Strengths

1. Solid Test Infrastructure

2. Improved Test Quality

3. Code Quality

4. Configuration

⚠️ Issues & Concerns

🔴 Critical: Missing tsx Dependency

🟡 Moderate: Potential Process Leak on Windows

🟡 Moderate: Missing Root Test Script

🟡 Moderate: HTTP Server Cleanup

🟢 Minor: Inconsistent Error Handling

🟢 Minor: Test Timeout Configuration

🤔 Questions

📋 Testing Recommendations

🎯 Summary

Uh oh!

BobDickinson commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

claude bot commented Jan 15, 2026 •

edited

Loading

🔴 Critical: Missing `tsx` Dependency