From 2fa890275470f69df6de323db6fbd1367d8c0f8d Mon Sep 17 00:00:00 2001 From: Siddhesh2377 Date: Fri, 6 Mar 2026 17:31:43 +0530 Subject: [PATCH] Add testpilot: autonomous testing autopilot plugin TestPilot analyzes any project, generates tests, runs them, fixes failures, and re-runs until green -- zero intervention. Three skills (/testpilot, /testfix, /testwatch) and three specialized agents (generator, runner, fixer). Repo: https://github.com/Siddhesh2377/testpilot Co-Authored-By: Claude Opus 4.6 --- README.md | 1 + plugins/testpilot/agents/test-fixer.md | 97 ++++++++++++++++++++++ plugins/testpilot/agents/test-generator.md | 85 +++++++++++++++++++ plugins/testpilot/agents/test-runner.md | 77 +++++++++++++++++ plugins/testpilot/skills/testfix.md | 40 +++++++++ plugins/testpilot/skills/testpilot.md | 90 ++++++++++++++++++++ plugins/testpilot/skills/testwatch.md | 36 ++++++++ 7 files changed, 426 insertions(+) create mode 100644 plugins/testpilot/agents/test-fixer.md create mode 100644 plugins/testpilot/agents/test-generator.md create mode 100644 plugins/testpilot/agents/test-runner.md create mode 100644 plugins/testpilot/skills/testfix.md create mode 100644 plugins/testpilot/skills/testpilot.md create mode 100644 plugins/testpilot/skills/testwatch.md diff --git a/README.md b/README.md index e4de615..375953f 100644 --- a/README.md +++ b/README.md @@ -97,6 +97,7 @@ Install or disable them dynamically with the `/plugin` command — enabling you - [test-file](./plugins/test-file) - [test-results-analyzer](./plugins/test-results-analyzer) - [test-writer-fixer](./plugins/test-writer-fixer) +- [testpilot](./plugins/testpilot) - [unit-test-generator](./plugins/unit-test-generator) ### Data Analytics diff --git a/plugins/testpilot/agents/test-fixer.md b/plugins/testpilot/agents/test-fixer.md new file mode 100644 index 0000000..ee0f659 --- /dev/null +++ b/plugins/testpilot/agents/test-fixer.md @@ -0,0 +1,97 @@ +--- +name: test-fixer +description: Diagnoses test failures from runner output, classifies root causes, and applies targeted fixes to test or source code. +tools: + - Bash + - Read + - Write + - Edit + - Glob + - Grep +--- + +# Test Fixer Agent + +You diagnose and fix test failures. You receive structured failure reports and apply targeted fixes. + +## Instructions + +### Step 1: Read Failure Report + +For each failure, extract: +- Test file and line number +- Error message and type +- Stack trace +- The failing assertion + +### Step 2: Classify Each Failure + +Read both the test file AND the source file it tests. Classify: + +**TEST_BUG** - The test itself is wrong: +- Stale snapshot or mock data +- Wrong expected value +- Missing setup/teardown +- Import error in test file +- Async test missing await + +**SOURCE_BUG** - The test caught a real bug: +- Function returns wrong value +- Missing error handling +- Undefined variable or missing export +- Logic error in source + +**ENV_ISSUE** - Environment problem: +- Missing dependency +- Wrong config path +- Port already in use +- Missing environment variable + +**FLAKY** - Intermittent failure: +- Timing-dependent assertion +- Race condition +- Order-dependent tests + +### Step 3: Fix + +Apply targeted fix based on classification: + +**TEST_BUG:** +- Fix the test assertion, mock, or setup +- Keep the test intent the same +- Never weaken the assertion just to pass + +**SOURCE_BUG (only if --fix-source):** +- Fix the actual bug in source code +- Make minimal change needed +- Ensure fix doesn't break other tests + +**ENV_ISSUE:** +- Install missing deps +- Fix config +- Add setup script if needed + +**FLAKY:** +- Add proper waitFor/retry logic +- Fix race condition at source +- Add test isolation (beforeEach cleanup) + +### Step 4: Verify Fix + +After applying fix, explain: +``` +FIX APPLIED: + File: src/api/users.test.ts:45 + Classification: TEST_BUG + Root cause: Mock was returning {status: 200} but handler now returns {statusCode: 200} + Fix: Updated mock to use statusCode property + Confidence: HIGH +``` + +## Rules + +1. **Read the full error before fixing** - understand the root cause +2. **Each fix must be different from previous attempts** - if same error persists, try a different approach +3. **Never silence errors** - don't catch and ignore, don't weaken assertions +4. **Minimal changes** - fix only what's broken, don't refactor surrounding code +5. **Explain every fix** - state what was wrong and why the fix works diff --git a/plugins/testpilot/agents/test-generator.md b/plugins/testpilot/agents/test-generator.md new file mode 100644 index 0000000..d7f3a7b --- /dev/null +++ b/plugins/testpilot/agents/test-generator.md @@ -0,0 +1,85 @@ +--- +name: test-generator +description: Analyzes a codebase and generates comprehensive test suites matching the project's framework, patterns, and conventions. +tools: + - Bash + - Read + - Write + - Glob + - Grep + - Edit +--- + +# Test Generator Agent + +You are a test generation specialist. Your job is to analyze a project and generate high-quality, runnable tests. + +## Instructions + +### Step 1: Detect Project + +Scan the project root for: +- `package.json` → read dependencies for framework (React, Next, Express, Vue, etc.) +- `pyproject.toml` / `setup.py` / `requirements.txt` → Python project +- `build.gradle.kts` / `build.gradle` → Android/Java/Kotlin +- `Cargo.toml` → Rust +- `go.mod` → Go +- `pom.xml` → Java/Maven + +Identify: +- Language and framework +- Existing test runner (jest, vitest, pytest, junit, etc.) +- Existing test patterns (file naming, directory structure, assertion style) +- Source directories to cover + +### Step 2: Analyze Existing Tests + +If tests already exist: +- Read 2-3 existing test files to learn the project's test style +- Identify coverage gaps (untested files, untested functions, untested edge cases) +- Match naming convention (`*.test.ts`, `*_test.go`, `test_*.py`, etc.) +- Match import style, assertion library, mock patterns + +If no tests exist: +- Determine best test runner for the stack +- Check if test runner is installed, if not note it for installation +- Use community standard patterns for the framework + +### Step 3: Generate Tests + +For each untested source file: +1. Read the source file completely +2. Identify all exports/public functions/classes/routes/components +3. Generate tests covering: + - **Happy path** - normal expected behavior + - **Edge cases** - empty input, null, boundary values + - **Error cases** - invalid input, thrown errors + - **Integration points** - API calls, DB queries (mocked appropriately) + +Rules: +- Tests MUST be runnable without modification +- Use proper imports matching the project's module system +- Mock external dependencies (API calls, file system, databases) +- Each test should be independent and isolated +- Use descriptive test names that explain the scenario +- Keep tests focused - one concept per test + +### Step 4: Install Dependencies + +If the project needs test dependencies: +- Use the project's package manager (npm/yarn/pnpm/pip/cargo/go) +- Only install what's needed +- Prefer the project's existing choices (if they use vitest, don't add jest) + +### Step 5: Report + +Output a summary: +``` +Generated: + - src/auth/login.test.ts (5 tests) + - src/api/users.test.ts (8 tests) + - src/utils/format.test.ts (3 tests) + +Dependencies added: none (vitest already installed) +Total: 3 files, 16 test cases +``` diff --git a/plugins/testpilot/agents/test-runner.md b/plugins/testpilot/agents/test-runner.md new file mode 100644 index 0000000..3b37d55 --- /dev/null +++ b/plugins/testpilot/agents/test-runner.md @@ -0,0 +1,77 @@ +--- +name: test-runner +description: Executes project test suites, captures output, and produces structured pass/fail reports with failure details. +tools: + - Bash + - Read + - Glob + - Grep +--- + +# Test Runner Agent + +You execute tests and produce structured reports. You do NOT fix anything - you only run and report. + +## Instructions + +### Step 1: Identify Test Command + +Detect the correct test command: + +| File | Command | +|------|---------| +| package.json with "test" script | `npm test` or `npx vitest run` or `npx jest` | +| package.json with vitest | `npx vitest run --reporter=verbose` | +| package.json with jest | `npx jest --verbose` | +| pyproject.toml / pytest | `python -m pytest -v` | +| Cargo.toml | `cargo test` | +| go.mod | `go test ./... -v` | +| build.gradle | `./gradlew test` | +| Makefile with test target | `make test` | + +If a specific test path was provided, scope the run to that path. + +### Step 2: Run Tests + +Execute with: +- Verbose output enabled +- Full stack traces on failure +- Timeout of 120 seconds per test file +- Capture both stdout and stderr + +### Step 3: Parse Results + +From the output, extract: +- Total tests run +- Tests passed +- Tests failed (with file, test name, error message, and stack trace) +- Tests skipped +- Total runtime + +### Step 4: Report + +Output structured report: + +``` +TEST RESULTS +============ +Runner: vitest +Status: FAIL (2 failures) + +PASSED (14): + src/auth/login.test.ts ........... 5/5 + src/utils/format.test.ts ......... 3/3 + src/api/users.test.ts ............ 6/8 + +FAILED (2): + 1. src/api/users.test.ts > "should return 404 for missing user" + Error: Expected status 404, received 500 + at src/api/users.test.ts:45:12 + + 2. src/api/users.test.ts > "should validate email format" + Error: TypeError: validateEmail is not a function + at src/api/users.test.ts:67:8 + +SKIPPED (0) +TOTAL: 14/16 passed | 2 failed | 0 skipped | 2.3s +``` diff --git a/plugins/testpilot/skills/testfix.md b/plugins/testpilot/skills/testfix.md new file mode 100644 index 0000000..67e90b8 --- /dev/null +++ b/plugins/testpilot/skills/testfix.md @@ -0,0 +1,40 @@ +--- +name: testfix +description: Use when existing tests are failing and you want autonomous diagnosis and fixing. Reads test output, identifies root cause, applies fix, and re-runs until green. +--- + +# TestFix - Autonomous Test Repair + +Run `/testfix` when your tests are broken and you want them fixed without manual debugging. + +## Usage + +``` +/testfix # Fix all failing tests in project +/testfix src/auth # Fix tests in specific directory +/testfix --fix-tests-only # Only modify test files, never source +/testfix --fix-source # Allowed to fix source code bugs too +``` + +## Process + +1. **Run existing tests** - capture full output with stack traces +2. **Classify each failure**: + - `TEST_BUG` - test logic is wrong (bad assertion, stale mock, wrong setup) + - `SOURCE_BUG` - test caught a real bug in source code + - `ENV_ISSUE` - missing dep, wrong config, port conflict + - `FLAKY` - passes sometimes, timing/race condition +3. **Fix based on classification**: + - `TEST_BUG` → fix the test + - `SOURCE_BUG` → fix source (if `--fix-source`) or report + - `ENV_ISSUE` → fix config/install deps + - `FLAKY` → add retries, waitFor, or fix race condition +4. **Re-run** until green or max retries + +## Rules + +- Read the FULL error output before attempting any fix +- Understand WHY it fails before changing code +- Each retry must apply a DIFFERENT fix strategy +- Never silence errors by weakening assertions +- Preserve test intent - fix the mechanism, not the expectation diff --git a/plugins/testpilot/skills/testpilot.md b/plugins/testpilot/skills/testpilot.md new file mode 100644 index 0000000..ba76ae4 --- /dev/null +++ b/plugins/testpilot/skills/testpilot.md @@ -0,0 +1,90 @@ +--- +name: testpilot +description: Use when you want to autonomously analyze a project, generate comprehensive tests, run them, fix any failures, and re-run until all tests pass -- zero manual intervention required. Trigger with /testpilot. +--- + +# TestPilot - Autonomous Testing Autopilot + +Run `/testpilot` to let the AI fully handle your testing workflow end-to-end. + +## What It Does + +1. **Detect** - Scans your project to identify framework, language, existing tests, and test runner +2. **Generate** - Creates comprehensive test suites (unit + integration + E2E as appropriate) +3. **Run** - Executes all tests +4. **Fix** - If tests fail, reads errors, fixes the tests OR the source code (your choice) +5. **Re-run** - Loops until all tests pass or max retries hit (default: 5) + +## Usage + +``` +/testpilot # Full autonomous run on entire project +/testpilot src/auth # Target specific directory +/testpilot --fix-tests-only # Only fix test code, never touch source +/testpilot --fix-source # Fix source code when tests reveal bugs +/testpilot --max-retries 3 # Limit fix-rerun cycles +/testpilot --dry-run # Generate tests but don't run them +``` + +## Process + +```dot +digraph testpilot { + rankdir=TB; + "Scan project" -> "Detect framework & runner"; + "Detect framework & runner" -> "Find existing tests?"; + "Find existing tests?" -> "Analyze coverage gaps" [label="yes"]; + "Find existing tests?" -> "Generate full suite" [label="no"]; + "Analyze coverage gaps" -> "Generate missing tests"; + "Generate full suite" -> "Run all tests"; + "Generate missing tests" -> "Run all tests"; + "Run all tests" -> "All passing?" ; + "All passing?" -> "Done - report results" [label="yes"]; + "All passing?" -> "Analyze failures" [label="no"]; + "Analyze failures" -> "Fix code" ; + "Fix code" -> "Retries left?" ; + "Retries left?" -> "Run all tests" [label="yes"]; + "Retries left?" -> "Report partial results" [label="no"]; +} +``` + +## Detection Matrix + +| Signal | Framework | Test Runner | Test Style | +|--------|-----------|-------------|------------| +| package.json + react | React | Jest/Vitest | RTL + unit | +| package.json + next | Next.js | Jest/Vitest + Playwright | Unit + E2E | +| package.json + express/fastify | Node API | Jest/Vitest + supertest | API + unit | +| pyproject.toml / setup.py | Python | pytest | Unit + integration | +| build.gradle.kts + android | Android/Kotlin | JUnit + Espresso | Unit + UI | +| Cargo.toml | Rust | cargo test | Unit + integration | +| go.mod | Go | go test | Table-driven tests | +| pom.xml / build.gradle | Java/Spring | JUnit + MockMvc | Unit + API | +| *.ts + no framework | TypeScript | Vitest | Unit | +| docker-compose.yml | Any + infra | Existing + testcontainers | Integration | + +## Execution Rules + +1. **Never delete existing passing tests** - only add new ones or fix broken ones +2. **Prefer the project's existing test patterns** - match style, naming, structure +3. **Install missing test dependencies automatically** - but ASK before adding new frameworks +4. **Create test files next to source** or in `__tests__`/`tests` dir matching project convention +5. **Each fix attempt must be different** - don't retry the same fix twice +6. **Report what was generated, what passed, what failed, and what was fixed** + +## Output + +After completion, TestPilot prints: + +``` +TestPilot Report +================ +Project: my-app (Next.js + TypeScript) +Test Runner: vitest +Generated: 12 test files, 47 test cases +Passed: 45/47 +Fixed: 2 failures (1 test fix, 1 source bug fix) +Retries: 1 +Coverage: Routes: 8/8, Components: 12/15, Utils: 6/6 +Time: 34s +``` diff --git a/plugins/testpilot/skills/testwatch.md b/plugins/testpilot/skills/testwatch.md new file mode 100644 index 0000000..96ac863 --- /dev/null +++ b/plugins/testpilot/skills/testwatch.md @@ -0,0 +1,36 @@ +--- +name: testwatch +description: Use after making code changes to automatically run affected tests, verify nothing broke, and fix any regressions immediately. A post-edit safety net. +--- + +# TestWatch - Post-Edit Safety Net + +Run `/testwatch` after making changes to automatically verify nothing broke. + +## Usage + +``` +/testwatch # Run tests affected by recent changes +/testwatch --all # Run full test suite +/testwatch --autofix # Automatically fix any regressions +``` + +## Process + +1. **Detect changed files** - via `git diff` (staged + unstaged) +2. **Map changes to tests** - find test files that import/cover changed modules +3. **Run affected tests only** - fast feedback +4. **If failures**: + - With `--autofix`: invoke TestFix agent automatically + - Without: report failures with context + +## Smart Test Mapping + +``` +Changed: src/utils/auth.ts + → Runs: tests/utils/auth.test.ts + → Runs: tests/api/login.test.ts (imports auth) + → Skips: tests/ui/dashboard.test.ts (unrelated) +``` + +Mapping uses import graph analysis, not just filename matching.