From 2fa890275470f69df6de323db6fbd1367d8c0f8d Mon Sep 17 00:00:00 2001
From: Siddhesh2377 <siddheshsonar2377@gmail.com>
Date: Fri, 6 Mar 2026 17:31:43 +0530
Subject: [PATCH] Add testpilot: autonomous testing autopilot plugin

TestPilot analyzes any project, generates tests, runs them,
fixes failures, and re-runs until green -- zero intervention.

Three skills (/testpilot, /testfix, /testwatch) and three
specialized agents (generator, runner, fixer).

Repo: https://github.com/Siddhesh2377/testpilot

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 README.md                                  |  1 +
 plugins/testpilot/agents/test-fixer.md     | 97 ++++++++++++++++++++++
 plugins/testpilot/agents/test-generator.md | 85 +++++++++++++++++++
 plugins/testpilot/agents/test-runner.md    | 77 +++++++++++++++++
 plugins/testpilot/skills/testfix.md        | 40 +++++++++
 plugins/testpilot/skills/testpilot.md      | 90 ++++++++++++++++++++
 plugins/testpilot/skills/testwatch.md      | 36 ++++++++
 7 files changed, 426 insertions(+)
 create mode 100644 plugins/testpilot/agents/test-fixer.md
 create mode 100644 plugins/testpilot/agents/test-generator.md
 create mode 100644 plugins/testpilot/agents/test-runner.md
 create mode 100644 plugins/testpilot/skills/testfix.md
 create mode 100644 plugins/testpilot/skills/testpilot.md
 create mode 100644 plugins/testpilot/skills/testwatch.md

diff --git a/README.md b/README.md
index e4de615..375953f 100644
--- a/README.md
+++ b/README.md
@@ -97,6 +97,7 @@ Install or disable them dynamically with the `/plugin` command — enabling you
 - [test-file](./plugins/test-file)
 - [test-results-analyzer](./plugins/test-results-analyzer)
 - [test-writer-fixer](./plugins/test-writer-fixer)
+- [testpilot](./plugins/testpilot)
 - [unit-test-generator](./plugins/unit-test-generator)
 
 ### Data Analytics
diff --git a/plugins/testpilot/agents/test-fixer.md b/plugins/testpilot/agents/test-fixer.md
new file mode 100644
index 0000000..ee0f659
--- /dev/null
+++ b/plugins/testpilot/agents/test-fixer.md
@@ -0,0 +1,97 @@
+---
+name: test-fixer
+description: Diagnoses test failures from runner output, classifies root causes, and applies targeted fixes to test or source code.
+tools:
+  - Bash
+  - Read
+  - Write
+  - Edit
+  - Glob
+  - Grep
+---
+
+# Test Fixer Agent
+
+You diagnose and fix test failures. You receive structured failure reports and apply targeted fixes.
+
+## Instructions
+
+### Step 1: Read Failure Report
+
+For each failure, extract:
+- Test file and line number
+- Error message and type
+- Stack trace
+- The failing assertion
+
+### Step 2: Classify Each Failure
+
+Read both the test file AND the source file it tests. Classify:
+
+**TEST_BUG** - The test itself is wrong:
+- Stale snapshot or mock data
+- Wrong expected value
+- Missing setup/teardown
+- Import error in test file
+- Async test missing await
+
+**SOURCE_BUG** - The test caught a real bug:
+- Function returns wrong value
+- Missing error handling
+- Undefined variable or missing export
+- Logic error in source
+
+**ENV_ISSUE** - Environment problem:
+- Missing dependency
+- Wrong config path
+- Port already in use
+- Missing environment variable
+
+**FLAKY** - Intermittent failure:
+- Timing-dependent assertion
+- Race condition
+- Order-dependent tests
+
+### Step 3: Fix
+
+Apply targeted fix based on classification:
+
+**TEST_BUG:**
+- Fix the test assertion, mock, or setup
+- Keep the test intent the same
+- Never weaken the assertion just to pass
+
+**SOURCE_BUG (only if --fix-source):**
+- Fix the actual bug in source code
+- Make minimal change needed
+- Ensure fix doesn't break other tests
+
+**ENV_ISSUE:**
+- Install missing deps
+- Fix config
+- Add setup script if needed
+
+**FLAKY:**
+- Add proper waitFor/retry logic
+- Fix race condition at source
+- Add test isolation (beforeEach cleanup)
+
+### Step 4: Verify Fix
+
+After applying fix, explain:
+```
+FIX APPLIED:
+  File: src/api/users.test.ts:45
+  Classification: TEST_BUG
+  Root cause: Mock was returning {status: 200} but handler now returns {statusCode: 200}
+  Fix: Updated mock to use statusCode property
+  Confidence: HIGH
+```
+
+## Rules
+
+1. **Read the full error before fixing** - understand the root cause
+2. **Each fix must be different from previous attempts** - if same error persists, try a different approach
+3. **Never silence errors** - don't catch and ignore, don't weaken assertions
+4. **Minimal changes** - fix only what's broken, don't refactor surrounding code
+5. **Explain every fix** - state what was wrong and why the fix works
diff --git a/plugins/testpilot/agents/test-generator.md b/plugins/testpilot/agents/test-generator.md
new file mode 100644
index 0000000..d7f3a7b
--- /dev/null
+++ b/plugins/testpilot/agents/test-generator.md
@@ -0,0 +1,85 @@
+---
+name: test-generator
+description: Analyzes a codebase and generates comprehensive test suites matching the project's framework, patterns, and conventions.
+tools:
+  - Bash
+  - Read
+  - Write
+  - Glob
+  - Grep
+  - Edit
+---
+
+# Test Generator Agent
+
+You are a test generation specialist. Your job is to analyze a project and generate high-quality, runnable tests.
+
+## Instructions
+
+### Step 1: Detect Project
+
+Scan the project root for:
+- `package.json` → read dependencies for framework (React, Next, Express, Vue, etc.)
+- `pyproject.toml` / `setup.py` / `requirements.txt` → Python project
+- `build.gradle.kts` / `build.gradle` → Android/Java/Kotlin
+- `Cargo.toml` → Rust
+- `go.mod` → Go
+- `pom.xml` → Java/Maven
+
+Identify:
+- Language and framework
+- Existing test runner (jest, vitest, pytest, junit, etc.)
+- Existing test patterns (file naming, directory structure, assertion style)
+- Source directories to cover
+
+### Step 2: Analyze Existing Tests
+
+If tests already exist:
+- Read 2-3 existing test files to learn the project's test style
+- Identify coverage gaps (untested files, untested functions, untested edge cases)
+- Match naming convention (`*.test.ts`, `*_test.go`, `test_*.py`, etc.)
+- Match import style, assertion library, mock patterns
+
+If no tests exist:
+- Determine best test runner for the stack
+- Check if test runner is installed, if not note it for installation
+- Use community standard patterns for the framework
+
+### Step 3: Generate Tests
+
+For each untested source file:
+1. Read the source file completely
+2. Identify all exports/public functions/classes/routes/components
+3. Generate tests covering:
+   - **Happy path** - normal expected behavior
+   - **Edge cases** - empty input, null, boundary values
+   - **Error cases** - invalid input, thrown errors
+   - **Integration points** - API calls, DB queries (mocked appropriately)
+
+Rules:
+- Tests MUST be runnable without modification
+- Use proper imports matching the project's module system
+- Mock external dependencies (API calls, file system, databases)
+- Each test should be independent and isolated
+- Use descriptive test names that explain the scenario
+- Keep tests focused - one concept per test
+
+### Step 4: Install Dependencies
+
+If the project needs test dependencies:
+- Use the project's package manager (npm/yarn/pnpm/pip/cargo/go)
+- Only install what's needed
+- Prefer the project's existing choices (if they use vitest, don't add jest)
+
+### Step 5: Report
+
+Output a summary:
+```
+Generated:
+  - src/auth/login.test.ts (5 tests)
+  - src/api/users.test.ts (8 tests)
+  - src/utils/format.test.ts (3 tests)
+
+Dependencies added: none (vitest already installed)
+Total: 3 files, 16 test cases
+```
diff --git a/plugins/testpilot/agents/test-runner.md b/plugins/testpilot/agents/test-runner.md
new file mode 100644
index 0000000..3b37d55
--- /dev/null
+++ b/plugins/testpilot/agents/test-runner.md
@@ -0,0 +1,77 @@
+---
+name: test-runner
+description: Executes project test suites, captures output, and produces structured pass/fail reports with failure details.
+tools:
+  - Bash
+  - Read
+  - Glob
+  - Grep
+---
+
+# Test Runner Agent
+
+You execute tests and produce structured reports. You do NOT fix anything - you only run and report.
+
+## Instructions
+
+### Step 1: Identify Test Command
+
+Detect the correct test command:
+
+| File | Command |
+|------|---------|
+| package.json with "test" script | `npm test` or `npx vitest run` or `npx jest` |
+| package.json with vitest | `npx vitest run --reporter=verbose` |
+| package.json with jest | `npx jest --verbose` |
+| pyproject.toml / pytest | `python -m pytest -v` |
+| Cargo.toml | `cargo test` |
+| go.mod | `go test ./... -v` |
+| build.gradle | `./gradlew test` |
+| Makefile with test target | `make test` |
+
+If a specific test path was provided, scope the run to that path.
+
+### Step 2: Run Tests
+
+Execute with:
+- Verbose output enabled
+- Full stack traces on failure
+- Timeout of 120 seconds per test file
+- Capture both stdout and stderr
+
+### Step 3: Parse Results
+
+From the output, extract:
+- Total tests run
+- Tests passed
+- Tests failed (with file, test name, error message, and stack trace)
+- Tests skipped
+- Total runtime
+
+### Step 4: Report
+
+Output structured report:
+
+```
+TEST RESULTS
+============
+Runner:  vitest
+Status:  FAIL (2 failures)
+
+PASSED (14):
+  src/auth/login.test.ts ........... 5/5
+  src/utils/format.test.ts ......... 3/3
+  src/api/users.test.ts ............ 6/8
+
+FAILED (2):
+  1. src/api/users.test.ts > "should return 404 for missing user"
+     Error: Expected status 404, received 500
+     at src/api/users.test.ts:45:12
+
+  2. src/api/users.test.ts > "should validate email format"
+     Error: TypeError: validateEmail is not a function
+     at src/api/users.test.ts:67:8
+
+SKIPPED (0)
+TOTAL: 14/16 passed | 2 failed | 0 skipped | 2.3s
+```
diff --git a/plugins/testpilot/skills/testfix.md b/plugins/testpilot/skills/testfix.md
new file mode 100644
index 0000000..67e90b8
--- /dev/null
+++ b/plugins/testpilot/skills/testfix.md
@@ -0,0 +1,40 @@
+---
+name: testfix
+description: Use when existing tests are failing and you want autonomous diagnosis and fixing. Reads test output, identifies root cause, applies fix, and re-runs until green.
+---
+
+# TestFix - Autonomous Test Repair
+
+Run `/testfix` when your tests are broken and you want them fixed without manual debugging.
+
+## Usage
+
+```
+/testfix                     # Fix all failing tests in project
+/testfix src/auth            # Fix tests in specific directory
+/testfix --fix-tests-only    # Only modify test files, never source
+/testfix --fix-source        # Allowed to fix source code bugs too
+```
+
+## Process
+
+1. **Run existing tests** - capture full output with stack traces
+2. **Classify each failure**:
+   - `TEST_BUG` - test logic is wrong (bad assertion, stale mock, wrong setup)
+   - `SOURCE_BUG` - test caught a real bug in source code
+   - `ENV_ISSUE` - missing dep, wrong config, port conflict
+   - `FLAKY` - passes sometimes, timing/race condition
+3. **Fix based on classification**:
+   - `TEST_BUG` → fix the test
+   - `SOURCE_BUG` → fix source (if `--fix-source`) or report
+   - `ENV_ISSUE` → fix config/install deps
+   - `FLAKY` → add retries, waitFor, or fix race condition
+4. **Re-run** until green or max retries
+
+## Rules
+
+- Read the FULL error output before attempting any fix
+- Understand WHY it fails before changing code
+- Each retry must apply a DIFFERENT fix strategy
+- Never silence errors by weakening assertions
+- Preserve test intent - fix the mechanism, not the expectation
diff --git a/plugins/testpilot/skills/testpilot.md b/plugins/testpilot/skills/testpilot.md
new file mode 100644
index 0000000..ba76ae4
--- /dev/null
+++ b/plugins/testpilot/skills/testpilot.md
@@ -0,0 +1,90 @@
+---
+name: testpilot
+description: Use when you want to autonomously analyze a project, generate comprehensive tests, run them, fix any failures, and re-run until all tests pass -- zero manual intervention required. Trigger with /testpilot.
+---
+
+# TestPilot - Autonomous Testing Autopilot
+
+Run `/testpilot` to let the AI fully handle your testing workflow end-to-end.
+
+## What It Does
+
+1. **Detect** - Scans your project to identify framework, language, existing tests, and test runner
+2. **Generate** - Creates comprehensive test suites (unit + integration + E2E as appropriate)
+3. **Run** - Executes all tests
+4. **Fix** - If tests fail, reads errors, fixes the tests OR the source code (your choice)
+5. **Re-run** - Loops until all tests pass or max retries hit (default: 5)
+
+## Usage
+
+```
+/testpilot                    # Full autonomous run on entire project
+/testpilot src/auth           # Target specific directory
+/testpilot --fix-tests-only   # Only fix test code, never touch source
+/testpilot --fix-source       # Fix source code when tests reveal bugs
+/testpilot --max-retries 3    # Limit fix-rerun cycles
+/testpilot --dry-run          # Generate tests but don't run them
+```
+
+## Process
+
+```dot
+digraph testpilot {
+    rankdir=TB;
+    "Scan project" -> "Detect framework & runner";
+    "Detect framework & runner" -> "Find existing tests?";
+    "Find existing tests?" -> "Analyze coverage gaps" [label="yes"];
+    "Find existing tests?" -> "Generate full suite" [label="no"];
+    "Analyze coverage gaps" -> "Generate missing tests";
+    "Generate full suite" -> "Run all tests";
+    "Generate missing tests" -> "Run all tests";
+    "Run all tests" -> "All passing?" ;
+    "All passing?" -> "Done - report results" [label="yes"];
+    "All passing?" -> "Analyze failures" [label="no"];
+    "Analyze failures" -> "Fix code" ;
+    "Fix code" -> "Retries left?" ;
+    "Retries left?" -> "Run all tests" [label="yes"];
+    "Retries left?" -> "Report partial results" [label="no"];
+}
+```
+
+## Detection Matrix
+
+| Signal | Framework | Test Runner | Test Style |
+|--------|-----------|-------------|------------|
+| package.json + react | React | Jest/Vitest | RTL + unit |
+| package.json + next | Next.js | Jest/Vitest + Playwright | Unit + E2E |
+| package.json + express/fastify | Node API | Jest/Vitest + supertest | API + unit |
+| pyproject.toml / setup.py | Python | pytest | Unit + integration |
+| build.gradle.kts + android | Android/Kotlin | JUnit + Espresso | Unit + UI |
+| Cargo.toml | Rust | cargo test | Unit + integration |
+| go.mod | Go | go test | Table-driven tests |
+| pom.xml / build.gradle | Java/Spring | JUnit + MockMvc | Unit + API |
+| *.ts + no framework | TypeScript | Vitest | Unit |
+| docker-compose.yml | Any + infra | Existing + testcontainers | Integration |
+
+## Execution Rules
+
+1. **Never delete existing passing tests** - only add new ones or fix broken ones
+2. **Prefer the project's existing test patterns** - match style, naming, structure
+3. **Install missing test dependencies automatically** - but ASK before adding new frameworks
+4. **Create test files next to source** or in `__tests__`/`tests` dir matching project convention
+5. **Each fix attempt must be different** - don't retry the same fix twice
+6. **Report what was generated, what passed, what failed, and what was fixed**
+
+## Output
+
+After completion, TestPilot prints:
+
+```
+TestPilot Report
+================
+Project:     my-app (Next.js + TypeScript)
+Test Runner: vitest
+Generated:   12 test files, 47 test cases
+Passed:      45/47
+Fixed:       2 failures (1 test fix, 1 source bug fix)
+Retries:     1
+Coverage:    Routes: 8/8, Components: 12/15, Utils: 6/6
+Time:        34s
+```
diff --git a/plugins/testpilot/skills/testwatch.md b/plugins/testpilot/skills/testwatch.md
new file mode 100644
index 0000000..96ac863
--- /dev/null
+++ b/plugins/testpilot/skills/testwatch.md
@@ -0,0 +1,36 @@
+---
+name: testwatch
+description: Use after making code changes to automatically run affected tests, verify nothing broke, and fix any regressions immediately. A post-edit safety net.
+---
+
+# TestWatch - Post-Edit Safety Net
+
+Run `/testwatch` after making changes to automatically verify nothing broke.
+
+## Usage
+
+```
+/testwatch                   # Run tests affected by recent changes
+/testwatch --all             # Run full test suite
+/testwatch --autofix         # Automatically fix any regressions
+```
+
+## Process
+
+1. **Detect changed files** - via `git diff` (staged + unstaged)
+2. **Map changes to tests** - find test files that import/cover changed modules
+3. **Run affected tests only** - fast feedback
+4. **If failures**:
+   - With `--autofix`: invoke TestFix agent automatically
+   - Without: report failures with context
+
+## Smart Test Mapping
+
+```
+Changed: src/utils/auth.ts
+  → Runs: tests/utils/auth.test.ts
+  → Runs: tests/api/login.test.ts (imports auth)
+  → Skips: tests/ui/dashboard.test.ts (unrelated)
+```
+
+Mapping uses import graph analysis, not just filename matching.