Merge pull request #41 from cursor/add-agent-compatibility-plugin

ericzakariasson · web-flow · commit 3b0296ebe6b4 · 2026-03-25T14:24:58.000+01:00
Add agent-compatibility plugin
diff --git a/.cursor-plugin/marketplace.json b/.cursor-plugin/marketplace.json
@@ -32,6 +32,11 @@
       "name": "ralph-loop",
       "source": "ralph-loop",
       "description": "Iterative self-referential AI loops using the Ralph Wiggum technique."
+    },
+    {
+      "name": "agent-compatibility",
+      "source": "agent-compatibility",
+      "description": "CLI-backed repo compatibility scans plus Cursor agents that audit startup, validation, and docs against reality."
     }
   ]
 }
diff --git a/README.md b/README.md
@@ -4,13 +4,14 @@ Official Cursor plugins for popular developer tools, frameworks, and SaaS produc
 
 ## Plugins
 
-| Plugin | Category | Description |
-|:-------|:---------|:------------|
-| [Teaching](teaching/) | Utilities | Skill maps, practice plans, and feedback loops |
-| [Continual Learning](continual-learning/) | Developer Tools | Incremental transcript-driven AGENTS.md memory updates with high-signal bullet points |
-| [Cursor Team Kit](cursor-team-kit/) | Developer Tools | Internal-style workflows for CI, code review, shipping, and testing |
-| [Create Plugin](create-plugin/) | Developer Tools | Meta workflows for creating Cursor plugins with scaffolding and submission checks |
-| [Ralph Loop](ralph-loop/) | Developer Tools | Iterative self-referential AI loops using the Ralph Wiggum technique |
+| `name` | Plugin | Author | Category | `description` (from marketplace) |
+|:-------|:-------|:-------|:---------|:-------------------------------------|
+| `continual-learning` | [Continual Learning](continual-learning/) | Cursor | Developer Tools | Incremental transcript-driven memory updates for AGENTS.md using high-signal bullet points only. |
+| `cursor-team-kit` | [Cursor Team Kit](cursor-team-kit/) | Cursor | Developer Tools | Internal team workflows used by Cursor developers for CI, code review, and shipping. |
+| `create-plugin` | [Create Plugin](create-plugin/) | Cursor | Developer Tools | Scaffold and validate new Cursor plugins. |
+| `agent-compatibility` | [Agent Compatibility](agent-compatibility/) | Cursor | Developer Tools | CLI-backed repo compatibility scans plus Cursor agents that audit startup, validation, and docs against reality. |
+
+Author values match each plugin’s `plugin.json` `author.name` (Cursor lists `plugins@cursor.com` in the manifest).
 
 ## Repository structure
 
diff --git a/agent-compatibility/.cursor-plugin/plugin.json b/agent-compatibility/.cursor-plugin/plugin.json
@@ -0,0 +1,32 @@
+{
+  "name": "agent-compatibility",
+  "displayName": "Agent Compatibility",
+  "version": "1.0.0",
+  "description": "CLI-backed repo compatibility scans plus Cursor agents that audit startup, validation, and docs against reality.",
+  "author": {
+    "name": "Cursor",
+    "email": "plugins@cursor.com"
+  },
+  "homepage": "https://github.com/cursor/plugins/tree/main/agent-compatibility",
+  "repository": "https://github.com/cursor/plugins",
+  "license": "MIT",
+  "logo": "assets/avatar.png",
+  "keywords": [
+    "agent-compatibility",
+    "agents",
+    "compatibility",
+    "cursor-plugin",
+    "repo-audit",
+    "startup",
+    "validation"
+  ],
+  "category": "developer-tools",
+  "tags": [
+    "agents",
+    "compatibility",
+    "quality",
+    "workflow"
+  ],
+  "skills": "./skills/",
+  "agents": "./agents/"
+}
diff --git a/agent-compatibility/CHANGELOG.md b/agent-compatibility/CHANGELOG.md
@@ -0,0 +1,11 @@
+# Changelog
+
+All notable changes to this plugin will be documented here.
+
+## Unreleased
+
+- Renamed the full-pass skill to `check-agent-compatibility`.
+- Renamed `deterministic-scan-review` to `compatibility-scan-review`.
+- Renamed `docs-reality-review` to `docs-reliability-review`.
+- Clarified the score model so `Agent Compatibility Score` is the final blended score and `Deterministic Compatibility Score` is the raw CLI score.
+- Tightened the README, marketplace copy, and agent wording for public release.
diff --git a/agent-compatibility/LICENSE b/agent-compatibility/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2026 Cursor
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/agent-compatibility/README.md b/agent-compatibility/README.md
@@ -0,0 +1,87 @@
+# Agent Compatibility
+
+Cursor plugin for checking how well a repo holds up under agent workflows. It pairs the published `agent-compatibility` CLI with focused reviews for startup, validation, and docs reliability.
+
+By default, the full pass returns one overall score and one short list of the highest-leverage fixes. If the user wants the full breakdown, the agents can expose the component scores and the reasoning behind them.
+
+## What it includes
+
+- `check-agent-compatibility`: full compatibility pass
+- `compatibility-scan-review`: raw CLI-backed scan
+- `startup-review`: cold-start and bootstrap review
+- `validation-review`: small-change verification review
+- `docs-reliability-review`: docs reliability review
+
+## Score model
+
+- `Agent Compatibility Score`: final blended score shown to the user
+- `Deterministic Compatibility Score`: raw score from the published CLI
+- `Startup Compatibility Score`: how much guesswork it takes to boot the repo
+- `Validation Loop Score`: how practical it is to verify a small change
+- `Docs Reliability Score`: how closely the docs match the real setup path
+
+The final score blends the deterministic scan with the workflow checks:
+
+```text
+Agent Compatibility Score = round((deterministic * 0.7) + (workflow * 0.3))
+```
+
+The CLI also reports an accelerator layer for committed agent tooling. That extra context informs recommendations, but it does not inflate the deterministic compatibility score itself.
+
+## How to use it
+
+Use `check-agent-compatibility` when you want the full pass. That skill fans out to the four review agents above, then returns a compact result:
+
+```md
+## Agent Compatibility Score: 72/100
+
+Top fixes
+- First issue
+- Second issue
+```
+
+Ask for a breakdown if you want the component scores or the weighting.
+
+## CLI notes
+
+The plugin does not bundle the scanner. It runs the published npm package when needed.
+
+Default scan (compact terminal dashboard):
+
+```bash
+npx -y agent-compatibility@latest .
+```
+
+JSON output:
+
+```bash
+npx -y agent-compatibility@latest --json .
+```
+
+Markdown output:
+
+```bash
+npx -y agent-compatibility@latest --md .
+```
+
+Plain text output:
+
+```bash
+npx -y agent-compatibility@latest --text .
+```
+
+Config override for ignored paths or weight overrides:
+
+```bash
+npx -y agent-compatibility@latest . --config ./agent-compatibility.config.json
+```
+
+The scanner is heuristic. It scores repo signals and surfaces likely friction, but it is not a full quality verdict on the codebase.
+
+## Local install
+
+If you want to use this plugin directly, symlink this directory into:
+
+```bash
+~/.cursor/plugins/local/agent-compatibility
+```
diff --git a/agent-compatibility/agents/compatibility-scan-review.md b/agent-compatibility/agents/compatibility-scan-review.md
@@ -0,0 +1,40 @@
+---
+name: compatibility-scan-review
+description: Run the agent-compatibility CLI and return the raw repository score with its main problems
+model: fast
+readonly: true
+---
+
+# Compatibility scan review
+
+Runs the published scanner and reports the raw repository score.
+
+## Trigger
+
+Use when the task is specifically to run the published `agent-compatibility` scanner and report the raw compatibility result.
+
+## Workflow
+
+1. Try the published scanner first with `npx -y agent-compatibility@latest --json "<path>"`.
+2. If you are clearly working inside the scanner source repo and the published package path fails for an environment reason, fall back to the local scanner entrypoint.
+3. Only say the scanner is unavailable after you have actually tried the published package, and the local fallback when it is clearly available.
+4. Prefer JSON when you need structured reasoning. Prefer Markdown when the user wants a direct report.
+5. Keep the scanner's real score, summary direction, and problem ordering.
+6. Do not bundle in startup, validation, or docs-reliability judgments. Those belong to separate agents.
+
+## Output
+
+Reply in **plain text only** (no markdown fences, no `#` headings, no emphasis syntax). Use this layout:
+
+First line: `Deterministic Compatibility Score: <score>/100`
+
+Then a short summary paragraph.
+
+Then the line `Problems` followed by one bullet per line using `- `.
+
+- Use the compatibility scan's real score.
+- Keep accelerator context separate from the deterministic compatibility score itself.
+- Include both rubric issues and accelerator issues when they matter.
+- If there are no meaningful problems, under Problems write `- None.`
+- Do not treat scanner availability as a defect in the target repo.
+- If the scanner truly cannot be run, say that the deterministic scan is unavailable because of the tool environment, not because the repo lacks a compatibility CLI.
diff --git a/agent-compatibility/agents/docs-reliability-review.md b/agent-compatibility/agents/docs-reliability-review.md
@@ -0,0 +1,44 @@
+---
+name: docs-reliability-review
+description: Check whether the documented setup and run paths reliably lead to the real working path
+model: fast
+readonly: true
+---
+
+# Docs reliability review
+
+Follows the written setup path and reports where the docs drift from reality.
+
+## Trigger
+
+Use when the user wants to know whether the repo documentation is actually trustworthy for an agent starting fresh.
+
+## Workflow
+
+1. If a compatibility scan result is already available from the parent task, use it as context. Otherwise run the compatibility scan once.
+2. Read the obvious documentation surfaces: `README`, setup docs, env docs, and contribution or agent guidance.
+3. Follow the documented setup and run path as literally as practical.
+4. Note where docs are accurate, stale, incomplete, or misleading.
+5. Pick a specific score instead of a round bucket. Start from these anchors and move a few points if the evidence clearly warrants it:
+   - around `93/100` if the docs lead to the working path with little or no correction.
+   - around `84/100` if the docs drift in places but an agent can still get to the right setup or run path without much guesswork.
+   - around `68/100` if the docs are stale enough that the agent has to reconstruct important steps from the tree or CI.
+   - around `27/100` if the docs point the agent down the wrong path or omit key steps you need to proceed.
+   - around `12/100` if the real path depends on private docs or internal context that is not available in the repo.
+6. Prefer a specific score such as `81`, `85`, or `92` over a multiple of ten when that is the more honest read.
+
+## Output
+
+Reply in **plain text only** (no markdown fences, no `#` headings, no emphasis syntax). Use this layout:
+
+First line: `Docs Reliability Score: <score>/100`
+
+Then a short summary paragraph.
+
+Then the line `Problems` followed by one bullet per line using `- `.
+
+- Base the score on what happened when you followed the docs.
+- Build Problems from real mismatches, omissions, or misleading guidance.
+- If the repo is blocked on secrets or infrastructure, say so plainly and still use the same output shape.
+- Minor drift or stale references should not drag a good repo into the mid-60s if the real path is still easy to recover.
+- Score the damage from the drift, not the mere existence of drift.
diff --git a/agent-compatibility/agents/startup-review.md b/agent-compatibility/agents/startup-review.md
@@ -0,0 +1,51 @@
+---
+name: startup-review
+description: Try to bootstrap and start a repository like a cold agent, then report where the path breaks down
+model: fast
+readonly: true
+---
+
+# Startup review
+
+Tries the cold-start path and reports how much work it takes to get the repo running.
+
+## Trigger
+
+Use when the user wants to know whether a repo is actually easy to start, not just whether it claims to be.
+
+## Workflow
+
+1. If a compatibility scan result is already available from the parent task, use it as context. Otherwise run the compatibility scan once.
+2. Read the obvious startup surfaces: `README`, scripts, toolchain files, env examples, and workflow docs.
+3. Pick the most likely bootstrap path and startup command.
+4. Try to reach first success inside a fixed time budget.
+5. If the first path fails, allow a small amount of recovery and note what you had to infer.
+6. Do not infer a startup failure from a lockfile, a bound port, or an existing repo-local process by itself.
+7. Only call startup blocked or failed when your own startup attempt fails, or when the documented startup path cannot be completed within the budget.
+8. Pick a specific score instead of a round bucket. Start from these anchors and move a few points if the evidence clearly warrants it:
+   - around `93/100` if the main startup path works inside the time budget, even if it needs ordinary local prerequisites such as Docker or a database.
+   - around `84/100` if the repo starts, but only after some digging, a recovery step, or heavier setup than the docs suggest.
+   - around `68/100` if a startup path probably exists but stays too manual, too ambiguous, or too expensive for normal agent use.
+   - around `27/100` if you cannot get a credible startup path working from the repo and docs you have.
+   - around `12/100` if the path is blocked on secrets, accounts, or infrastructure you cannot reasonably access.
+9. Prefer a specific score such as `82`, `85`, or `91` over a multiple of ten when that is the more honest read.
+10. Return the result in the same plain-text report shape as the deterministic scan.
+
+## Output
+
+Reply in **plain text only** (no markdown fences, no `#` headings, no emphasis syntax). Use this layout:
+
+First line: `Startup Compatibility Score: <score>/100`
+
+Then a short summary paragraph.
+
+Then the line `Problems` followed by one bullet per line using `- `.
+
+- Base the score on what happened when you actually tried to start the repo.
+- Build Problems from the real startup friction you observed.
+- If the repo is blocked on secrets, accounts, or external infra, say that plainly and still use the same output shape.
+- Do not assume a Next.js lockfile or a port that does not answer HTTP immediately is a repo problem.
+- Do not require an HTTP response unless the documented startup path clearly implies one and you actually started that path yourself.
+- If the environment starts successfully, treat that as a strong result. Record the friction, but do not score it like a near-failure.
+- Treat Docker, local services, and other standard dev prerequisites as friction, not failure.
+- Error-message quality is secondary here unless it actually prevents startup or recovery.
diff --git a/agent-compatibility/agents/validation-review.md b/agent-compatibility/agents/validation-review.md
@@ -0,0 +1,50 @@
+---
+name: validation-review
+description: Assess whether an agent can verify a small change without guessing or running an unnecessarily heavy loop
+model: fast
+readonly: true
+---
+
+# Validation review
+
+Checks whether an agent can verify a small change without falling back to a full-repo loop.
+
+## Trigger
+
+Use when the user wants to know whether an agent can safely verify its own work in a repo.
+
+## Workflow
+
+1. If a compatibility scan result is already available from the parent task, use it as context. Otherwise run the compatibility scan once.
+2. Inspect the repo's declared test, lint, check, and typecheck paths.
+3. Decide whether there is a practical scoped loop for a small change.
+4. Try the most relevant validation path.
+5. Judge whether the result is:
+   - targeted
+   - actionable
+   - noisy
+   - too expensive for normal iteration
+6. Pick a specific score instead of a round bucket. Start from these anchors and move a few points if the evidence clearly warrants it:
+   - around `93/100` if there is a repeatable validation path and it gives useful signal, even if it is broader than ideal.
+   - around `84/100` if validation works but is heavier than it should be, repo-wide, or split across a few commands.
+   - around `68/100` if a valid loop probably exists but picking the right one takes guesswork or the output is too noisy to trust quickly.
+   - around `27/100` if there is no practical validation loop you can actually use.
+   - around `12/100` if the loop is blocked on secrets, accounts, or infrastructure you cannot reasonably access.
+7. Prefer a specific score such as `83`, `86`, or `91` over a multiple of ten when that is the more honest read.
+8. Return the result in the same plain-text report shape as the deterministic scan.
+
+## Output
+
+Reply in **plain text only** (no markdown fences, no `#` headings, no emphasis syntax). Use this layout:
+
+First line: `Validation Loop Score: <score>/100`
+
+Then a short summary paragraph.
+
+Then the line `Problems` followed by one bullet per line using `- `.
+
+- Base the score on the loop you actually tried.
+- Build Problems from the real validation friction you observed.
+- Prefer concrete issues like "only full-repo test path exists" over generic quality advice.
+- Do not score a repo in the mid-60s just because the loop is heavy. If an agent can still verify changes reliably, keep it in the good range and note the cost.
+- Noisy logs and extra warnings matter only when they hide the actual validation result.
diff --git a/agent-compatibility/assets/avatar.png b/agent-compatibility/assets/avatar.png
diff --git a/agent-compatibility/skills/check-agent-compatibility/SKILL.md b/agent-compatibility/skills/check-agent-compatibility/SKILL.md

Original file line number	Diff line number	Diff line change
`@@ -32,6 +32,11 @@`
`32`	`32`	`"name": "ralph-loop",`
`33`	`33`	`"source": "ralph-loop",`
`34`	`34`	`"description": "Iterative self-referential AI loops using the Ralph Wiggum technique."`
	`35`	`+ },`
	`36`	`+ {`
	`37`	`+ "name": "agent-compatibility",`
	`38`	`+ "source": "agent-compatibility",`
	`39`	`+ "description": "CLI-backed repo compatibility scans plus Cursor agents that audit startup, validation, and docs against reality."`
`35`	`40`	`}`
`36`	`41`	`]`
`37`	`42`	`}`