From 7a0e4eb465d551ffaa71254177a72cbb14042e5f Mon Sep 17 00:00:00 2001 From: Paul Mulligan Date: Thu, 2 Apr 2026 15:13:02 -0400 Subject: [PATCH 01/14] docs: add guides directory with index Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/guides/README.md | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 docs/guides/README.md diff --git a/docs/guides/README.md b/docs/guides/README.md new file mode 100644 index 0000000..0799c20 --- /dev/null +++ b/docs/guides/README.md @@ -0,0 +1,29 @@ +# Framework Guides + +Deep-dive documentation for Aurelius framework systems. These guides go beyond the [Quickstart](../onboarding/quickstart.md) and explain how each system works under the hood. + +## Core Pipeline Systems (P0) + +| Guide | What You Will Learn | +|-------|-------------------| +| [Design Token System](design-tokens.md) | Token structure, lockfile format, validation, sync strategy, drift detection | +| [Visual QA Deep Dive](visual-qa.md) | How visual-diff.js works, sub-pixel detection, typography analysis, threshold tuning | +| [Pipeline Caching & Performance](caching.md) | Incremental builds, cache invalidation, profiling, when to use --force | + +## Development Guides (P1) + +| Guide | What You Will Learn | +|-------|-------------------| +| [Hook System](hooks.md) | How hooks fire, execution order, creating custom hooks | +| [Error Recovery](error-recovery.md) | What to do when a pipeline phase fails, how to resume, common failure modes | +| [Agent Creation](agent-creation.md) | How to create a custom agent, required YAML frontmatter, tool declarations | + +## Framework-Specific Guides (P2) + +| Guide | What You Will Learn | +|-------|-------------------| +| [Vue Converter Workflow](vue-converter.md) | Vue 3 pipeline specifics, Composition API patterns | +| [Svelte Converter Workflow](svelte-converter.md) | SvelteKit pipeline specifics, store patterns | +| [React Native Converter Workflow](react-native-converter.md) | Expo pipeline specifics, NativeWind setup | +| [Chrome Extension Pipeline](chrome-extension.md) | Manifest v3, service worker testing, extension E2E | +| [PWA Pipeline](pwa.md) | Service worker lifecycle, offline testing, manifest validation | From 485f16a9e327718d244c4b8a1040d64a3938e81e Mon Sep 17 00:00:00 2001 From: Paul Mulligan Date: Thu, 2 Apr 2026 15:15:18 -0400 Subject: [PATCH 02/14] docs: add design token system guide Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/guides/design-tokens.md | 202 +++++++++++++++++++++++++++++++++++ 1 file changed, 202 insertions(+) create mode 100644 docs/guides/design-tokens.md diff --git a/docs/guides/design-tokens.md b/docs/guides/design-tokens.md new file mode 100644 index 0000000..299ae21 --- /dev/null +++ b/docs/guides/design-tokens.md @@ -0,0 +1,202 @@ +# Design Token System + +Design tokens are the single source of truth for colors, spacing, typography, and other visual properties extracted from Figma, Canva, or screenshot sources. Aurelius uses a lockfile-based system to ensure every component references the same values and to catch drift before it ships. + +## Token Structure + +All tokens live in `design-tokens.lock.json`. The top-level keys map directly to Tailwind config sections: + +```json +{ + "colors": { + "primary": "#3B82F6", + "secondary": "#10B981", + "background": "#FFFFFF", + "foreground": "#111827" + }, + "spacing": { + "sm": "0.5rem", + "md": "1rem", + "lg": "1.5rem", + "xl": "2rem" + }, + "typography": { + "heading": { "fontFamily": "Inter", "fontWeight": 700 }, + "body": { "fontFamily": "Inter", "fontWeight": 400 } + }, + "borderRadius": { + "sm": "0.25rem", + "md": "0.5rem", + "lg": "1rem" + }, + "shadows": { + "sm": "0 1px 2px rgba(0,0,0,0.05)", + "md": "0 4px 6px rgba(0,0,0,0.1)" + }, + "textContent": { + "heroHeading": "Build faster with Aurelius", + "ctaButton": "Get Started" + }, + "metadata": { + "source": "figma", + "fileKey": "abc123", + "exportedAt": "2026-04-01T12:00:00Z" + } +} +``` + +The lockfile also supports a nested `designTokens` wrapper (e.g., `lock.designTokens.colors`). Both scripts handle either format automatically. + +## Lockfile Locations + +The scripts search for the lockfile in this order: + +1. `src/styles/design-tokens.lock.json` +2. `design-tokens.lock.json` (project root) + +First found wins. If neither exists, token drift checks are skipped and `sync-tokens.sh` exits with code 2. + +## How Tokens Flow + +``` +Design Source (Figma / Canva / Screenshot) + │ + ▼ + Phase 2 — Token Lock + Extracts tokens → design-tokens.lock.json + │ + ├──────────────────────┐ + ▼ ▼ + tailwind.config.ts tokens.css + theme.extend.colors --primary: #3B82F6; + theme.extend.spacing --spacing-sm: 0.5rem; + │ │ + ▼ ▼ + Components use Components use + Tailwind classes CSS custom properties + (bg-primary, p-md) (var(--primary)) +``` + +Phase 2 of any pipeline (`/build-from-figma`, `/build-from-canva`, `/build-from-screenshot`) creates or updates the lockfile. From there, values are mapped into `tailwind.config.ts` (`theme.extend.colors`, `theme.extend.spacing`, etc.) and into CSS custom properties in `tokens.css`. Components then consume tokens through Tailwind utility classes or CSS variables — never through hardcoded values. + +## Token Validation (`verify-tokens.sh`) + +Run the validation script to catch hardcoded values that bypass the token system: + +```bash +./scripts/verify-tokens.sh +``` + +The script performs five checks against your `src/` directory: + +| Check | What It Catches | Pattern | +|-------|----------------|---------| +| 1. Hardcoded hex colors in TSX | `color="#3B82F6"` in components | `#[0-9a-fA-F]{3,8}` in `*.tsx` | +| 2. Arbitrary Tailwind values | `w-[200px]`, `p-[24px]` | `(w\|h\|p\|m\|gap\|...)-\[[0-9]+px\]` in `*.tsx` | +| 3. Inline style attributes | `style={{ color: 'red' }}` | `style=\{\{` in `*.tsx` | +| 4. Text content drift | Lockfile text missing from source | Compares `textContent` entries against `*.tsx` | +| 5. Hardcoded colors in CSS | `color: #3B82F6;` outside token files | `#[0-9a-fA-F]{3,8}` in `*.css` (excludes `tokens.css` and `globals.css`) | + +**Exit codes:** 0 = all checks pass, 1 = violations found. + +Example output: + +``` +=== Token Verification === + +▸ Checking for hardcoded hex colors in .tsx files... + ✗ Hardcoded hex colors found: + src/components/Hero.tsx:12:
+ +▸ Checking for arbitrary Tailwind values (w-[...], h-[...], p-[...], etc.)... + ✓ No arbitrary pixel values + +▸ Checking for inline style={{}} attributes... + ✓ No inline styles + +▸ Checking text content against lockfile (src/styles/design-tokens.lock.json)... + ✓ All lockfile text content found in source + +▸ Checking for hardcoded colors in CSS files (outside tokens)... + ✓ No hardcoded hex colors in CSS + +=== Summary === +✗ 1 violation(s) found + Fix violations or add '// token-ok' comment to intentional exceptions +``` + +## Token Drift Detection (`sync-tokens.sh`) + +While `verify-tokens.sh` checks components for hardcoded values, `sync-tokens.sh` checks whether the lockfile and your config files agree. Run it when you suspect tokens have drifted after a lockfile update: + +```bash +# Report-only mode (default) +./scripts/sync-tokens.sh + +# Sync source files to match lockfile +./scripts/sync-tokens.sh --update + +# Machine-readable output +./scripts/sync-tokens.sh --json +``` + +The script performs three checks: + +1. **Color tokens** — Every color in the lockfile must appear in `tailwind.config.ts`/`tailwind.config.js`. +2. **Spacing tokens** — Every spacing value in the lockfile must appear in the Tailwind config. +3. **CSS custom properties** — Every token category (`colors`, `spacing`, `typography`, `borderRadius`, `shadows`, `fontSizes`) is compared against `tokens.css`. Catches both value mismatches and missing properties. + +Tokens CSS is searched at `src/styles/tokens.css`, `src/tokens.css`, and `styles/tokens.css` (first found wins). + +**Exit codes:** 0 = no drift, 1 = drift detected, 2 = no lockfile found. + +**Drift classification:** + +| Drift Items | Status | +|-------------|--------| +| 0 | `no-drift` | +| 1-3 | `minor-drift` | +| 4+ | `major-drift` | + +In `--update` mode, the script rewrites CSS custom property values in `tokens.css` to match the lockfile. Tailwind config updates are flagged for manual review. + +## Pipeline Integration + +Tokens are checked and enforced at three points in the pipeline: + +- **Phase 0 (Token Sync)** — If a lockfile already exists, `sync-tokens.sh` runs automatically to detect drift before the build begins. +- **Phase 2 (Token Lock)** — The `design-token-lock` skill extracts tokens from the design source and writes `design-tokens.lock.json`. +- **Pre-commit hook** — `verify-tokens.sh` runs automatically on every `git commit`. If violations are found, the commit is blocked with a warning. + +This means tokens are validated on entry (Phase 0), created/updated mid-pipeline (Phase 2), and enforced on exit (pre-commit). + +## The `// token-ok` Escape Hatch + +Add `// token-ok` to any line to suppress `verify-tokens.sh` warnings for that line: + +```tsx +// Third-party brand color — intentionally hardcoded + // token-ok + +// Tailwind arbitrary value needed for external embed +
// token-ok +``` + +Use this sparingly. Valid use cases include third-party brand colors, embedded widget dimensions, and SVG path data. If you find yourself adding `// token-ok` to many lines, the lockfile is probably missing tokens — update it instead. + +## Troubleshooting + +**"No design-tokens.lock.json found"** +The lockfile does not exist yet. Either run the pipeline (Phase 2 creates it) or create one manually following the structure above. Place it at `src/styles/design-tokens.lock.json` or in the project root. + +**"Token drift detected" / exit code 1 from sync-tokens.sh** +The lockfile and your config files disagree. Run `./scripts/sync-tokens.sh --update` to sync CSS custom properties automatically, then manually review your Tailwind config for any remaining mismatches. + +**"Hardcoded hex color at src/components/Foo.tsx:42"** +Replace the hardcoded value with a Tailwind token class (e.g., `bg-primary` instead of `bg-[#3B82F6]`) or a CSS variable (`var(--primary)`). If the value is intentionally hardcoded, add `// token-ok` to the line. + +**"Arbitrary pixel values found"** +Replace arbitrary Tailwind values like `w-[200px]` with token-based classes like `w-48`. If the lockfile defines custom spacing (e.g., `"sidebar": "200px"`), map it in `tailwind.config.ts` under `theme.extend.spacing` and use `w-sidebar`. + +**"Missing text from lockfile"** +A `textContent` entry in the lockfile does not appear anywhere in your `src/**/*.tsx` files. Either the text was changed in the component (update the lockfile) or a component was removed (remove the entry from `textContent`). From 83d2f3a595976670a463be329f39a579d5d88c56 Mon Sep 17 00:00:00 2001 From: Paul Mulligan Date: Thu, 2 Apr 2026 15:18:01 -0400 Subject: [PATCH 03/14] docs: add visual QA deep dive guide Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/guides/visual-qa.md | 267 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 267 insertions(+) create mode 100644 docs/guides/visual-qa.md diff --git a/docs/guides/visual-qa.md b/docs/guides/visual-qa.md new file mode 100644 index 0000000..89cacea --- /dev/null +++ b/docs/guides/visual-qa.md @@ -0,0 +1,267 @@ +# Visual QA Deep Dive + +The visual QA system provides pixel-level screenshot comparison using [pixelmatch](https://github.com/mapbox/pixelmatch). Instead of manual eyeballing, it programmatically compares actual and expected PNG files, outputs a diff image with magenta highlights, reports a mismatch percentage, and runs multi-layer analysis covering regions, sub-pixel artifacts, typography, and layout drift. + +## Usage + +### Single comparison + +```bash +node scripts/visual-diff.js [options] +``` + +Options: + +| Flag | Default | Description | +|------|---------|-------------| +| `--output ` | none | Path to save the diff image | +| `--threshold ` | `0.02` | Maximum mismatch ratio to pass (0.02 = 2%) | +| `--json` | off | Output JSON instead of human-readable text | +| `--region-grid ` | `4` | Grid divisions for region analysis (4 = 4x4 = 16 regions) | +| `--antialiasing ` | `true` | Ignore antialiasing differences | + +### Batch comparison + +```bash +node scripts/visual-diff.js --batch [options] +``` + +Options: + +| Flag | Default | Description | +|------|---------|-------------| +| `--output-dir ` | `.claude/visual-qa/diffs` | Directory for diff images | +| `--threshold ` | `0.02` | Threshold applied to each file | + +Batch mode compares every `.png` in the actual directory against the file with the same name in the expected directory. Files present in only one directory are reported as SKIP or MISSING. + +## Exit Codes + +| Code | Meaning | +|------|---------| +| 0 | Pass -- mismatch is at or below the threshold | +| 1 | Fail -- mismatch exceeds the threshold | +| 2 | Error -- missing files, invalid arguments, or runtime failure | + +## How It Works + +1. Both PNGs are loaded via `pngjs`. +2. If the images differ in size, both are scaled up to the larger dimensions. The extra area is filled with white. +3. `pixelmatch` runs with a per-pixel color distance threshold of `0.1`. Diff pixels are colored magenta (`[255, 0, 255]`) by default. +4. The overall mismatch percentage is calculated as `diffPixels / totalPixels`. +5. Four analysis layers run on the result: region grid, sub-pixel classification, typography, and layout drift. + +## Region Grid Analysis + +The diff image is divided into a grid (default 4x4 = 16 regions). Each region receives a human-readable name (e.g., `top-left`, `upper-center-right`, `bottom-left`) and an individual mismatch percentage. + +Region status thresholds: + +| Status | Condition | +|--------|-----------| +| pass | mismatch <= 1% | +| warn | mismatch 1--5% | +| fail | mismatch > 5% | + +Regions are sorted by mismatch descending, so the worst areas appear first. Example output: + +``` +Problem Regions: + FAIL top-left — 8.42% diff + FAIL upper-center-left — 6.11% diff +Warning Regions: + WARN bottom-right — 2.30% diff +``` + +Configure the grid size with `--region-grid` or `iterationLoop.regionGridSize` in `pipeline.config.json`. + +## Sub-Pixel Classification + +Not every diff pixel represents a real visual difference. The sub-pixel classifier uses connected-component labeling (4-connected flood fill) to group diff pixels into clusters, then classifies each cluster: + +- **Sub-pixel artifact**: cluster size <= `subPixelMaxClusterSize` (default 2 pixels) +- **Real difference**: cluster size > `subPixelMaxClusterSize` + +The report includes total clusters, sub-pixel vs real counts, and the percentage of the diff that is sub-pixel noise. When more than 50% of diff pixels are sub-pixel artifacts, the output flags it: + +``` +Sub-Pixel Analysis: + Total diff clusters: 142 (118 sub-pixel, 24 real) + Sub-pixel artifacts: 62.3% of diff pixels + Real differences: 0.08% of image + NOTE: Majority of differences are sub-pixel rendering artifacts +``` + +Disable with `visualDiff.subPixelClassification: false` in `pipeline.config.json`. + +## Typography Analysis + +Detects font weight mismatches and font fallback issues by analyzing luminance patterns in horizontal bands (4px tall). + +### Font weight detection + +Compares the average dark-pixel luminance between actual and expected text bands. If the difference exceeds `fontWeightThreshold` (default 15), it reports a mismatch and the direction (`heavier` or `lighter`): + +``` +Typography Analysis: + WARN Font weight mismatch detected (expected is heavier, delta: 18.5) +``` + +### Font fallback detection + +Compares dark-pixel density between corresponding text bands. If the density difference exceeds `fontFallbackDensityThreshold` (default 0.05 = 5%), it flags a likely font fallback: + +``` + WARN Font fallback likely (character density diff: 7.2%) +``` + +### Text line count mismatch + +If the number of detected text bands differs between actual and expected, this is also reported: + +``` + WARN Text line count differs (actual: 12, expected: 14) +``` + +Disable with `visualDiff.typographyAnalysis: false`. + +## Layout Drift Detection + +Detects element shifts by building projection profiles -- the sum of dark pixels per row (horizontal profile) and per column (vertical profile) -- and using cross-correlation to find the offset that maximizes similarity. + +The result includes estimated `dx` and `dy` in pixels and the overall shift magnitude. If the magnitude exceeds `layoutShiftThresholdPx` (default 2px), a warning is emitted: + +``` +Layout Analysis: + WARN Layout shift detected: dx=3px, dy=-1px (magnitude: 3.16px) +``` + +Disable with `visualDiff.layoutDriftAnalysis: false`. + +## Threshold Tuning + +The default threshold is `0.02` (2%), loaded from `pipeline.config.json` at `visualDiff.threshold`. Override per-run with `--threshold`. + +When to adjust: + +| Scenario | Recommendation | +|----------|----------------| +| Cross-platform font rendering differences | Raise to 0.03--0.05 | +| Anti-aliasing noise | Enable `--antialiasing true` (on by default) | +| Retina vs standard screenshots | Ensure both screenshots use the same resolution | +| Strict pixel-perfect requirement | Lower to 0.01 | + +## Pipeline Integration + +Visual diff runs as **Phase 5** of the build pipeline in an iteration loop: + +1. Capture screenshots of the built components. +2. Compare against expected screenshots using `visual-diff.js`. +3. If the result is FAIL, fix the identified differences. +4. Re-screenshot and compare again. +5. Repeat up to `iterationLoop.maxVisualIterations` (default 5) times. + +Two thresholds control the outcome: + +| Setting | Default | Meaning | +|---------|---------|---------| +| `iterationLoop.diffPassThreshold` | `0.02` | At or below = PASS | +| `iterationLoop.diffWarnThreshold` | `0.05` | Between pass and warn = WARN, above = FAIL | + +Diff images are saved to `visualDiff.outputDir` (default `.claude/visual-qa/diffs`). + +## Default Breakpoints + +Screenshots are captured at each configured breakpoint for responsive comparison. From `pipeline.config.json`: + +| Name | Width | +|------|-------| +| mobile | 375px | +| tablet | 768px | +| desktop | 1440px | +| wide | 1920px | + +The `requiredBreakpoints` setting (`["mobile", "tablet", "desktop"]`) controls which breakpoints must pass for the pipeline to proceed. + +## Dark Mode Verification + +After the visual diff phase passes, the automated hook suggests running dark mode screenshot comparison via `check-dark-mode.sh`. This runs as a separate non-blocking phase (5.5) in the pipeline. It captures dark-theme screenshots and compares them using the same pixelmatch engine. + +## Batch Mode Summary + +In batch mode, the report includes aggregate analysis alongside per-file results: + +``` +=== Visual Diff Report (Batch) === +Actual: /project/screenshots/actual +Expected: /project/screenshots/expected +Diffs: /project/.claude/visual-qa/diffs + +Total: 8 | Pass: 5 | Fail: 2 | Skip: 1 +Overall: FAIL + + PASS hero.png — 0.41% diff (1205 pixels) + FAIL nav.png — 4.82% diff (14230 pixels) + Problem areas: top-left, top-center-left + Font: weight mismatch (heavier) + FAIL footer.png — 3.15% diff (9100 pixels) + Layout: shift dx=4px dy=0px + SKIP splash.png — No matching expected file: splash.png +``` + +The JSON output additionally includes `analysisSummary` with counts and file lists for font issues, layout shifts, and sub-pixel dominant files. + +## Configuration Reference + +All settings live in `.claude/pipeline.config.json`: + +### `visualDiff` section + +| Key | Default | Description | +|-----|---------|-------------| +| `threshold` | `0.02` | Overall mismatch ratio to pass | +| `diffColorRgb` | `[255, 0, 255]` | RGB color for diff pixels (magenta) | +| `antialiasing` | `true` | Ignore antialiasing differences | +| `subPixelClassification` | `true` | Enable sub-pixel artifact detection | +| `subPixelMaxClusterSize` | `2` | Max cluster size to classify as sub-pixel | +| `typographyAnalysis` | `true` | Enable font weight/fallback detection | +| `fontWeightThreshold` | `15` | Luminance delta to flag weight mismatch | +| `fontFallbackDensityThreshold` | `0.05` | Density delta to flag fallback (5%) | +| `layoutDriftAnalysis` | `true` | Enable layout shift detection | +| `layoutShiftThresholdPx` | `2` | Pixel shift magnitude to flag drift | +| `outputDir` | `.claude/visual-qa/diffs` | Directory for diff images | + +### `iterationLoop` section + +| Key | Default | Description | +|-----|---------|-------------| +| `maxVisualIterations` | `5` | Max fix-and-recheck cycles | +| `regionGridSize` | `4` | Grid divisions (4 = 4x4 = 16 regions) | +| `diffPassThreshold` | `0.02` | Mismatch ratio for PASS | +| `diffWarnThreshold` | `0.05` | Mismatch ratio for WARN (above = FAIL) | + +## Troubleshooting + +### Diff too high due to font rendering + +Different operating systems and browsers render fonts differently. Solutions: + +- Enable the antialiasing filter (`--antialiasing true`, on by default). +- Raise the threshold slightly (0.03--0.05). +- Normalize fonts by ensuring the same font files are loaded in both environments. + +### Region shows drift but looks identical + +Likely sub-pixel rendering artifacts. Check the sub-pixel analysis section of the output. If more than 50% of the diff is sub-pixel, the differences are noise and can be safely ignored by raising the threshold or increasing `subPixelMaxClusterSize`. + +### Layout shift detected but design is correct + +Cross-platform layout engines can produce small positional differences. Verify that both screenshots use the same viewport size, then adjust `layoutShiftThresholdPx` if the shift is within acceptable tolerance. + +### Batch mode missing files + +Ensure the actual and expected directories contain `.png` files with matching filenames. Files present in only one directory are reported as SKIP (no expected match) or MISSING (no actual match). + +### Font weight mismatch warning + +Verify that the correct font weights are loaded. A weight of 400 on one platform can render differently than on another. Check your `@font-face` declarations and ensure font files are not being substituted by the browser. From 97f7826aab081d9d3ca14d7ee6b3141c24053b1d Mon Sep 17 00:00:00 2001 From: Paul Mulligan Date: Thu, 2 Apr 2026 15:20:11 -0400 Subject: [PATCH 04/14] docs: add pipeline caching and performance guide Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/guides/caching.md | 224 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 224 insertions(+) create mode 100644 docs/guides/caching.md diff --git a/docs/guides/caching.md b/docs/guides/caching.md new file mode 100644 index 0000000..06a1269 --- /dev/null +++ b/docs/guides/caching.md @@ -0,0 +1,224 @@ +# Pipeline Caching & Performance + +The Aurelius build pipeline uses content-hash caching to skip unchanged phases, stage profiling to track timing per phase, and a metrics dashboard to visualize trends over time. All three systems are orchestrated by `incremental-build.sh`. + +## How Incremental Builds Work + +`./scripts/incremental-build.sh` is the main entry point. Before each pipeline phase it computes a SHA-256 hash of the phase's input files via `pipeline-cache.js`. If the hash matches what was stored on the last successful run, the phase is skipped (cache hit). If the hash differs, the phase runs and the new hash is recorded. Every phase is timed by `stage-profiler.js`, and after all phases complete the metrics dashboard is regenerated by `metrics-dashboard.js`. + +```bash +# Run all phases (default) +./scripts/incremental-build.sh + +# Run a single phase +./scripts/incremental-build.sh lint + +# Available phases +# all (default), lint, types, tests, build, bundle, a11y, tokens, quality +``` + +### Flags + +| Flag | Effect | +|------|--------| +| `--force` | Ignore cache, rebuild everything | +| `--no-profile` | Skip stage profiling | +| `--no-cache` | Disable caching entirely | +| `--verbose` | Print detailed output for each phase | +| `--parallel` | Run independent phases concurrently | + +## Cache Strategy + +Caching is content-addressable. Each input file is hashed with SHA-256 and the hashes are combined into a single digest per input category. Files are grouped into six categories: + +| Category | Glob patterns | +|----------|--------------| +| `source` | `src/**/*.{ts,tsx,js,jsx}`, `components/**/*.{ts,tsx}` | +| `styles` | `src/**/*.css`, `styles/**/*.css`, `tailwind.config.*` | +| `tests` | `**/*.test.{ts,tsx,js,jsx}`, `**/*.spec.{ts,tsx}` | +| `config` | `package.json`, `tsconfig.json`, `vite.config.*`, `next.config.*` | +| `tokens` | `design-tokens.lock.json`, `tailwind.config.*` | +| `figma` | `build-spec.json`, `design-tokens.lock.json` | + +Each pipeline phase depends on one or more categories. When any file in a dependent category changes, the phase's combined hash changes and the cache is invalidated. + +## Phase-to-Input Mapping + +| Phase | Input categories | +|-------|-----------------| +| token-sync | tokens, config | +| intake | figma | +| token-lock | figma, tokens | +| tdd-scaffold | figma, tokens, tests | +| component-build | source, styles, tokens, tests, config | +| storybook | source, styles | +| visual-diff | source, styles | +| dark-mode | source, styles | +| e2e-tests | source, tests, config | +| cross-browser | source, styles | +| quality-gate | source, tests, config | +| responsive | source, styles | +| report | _(no inputs -- always runs)_ | + +## Cache Commands + +All cache operations go through `node scripts/pipeline-cache.js`: + +```bash +# Show cache status for every phase +node scripts/pipeline-cache.js status +node scripts/pipeline-cache.js status --json + +# Check whether a single phase's cache is still valid +# Exit 0 = valid (would be skipped), exit 1 = invalid (would run) +node scripts/pipeline-cache.js check component-build + +# Force a specific phase to re-run on the next build +node scripts/pipeline-cache.js invalidate visual-diff + +# Remove stale entries older than N days +node scripts/pipeline-cache.js clean --max-age 7 + +# Compute the SHA-256 hash for a file or directory +node scripts/pipeline-cache.js hash src/components/Hero.tsx +``` + +### Cache Location + +The manifest lives at `.claude/pipeline-cache/cache-manifest.json`. This file is git-ignored and local to each developer's machine. + +## When to Use `--force` + +Force a full rebuild when the cache may not reflect the true state of the project: + +- After a fresh clone or branch switch +- After a `git rebase` that rewrites history +- When you suspect the cache is stale +- In CI environments where you want deterministic builds +- After major configuration changes (e.g., upgrading TypeScript or Vite) + +```bash +./scripts/incremental-build.sh --force +``` + +The config also lists files that always trigger a full rebuild regardless of cache: `package.json`, `pnpm-lock.yaml`, and `pipeline.config.json`. + +## Stage Profiling + +Every phase is timed automatically. You can also query profiling data directly with `node scripts/stage-profiler.js`: + +```bash +# Begin timing a stage manually +node scripts/stage-profiler.js start component-build + +# End timing with a status +node scripts/stage-profiler.js end component-build --status pass + +# Generate a performance report +node scripts/stage-profiler.js report +node scripts/stage-profiler.js report --format json +node scripts/stage-profiler.js report --format ascii + +# View past build runs +node scripts/stage-profiler.js history +node scripts/stage-profiler.js history --last 10 + +# Find slow stages +node scripts/stage-profiler.js analyze +node scripts/stage-profiler.js analyze --slow-threshold 30000 +``` + +### Where Metrics Are Stored + +Profiling data lives in `.claude/pipeline-cache/metrics/`. The current run is written to `current-run.json` and completed runs are appended to `history.json`. Each entry records the node version, platform, and architecture alongside timing data. + +### Slow Stage Detection + +A stage is flagged as slow when it exceeds 30 seconds (configurable via `profiling.slowStageThresholdMs` in `pipeline.config.json`). Two additional alerts exist: + +| Alert | Threshold | Config key | +|-------|-----------|------------| +| Slow stage | 30 000 ms | `profiling.slowStageThresholdMs` | +| High memory | 1024 MB | `profiling.alerts.memoryThresholdMb` | +| Trend degradation | 20% slower than average | `profiling.alerts.trendDegradationPercent` | + +When `stage-profiler.js analyze` detects a stage exceeding any threshold it prints a warning with the stage name, measured value, and the threshold it violated. + +## Metrics Dashboard + +The dashboard gives a visual overview of build performance over time. + +```bash +# Generate an HTML dashboard with charts +node scripts/metrics-dashboard.js generate + +# Print a terminal-friendly summary +node scripts/metrics-dashboard.js summary + +# Show performance trends across recent builds +node scripts/metrics-dashboard.js trends +``` + +Output goes to `.claude/visual-qa/dashboard/` by default (configurable via `dashboard.outputDirectory`). Supported formats are HTML and Markdown. When `dashboard.autoGenerateAfterBuild` is `true` (the default), the dashboard regenerates automatically after every build. Historical data is retained for 30 days. + +## Parallel Execution + +Adding `--parallel` runs independent phases concurrently instead of sequentially: + +```bash +./scripts/incremental-build.sh --parallel +``` + +The dependency graph from `orchestration.phases` in `pipeline.config.json` determines which phases can overlap. For example, `storybook`, `visual-diff`, `dark-mode`, `cross-browser`, `quality-gate`, and `responsive` all depend only on `component-build`, so they can run at the same time once that phase completes. + +Maximum concurrency is controlled by `orchestration.maxConcurrent` (default 3). Phases that share exclusive resources are serialized automatically. + +## Configuration Reference + +Key sections of `pipeline.config.json` that control caching and performance: + +```jsonc +{ + "caching": { + "strategy": "content-hash", // Hashing algorithm strategy + "directory": ".claude/pipeline-cache", + "inputCategories": { /* ... */ }, // File globs per category + "phaseInputs": { /* ... */ }, // Categories per phase + "invalidateOnConfigChange": true + }, + "profiling": { + "metricsDirectory": ".claude/pipeline-cache/metrics", + "slowStageThresholdMs": 30000, + "alerts": { + "memoryThresholdMb": 1024, + "trendDegradationPercent": 20 + } + }, + "dashboard": { + "outputDirectory": ".claude/visual-qa/dashboard", + "autoGenerateAfterBuild": true, + "retentionDays": 30, + "formats": ["html", "md"] + }, + "incrementalBuild": { + "parallelPhases": true, + "skipCachedPhases": true, + "forceRebuildOn": ["package.json", "pnpm-lock.yaml", "pipeline.config.json"], + "maxParallelPhases": 4 + } +} +``` + +## Troubleshooting + +**Cache not invalidating after changes.** Check which input category contains the file you changed. If the file matches a new pattern that is not listed in `caching.inputCategories`, it will not be tracked. Run `node scripts/pipeline-cache.js status` to see current hashes. + +**Build slower than expected.** Run `node scripts/stage-profiler.js analyze` to identify which stages are exceeding thresholds. Check the trend degradation alert -- a 20% slowdown compared to the historical average triggers a warning. + +**Dashboard is empty.** The dashboard draws from `history.json` in the metrics directory. Run at least one full build to populate it. Verify that `dashboard.autoGenerateAfterBuild` is `true` in `pipeline.config.json`. + +**Parallel build fails.** Check the dependency graph in `orchestration.phases` for missing dependencies. If a phase needs another phase's output but does not declare it in `depends`, it will start too early. Fall back to sequential mode by omitting the `--parallel` flag: + +```bash +./scripts/incremental-build.sh +``` From 7c5d3115584072602cce64c30887361ff2ecc985 Mon Sep 17 00:00:00 2001 From: Paul Mulligan Date: Thu, 2 Apr 2026 15:22:15 -0400 Subject: [PATCH 05/14] docs: add hook system guide Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/guides/hooks.md | 153 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 153 insertions(+) create mode 100644 docs/guides/hooks.md diff --git a/docs/guides/hooks.md b/docs/guides/hooks.md new file mode 100644 index 0000000..ff01fa4 --- /dev/null +++ b/docs/guides/hooks.md @@ -0,0 +1,153 @@ +# Hook System + +Hooks are shell commands that run automatically after Claude Code tool-use events. They provide automated reminders, guards, and quality checks without requiring manual intervention. All hooks are configured in `.claude/settings.json`. + +## How Hooks Fire + +The `PostToolUse` event fires every time the Bash tool completes inside Claude Code. When this happens, the runtime iterates through all hooks with `"matcher": "Bash"` and executes each one. Two environment variables are available to every hook command: + +| Variable | Contents | +|----------|----------| +| `$TOOL_INPUT` | The command that was run (e.g. `pnpm build`, `git commit -m "feat: add hero"`) | +| `$TOOL_OUTPUT` | The stdout/stderr returned by the command | + +Each hook uses `grep` or other pattern-matching against these variables to decide whether to print a message. If the hook produces output, the message is shown to the user. If it produces no output, the hook passes silently. + +## Hook Configuration Format + +Hooks live in the `hooks` object inside `.claude/settings.json`: + +```json +{ + "hooks": { + "PostToolUse": [ + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "bash -c 'if echo \"$TOOL_INPUT\" | grep -q \"pattern\"; then echo \"[hook-name] message\"; fi'", + "description": "Human-readable description" + } + ] + } + ] + } +} +``` + +Each hook entry has three fields: + +| Field | Purpose | +|-------|---------| +| `type` | Always `"command"` | +| `command` | The shell command to execute. Receives `$TOOL_INPUT` and `$TOOL_OUTPUT` | +| `description` | Shown to users in Claude Code. Describe what the hook does | + +## Execution Order + +All hooks in the array run sequentially after the Bash tool completes. Execution follows array position in `settings.json` -- the first entry runs first, the second runs second, and so on. Each hook is independent; one hook's output does not affect another. + +A hook that prints text shows a visible message. A hook that prints nothing passes silently. Both outcomes are normal. + +## Built-In Hooks + +The framework ships with 8 hooks covering build quality, commit safety, and testing reminders: + +| Hook | Trigger | Action | +|------|---------|--------| +| Post-build QA | `pnpm build` succeeds (`built in` in output) | Reminds to run quality gate: vitest, tsc, verify-tokens | +| Pre-commit token guard | `git commit` detected in input | Runs `verify-tokens.sh`, warns if violations found | +| Dark mode reminder | `visual-diff.js` in input + `PASS` in output | Suggests running `check-dark-mode.sh` | +| Coverage enforcement | `vitest` in input + `Coverage` in output | Reminds to check 80% threshold from pipeline.config.json | +| Lighthouse CI | `pnpm build` succeeds | Suggests Lighthouse audit with thresholds from config (perf=80, a11y=90) | +| Bundle size guard | `git commit` + build dir exists | Checks dir size against `maxSizeKb` from config, warns if exceeded | +| Mutation testing reminder | `vitest` + `Tests N passed` in output | Suggests Stryker if `mutationTesting.reminder` is true in config | +| Regression reminder | `pnpm build` succeeds | Calls `regression-reminder.sh` to suggest regression tests if baselines exist | + +Notice the pattern: most hooks match on both `$TOOL_INPUT` (what command ran) and `$TOOL_OUTPUT` (what it returned). Matching on both reduces false triggers. + +## Creating a Custom Hook + +### Step by step + +1. Open `.claude/settings.json` +2. Find the `hooks.PostToolUse[0].hooks` array +3. Add a new object with `type`, `command`, and `description` +4. Write a bash command that pattern-matches `$TOOL_INPUT` and/or `$TOOL_OUTPUT` +5. Test by running the triggering command in Claude Code + +### Example: Changelog reminder after git tag + +```json +{ + "type": "command", + "command": "bash -c 'if echo \"$TOOL_INPUT\" | grep -q \"git tag\"; then echo \"[changelog] New tag created. Remember to update CHANGELOG.md\"; fi'", + "description": "Remind to update changelog after creating git tags" +} +``` + +### Example: Warn when installing a new dependency + +```json +{ + "type": "command", + "command": "bash -c 'if echo \"$TOOL_INPUT\" | grep -q \"pnpm add\"; then echo \"[dep-check] New dependency added. Run ./scripts/check-security.sh to audit.\"; fi'", + "description": "Suggest security audit after adding dependencies" +} +``` + +## Hook Scripts Directory + +When a hook grows too complex for a single inline bash command, extract the logic to a script file in `.claude/hooks/` and call it from settings.json. + +The regression reminder hook demonstrates this pattern. Instead of cramming file-counting logic into a JSON string, it delegates to a script: + +```json +{ + "type": "command", + "command": "bash .claude/hooks/regression-reminder.sh \"$TOOL_INPUT\" \"$TOOL_OUTPUT\"", + "description": "Reminds to run visual regression tests after successful builds" +} +``` + +The script at `.claude/hooks/regression-reminder.sh` receives the two arguments, does its grep checks and baseline counting, and echoes a message only when appropriate. + +When writing hook scripts: +- Accept `$1` (tool input) and `$2` (tool output) as positional arguments +- Echo a message when the hook should fire; print nothing otherwise +- Exit with code 0 in all cases + +## Best Practices + +- **Keep hooks fast.** They run after every matching Bash tool use. Aim for under 2 seconds. +- **Exit 0 for reminders.** Hooks are informational. Use exit code 0 so they never block the workflow. +- **Match carefully.** Pattern-match on both `$TOOL_INPUT` and `$TOOL_OUTPUT` when possible to avoid false triggers. +- **Prefix output with `[hook-name]`.** This makes it easy to identify which hook produced a message. +- **Extract complex logic.** If your command exceeds one or two conditions, move it to a script in `.claude/hooks/`. +- **Read thresholds from config.** When a hook checks a limit (bundle size, coverage), read it from `pipeline.config.json` rather than hardcoding. + +## Troubleshooting + +**Hook not firing** +Check that your `grep` pattern matches the actual command string. Test it manually: +```bash +echo "pnpm build --production" | grep -q "pnpm build" && echo "match" +``` +If the pattern uses special characters, make sure they are properly escaped inside the JSON string. + +**Hook firing too often** +Tighten the pattern. Match on both input and output instead of just one: +```bash +# Too broad -- fires on any vitest run +if echo "$TOOL_INPUT" | grep -q "vitest"; then ... + +# Better -- only fires when coverage output is present +if echo "$TOOL_INPUT" | grep -q "vitest" && echo "$TOOL_OUTPUT" | grep -q "Coverage"; then ... +``` + +**Hook blocking workflow** +Make sure the hook script always exits with code 0. A non-zero exit code may interrupt Claude Code's execution flow. + +**Hook output not visible** +The hook must echo to stdout. Check for quoting issues in the JSON command string -- mismatched quotes are the most common cause of silent failures. Test the command directly in a terminal first. From 01d2140a70656bab21535d602dff368240bc682e Mon Sep 17 00:00:00 2001 From: Paul Mulligan Date: Thu, 2 Apr 2026 15:24:14 -0400 Subject: [PATCH 06/14] docs: add error recovery guide Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/guides/error-recovery.md | 199 ++++++++++++++++++++++++++++++++++ 1 file changed, 199 insertions(+) create mode 100644 docs/guides/error-recovery.md diff --git a/docs/guides/error-recovery.md b/docs/guides/error-recovery.md new file mode 100644 index 0000000..93faaa3 --- /dev/null +++ b/docs/guides/error-recovery.md @@ -0,0 +1,199 @@ +# Error Recovery Guide + +Pipeline phases can fail for many reasons: network timeouts, missing dependencies, configuration drift, or simple typos. This guide explains how to diagnose failures, recover from them, and resume without starting over. + +## How the Pipeline Tracks Progress + +The pipeline uses TodoWrite tasks to track each phase. As phases complete they are marked done. Failed phases remain in-progress or pending. When you re-run the pipeline command, completed phases are checked against the cache. If their inputs have not changed the phase is skipped (cache hit). This means re-running the pipeline is always safe and efficient — you will never redo work that already succeeded. + +Check current cache state at any time: + +```bash +node scripts/pipeline-cache.js status +``` + +## Phase Failure Modes + +Every phase has characteristic failure patterns. The table below lists the most common ones and how to fix them. + +| Phase | Common Failures | How to Fix | +|-------|----------------|------------| +| 0: Token Sync | No lockfile (non-fatal, skipped), drift detected | Run `./scripts/sync-tokens.sh --update` or fix Tailwind config | +| 1: Intake | Figma MCP connection failed, invalid URL, Canva API timeout | Check MCP server is running, verify URL format | +| 2: Token Lock | Empty design (no extractable tokens), extraction timeout | Verify design has visible content, retry | +| 3: TDD Gate | Test generation fails, no components in build-spec | Check `build-spec.json` has a `components` array | +| 4: Build | Component compile errors, test failures (red phase still active) | Fix TypeScript errors, ensure tests from Phase 3 can pass | +| 4.5: Storybook | Story generation fails, missing component exports | Check component exports, run `./scripts/generate-stories.sh` manually | +| 5: Visual Diff | Screenshot capture fails, diff threshold exceeded after 5 iterations | Check dev server is running at expected port, lower threshold or fix components | +| 5.5: Dark Mode | Dark theme not configured, screenshot fails | Non-blocking — add dark mode support or skip | +| 6: E2E Tests | Browser not installed, test timeout, element not found | Run `./scripts/setup-playwright.sh`, increase timeout in `pipeline.config.json` | +| 7: Cross-Browser | Firefox/WebKit not installed | Run `./scripts/setup-playwright.sh` to install all browsers | +| 7.5: Regression | No baselines captured yet | Run `./scripts/capture-baselines.sh` first, then `./scripts/regression-test.sh` | +| 8: Quality Gate | Coverage below 80%, TypeScript errors, Lighthouse below thresholds | Write more tests, fix type errors, optimize performance | +| 8.5: Responsive | Dev server not running, screenshot timeout | Start dev server first, check ports | +| 9: Report | No failures — generates from available data | N/A | + +## Resuming a Failed Pipeline + +Re-run the same pipeline command you used originally: + +```bash +# Figma pipeline +/build-from-figma + +# Canva pipeline +/build-from-canva + +# Screenshot/URL pipeline +/build-from-screenshot +``` + +The caching system ensures completed phases are not repeated. Only the failed phase and subsequent phases run. Before re-running, check which phases already have valid cache: + +```bash +node scripts/pipeline-cache.js status +``` + +The output shows each phase name, its cache status (valid/invalid/missing), and when it last ran. + +## Forcing a Phase Re-Run + +If you need to re-run a specific phase even though its cache is valid, invalidate its cache entry: + +```bash +node scripts/pipeline-cache.js invalidate +``` + +Available phase names: + +| Phase Name | Pipeline Phase | +|------------|---------------| +| `token-sync` | 0: Token Sync | +| `intake` | 1: Intake | +| `token-lock` | 2: Token Lock | +| `tdd-scaffold` | 3: TDD Gate | +| `component-build` | 4: Build | +| `storybook` | 4.5: Storybook | +| `visual-diff` | 5: Visual Diff | +| `dark-mode` | 5.5: Dark Mode | +| `e2e-tests` | 6: E2E Tests | +| `cross-browser` | 7: Cross-Browser | +| `quality-gate` | 8: Quality Gate | +| `responsive` | 8.5: Responsive | +| `report` | 9: Report | + +After invalidation, re-run the pipeline command. Only the invalidated phase (and anything that depends on it) will re-execute. + +## Manual Phase Execution + +When debugging a failure, run individual scripts directly to get detailed output: + +```bash +# Token validation +./scripts/verify-tokens.sh + +# Token sync (dry run shows drift without updating) +./scripts/sync-tokens.sh --dry-run + +# Visual diff between two screenshots +node scripts/visual-diff.js + +# TypeScript type checking +./scripts/check-types.sh + +# Tests with coverage +./scripts/run-tests.sh + +# Accessibility audit +./scripts/check-accessibility.sh + +# Bundle size check +./scripts/check-bundle-size.sh + +# Security audit +./scripts/check-security.sh +``` + +Running a script manually does not affect pipeline cache state. You can experiment freely without invalidating anything. + +## Common Recovery Patterns + +### Visual diff stuck in iteration loop + +After 5 iterations and still failing, determine whether the difference is real or sub-pixel noise: + +```bash +node scripts/visual-diff.js actual.png expected.png --json +``` + +Check the `subPixelPercentage` field in the JSON output. If more than 50% of the diff is sub-pixel, the components are visually correct and you should raise the threshold in `pipeline.config.json` under `visualDiff.threshold`. If the differences are real, fix the component styling and re-run. + +### E2E tests timeout + +Ensure the dev server is running at the expected port before the pipeline launches E2E tests. Check timeout and retry settings in `pipeline.config.json`: + +```json +{ + "e2e": { + "timeout": 30000, + "retries": 2 + } +} +``` + +Increase `timeout` for slow environments. Increase `retries` if failures are intermittent. + +### Quality gate coverage too low + +Run tests with coverage to see exactly which files are under-covered: + +```bash +pnpm vitest run --coverage +``` + +The coverage threshold is set in `pipeline.config.json` under `tdd.coverageThreshold` (default 80%). Write tests for uncovered files, then re-run the pipeline. + +### Figma MCP not connecting + +1. Verify Figma Desktop is running. +2. Check that the MCP server is configured on port 3845. +3. If local MCP is unavailable, the pipeline falls back to the remote Figma MCP automatically. + +### Build fails with TypeScript errors + +Run the type checker independently to see all errors at once: + +```bash +./scripts/check-types.sh +``` + +Fix every reported error before re-running the pipeline. The Build phase (4) will not succeed while TypeScript errors remain. + +## Parallel Phase Failures + +Phases 4 through 9 run concurrently via the parallel orchestration system. A failure in one phase does not block independent phases. For example, if E2E tests fail but Visual Diff succeeds, the Visual Diff result is cached normally. + +After the batch completes, the summary shows the status of every phase: + +- **Succeeded** — cached, will not re-run. +- **Failed** — will re-run on next pipeline invocation. +- **Skipped** — blocked by a failed dependency, will run once the dependency passes. + +Fix the failed phase and re-run the pipeline. All successful phases are served from cache. + +## When to Start Fresh + +In most cases, resuming is the right approach. However, some situations call for a full rebuild: + +```bash +./scripts/incremental-build.sh --force +``` + +Start fresh after: + +- **Major config changes** — significant edits to `tsconfig.json`, `tailwind.config.*`, or `pipeline.config.json`. +- **Git rebase with conflicts** — resolved merge conflicts may leave the cache out of sync with the actual file contents. +- **Suspected corrupted cache** — if phases behave unexpectedly despite matching inputs. +- **Switching output targets** — changing `outputTarget` in `build-spec.json` (e.g., from `react` to `vue`) invalidates the entire build. + +The `--force` flag ignores all cached results and rebuilds every phase from scratch. From 563a1bfb37e0b9b263fe89d907c4a363c68e5d98 Mon Sep 17 00:00:00 2001 From: Paul Mulligan Date: Thu, 2 Apr 2026 15:26:58 -0400 Subject: [PATCH 07/14] docs: add agent creation guide Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/guides/agent-creation.md | 267 ++++++++++++++++++++++++++++++++++ 1 file changed, 267 insertions(+) create mode 100644 docs/guides/agent-creation.md diff --git a/docs/guides/agent-creation.md b/docs/guides/agent-creation.md new file mode 100644 index 0000000..ae4504f --- /dev/null +++ b/docs/guides/agent-creation.md @@ -0,0 +1,267 @@ +# Agent Creation + +Agents are specialized Markdown files with YAML frontmatter that live in `.claude/agents/`. Each agent defines a persona, a set of allowed tools, and detailed instructions that shape how Claude Code behaves when the agent is selected. Claude Code reads the `description` field to decide which agent best matches a given task, then loads that agent's instructions as the system prompt for the session. + +This framework ships with 53 agents covering engineering, design, testing, marketing, and operations. You can add your own by following the conventions below. + +## File Location + +Create agent files at: + +``` +.claude/agents/.md +``` + +The filename should be kebab-case and match the `name` field in the YAML frontmatter. For example, an agent named `docs-writer` lives at `.claude/agents/docs-writer.md`. + +## YAML Frontmatter + +Every agent file starts with a YAML frontmatter block delimited by `---`. Three fields are required; the rest are optional. + +```yaml +--- +name: my-agent # Required: kebab-case identifier, must match filename +description: ... # Required: tells Claude Code WHEN to select this agent +tools: Read, Write, Bash # Required: comma-separated list of allowed tools +color: blue # Optional: agent color shown in the UI +model: sonnet # Optional: model override (sonnet, opus, haiku) +permissionMode: bypassPermissions # Optional: skip confirmation prompts +--- +``` + +### The `name` Field + +A kebab-case identifier that matches the filename (without the `.md` extension). This is how the agent is referenced internally. + +### The `description` Field + +This is the most important field. Claude Code uses it to decide which agent to dispatch for a task. A vague description means the agent will rarely (or never) be selected. + +Write descriptions that start with "Use this agent when..." and list specific trigger scenarios. Be concrete about the tasks the agent handles. + +**Good description:** + +> Use this agent when building user interfaces, implementing React/Vue components, handling state management, or optimizing frontend performance. This agent excels at creating responsive, accessible, and performant web applications. + +**Bad description:** + +> A helpful coding agent. + +The good description names specific activities (building UIs, implementing components, handling state, optimizing performance) so Claude Code can match it against user requests. The bad description is too generic to ever win selection over a more specific agent. + +### The `tools` Field + +A comma-separated list of tools the agent is allowed to use. Only declare tools the agent actually needs -- this follows the principle of least privilege and keeps agents focused. + +**File operations:** + +| Tool | Purpose | +|------|---------| +| `Read` | Read file contents | +| `Write` | Create or overwrite files | +| `Edit` | Replace strings in existing files | +| `MultiEdit` | Multiple edits in a single operation | + +**Search:** + +| Tool | Purpose | +|------|---------| +| `Grep` | Search file contents by pattern | +| `Glob` | Find files by name pattern | + +**Shell:** + +| Tool | Purpose | +|------|---------| +| `Bash` | Execute shell commands | +| `KillShell` | Terminate a running shell process | + +**Interaction:** + +| Tool | Purpose | +|------|---------| +| `AskUserQuestion` | Prompt the user for input | +| `TaskOutput` | Return structured output from a sub-task | + +**Task management:** + +| Tool | Purpose | +|------|---------| +| `TodoWrite` | Track progress with a todo list | +| `Task` | Spawn sub-agent tasks | +| `Skill` | Invoke a registered skill | + +**Web:** + +| Tool | Purpose | +|------|---------| +| `WebFetch` | Fetch content from a URL | +| `WebSearch` | Search the web | + +**MCP tools:** + +MCP tools follow the pattern `mcp____`. Examples: + +- `mcp__figma__get_design_context` -- read a Figma design +- `mcp__playwright__browser_navigate` -- navigate a browser +- `mcp__chrome-devtools__take_screenshot` -- capture a screenshot + +Only include MCP tools if the agent's workflow requires them. + +## Choosing a Model + +The `model` field is optional. When omitted, the agent uses whatever model the user has active. When set, it overrides the model for the duration of the agent session. + +| Value | Best for | +|-------|----------| +| `sonnet` | Most agents -- fast, capable, and cost-effective | +| `opus` | Deep reasoning tasks: design interpretation, architecture decisions, complex refactoring | +| `haiku` | Simple and fast tasks: linting reminders, quick lookups, formatting checks | + +Prefer `sonnet` unless the task genuinely requires deeper reasoning. Using `opus` for every agent wastes time and cost. + +## Permission Mode + +The `permissionMode` field controls whether the agent asks for confirmation before performing actions. + +- **Omitted (default):** The agent follows normal permission rules and prompts the user before destructive operations. +- **`bypassPermissions`:** The agent runs without confirmation prompts. Use this only for autonomous pipeline agents that execute without human oversight (e.g., the `figma-react-converter` agent inside the build pipeline). + +Use `bypassPermissions` sparingly. Most agents should operate under normal permission rules. + +## Agent Body + +Everything after the closing `---` of the frontmatter is the agent body. This is Markdown that serves as the agent's system prompt -- it defines the persona, responsibilities, workflow, and quality standards. + +### Recommended Structure + +**1. Opening paragraph** -- Define the persona and expertise in one or two sentences: + +```markdown +You are an elite frontend development specialist with deep expertise +in modern JavaScript frameworks, responsive design, and user interface +implementation. +``` + +**2. Primary Responsibilities** -- Numbered sections with specific tasks: + +```markdown +## Primary Responsibilities + +### 1. Component Architecture +When building interfaces, you will: +- Design reusable, composable component hierarchies +- Implement proper state management +- Create type-safe components with TypeScript +``` + +**3. Workflow** -- Step-by-step process for common tasks: + +```markdown +## Workflow +1. Read the existing codebase to understand conventions +2. Check for related components and shared utilities +3. Implement the feature following established patterns +4. Write tests alongside the implementation +5. Run lint and type checks before finishing +``` + +**4. Key Principles** -- Guiding rules: + +```markdown +## Key Principles +- Always use TypeScript strict mode +- Prefer composition over inheritance +- Every component must have a test file +``` + +**5. Quality Standards** -- What "done" looks like: + +```markdown +## Quality Standards +- All tests pass with 80%+ coverage +- No TypeScript errors +- Accessible (WCAG 2.1 AA) +- Responsive across breakpoints +``` + +Keep the body focused. Each agent should have one clear area of responsibility. If you find yourself writing an agent that covers everything, split it into multiple specialized agents. + +## Complete Example + +Here is a full agent file for a documentation writer. Save this as `.claude/agents/docs-writer.md`: + +```markdown +--- +name: docs-writer +description: Use this agent when writing or updating project documentation, creating user guides, generating API references, or improving README files. This agent produces clear, well-structured technical documentation. +tools: Read, Write, Edit, Grep, Glob, Bash +color: green +--- + +You are a technical documentation specialist. You write clear, concise, +and well-organized documentation that helps developers understand and use +the codebase effectively. + +## Primary Responsibilities + +### 1. Documentation Creation +- Write user guides with step-by-step instructions +- Create API reference documentation from source code +- Document architecture decisions and design rationale +- Maintain changelog entries for notable changes + +### 2. Documentation Maintenance +- Keep existing docs in sync with code changes +- Fix broken links and outdated references +- Improve clarity based on common support questions + +## Workflow + +1. Read existing documentation to understand conventions and tone +2. Explore the relevant source code with Grep and Glob +3. Draft the documentation in Markdown +4. Verify code examples by reading the actual source +5. Check for broken internal links + +## Quality Standards + +- Use second person ("you") for instructions +- Include code examples for every non-trivial concept +- Keep paragraphs short (3-4 sentences maximum) +- Add a table of contents for documents longer than 100 lines +``` + +## Testing Your Agent + +After creating an agent, test it by giving Claude Code a task that should trigger it. For example, with the `docs-writer` agent above, you might say: + +> "Write a user guide for the caching system." + +Check three things: + +1. **Selection** -- Does Claude Code pick the right agent? If not, revise the `description` field to include more specific trigger phrases. +2. **Tool usage** -- Does the agent use only the tools you declared? If it needs a tool you did not list, add it to the `tools` field. +3. **Instruction adherence** -- Does the agent follow the workflow and quality standards in the body? If not, make the instructions more explicit. + +## Registration + +After creating and testing your agent, register it in two places so the rest of the framework knows about it: + +1. **`CLAUDE.md`** -- Add the agent to the agent table under the appropriate category: + + ```markdown + | Documentation | 2 | docusaurus-expert, docs-writer | + ``` + +2. **`.claude/CUSTOM-AGENTS-GUIDE.md`** -- Add a full catalog entry with the agent name, description, tools, and a brief summary of what it does. + +## Best Practices + +- **Keep scope focused.** One clear responsibility per agent. A "frontend-developer" agent and a "test-writer-fixer" agent are better than a single "frontend-developer-and-tester" agent. +- **Declare minimal tools.** Only include tools the agent actually needs. An agent that writes documentation does not need `KillShell` or MCP browser tools. +- **Write actionable descriptions.** Start with "Use this agent when..." and list concrete scenarios. The description is what determines whether the agent gets selected. +- **Include examples in the body.** Show the agent what good output looks like. If you want a specific format, demonstrate it. +- **Use `permissionMode: bypassPermissions` sparingly.** Reserve it for autonomous pipeline agents that run end-to-end without human oversight. +- **Prefer `sonnet` as the default model.** Only override to `opus` for tasks that need deep reasoning (design interpretation, architecture planning, complex multi-file refactoring). +- **Test with real tasks.** The best way to refine an agent is to use it on actual work and iterate on the description and body based on what you observe. From 4d2acd3572bc9071b45f15abc4f46b2002cd8f15 Mon Sep 17 00:00:00 2001 From: Paul Mulligan Date: Thu, 2 Apr 2026 15:28:32 -0400 Subject: [PATCH 08/14] docs: add Vue converter workflow guide Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/guides/vue-converter.md | 144 +++++++++++++++++++++++++++++++++++ 1 file changed, 144 insertions(+) create mode 100644 docs/guides/vue-converter.md diff --git a/docs/guides/vue-converter.md b/docs/guides/vue-converter.md new file mode 100644 index 0000000..d85dd2a --- /dev/null +++ b/docs/guides/vue-converter.md @@ -0,0 +1,144 @@ +# Vue Converter Workflow Guide + +The pipeline generates Vue 3 components from any design source (Figma, Canva, or screenshots/URLs). This guide covers the Vue-specific details of the conversion process. For general multi-framework pipeline information, see [docs/multi-framework/README.md](../multi-framework/README.md). + +## When Vue Is Selected + +Vue output is activated in one of two ways: + +1. **Explicit selection** -- set `"outputTarget": "vue"` in `build-spec.json` during the intake phase. +2. **Auto-detection** -- the pipeline detects `vue` in the project's `package.json` dependencies and sets the target automatically. If `nuxt.config.*` is present, Nuxt 3 conventions are used instead of plain Vue + Vite. + +The intake skills (`figma-intake`, `canva-intake`, `screenshot-intake`) ask the user to confirm or override the detected target during the interview. + +## The `vue-converter` Agent + +The `vue-converter` agent (defined in `.claude/agents/vue-converter.md`) is dispatched during Phase 4 (Build) of the pipeline. It reads: + +- **`build-spec.json`** -- component list, layout hierarchy, and page structure +- **`design-tokens.lock.json`** -- the single source of truth for all design values +- **Screenshots** -- used for layout decisions only; token values always come from the lockfile + +The agent generates components in dependency order (leaf components first, then composites), runs `pnpm vitest run` after each batch, and tracks progress via TodoWrite. It operates autonomously with no user prompts during the build phase. + +## Component Patterns Generated + +### Single-File Components + +Every component is a `.vue` file with three blocks: + +```vue + + + + + +``` + +### Key patterns + +- **Props** -- defined with `defineProps()` and a TypeScript interface. Defaults via `withDefaults()`. +- **Events** -- typed with `defineEmits<{...}>()`. Templates use `@click`, `@input`, etc. +- **Slots** -- `` for default content projection (equivalent to React `children`). Named slots (``) for multiple insertion points. +- **Composables** -- reusable logic extracted into `use*.ts` files (equivalent to React custom hooks). Return reactive state via `ref()` and `reactive()`. + +## Styling + +Tailwind utility classes are applied directly in `