diff --git a/.gitignore b/.gitignore index 9e413bc56b..6bc45e7e6c 100644 --- a/.gitignore +++ b/.gitignore @@ -37,3 +37,6 @@ supabase/.temp/ # Throughput analysis — local-only, regenerate via scripts/garry-output-comparison.ts docs/throughput-*.json + +# gstack preamble feature-discovery state markers (written via dev symlink) +.feature-prompted-* diff --git a/AGENTS.md b/AGENTS.md index c1e5595fc5..6ea9206640 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -57,6 +57,7 @@ Invoke them by name (e.g., `/office-hours`). | `/context-save` | Save working context (git state, decisions, remaining work). | | `/context-restore` | Resume from a saved context, even across Conductor workspaces. | | `/learn` | Manage what gstack learned across sessions. | +| `/plan-status` | Check progress of a plan against the codebase and git log. Phase-by-phase DONE/PARTIAL/REMAINING dashboard. | | `/retro` | Weekly retro with per-person breakdowns and shipping streaks. | | `/health` | Code quality dashboard (type checker, linter, tests, dead code). | | `/benchmark` | Performance regression detection (page load, Core Web Vitals). | diff --git a/CHANGELOG.md b/CHANGELOG.md index 937e67e37f..83fc41cb53 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,54 @@ # Changelog +## [1.34.0.0] - 2026-05-11 + +## **`/plan-status` ships. Ask "where am I on this plan?" and get an answer in seconds.** +## **`/ship` now warns before creating a PR if the plan has open checkboxes.** + +Plans have always been written by gstack. Now gstack can tell you how far along you are. `/plan-status` reads the plan that already exists, cross-references git commits and filesystem state, and produces a dashboard: a gstack Lifecycle Dashboard (which workflow skills have run, matching the Review Readiness Dashboard format) plus a phase-by-phase DONE/PARTIAL/REMAINING breakdown with a Branch Commits table. No setup, no ticket creation, no pre-registration of work. The plan is already there. The git history is already there. The skill just cross-references them. + +As a side effect, `/ship`'s Step 1 pre-flight now runs a 3-second open-checkbox grep. If the current branch's plan has unchecked `- [ ]` items, it surfaces an informational note before PR creation. You decide whether to proceed or check `/plan-status` first. Never blocks. + +### The numbers that matter + +Measured against v1.32.0.0 on the `plan-status-skill` branch. No external API calls, no new dependencies. + +| Metric | v1.32.0.0 | v1.33.0.0 | Δ | +|---|---|---|---| +| Skills | 40 | 41 | **+1** | +| Gate-tier E2E tests | (prior count) | +1 | **+1** | +| Free (Tier 1) tests | 400 | 400 | 0 | +| SKILL.md files | 40 | 41 | **+1** | +| LOC added (skill + test + fixture + docs) | — | ~280 | — | + +The `/plan-status` preamble runs in ~10s typical (gstack-update-check + gstack-slug + plan find + git log + analytics write). Evidence gathering is O(n) on file references in the plan, typically 5-20 files, 5-15 seconds. Total expected runtime: 15-40 seconds — consistent with `/learn` and `/canary`. + +### What this means for you + +If you've run `/office-hours` or `/plan-ceo-review`, you have a plan file. Run `/plan-status` on your feature branch and get a scannable dashboard in under a minute. No setup. The skill finds your plan from `~/.gstack/projects//ceo-plans/` automatically. + +For non-code projects (board planning, grant pipelines, accreditation prep): set `gstack-config set plan_glob "~/board-plans/*.md"` to point the skill at any folder of markdown plan files. + +### Itemized changes + +#### Added + +- **`/plan-status` skill** (`plan-status/SKILL.md.tmpl` + generated `SKILL.md`). Five-step flow: resolve plan file → extract phases and success-criteria checkboxes → gather git and filesystem evidence → classify DONE/PARTIAL/REMAINING/DROPPED → produce gstack Lifecycle Dashboard + Plan Detail report. Evidence sources for v1: `git log`, `git diff --name-only`, filesystem `ls`, `Gemfile` grep. All read-only. Optional `plan_glob` knob (`gstack-config set plan_glob "~/your-plans/*.md"`) for non-default plan directories. +- **`test/fixtures/plans/sample-ruby-llm-plan.md`** — fixture plan for the E2E test (2 phases, 4 checkboxes, 2 file refs, 1 gem ref). +- **`test/skill-e2e-plan-status.test.ts`** — gate-tier E2E test. Loose assertions: transcript contains "Plan Status" and at least one of "DONE" or "REMAINING". Classified `gate` because it's deterministic, read-only, filesystem-only fixture, < $0.50/run. +- **`TODOS.md`** — added entry to monitor GitHub issue #1343 for maintainer response to community asks about non-code evidence sources. + +#### Changed + +- **`/ship` Step 1 pre-flight** (`ship/SKILL.md.tmpl`) — new item 5: 3-second bash grep checks for open `- [ ]` checkboxes in the current branch's plan file. Informational only, never blocks. Suggests running `/plan-status` if open items are found. +- **`docs/skills.md`** — added `/plan-status` skill description and full section. +- **`AGENTS.md`** — added `/plan-status` row. + +#### For contributors + +- `test/helpers/touchfiles.ts`: new entry `'plan-status': ['plan-status/**', 'test/fixtures/plans/sample-ruby-llm-plan.md']` in `E2E_TOUCHFILES` and `'plan-status': 'gate'` in `E2E_TIERS`. +- Ship golden baselines (`test/fixtures/golden/`) regenerated to include the new pre-flight step 5. + ## [1.33.2.0] - 2026-05-11 ## **`./setup` no longer pollutes the global install when run from a Conductor worktree.** @@ -155,6 +204,7 @@ If you've been hitting the 35-minute hang on `/sync-gbrain`, it's gone. The arch - `TODOS.md` filed P2: investigate `gbrain import` perf on large staging dirs (5,131 files takes >10 minutes when 501 takes 10 seconds — gbrain-side N+1 SQL or auto-link reconciliation suspected). P3: cache "no changes since last import" at the prepare-batch level for true no-op fast paths. - `Plan completion audit` ran via subagent on this branch: 17/21 DONE, 1 CHANGED (D3 made opt-in), 2 deferred (F8 benchmark harness as separate work, 24-path unit coverage went integration-only). + ## [1.32.0.0] - 2026-05-10 ## **Seven contributor PRs land. Three are security or hardening.** diff --git a/TODOS.md b/TODOS.md index 0516f972e1..a25344680b 100644 --- a/TODOS.md +++ b/TODOS.md @@ -1,5 +1,15 @@ # TODOS +## plan-status follow-on + +### P3: Monitor issue #1343 for community asks + +Check whether @wwybdd23-bot has replied to our clarifying questions on GitHub issue #1343 (non-code plan evidence: calendar, email, external services). If the maintainer engages, design an extensible evidence-source plugin interface before implementing. Scope for that work belongs in a separate PR, not this one. + +**Priority:** P3 (monitor only — no implementation until maintainer decision) + +--- + ## /sync-gbrain memory stage perf follow-up ### P2: Investigate `gbrain import` perf on large staging dirs diff --git a/VERSION b/VERSION index 0df2c524d3..41efb235e3 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.33.2.0 +1.34.0.0 diff --git a/docs/skills.md b/docs/skills.md index b20bf665d1..50344ffc5e 100644 --- a/docs/skills.md +++ b/docs/skills.md @@ -32,6 +32,7 @@ Detailed guides for every gstack skill — philosophy, workflow, and examples. | [`/devex-review`](#devex-review) | **DX Reviewer (live)** | Live developer experience audit. Walks the actual onboarding flow, measures TTHW, catches the docs lies. | | [`/plan-tune`](#plan-tune) | **Question Tuner** | Self-tune AskUserQuestion sensitivity per question. Mark questions as never-ask, always-ask, or only-for-one-way. | | [`/learn`](#learn) | **Memory** | Manage what gstack learned across sessions. Review, search, prune, and export project-specific patterns and preferences. | +| [`/plan-status`](#plan-status) | **Plan Status** | Check progress of a gstack plan against the codebase and git log. Phase-by-phase DONE/PARTIAL/REMAINING dashboard. gstack Lifecycle Dashboard shows which workflow skills have run. | | [`/context-save`](#context-save) | **Save State** | Save working context (git state, decisions, remaining work) so any future session can resume. | | [`/context-restore`](#context-restore) | **Restore State** | Resume from a saved context, even across Conductor workspace handoffs. | | [`/health`](#health) | **Code Quality Dashboard** | Wraps type checker, linter, tests, dead code detection. Computes a weighted 0-10 score; tracks trends over time. | @@ -969,6 +970,51 @@ Claude: 23 learnings for this project (14 high confidence, 6 medium, 3 low) --- +## `/plan-status` + +This is my **progress dashboard** for any gstack plan. + +After `/office-hours` or `/plan-ceo-review` writes a plan, implementation starts — and there's no way to answer "where am I?" without manually re-reading the whole plan and cross-referencing the codebase. `/plan-status` closes that gap. It reads the plan that already exists, cross-references evidence from git and the filesystem, and produces a scannable dashboard in seconds. No setup. No ticket creation. + +The output has two parts: a **gstack Lifecycle Dashboard** showing which workflow skills have run (matching the same format as the Review Readiness Dashboard), and a **Plan Detail** section with per-phase and per-criteria DONE/PARTIAL/REMAINING classification. + +``` +You: /plan-status + +Claude: ## gstack Lifecycle Dashboard + +====================================================================================+ + | GSTACK LIFECYCLE DASHBOARD | + +====================================================================================+ + | Phase | Skill | Runs | Last Run | Status | Required | + |----------|---------------------|------|------------------|----------|-------------| + | Think | /office-hours | 1 | 2026-05-11 22:00 | DONE | no | + | Plan | /plan-ceo-review | 1 | 2026-05-11 22:00 | DONE | no | + | | /plan-eng-review | 0 | — | — | YES | + | Build | (commits) | 4 | — | DONE | YES | + ... + +------------------------------------------------------------------------------------+ + | VERDICT: IN PROGRESS — 1 required skill REMAINING (plan-eng-review) | + +====================================================================================+ + + ## Plan Status: 2026-05-11-migrate-to-ruby-llm.md + Branch: migrate-to-ruby-llm | As of: 2026-05-11 + + ### Phase Summary + | Phase | Status | Notes | + |-------|--------|-------| + | Phase 0: Foundation | DONE | Gems added, initializer present | + | Phase 1: Schema Migration | REMAINING | No migration files found | + + ### Summary + 1 of 2 phases complete. 2 of 4 success criteria met. + Key blockers: langchainrb still in Gemfile, no db/migrate/ files + Suggested next action: Run /plan-eng-review, then write the schema migration. +``` + +Configure a custom plan search path: `gstack-config set plan_glob "~/board-plans/*.md"` — useful for non-code projects like board planning or accreditation prep. + +--- + ## `/open-gstack-browser` This is my **co-presence mode**. diff --git a/gstack/llms.txt b/gstack/llms.txt index 8c5d4a3924..7e19d81fe8 100644 --- a/gstack/llms.txt +++ b/gstack/llms.txt @@ -44,6 +44,7 @@ Conventions: - [/plan-design-review](plan-design-review/SKILL.md): Designer's eye plan review — interactive, like CEO and Eng review. - [/plan-devex-review](plan-devex-review/SKILL.md): Interactive developer experience plan review. - [/plan-eng-review](plan-eng-review/SKILL.md): Eng manager-mode plan review. +- [/plan-status](plan-status/SKILL.md): Check progress of a gstack plan against the current codebase and git log. - [/plan-tune](plan-tune/SKILL.md): Self-tuning question sensitivity + developer psychographic for gstack (v1: observational). - [/qa](qa/SKILL.md): Systematically QA test a web application and fix bugs found. - [/qa-only](qa-only/SKILL.md): Report-only QA testing. diff --git a/package.json b/package.json index d4512f5e7d..6f3dd91642 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "gstack", - "version": "1.33.2.0", + "version": "1.34.0.0", "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module", diff --git a/plan-status/SKILL.md b/plan-status/SKILL.md new file mode 100644 index 0000000000..048ec5f1b3 --- /dev/null +++ b/plan-status/SKILL.md @@ -0,0 +1,728 @@ +--- +name: plan-status +preamble-tier: 1 +version: 1.0.0 +description: | + Check progress of a gstack plan against the current codebase and git log. + Reads a plan file (CEO plan, eng plan, or any gstack plan doc), extracts + phases and success-criteria checkboxes, cross-references git commits and + filesystem state, and produces a done/in-progress/remaining status report. + Use when: "plan status", "how far along is the plan", "what's done", + "what's left", "check plan progress", "where are we on the plan". (gstack) +triggers: + - plan status + - how far along + - what's done on the plan + - what's left on the plan + - check plan progress + - where are we on the plan +allowed-tools: + - Bash + - Read + - Glob + - Grep + - AskUserQuestion +--- + + + +## Preamble (run first) + +```bash +_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true) +[ -n "$_UPD" ] && echo "$_UPD" || true +mkdir -p ~/.gstack/sessions +touch ~/.gstack/sessions/"$PPID" +_SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') +find ~/.gstack/sessions -mmin +120 -type f -exec rm {} + 2>/dev/null || true +_PROACTIVE=$(~/.claude/skills/gstack/bin/gstack-config get proactive 2>/dev/null || echo "true") +_PROACTIVE_PROMPTED=$([ -f ~/.gstack/.proactive-prompted ] && echo "yes" || echo "no") +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" +_SKILL_PREFIX=$(~/.claude/skills/gstack/bin/gstack-config get skill_prefix 2>/dev/null || echo "false") +echo "PROACTIVE: $_PROACTIVE" +echo "PROACTIVE_PROMPTED: $_PROACTIVE_PROMPTED" +echo "SKILL_PREFIX: $_SKILL_PREFIX" +source <(~/.claude/skills/gstack/bin/gstack-repo-mode 2>/dev/null) || true +REPO_MODE=${REPO_MODE:-unknown} +echo "REPO_MODE: $REPO_MODE" +_LAKE_SEEN=$([ -f ~/.gstack/.completeness-intro-seen ] && echo "yes" || echo "no") +echo "LAKE_INTRO: $_LAKE_SEEN" +_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || true) +_TEL_PROMPTED=$([ -f ~/.gstack/.telemetry-prompted ] && echo "yes" || echo "no") +_TEL_START=$(date +%s) +_SESSION_ID="$$-$(date +%s)" +echo "TELEMETRY: ${_TEL:-off}" +echo "TEL_PROMPTED: $_TEL_PROMPTED" +_EXPLAIN_LEVEL=$(~/.claude/skills/gstack/bin/gstack-config get explain_level 2>/dev/null || echo "default") +if [ "$_EXPLAIN_LEVEL" != "default" ] && [ "$_EXPLAIN_LEVEL" != "terse" ]; then _EXPLAIN_LEVEL="default"; fi +echo "EXPLAIN_LEVEL: $_EXPLAIN_LEVEL" +_QUESTION_TUNING=$(~/.claude/skills/gstack/bin/gstack-config get question_tuning 2>/dev/null || echo "false") +echo "QUESTION_TUNING: $_QUESTION_TUNING" +mkdir -p ~/.gstack/analytics +if [ "$_TEL" != "off" ]; then +echo '{"skill":"plan-status","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true +fi +for _PF in $(find ~/.gstack/analytics -maxdepth 1 -name '.pending-*' 2>/dev/null); do + if [ -f "$_PF" ]; then + if [ "$_TEL" != "off" ] && [ -x "~/.claude/skills/gstack/bin/gstack-telemetry-log" ]; then + ~/.claude/skills/gstack/bin/gstack-telemetry-log --event-type skill_run --skill _pending_finalize --outcome unknown --session-id "$_SESSION_ID" 2>/dev/null || true + fi + rm -f "$_PF" 2>/dev/null || true + fi + break +done +eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true +_LEARN_FILE="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}/learnings.jsonl" +if [ -f "$_LEARN_FILE" ]; then + _LEARN_COUNT=$(wc -l < "$_LEARN_FILE" 2>/dev/null | tr -d ' ') + echo "LEARNINGS: $_LEARN_COUNT entries loaded" + if [ "$_LEARN_COUNT" -gt 5 ] 2>/dev/null; then + ~/.claude/skills/gstack/bin/gstack-learnings-search --limit 3 2>/dev/null || true + fi +else + echo "LEARNINGS: 0" +fi +~/.claude/skills/gstack/bin/gstack-timeline-log '{"skill":"plan-status","event":"started","branch":"'"$_BRANCH"'","session":"'"$_SESSION_ID"'"}' 2>/dev/null & +_HAS_ROUTING="no" +if [ -f CLAUDE.md ] && grep -q "## Skill routing" CLAUDE.md 2>/dev/null; then + _HAS_ROUTING="yes" +fi +_ROUTING_DECLINED=$(~/.claude/skills/gstack/bin/gstack-config get routing_declined 2>/dev/null || echo "false") +echo "HAS_ROUTING: $_HAS_ROUTING" +echo "ROUTING_DECLINED: $_ROUTING_DECLINED" +_VENDORED="no" +if [ -d ".claude/skills/gstack" ] && [ ! -L ".claude/skills/gstack" ]; then + if [ -f ".claude/skills/gstack/VERSION" ] || [ -d ".claude/skills/gstack/.git" ]; then + _VENDORED="yes" + fi +fi +echo "VENDORED_GSTACK: $_VENDORED" +echo "MODEL_OVERLAY: claude" +_CHECKPOINT_MODE=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_mode 2>/dev/null || echo "explicit") +_CHECKPOINT_PUSH=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_push 2>/dev/null || echo "false") +echo "CHECKPOINT_MODE: $_CHECKPOINT_MODE" +echo "CHECKPOINT_PUSH: $_CHECKPOINT_PUSH" +[ -n "$OPENCLAW_SESSION" ] && echo "SPAWNED_SESSION: true" || true +``` + +## Plan Mode Safe Operations + +In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`codex review`, writes to `~/.gstack/`, writes to the plan file, and `open` for generated artifacts. + +## Skill Invocation During Plan Mode + +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. + +If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" + +If `SKILL_PREFIX` is `"true"`, suggest/invoke `/gstack-*` names. Disk paths stay `~/.claude/skills/gstack/[skill-name]/SKILL.md`. + +If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). + +If output shows `JUST_UPGRADED `: print "Running gstack v{to} (just updated!)". If `SPAWNED_SESSION` is true, skip feature discovery. + +Feature discovery, max one prompt per session: +- Missing `~/.claude/skills/gstack/.feature-prompted-continuous-checkpoint`: AskUserQuestion for Continuous checkpoint auto-commits. If accepted, run `~/.claude/skills/gstack/bin/gstack-config set checkpoint_mode continuous`. Always touch marker. +- Missing `~/.claude/skills/gstack/.feature-prompted-model-overlay`: inform "Model overlays are active. MODEL_OVERLAY shows the patch." Always touch marker. + +After upgrade prompts, continue workflow. + +If `WRITING_STYLE_PENDING` is `yes`: ask once about writing style: + +> v1 prompts are simpler: first-use jargon glosses, outcome-framed questions, shorter prose. Keep default or restore terse? + +Options: +- A) Keep the new default (recommended — good writing helps everyone) +- B) Restore V0 prose — set `explain_level: terse` + +If A: leave `explain_level` unset (defaults to `default`). +If B: run `~/.claude/skills/gstack/bin/gstack-config set explain_level terse`. + +Always run (regardless of choice): +```bash +rm -f ~/.gstack/.writing-style-prompt-pending +touch ~/.gstack/.writing-style-prompted +``` + +Skip if `WRITING_STYLE_PENDING` is `no`. + +If `LAKE_INTRO` is `no`: say "gstack follows the **Boil the Lake** principle — do the complete thing when AI makes marginal cost near-zero. Read more: https://garryslist.org/posts/boil-the-ocean" Offer to open: + +```bash +open https://garryslist.org/posts/boil-the-ocean +touch ~/.gstack/.completeness-intro-seen +``` + +Only run `open` if yes. Always run `touch`. + +If `TEL_PROMPTED` is `no` AND `LAKE_INTRO` is `yes`: ask telemetry once via AskUserQuestion: + +> Help gstack get better. Share usage data only: skill, duration, crashes, stable device ID. No code, file paths, or repo names. + +Options: +- A) Help gstack get better! (recommended) +- B) No thanks + +If A: run `~/.claude/skills/gstack/bin/gstack-config set telemetry community` + +If B: ask follow-up: + +> Anonymous mode sends only aggregate usage, no unique ID. + +Options: +- A) Sure, anonymous is fine +- B) No thanks, fully off + +If B→A: run `~/.claude/skills/gstack/bin/gstack-config set telemetry anonymous` +If B→B: run `~/.claude/skills/gstack/bin/gstack-config set telemetry off` + +Always run: +```bash +touch ~/.gstack/.telemetry-prompted +``` + +Skip if `TEL_PROMPTED` is `yes`. + +If `PROACTIVE_PROMPTED` is `no` AND `TEL_PROMPTED` is `yes`: ask once: + +> Let gstack proactively suggest skills, like /qa for "does this work?" or /investigate for bugs? + +Options: +- A) Keep it on (recommended) +- B) Turn it off — I'll type /commands myself + +If A: run `~/.claude/skills/gstack/bin/gstack-config set proactive true` +If B: run `~/.claude/skills/gstack/bin/gstack-config set proactive false` + +Always run: +```bash +touch ~/.gstack/.proactive-prompted +``` + +Skip if `PROACTIVE_PROMPTED` is `yes`. + +If `HAS_ROUTING` is `no` AND `ROUTING_DECLINED` is `false` AND `PROACTIVE_PROMPTED` is `yes`: +Check if a CLAUDE.md file exists in the project root. If it does not exist, create it. + +Use AskUserQuestion: + +> gstack works best when your project's CLAUDE.md includes skill routing rules. + +Options: +- A) Add routing rules to CLAUDE.md (recommended) +- B) No thanks, I'll invoke skills manually + +If A: Append this section to the end of CLAUDE.md: + +```markdown + +## Skill routing + +When the user's request matches an available skill, invoke it via the Skill tool. When in doubt, invoke the skill. + +Key routing rules: +- Product ideas/brainstorming → invoke /office-hours +- Strategy/scope → invoke /plan-ceo-review +- Architecture → invoke /plan-eng-review +- Design system/plan review → invoke /design-consultation or /plan-design-review +- Full review pipeline → invoke /autoplan +- Bugs/errors → invoke /investigate +- QA/testing site behavior → invoke /qa or /qa-only +- Code review/diff check → invoke /review +- Visual polish → invoke /design-review +- Ship/deploy/PR → invoke /ship or /land-and-deploy +- Save progress → invoke /context-save +- Resume context → invoke /context-restore +``` + +Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"` + +If B: run `~/.claude/skills/gstack/bin/gstack-config set routing_declined true` and say they can re-enable with `gstack-config set routing_declined false`. + +This only happens once per project. Skip if `HAS_ROUTING` is `yes` or `ROUTING_DECLINED` is `true`. + +If `VENDORED_GSTACK` is `yes`, warn once via AskUserQuestion unless `~/.gstack/.vendoring-warned-$SLUG` exists: + +> This project has gstack vendored in `.claude/skills/gstack/`. Vendoring is deprecated. +> Migrate to team mode? + +Options: +- A) Yes, migrate to team mode now +- B) No, I'll handle it myself + +If A: +1. Run `git rm -r .claude/skills/gstack/` +2. Run `echo '.claude/skills/gstack/' >> .gitignore` +3. Run `~/.claude/skills/gstack/bin/gstack-team-init required` (or `optional`) +4. Run `git add .claude/ .gitignore CLAUDE.md && git commit -m "chore: migrate gstack from vendored to team mode"` +5. Tell the user: "Done. Each developer now runs: `cd ~/.claude/skills/gstack && ./setup --team`" + +If B: say "OK, you're on your own to keep the vendored copy up to date." + +Always run (regardless of choice): +```bash +eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true +touch ~/.gstack/.vendoring-warned-${SLUG:-unknown} +``` + +If marker exists, skip. + +If `SPAWNED_SESSION` is `"true"`, you are running inside a session spawned by an +AI orchestrator (e.g., OpenClaw). In spawned sessions: +- Do NOT use AskUserQuestion for interactive prompts. Auto-choose the recommended option. +- Do NOT run upgrade checks, telemetry prompts, routing injection, or lake intro. +- Focus on completing the task and reporting results via prose output. +- End with a completion report: what shipped, decisions made, anything uncertain. + +## Artifacts Sync (skill start) + +```bash +_GSTACK_HOME="${GSTACK_HOME:-$HOME/.gstack}" +# Prefer the v1.27.0.0 artifacts file; fall back to brain file for users +# upgrading mid-stream before the migration script runs. +if [ -f "$HOME/.gstack-artifacts-remote.txt" ]; then + _BRAIN_REMOTE_FILE="$HOME/.gstack-artifacts-remote.txt" +else + _BRAIN_REMOTE_FILE="$HOME/.gstack-brain-remote.txt" +fi +_BRAIN_SYNC_BIN="~/.claude/skills/gstack/bin/gstack-brain-sync" +_BRAIN_CONFIG_BIN="~/.claude/skills/gstack/bin/gstack-config" + +# /sync-gbrain context-load: teach the agent to use gbrain when it's available. +# Per-worktree pin: post-spike redesign uses kubectl-style `.gbrain-source` in the +# git toplevel to scope queries. Look for the pin in the worktree (not a global +# state file) so that opening worktree B without a pin doesn't claim "indexed" +# just because worktree A was synced. Empty string when gbrain is not +# configured (zero context cost for non-gbrain users). +_GBRAIN_CONFIG="$HOME/.gbrain/config.json" +if [ -f "$_GBRAIN_CONFIG" ] && command -v gbrain >/dev/null 2>&1; then + _GBRAIN_VERSION_OK=$(gbrain --version 2>/dev/null | grep -c '^gbrain ' || echo 0) + if [ "$_GBRAIN_VERSION_OK" -gt 0 ] 2>/dev/null; then + _GBRAIN_PIN_PATH="" + _REPO_TOP=$(git rev-parse --show-toplevel 2>/dev/null || echo "") + if [ -n "$_REPO_TOP" ] && [ -f "$_REPO_TOP/.gbrain-source" ]; then + _GBRAIN_PIN_PATH="$_REPO_TOP/.gbrain-source" + fi + if [ -n "$_GBRAIN_PIN_PATH" ]; then + echo "GBrain configured. Prefer \`gbrain search\`/\`gbrain query\` over Grep for" + echo "semantic questions; use \`gbrain code-def\`/\`code-refs\`/\`code-callers\` for" + echo "symbol-aware code lookup. See \"## GBrain Search Guidance\" in CLAUDE.md." + echo "Run /sync-gbrain to refresh." + else + echo "GBrain configured but this worktree isn't pinned yet. Run \`/sync-gbrain --full\`" + echo "before relying on \`gbrain search\` for code questions in this worktree." + echo "Falls back to Grep until pinned." + fi + fi +fi + +_BRAIN_SYNC_MODE=$("$_BRAIN_CONFIG_BIN" get artifacts_sync_mode 2>/dev/null || echo off) + +# Detect remote-MCP mode (Path 4 of /setup-gbrain). Local artifacts sync is +# a no-op in remote mode; the brain server pulls from GitHub/GitLab on its +# own cadence. Read claude.json directly to keep this preamble fast (no +# subprocess to claude CLI on every skill start). +_GBRAIN_MCP_MODE="none" +if command -v jq >/dev/null 2>&1 && [ -f "$HOME/.claude.json" ]; then + _GBRAIN_MCP_TYPE=$(jq -r '.mcpServers.gbrain.type // .mcpServers.gbrain.transport // empty' "$HOME/.claude.json" 2>/dev/null) + case "$_GBRAIN_MCP_TYPE" in + url|http|sse) _GBRAIN_MCP_MODE="remote-http" ;; + stdio) _GBRAIN_MCP_MODE="local-stdio" ;; + esac +fi + +if [ -f "$_BRAIN_REMOTE_FILE" ] && [ ! -d "$_GSTACK_HOME/.git" ] && [ "$_BRAIN_SYNC_MODE" = "off" ]; then + _BRAIN_NEW_URL=$(head -1 "$_BRAIN_REMOTE_FILE" 2>/dev/null | tr -d '[:space:]') + if [ -n "$_BRAIN_NEW_URL" ]; then + echo "ARTIFACTS_SYNC: artifacts repo detected: $_BRAIN_NEW_URL" + echo "ARTIFACTS_SYNC: run 'gstack-brain-restore' to pull your cross-machine artifacts (or 'gstack-config set artifacts_sync_mode off' to dismiss forever)" + fi +fi + +if [ -d "$_GSTACK_HOME/.git" ] && [ "$_BRAIN_SYNC_MODE" != "off" ]; then + _BRAIN_LAST_PULL_FILE="$_GSTACK_HOME/.brain-last-pull" + _BRAIN_NOW=$(date +%s) + _BRAIN_DO_PULL=1 + if [ -f "$_BRAIN_LAST_PULL_FILE" ]; then + _BRAIN_LAST=$(cat "$_BRAIN_LAST_PULL_FILE" 2>/dev/null || echo 0) + _BRAIN_AGE=$(( _BRAIN_NOW - _BRAIN_LAST )) + [ "$_BRAIN_AGE" -lt 86400 ] && _BRAIN_DO_PULL=0 + fi + if [ "$_BRAIN_DO_PULL" = "1" ]; then + ( cd "$_GSTACK_HOME" && git fetch origin >/dev/null 2>&1 && git merge --ff-only "origin/$(git rev-parse --abbrev-ref HEAD)" >/dev/null 2>&1 ) || true + echo "$_BRAIN_NOW" > "$_BRAIN_LAST_PULL_FILE" + fi + "$_BRAIN_SYNC_BIN" --once 2>/dev/null || true +fi + +if [ "$_GBRAIN_MCP_MODE" = "remote-http" ]; then + # Remote-MCP mode: local artifacts sync is a no-op (brain admin's server + # pulls from GitHub/GitLab). Show the user this is by design, not broken. + _GBRAIN_HOST=$(jq -r '.mcpServers.gbrain.url // empty' "$HOME/.claude.json" 2>/dev/null | sed -E 's|^https?://([^/:]+).*|\1|') + echo "ARTIFACTS_SYNC: remote-mode (managed by brain server ${_GBRAIN_HOST:-remote})" +elif [ -d "$_GSTACK_HOME/.git" ] && [ "$_BRAIN_SYNC_MODE" != "off" ]; then + _BRAIN_QUEUE_DEPTH=0 + [ -f "$_GSTACK_HOME/.brain-queue.jsonl" ] && _BRAIN_QUEUE_DEPTH=$(wc -l < "$_GSTACK_HOME/.brain-queue.jsonl" | tr -d ' ') + _BRAIN_LAST_PUSH="never" + [ -f "$_GSTACK_HOME/.brain-last-push" ] && _BRAIN_LAST_PUSH=$(cat "$_GSTACK_HOME/.brain-last-push" 2>/dev/null || echo never) + echo "ARTIFACTS_SYNC: mode=$_BRAIN_SYNC_MODE | last_push=$_BRAIN_LAST_PUSH | queue=$_BRAIN_QUEUE_DEPTH" +else + echo "ARTIFACTS_SYNC: off" +fi +``` + + + +Privacy stop-gate: if output shows `ARTIFACTS_SYNC: off`, `artifacts_sync_mode_prompted` is `false`, and gbrain is on PATH or `gbrain doctor --fast --json` works, ask once: + +> gstack can publish your artifacts (CEO plans, designs, reports) to a private GitHub repo that GBrain indexes across machines. How much should sync? + +Options: +- A) Everything allowlisted (recommended) +- B) Only artifacts +- C) Decline, keep everything local + +After answer: + +```bash +# Chosen mode: full | artifacts-only | off +"$_BRAIN_CONFIG_BIN" set artifacts_sync_mode +"$_BRAIN_CONFIG_BIN" set artifacts_sync_mode_prompted true +``` + +If A/B and `~/.gstack/.git` is missing, ask whether to run `gstack-artifacts-init`. Do not block the skill. + +At skill END before telemetry: + +```bash +"~/.claude/skills/gstack/bin/gstack-brain-sync" --discover-new 2>/dev/null || true +"~/.claude/skills/gstack/bin/gstack-brain-sync" --once 2>/dev/null || true +``` + + +## Model-Specific Behavioral Patch (claude) + +The following nudges are tuned for the claude model family. They are +**subordinate** to skill workflow, STOP points, AskUserQuestion gates, plan-mode +safety, and /ship review gates. If a nudge below conflicts with skill instructions, +the skill wins. Treat these as preferences, not rules. + +**Todo-list discipline.** When working through a multi-step plan, mark each task +complete individually as you finish it. Do not batch-complete at the end. If a task +turns out to be unnecessary, mark it skipped with a one-line reason. + +**Think before heavy actions.** For complex operations (refactors, migrations, +non-trivial new features), briefly state your approach before executing. This lets +the user course-correct cheaply instead of mid-flight. + +**Dedicated tools over Bash.** Prefer Read, Edit, Write, Glob, Grep over shell +equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer. + +## Voice + +Direct, concrete, builder-to-builder. Name the file, function, command, and user-visible impact. No filler. + +No em dashes. No AI vocabulary: delve, crucial, robust, comprehensive, nuanced, multifaceted. Never corporate or academic. Short paragraphs. End with what to do. + +The user has context you do not. Cross-model agreement is a recommendation, not a decision. The user decides. + +## Completion Status Protocol + +When completing a skill workflow, report status using one of: +- **DONE** — completed with evidence. +- **DONE_WITH_CONCERNS** — completed, but list concerns. +- **BLOCKED** — cannot proceed; state blocker and what was tried. +- **NEEDS_CONTEXT** — missing info; state exactly what is needed. + +Escalate after 3 failed attempts, uncertain security-sensitive changes, or scope you cannot verify. Format: `STATUS`, `REASON`, `ATTEMPTED`, `RECOMMENDATION`. + +## Operational Self-Improvement + +Before completing, if you discovered a durable project quirk or command fix that would save 5+ minutes next time, log it: + +```bash +~/.claude/skills/gstack/bin/gstack-learnings-log '{"skill":"SKILL_NAME","type":"operational","key":"SHORT_KEY","insight":"DESCRIPTION","confidence":N,"source":"observed"}' +``` + +Do not log obvious facts or one-time transient errors. + +## Telemetry (run last) + +After workflow completion, log telemetry. Use skill `name:` from frontmatter. OUTCOME is success/error/abort/unknown. + +**PLAN MODE EXCEPTION — ALWAYS RUN:** This command writes telemetry to +`~/.gstack/analytics/`, matching preamble analytics writes. + +Run this bash: + +```bash +_TEL_END=$(date +%s) +_TEL_DUR=$(( _TEL_END - _TEL_START )) +rm -f ~/.gstack/analytics/.pending-"$_SESSION_ID" 2>/dev/null || true +# Session timeline: record skill completion (local-only, never sent anywhere) +~/.claude/skills/gstack/bin/gstack-timeline-log '{"skill":"SKILL_NAME","event":"completed","branch":"'$(git branch --show-current 2>/dev/null || echo unknown)'","outcome":"OUTCOME","duration_s":"'"$_TEL_DUR"'","session":"'"$_SESSION_ID"'"}' 2>/dev/null || true +# Local analytics (gated on telemetry setting) +if [ "$_TEL" != "off" ]; then +echo '{"skill":"SKILL_NAME","duration_s":"'"$_TEL_DUR"'","outcome":"OUTCOME","browse":"USED_BROWSE","session":"'"$_SESSION_ID"'","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true +fi +# Remote telemetry (opt-in, requires binary) +if [ "$_TEL" != "off" ] && [ -x ~/.claude/skills/gstack/bin/gstack-telemetry-log ]; then + ~/.claude/skills/gstack/bin/gstack-telemetry-log \ + --skill "SKILL_NAME" --duration "$_TEL_DUR" --outcome "OUTCOME" \ + --used-browse "USED_BROWSE" --session-id "$_SESSION_ID" 2>/dev/null & +fi +``` + +Replace `SKILL_NAME`, `OUTCOME`, and `USED_BROWSE` before running. + +## Plan Status Footer + +In plan mode before ExitPlanMode: if the plan file lacks `## GSTACK REVIEW REPORT`, run `~/.claude/skills/gstack/bin/gstack-review-read` and append the standard runs/status/findings table. With `NO_REVIEWS` or empty, append a 5-row placeholder with verdict "NO REVIEWS YET — run `/autoplan`". If a richer report exists, skip. + +PLAN MODE EXCEPTION — always allowed (it's the plan file). + +## Plan-status Setup + +```bash +eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" && mkdir -p ~/.gstack/projects/$SLUG +_REPO=$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown") +_PLAN_GLOB=$(~/.claude/skills/gstack/bin/gstack-config get plan_glob 2>/dev/null || echo "") +_PLAN_DIR="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}" +echo "SLUG_RESOLVED: ${SLUG:-unknown}" +echo "PLAN_DIR: $_PLAN_DIR" +if [ -n "$_PLAN_GLOB" ]; then + _PLANS=$(ls -t $_PLAN_GLOB 2>/dev/null | head -10) +else + _PLANS=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | sort -r | head -10) +fi +echo "AVAILABLE_PLANS:" +if [ -z "$_PLANS" ]; then + echo "No plans found in $_PLAN_DIR. Run /office-hours or /plan-ceo-review to create one." +else + echo "$_PLANS" +fi +_GIT_LOG=$(git log --oneline $(git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null)..HEAD 2>/dev/null | head -30) +echo "BRANCH_COMMITS:" +echo "$_GIT_LOG" +_ANALYTICS="$HOME/.gstack/analytics/skill-usage.jsonl" +echo "SKILL_ANALYTICS:" +if [ -f "$_ANALYTICS" ] && [ -n "$_REPO" ] && [ "$_REPO" != "unknown" ]; then + grep "\"repo\":\"$_REPO\"" "$_ANALYTICS" 2>/dev/null | tail -200 +else + echo "none" +fi +``` + +Note: `plan_glob` is read from `~/.gstack/config.yaml` via `gstack-config` — not from environment variables or user input. Configure it with: `~/.claude/skills/gstack/bin/gstack-config set plan_glob "~/board-plans/*.md"`. Paths with spaces in directory names are not supported in v1. + +## Plan Mode Safe Operations + +In plan mode, allowed because they inform the plan: reads from `~/.gstack/`, reads from the codebase, `git log`, `git diff`, `find`, `ls`. + +## Skill Invocation During Plan Mode + +If the user invokes this skill in plan mode, follow it step by step. This skill is read-only — it never edits files. AskUserQuestion satisfies plan mode's end-of-turn requirement if no plan file is auto-resolved. + +--- + +## Step 0 — Resolve the plan file + +If the user named a specific plan file in their message, use that path directly. + +Otherwise, use the `AVAILABLE_PLANS` list from the Setup block to pick the most relevant plan: +- If only one plan exists for this project, use it. +- If the current branch name appears in a plan filename, prefer that one. +- If multiple plans exist with no clear match, use AskUserQuestion to let the user pick. Present a numbered list of the plan filenames and ask which one to check. This is the only case where AskUserQuestion is needed in this skill — keep the prompt simple (no D format required). +- If `AVAILABLE_PLANS` is empty, stop and display the empty-state message from Setup. + +Read the resolved plan file in full before proceeding. + +## Step 1 — Extract structure + +From the plan file, identify: + +1. **Phases** — sections headed `### Phase N` or `## Phase N` or similar. Note the name and summary of each. +2. **Success criteria checkboxes** — lines matching `- [ ]` (incomplete) or `- [x]` (complete). These are ground truth from the plan author. +3. **File/directory references** — any `app/`, `config/`, `db/`, `test/` paths mentioned in the plan that we can verify exist or not. +4. **Gem/dependency references** — any gem names mentioned as "add" or "remove" that we can verify in Gemfile. +5. **Pre-migration checklist items** — if present, extract as a separate list. + +## Step 2 — Gather evidence + +Run these checks. Do them in parallel where possible. + +**Git evidence:** +```bash +# Commits on this branch since diverging from main +git log --oneline $(git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null)..HEAD 2>/dev/null + +# Files changed on this branch +git diff --name-only $(git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null)..HEAD 2>/dev/null +``` + +**Filesystem evidence** — for each file/directory the plan says should exist or be deleted, check: +```bash +# Check existence of files the plan says should exist +ls 2>/dev/null || echo "MISSING: " + +# Check deletion of files the plan says should be gone +ls 2>/dev/null && echo "STILL EXISTS: " || echo "DELETED: " +``` + +**Gemfile evidence** — for each gem the plan says to add or remove: +```bash +grep -E "gem .\"\"" Gemfile 2>/dev/null && echo "PRESENT" || echo "ABSENT" +``` + +**Config/initializer evidence** — for any initializers or config the plan says to create: +```bash +ls config/initializers/.rb 2>/dev/null && echo "EXISTS" || echo "MISSING" +``` + +## Step 3 — Classify each item + +For each phase and each success-criteria checkbox, assign one of: + +- **DONE** — evidence confirms it: file exists/deleted as expected, gem added/removed, commit message matches, checkbox already `[x]` +- **PARTIAL** — some evidence but not complete (e.g. file exists but a required method is missing) +- **REMAINING** — no evidence; checkbox still `[ ]` and no corroborating git/filesystem signal +- **DROPPED** — plan explicitly marks it dropped/removed/n/a + +**Classification rules:** +- Use the evidence conservatively. If you're uncertain, mark REMAINING rather than DONE — a false positive is worse than a false negative. +- File existence alone is not sufficient to mark a planning artifact DONE. Require a content signal: a named section heading matching the plan's description, a completion marker, or a matching `[x]` checkbox within the file. +- Commit message matching is a strong positive signal. If the commit message explicitly names the deliverable, weight it heavily. +- A `[x]` checkbox in the plan file is ground truth from the plan author — mark DONE regardless of other evidence. + +## Step 4 — Produce the status report + +Output a structured report in three sections. + +### Section A: gstack Lifecycle Dashboard + +Parse the `SKILL_ANALYTICS` output from the Setup block. Each line is a JSON object: +`{"skill":"...","ts":"2026-05-11T22:00:00Z","repo":"..."}`. + +For each canonical skill in the table below, count occurrences and find the most recent `ts` (format as `YYYY-MM-DD HH:MM`). For the `(commits)` row, use the `BRANCH_COMMITS` count from the Setup block. + +**Status per row:** +- 0 runs: Status = `—`, Last Run = `—` +- > 0 runs: Status = `DONE`, Last Run = most recent timestamp +- `(commits)` row: Status = `DONE` if commits > 0, `—` if 0; Last Run = `—` always + +**Required column:** +- `YES` for: `/plan-eng-review`, `(commits)`, `/review`, `/ship` +- `no` for all others + +**VERDICT logic:** +- Count required YES rows where Status = `—` (not yet run), including `(commits)` if 0 commits +- 0 REMAINING: `CLEARED — all required lifecycle steps complete` +- All required REMAINING (nothing run): `NOT STARTED` +- Otherwise: `IN PROGRESS — N required skill(s) REMAINING: [names]` + +Display: + +``` ++====================================================================================+ +| GSTACK LIFECYCLE DASHBOARD | ++====================================================================================+ +| Phase | Skill | Runs | Last Run | Status | Required | +|----------|---------------------|------|------------------|----------|-------------| +| Think | /office-hours | 0 | — | — | no | +| Plan | /plan-ceo-review | 0 | — | — | no | +| | /plan-eng-review | 0 | — | — | YES | +| | /plan-design-review | 0 | — | — | no | +| | /plan-devex-review | 0 | — | — | no | +| | /autoplan | 0 | — | — | no | +| Build | (commits) | 0 | — | — | YES | +| Review | /review | 0 | — | — | YES | +| | /design-review | 0 | — | — | no | +| Test | /qa | 0 | — | — | no | +| | /qa-only | 0 | — | — | no | +| | /benchmark | 0 | — | — | no | +| Ship | /ship | 0 | — | — | YES | +| | /land-and-deploy | 0 | — | — | no | +| Reflect | /retro | 0 | — | — | no | +| | /document-release | 0 | — | — | no | ++------------------------------------------------------------------------------------+ +| VERDICT: NOT STARTED | ++====================================================================================+ +``` + +Fill in real values from the analytics data; the table above shows the all-zero baseline shape. + +### Section B: Other Observed Skills + +After building the lifecycle dashboard, collect any skill names from `SKILL_ANALYTICS` that are NOT in the canonical skill list above and NOT `plan-status` itself. Display a companion table: + +``` ++================================================+ +| OTHER OBSERVED SKILLS | +| (not yet mapped to lifecycle dashboard) | ++================================================+ +| Skill | Runs | Last Run | +|--------------------+------+--------------------| +| /investigate | 2 | 2026-05-10 14:00 | +| /codex | 1 | 2026-05-09 11:00 | ++------------------------------------------------+ +| 2 skills observed | ++================================================+ +``` + +If no unlisted skills appear: +``` +Other observed skills: none — all analytics entries are covered by the lifecycle dashboard above. +``` + +This table is self-maintaining: any new gstack skill that gets invoked on the project surfaces here automatically. + +### Section C: Plan Detail + +``` +## Plan Status: +Branch: | As of: + +### Phase Summary +| Phase | Status | Notes | +|-------|--------|-------| +| Phase 0: Foundation | DONE | All gems added, initializers present | +| Phase 1: Schema | DONE | Migrations in db/migrate/ | +| Phase 7: RAG migration | REMAINING | NotImplementedError stubs; no RubyLLM.embed calls | + +### Success Criteria +| Item | Status | +|------|--------| +| [ ] ruby_llm install generator accepted | DONE | +| [ ] langchainrb removed from Gemfile | REMAINING | +| [ ] Cross-tenant scoping test passes | REMAINING | +| [ ] ActsAsTenant require_tenant = true | PARTIAL — dev only, not production | + +### Branch Commits +| Hash | Message | In Scope | +|---------|------------------------------------------|----------| +| a1b2c3d | feat: add ruby_llm gem and initializer | YES | +| e4f5g6h | chore: remove langchainrb from Gemfile | YES | +| i7j8k9l | fix: unrelated bug in user controller | NO | + +### Pre-Migration Checklist (if present) +List any checklist items from the plan and their resolved status. + +### Summary +X of Y phases complete. Z of W success criteria met. +Key blockers: +Suggested next action: +``` + +**Notes column:** one phrase, not a sentence. Goal is a scannable dashboard, not a prose report. + +**Branch Commits:** For each commit in `BRANCH_COMMITS`, classify `In Scope` as YES if the commit message or changed files directly correspond to a plan deliverable (phase name, file reference, or gem reference from Step 1). Mark NO for commits unrelated to the plan. + +## Step 5 — Offer next action + +After the report, offer one of: +- "Want me to start on [top REMAINING item]?" +- "Want me to update the plan file's checkboxes to reflect what's done?" +- Nothing, if the plan is 100% complete — just say so. + +Do not auto-start work. User decides. diff --git a/plan-status/SKILL.md.tmpl b/plan-status/SKILL.md.tmpl new file mode 100644 index 0000000000..8c5fbbeb73 --- /dev/null +++ b/plan-status/SKILL.md.tmpl @@ -0,0 +1,272 @@ +--- +name: plan-status +preamble-tier: 1 +version: 1.0.0 +description: | + Check progress of a gstack plan against the current codebase and git log. + Reads a plan file (CEO plan, eng plan, or any gstack plan doc), extracts + phases and success-criteria checkboxes, cross-references git commits and + filesystem state, and produces a done/in-progress/remaining status report. + Use when: "plan status", "how far along is the plan", "what's done", + "what's left", "check plan progress", "where are we on the plan". (gstack) +triggers: + - plan status + - how far along + - what's done on the plan + - what's left on the plan + - check plan progress + - where are we on the plan +allowed-tools: + - Bash + - Read + - Glob + - Grep + - AskUserQuestion +--- + +{{PREAMBLE}} + +## Plan-status Setup + +```bash +{{SLUG_SETUP}} +_REPO=$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown") +_PLAN_GLOB=$(~/.claude/skills/gstack/bin/gstack-config get plan_glob 2>/dev/null || echo "") +_PLAN_DIR="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}" +echo "SLUG_RESOLVED: ${SLUG:-unknown}" +echo "PLAN_DIR: $_PLAN_DIR" +if [ -n "$_PLAN_GLOB" ]; then + _PLANS=$(ls -t $_PLAN_GLOB 2>/dev/null | head -10) +else + _PLANS=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | sort -r | head -10) +fi +echo "AVAILABLE_PLANS:" +if [ -z "$_PLANS" ]; then + echo "No plans found in $_PLAN_DIR. Run /office-hours or /plan-ceo-review to create one." +else + echo "$_PLANS" +fi +_GIT_LOG=$(git log --oneline $(git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null)..HEAD 2>/dev/null | head -30) +echo "BRANCH_COMMITS:" +echo "$_GIT_LOG" +_ANALYTICS="$HOME/.gstack/analytics/skill-usage.jsonl" +echo "SKILL_ANALYTICS:" +if [ -f "$_ANALYTICS" ] && [ -n "$_REPO" ] && [ "$_REPO" != "unknown" ]; then + grep "\"repo\":\"$_REPO\"" "$_ANALYTICS" 2>/dev/null | tail -200 +else + echo "none" +fi +``` + +Note: `plan_glob` is read from `~/.gstack/config.yaml` via `gstack-config` — not from environment variables or user input. Configure it with: `~/.claude/skills/gstack/bin/gstack-config set plan_glob "~/board-plans/*.md"`. Paths with spaces in directory names are not supported in v1. + +## Plan Mode Safe Operations + +In plan mode, allowed because they inform the plan: reads from `~/.gstack/`, reads from the codebase, `git log`, `git diff`, `find`, `ls`. + +## Skill Invocation During Plan Mode + +If the user invokes this skill in plan mode, follow it step by step. This skill is read-only — it never edits files. AskUserQuestion satisfies plan mode's end-of-turn requirement if no plan file is auto-resolved. + +--- + +## Step 0 — Resolve the plan file + +If the user named a specific plan file in their message, use that path directly. + +Otherwise, use the `AVAILABLE_PLANS` list from the Setup block to pick the most relevant plan: +- If only one plan exists for this project, use it. +- If the current branch name appears in a plan filename, prefer that one. +- If multiple plans exist with no clear match, use AskUserQuestion to let the user pick. Present a numbered list of the plan filenames and ask which one to check. This is the only case where AskUserQuestion is needed in this skill — keep the prompt simple (no D format required). +- If `AVAILABLE_PLANS` is empty, stop and display the empty-state message from Setup. + +Read the resolved plan file in full before proceeding. + +## Step 1 — Extract structure + +From the plan file, identify: + +1. **Phases** — sections headed `### Phase N` or `## Phase N` or similar. Note the name and summary of each. +2. **Success criteria checkboxes** — lines matching `- [ ]` (incomplete) or `- [x]` (complete). These are ground truth from the plan author. +3. **File/directory references** — any `app/`, `config/`, `db/`, `test/` paths mentioned in the plan that we can verify exist or not. +4. **Gem/dependency references** — any gem names mentioned as "add" or "remove" that we can verify in Gemfile. +5. **Pre-migration checklist items** — if present, extract as a separate list. + +## Step 2 — Gather evidence + +Run these checks. Do them in parallel where possible. + +**Git evidence:** +```bash +# Commits on this branch since diverging from main +git log --oneline $(git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null)..HEAD 2>/dev/null + +# Files changed on this branch +git diff --name-only $(git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null)..HEAD 2>/dev/null +``` + +**Filesystem evidence** — for each file/directory the plan says should exist or be deleted, check: +```bash +# Check existence of files the plan says should exist +ls 2>/dev/null || echo "MISSING: " + +# Check deletion of files the plan says should be gone +ls 2>/dev/null && echo "STILL EXISTS: " || echo "DELETED: " +``` + +**Gemfile evidence** — for each gem the plan says to add or remove: +```bash +grep -E "gem .\"\"" Gemfile 2>/dev/null && echo "PRESENT" || echo "ABSENT" +``` + +**Config/initializer evidence** — for any initializers or config the plan says to create: +```bash +ls config/initializers/.rb 2>/dev/null && echo "EXISTS" || echo "MISSING" +``` + +## Step 3 — Classify each item + +For each phase and each success-criteria checkbox, assign one of: + +- **DONE** — evidence confirms it: file exists/deleted as expected, gem added/removed, commit message matches, checkbox already `[x]` +- **PARTIAL** — some evidence but not complete (e.g. file exists but a required method is missing) +- **REMAINING** — no evidence; checkbox still `[ ]` and no corroborating git/filesystem signal +- **DROPPED** — plan explicitly marks it dropped/removed/n/a + +**Classification rules:** +- Use the evidence conservatively. If you're uncertain, mark REMAINING rather than DONE — a false positive is worse than a false negative. +- File existence alone is not sufficient to mark a planning artifact DONE. Require a content signal: a named section heading matching the plan's description, a completion marker, or a matching `[x]` checkbox within the file. +- Commit message matching is a strong positive signal. If the commit message explicitly names the deliverable, weight it heavily. +- A `[x]` checkbox in the plan file is ground truth from the plan author — mark DONE regardless of other evidence. + +## Step 4 — Produce the status report + +Output a structured report in three sections. + +### Section A: gstack Lifecycle Dashboard + +Parse the `SKILL_ANALYTICS` output from the Setup block. Each line is a JSON object: +`{"skill":"...","ts":"2026-05-11T22:00:00Z","repo":"..."}`. + +For each canonical skill in the table below, count occurrences and find the most recent `ts` (format as `YYYY-MM-DD HH:MM`). For the `(commits)` row, use the `BRANCH_COMMITS` count from the Setup block. + +**Status per row:** +- 0 runs: Status = `—`, Last Run = `—` +- > 0 runs: Status = `DONE`, Last Run = most recent timestamp +- `(commits)` row: Status = `DONE` if commits > 0, `—` if 0; Last Run = `—` always + +**Required column:** +- `YES` for: `/plan-eng-review`, `(commits)`, `/review`, `/ship` +- `no` for all others + +**VERDICT logic:** +- Count required YES rows where Status = `—` (not yet run), including `(commits)` if 0 commits +- 0 REMAINING: `CLEARED — all required lifecycle steps complete` +- All required REMAINING (nothing run): `NOT STARTED` +- Otherwise: `IN PROGRESS — N required skill(s) REMAINING: [names]` + +Display: + +``` ++====================================================================================+ +| GSTACK LIFECYCLE DASHBOARD | ++====================================================================================+ +| Phase | Skill | Runs | Last Run | Status | Required | +|----------|---------------------|------|------------------|----------|-------------| +| Think | /office-hours | 0 | — | — | no | +| Plan | /plan-ceo-review | 0 | — | — | no | +| | /plan-eng-review | 0 | — | — | YES | +| | /plan-design-review | 0 | — | — | no | +| | /plan-devex-review | 0 | — | — | no | +| | /autoplan | 0 | — | — | no | +| Build | (commits) | 0 | — | — | YES | +| Review | /review | 0 | — | — | YES | +| | /design-review | 0 | — | — | no | +| Test | /qa | 0 | — | — | no | +| | /qa-only | 0 | — | — | no | +| | /benchmark | 0 | — | — | no | +| Ship | /ship | 0 | — | — | YES | +| | /land-and-deploy | 0 | — | — | no | +| Reflect | /retro | 0 | — | — | no | +| | /document-release | 0 | — | — | no | ++------------------------------------------------------------------------------------+ +| VERDICT: NOT STARTED | ++====================================================================================+ +``` + +Fill in real values from the analytics data; the table above shows the all-zero baseline shape. + +### Section B: Other Observed Skills + +After building the lifecycle dashboard, collect any skill names from `SKILL_ANALYTICS` that are NOT in the canonical skill list above and NOT `plan-status` itself. Display a companion table: + +``` ++================================================+ +| OTHER OBSERVED SKILLS | +| (not yet mapped to lifecycle dashboard) | ++================================================+ +| Skill | Runs | Last Run | +|--------------------+------+--------------------| +| /investigate | 2 | 2026-05-10 14:00 | +| /codex | 1 | 2026-05-09 11:00 | ++------------------------------------------------+ +| 2 skills observed | ++================================================+ +``` + +If no unlisted skills appear: +``` +Other observed skills: none — all analytics entries are covered by the lifecycle dashboard above. +``` + +This table is self-maintaining: any new gstack skill that gets invoked on the project surfaces here automatically. + +### Section C: Plan Detail + +``` +## Plan Status: +Branch: | As of: + +### Phase Summary +| Phase | Status | Notes | +|-------|--------|-------| +| Phase 0: Foundation | DONE | All gems added, initializers present | +| Phase 1: Schema | DONE | Migrations in db/migrate/ | +| Phase 7: RAG migration | REMAINING | NotImplementedError stubs; no RubyLLM.embed calls | + +### Success Criteria +| Item | Status | +|------|--------| +| [ ] ruby_llm install generator accepted | DONE | +| [ ] langchainrb removed from Gemfile | REMAINING | +| [ ] Cross-tenant scoping test passes | REMAINING | +| [ ] ActsAsTenant require_tenant = true | PARTIAL — dev only, not production | + +### Branch Commits +| Hash | Message | In Scope | +|---------|------------------------------------------|----------| +| a1b2c3d | feat: add ruby_llm gem and initializer | YES | +| e4f5g6h | chore: remove langchainrb from Gemfile | YES | +| i7j8k9l | fix: unrelated bug in user controller | NO | + +### Pre-Migration Checklist (if present) +List any checklist items from the plan and their resolved status. + +### Summary +X of Y phases complete. Z of W success criteria met. +Key blockers: +Suggested next action: +``` + +**Notes column:** one phrase, not a sentence. Goal is a scannable dashboard, not a prose report. + +**Branch Commits:** For each commit in `BRANCH_COMMITS`, classify `In Scope` as YES if the commit message or changed files directly correspond to a plan deliverable (phase name, file reference, or gem reference from Step 1). Mark NO for commits unrelated to the plan. + +## Step 5 — Offer next action + +After the report, offer one of: +- "Want me to start on [top REMAINING item]?" +- "Want me to update the plan file's checkboxes to reflect what's done?" +- Nothing, if the plan is 100% complete — just say so. + +Do not auto-start work. User decides. diff --git a/ship/SKILL.md b/ship/SKILL.md index 25119fb391..d592c26b14 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -920,6 +920,31 @@ For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope < Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9. +5. Check for open plan items: + +```bash +eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true +_PLAN_DIR="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}" +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +_PLAN_FILE=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | \ + grep -i "$(echo "$_BRANCH" | tr '/' '-')" 2>/dev/null | sort -r | head -1) +if [ -z "$_PLAN_FILE" ]; then + _PLAN_FILE=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | sort -r | head -1) +fi +if [ -n "$_PLAN_FILE" ]; then + _REMAINING=$(grep -c "- \[ \]" "$_PLAN_FILE" 2>/dev/null || echo "0") + echo "PLAN_FILE: $_PLAN_FILE" + echo "OPEN_CHECKBOXES: $_REMAINING" +else + echo "PLAN_FILE: none" +fi +``` + +If `OPEN_CHECKBOXES` > 0, print an informational warning (never block): +"Note: Plan has N open checkbox(es) in ``. Run /plan-status to review what's REMAINING before the PR lands — or proceed if these items are intentionally deferred." + +If `PLAN_FILE` is none, skip silently. + --- ## Step 2: Distribution Pipeline Check diff --git a/ship/SKILL.md.tmpl b/ship/SKILL.md.tmpl index 5a7c34661d..61265072cf 100644 --- a/ship/SKILL.md.tmpl +++ b/ship/SKILL.md.tmpl @@ -95,6 +95,31 @@ For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope < Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9. +5. Check for open plan items: + +```bash +eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true +_PLAN_DIR="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}" +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +_PLAN_FILE=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | \ + grep -i "$(echo "$_BRANCH" | tr '/' '-')" 2>/dev/null | sort -r | head -1) +if [ -z "$_PLAN_FILE" ]; then + _PLAN_FILE=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | sort -r | head -1) +fi +if [ -n "$_PLAN_FILE" ]; then + _REMAINING=$(grep -c "- \[ \]" "$_PLAN_FILE" 2>/dev/null || echo "0") + echo "PLAN_FILE: $_PLAN_FILE" + echo "OPEN_CHECKBOXES: $_REMAINING" +else + echo "PLAN_FILE: none" +fi +``` + +If `OPEN_CHECKBOXES` > 0, print an informational warning (never block): +"Note: Plan has N open checkbox(es) in ``. Run /plan-status to review what's REMAINING before the PR lands — or proceed if these items are intentionally deferred." + +If `PLAN_FILE` is none, skip silently. + --- ## Step 2: Distribution Pipeline Check diff --git a/test/fixtures/golden/claude-ship-SKILL.md b/test/fixtures/golden/claude-ship-SKILL.md index 25119fb391..d592c26b14 100644 --- a/test/fixtures/golden/claude-ship-SKILL.md +++ b/test/fixtures/golden/claude-ship-SKILL.md @@ -920,6 +920,31 @@ For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope < Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9. +5. Check for open plan items: + +```bash +eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true +_PLAN_DIR="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}" +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +_PLAN_FILE=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | \ + grep -i "$(echo "$_BRANCH" | tr '/' '-')" 2>/dev/null | sort -r | head -1) +if [ -z "$_PLAN_FILE" ]; then + _PLAN_FILE=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | sort -r | head -1) +fi +if [ -n "$_PLAN_FILE" ]; then + _REMAINING=$(grep -c "- \[ \]" "$_PLAN_FILE" 2>/dev/null || echo "0") + echo "PLAN_FILE: $_PLAN_FILE" + echo "OPEN_CHECKBOXES: $_REMAINING" +else + echo "PLAN_FILE: none" +fi +``` + +If `OPEN_CHECKBOXES` > 0, print an informational warning (never block): +"Note: Plan has N open checkbox(es) in ``. Run /plan-status to review what's REMAINING before the PR lands — or proceed if these items are intentionally deferred." + +If `PLAN_FILE` is none, skip silently. + --- ## Step 2: Distribution Pipeline Check diff --git a/test/fixtures/golden/codex-ship-SKILL.md b/test/fixtures/golden/codex-ship-SKILL.md index 7770a8906e..d505ca6944 100644 --- a/test/fixtures/golden/codex-ship-SKILL.md +++ b/test/fixtures/golden/codex-ship-SKILL.md @@ -909,6 +909,31 @@ For Design Review: run `source <($GSTACK_ROOT/bin/gstack-diff-scope 2>/de Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9. +5. Check for open plan items: + +```bash +eval "$($GSTACK_ROOT/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true +_PLAN_DIR="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}" +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +_PLAN_FILE=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | \ + grep -i "$(echo "$_BRANCH" | tr '/' '-')" 2>/dev/null | sort -r | head -1) +if [ -z "$_PLAN_FILE" ]; then + _PLAN_FILE=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | sort -r | head -1) +fi +if [ -n "$_PLAN_FILE" ]; then + _REMAINING=$(grep -c "- \[ \]" "$_PLAN_FILE" 2>/dev/null || echo "0") + echo "PLAN_FILE: $_PLAN_FILE" + echo "OPEN_CHECKBOXES: $_REMAINING" +else + echo "PLAN_FILE: none" +fi +``` + +If `OPEN_CHECKBOXES` > 0, print an informational warning (never block): +"Note: Plan has N open checkbox(es) in ``. Run /plan-status to review what's REMAINING before the PR lands — or proceed if these items are intentionally deferred." + +If `PLAN_FILE` is none, skip silently. + --- ## Step 2: Distribution Pipeline Check diff --git a/test/fixtures/golden/factory-ship-SKILL.md b/test/fixtures/golden/factory-ship-SKILL.md index baae7421d9..b16973198e 100644 --- a/test/fixtures/golden/factory-ship-SKILL.md +++ b/test/fixtures/golden/factory-ship-SKILL.md @@ -911,6 +911,31 @@ For Design Review: run `source <($GSTACK_ROOT/bin/gstack-diff-scope 2>/de Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9. +5. Check for open plan items: + +```bash +eval "$($GSTACK_ROOT/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true +_PLAN_DIR="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}" +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +_PLAN_FILE=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | \ + grep -i "$(echo "$_BRANCH" | tr '/' '-')" 2>/dev/null | sort -r | head -1) +if [ -z "$_PLAN_FILE" ]; then + _PLAN_FILE=$(find "$_PLAN_DIR" -name "*.md" 2>/dev/null | grep -E "ceo-plans|eng-plans" | sort -r | head -1) +fi +if [ -n "$_PLAN_FILE" ]; then + _REMAINING=$(grep -c "- \[ \]" "$_PLAN_FILE" 2>/dev/null || echo "0") + echo "PLAN_FILE: $_PLAN_FILE" + echo "OPEN_CHECKBOXES: $_REMAINING" +else + echo "PLAN_FILE: none" +fi +``` + +If `OPEN_CHECKBOXES` > 0, print an informational warning (never block): +"Note: Plan has N open checkbox(es) in ``. Run /plan-status to review what's REMAINING before the PR lands — or proceed if these items are intentionally deferred." + +If `PLAN_FILE` is none, skip silently. + --- ## Step 2: Distribution Pipeline Check diff --git a/test/fixtures/plans/sample-ruby-llm-plan.md b/test/fixtures/plans/sample-ruby-llm-plan.md new file mode 100644 index 0000000000..32c1541e08 --- /dev/null +++ b/test/fixtures/plans/sample-ruby-llm-plan.md @@ -0,0 +1,33 @@ +# CEO Plan: Implement ruby_llm integration +**Branch:** add-ruby-llm | **Date:** 2026-04-01 | **Author:** example + +## Phase 0: Foundation + +Add the ruby_llm gem and configure the client. Create an initializer that loads the +API key from the environment and sets the default model. + +**Files:** +- `config/initializers/ruby_llm.rb` — new initializer +- `app/services/llm_service.rb` — new service wrapping the client + +**Success criteria:** +- [x] ruby_llm gem added to Gemfile and installed successfully +- [x] config/initializers/ruby_llm.rb created with API key config +- [ ] LlmService exposes a `complete(prompt)` method +- [ ] LlmService handles API errors and returns a structured result + +## Phase 1: First Feature + +Build a summarization endpoint backed by ruby_llm. Add a model and controller to +store requests and return streamed responses. + +**Files:** +- `app/models/summary_request.rb` — new model +- `app/controllers/summaries_controller.rb` — new controller + +**Gem:** +- `ruby_llm` (already added in Phase 0) + +**Success criteria:** +- app/models/summary_request.rb exists with validations +- POST /summaries creates a SummaryRequest and enqueues a job diff --git a/test/helpers/touchfiles.ts b/test/helpers/touchfiles.ts index 5043884c32..6e6dc288cf 100644 --- a/test/helpers/touchfiles.ts +++ b/test/helpers/touchfiles.ts @@ -213,6 +213,9 @@ export const E2E_TOUCHFILES: Record = { // Learnings 'learnings-show': ['learn/**', 'bin/gstack-learnings-search', 'bin/gstack-learnings-log', 'scripts/resolvers/learnings.ts'], + // Plan Status + 'plan-status': ['plan-status/**', 'test/fixtures/plans/sample-ruby-llm-plan.md'], + // Session Intelligence (timeline, context recovery, /context-save + /context-restore) 'timeline-event-flow': ['bin/gstack-timeline-log', 'bin/gstack-timeline-read'], 'context-recovery-artifacts': ['scripts/resolvers/preamble.ts', 'bin/gstack-timeline-log', 'bin/gstack-slug', 'learn/**'], @@ -549,6 +552,9 @@ export const E2E_TIERS: Record = { // Learnings — gate (functional guardrail: seeded learnings must appear) 'learnings-show': 'gate', + // Plan Status — gate (deterministic, read-only, filesystem-only fixture, < $0.50/run) + 'plan-status': 'gate', + // Document-release — gate (CHANGELOG guardrail) 'document-release': 'gate', diff --git a/test/skill-e2e-plan-status.test.ts b/test/skill-e2e-plan-status.test.ts new file mode 100644 index 0000000000..401288a553 --- /dev/null +++ b/test/skill-e2e-plan-status.test.ts @@ -0,0 +1,128 @@ +/** + * Gate-tier E2E for /plan-status. + * + * Verifies the skill can resolve a plan file, classify phase/criteria items, + * and produce a status report. Uses a filesystem-only fixture (no git commits + * that match plan deliverables) — the git evidence path is not exercised here. + * + * Gate tier: deterministic, read-only, filesystem-only fixture, < $0.50/run. + */ + +import { describe, test, expect, beforeAll, afterAll } from 'bun:test'; +import { runSkillTest } from './helpers/session-runner'; +import { + ROOT, runId, + describeIfSelected, testConcurrentIfSelected, + logCost, recordE2E, + createEvalCollector, finalizeEvalCollector, +} from './helpers/e2e-helpers'; +import { spawnSync } from 'child_process'; +import * as fs from 'fs'; +import * as path from 'path'; +import * as os from 'os'; + +const evalCollector = createEvalCollector('e2e-plan-status'); + +describeIfSelected('Plan Status E2E', ['plan-status'], () => { + let workDir: string; + let gstackHome: string; + + beforeAll(() => { + workDir = fs.mkdtempSync(path.join(os.tmpdir(), 'skill-e2e-plan-status-')); + gstackHome = path.join(workDir, '.gstack-home'); + + const run = (cmd: string, args: string[]) => + spawnSync(cmd, args, { cwd: workDir, stdio: 'pipe', timeout: 5000 }); + run('git', ['init', '-b', 'main']); + run('git', ['config', 'user.email', 'test@test.com']); + run('git', ['config', 'user.name', 'Test']); + fs.writeFileSync(path.join(workDir, 'README.md'), '# Test project\n'); + run('git', ['add', '.']); + run('git', ['commit', '-m', 'initial commit']); + + // Install the plan-status skill + const skillDir = path.join(workDir, '.claude', 'skills', 'plan-status'); + fs.mkdirSync(skillDir, { recursive: true }); + fs.copyFileSync(path.join(ROOT, 'plan-status', 'SKILL.md'), path.join(skillDir, 'SKILL.md')); + + // Copy bin scripts referenced by the preamble + const binDir = path.join(workDir, 'bin'); + fs.mkdirSync(binDir, { recursive: true }); + for (const script of ['gstack-update-check', 'gstack-slug', 'gstack-config', 'gstack-repo-mode']) { + const src = path.join(ROOT, 'bin', script); + if (fs.existsSync(src)) { + fs.copyFileSync(src, path.join(binDir, script)); + fs.chmodSync(path.join(binDir, script), 0o755); + } + } + + // Copy the fixture plan + fs.copyFileSync( + path.join(ROOT, 'test', 'fixtures', 'plans', 'sample-ruby-llm-plan.md'), + path.join(workDir, 'fixture-plan.md'), + ); + + // Create a minimal gstack analytics file (empty — no prior skill runs) + const analyticsDir = path.join(gstackHome, 'analytics'); + fs.mkdirSync(analyticsDir, { recursive: true }); + fs.writeFileSync(path.join(analyticsDir, 'skill-usage.jsonl'), ''); + + // Routing CLAUDE.md + fs.writeFileSync(path.join(workDir, 'CLAUDE.md'), `# Test project + +## Skill routing +When the user invokes /plan-status, ALWAYS use the Skill tool first. + +Environment: +- The plan-status skill is at ./.claude/skills/plan-status/SKILL.md +- Bin scripts are at ./bin/ — replace ~/.claude/skills/gstack/bin/ with ./bin/ +- Use GSTACK_HOME="${gstackHome}" for all gstack bin scripts +- The analytics file is at ${gstackHome}/analytics/skill-usage.jsonl +`); + }); + + afterAll(() => { + try { fs.rmSync(workDir, { recursive: true, force: true }); } catch {} + finalizeEvalCollector(evalCollector); + }); + + testConcurrentIfSelected('plan-status', async () => { + const result = await runSkillTest({ + prompt: `Run /plan-status on the plan file at ./fixture-plan.md. + +IMPORTANT: +- Use GSTACK_HOME="${gstackHome}" for all gstack bin scripts. +- The bin scripts are at ./bin/ (replace ~/.claude/skills/gstack/bin/ with ./bin/ in any commands). +- The plan file is ./fixture-plan.md — use it directly, do not search for other plans. +- Do NOT use AskUserQuestion. +- Do NOT make any file edits. +- Produce the full status report and stop after Step 5.`, + workingDirectory: workDir, + maxTurns: 20, + allowedTools: ['Bash', 'Read', 'Glob', 'Grep'], + timeout: 180_000, + testName: 'plan-status', + runId, + }); + + logCost('/plan-status', result); + + const output = result.output; + + // Loose assertions: report header present + at least one status label + const hasHeader = /plan.?status/i.test(output); + const hasDone = /\bDONE\b/.test(output); + const hasRemaining = /\bREMAINING\b/.test(output); + const hasStatusLabel = hasDone || hasRemaining; + + const exitOk = ['success', 'error_max_turns'].includes(result.exitReason); + + recordE2E(evalCollector, '/plan-status', 'Plan status E2E', result, { + passed: exitOk && hasHeader && hasStatusLabel, + }); + + expect(exitOk).toBe(true); + expect(hasHeader).toBe(true); + expect(hasStatusLabel).toBe(true); + }, 240_000); +});