Context
Surfaced 2026-05-14 from an interactive operator discussion while improving the existing /claude-code-hermit:capability-brainstorm skill. The operator asked whether capability-brainstorm currently surfaces domain-feature ideas (e.g. "your test suite is slow", "5 motion sensors but only 1 is wired into automations"), or only hermit-side capability ideas (new skills, agents, routines).
It only surfaces hermit-side. The skill reads project context (memory, available MCPs and channels, recent compiled artifacts, README + CLAUDE.md + top-level ls) for grounding, but the ideas it emits are pinned to category: capability, which /proposal-create defines as "new agent, skill, or heartbeat item." Domain-feature ideas grounded in actual project state never make it out.
After a short brainstorm of what each fleet hermit could surface if the scope were widened, the operator confirmed the appetite for this and asked for it captured as a proposal.
Problem
Operators have useful domain-feature ideas — concrete, observable from the data the hermit already touches — that go unsurfaced. Examples per fleet:
- dev-hermit: "tests cover
lib/ at 94% but cli/ has 0 tests; CI green is misleading"; "db.test.js is 38s of your 45s test runtime"; "README mentions feature X but no module implements it"
- homeassistant-hermit: "you have a
bom_dia script but no boa_noite equivalent"; "5 motion sensors exist; only 1 is wired into automations"; "persiana scripts cover bedroom but not the office janelão"
- fitness-hermit: "12 cardio sessions, 1 strength session in last 3 weeks — imbalance vs stated goal"; "sleep is in your goals but no sleep entry in 17 days"; "same workout repeated 4 weeks running"
The current capability-brainstorm could be widened to emit both classes, but that runs into three real problems:
- Reading-depth mismatch. Capability-brainstorm's scan is sketch-level by design (top 15 lines of compiled, README, top-level
ls) — fine for "what skill could exist?" but too thin for "what's wrong with your auth code?" Forcing it deeper inflates token cost on every invocation, including the ones that only want hermit-side ideas.
- Kill-criteria signal degradation. Capability-brainstorm's self-audit retires the skill if triage-survival drops below 25% or PROP-acceptance below 30%. A 25% rate mixing capability-class and domain-class ideas is uninterpretable — you can't tell which class is dying, and tuning the prompt to lift one class may sink the other.
- Per-fleet context lives in the fleet. Fleet plugins already have domain reading built in (HA's entity registry MCP, dev-hermit's codebase scanners, fitness-hermit's activity log shape). Core has none of this and shouldn't grow it.
Proposed Solution
Add a domain-brainstorm skill to each fleet plugin, scoped to that fleet's domain. Uniform name across fleets (domain-brainstorm) since the domain is implied by the hermit context — operators don't need to remember dev-brainstorm vs ha-brainstorm vs fitness-brainstorm.
Skill structure (mirror capability-brainstorm's shape):
- On-demand only; never autonomous. Triggers on phrases like "what should I be fixing?", "anything wrong with X?", "brainstorm features".
- Cap 2 ideas per invocation. Same concrete-friction + ≥2-named-grounding-items gates.
- Single-pass through
/proposal-create → proposal-triage pipeline.
- Output PROPs land in the standard
.claude-code-hermit/proposals/ stream (operator-confirmed) so review stays unified.
- Each fleet's skill carries its own kill-criteria threshold tuned to that domain — survival rates won't be comparable across fleets.
Per-fleet reading depth (where the value is):
- dev-hermit: recent git activity (last 50 commits, churned files), test runner output (slowest suites, coverage gaps), package.json/Cargo.toml/pyproject delta vs lockfile, README ↔ source drift
- homeassistant-hermit: entity registry (existing, unused, recently added), automation/script files (asymmetries, coverage gaps), recent event log shape
- fitness-hermit: recent activity log entries, workout-type distribution, stated goals from operator profile, sleep/macro/cardio gaps
Pipeline plumbing (small core changes):
- Add
domain-brainstorm as a recognized Evidence Source in proposal-triage (bypass recurrence like capability-brainstorm, since the brainstorm pass establishes the candidate).
- Either add a new
category: domain-feature in /proposal-create for filterability, or reuse improvement and rely on tags. The category route is preferred because it gives operators a clean filter in /proposal-list.
Implementation priority (operator-confirmed): dev > HA > fitness. Ship dev's first, observe for a few weeks of real use, then HA, then fitness.
Impact
Effort. Roughly 1 skill per fleet plugin (3 total at full rollout), each ~100 lines mirroring capability-brainstorm's shape. Core changes are minor: extend proposal-triage's Evidence Source list, optionally add a category value. Estimated days per fleet for the first ship, day per fleet for subsequent ones after the pattern stabilizes.
Benefit. Surfaces a class of ideas the hermit currently watches but can't act on. The operator's "things I should fix but forgot to ask about" category gets a structured intake. Per-fleet scoping keeps each skill's reading depth (and token cost) calibrated to where the signal is.
Risk. Domain ideas can be lower-quality than capability ideas because deeper reading is still token-bounded — a sketch read of a 50k-LOC repo will miss most real problems. Mitigated by per-fleet kill criteria (retire if survival/acceptance drops) and the cap-of-2 discipline. Worst case: a fleet's domain-brainstorm produces noise, fails its self-audit, and gets retired — non-destructive failure mode.
Open design questions (decide during implementation, not blockers):
- New
Evidence Source: domain-brainstorm vs reusing capability-brainstorm: leaning new, for distinct kill-criteria interpretation.
- New
category: domain-feature vs reusing improvement: leaning new, for filterability.
- Whether the dev-hermit version should accept a
--depth=light|deep parameter for token-budget control, or pick one heuristic.
Filed via hermit-scribe · proposal=PROP-022 · session=null
Context
Surfaced 2026-05-14 from an interactive operator discussion while improving the existing
/claude-code-hermit:capability-brainstormskill. The operator asked whether capability-brainstorm currently surfaces domain-feature ideas (e.g. "your test suite is slow", "5 motion sensors but only 1 is wired into automations"), or only hermit-side capability ideas (new skills, agents, routines).It only surfaces hermit-side. The skill reads project context (memory, available MCPs and channels, recent compiled artifacts, README + CLAUDE.md + top-level
ls) for grounding, but the ideas it emits are pinned tocategory: capability, which/proposal-createdefines as "new agent, skill, or heartbeat item." Domain-feature ideas grounded in actual project state never make it out.After a short brainstorm of what each fleet hermit could surface if the scope were widened, the operator confirmed the appetite for this and asked for it captured as a proposal.
Problem
Operators have useful domain-feature ideas — concrete, observable from the data the hermit already touches — that go unsurfaced. Examples per fleet:
lib/at 94% butcli/has 0 tests; CI green is misleading"; "db.test.jsis 38s of your 45s test runtime"; "README mentions feature X but no module implements it"bom_diascript but noboa_noiteequivalent"; "5 motion sensors exist; only 1 is wired into automations"; "persiana scripts cover bedroom but not the office janelão"The current
capability-brainstormcould be widened to emit both classes, but that runs into three real problems:ls) — fine for "what skill could exist?" but too thin for "what's wrong with your auth code?" Forcing it deeper inflates token cost on every invocation, including the ones that only want hermit-side ideas.Proposed Solution
Add a
domain-brainstormskill to each fleet plugin, scoped to that fleet's domain. Uniform name across fleets (domain-brainstorm) since the domain is implied by the hermit context — operators don't need to rememberdev-brainstormvsha-brainstormvsfitness-brainstorm.Skill structure (mirror
capability-brainstorm's shape):/proposal-create→proposal-triagepipeline..claude-code-hermit/proposals/stream (operator-confirmed) so review stays unified.Per-fleet reading depth (where the value is):
Pipeline plumbing (small core changes):
domain-brainstormas a recognizedEvidence Sourceinproposal-triage(bypass recurrence likecapability-brainstorm, since the brainstorm pass establishes the candidate).category: domain-featurein/proposal-createfor filterability, or reuseimprovementand rely on tags. The category route is preferred because it gives operators a clean filter in/proposal-list.Implementation priority (operator-confirmed): dev > HA > fitness. Ship dev's first, observe for a few weeks of real use, then HA, then fitness.
Impact
Effort. Roughly 1 skill per fleet plugin (3 total at full rollout), each ~100 lines mirroring
capability-brainstorm's shape. Core changes are minor: extendproposal-triage's Evidence Source list, optionally add acategoryvalue. Estimateddaysper fleet for the first ship,dayper fleet for subsequent ones after the pattern stabilizes.Benefit. Surfaces a class of ideas the hermit currently watches but can't act on. The operator's "things I should fix but forgot to ask about" category gets a structured intake. Per-fleet scoping keeps each skill's reading depth (and token cost) calibrated to where the signal is.
Risk. Domain ideas can be lower-quality than capability ideas because deeper reading is still token-bounded — a sketch read of a 50k-LOC repo will miss most real problems. Mitigated by per-fleet kill criteria (retire if survival/acceptance drops) and the cap-of-2 discipline. Worst case: a fleet's domain-brainstorm produces noise, fails its self-audit, and gets retired — non-destructive failure mode.
Open design questions (decide during implementation, not blockers):
Evidence Source: domain-brainstormvs reusingcapability-brainstorm: leaning new, for distinct kill-criteria interpretation.category: domain-featurevs reusingimprovement: leaning new, for filterability.--depth=light|deepparameter for token-budget control, or pick one heuristic.Filed via hermit-scribe · proposal=PROP-022 · session=null