feat(ORCH): orchestrator-first plugin spec — idea + research (closes #501)#508
feat(ORCH): orchestrator-first plugin spec — idea + research (closes #501)#508Luis85 wants to merge 21 commits into
Conversation
) Resolves all 7 open decisions and 5 open questions from issue #501. Establishes the orchestrator-first architecture vision: the orchestrator becomes the dispatch authority (not advisory-only), packaged as a proper Claude Code plugin via settings.json + .claude-plugin/plugin.json. Research synthesis covers: Claude plugin architecture constraints, Anthropic native Orchestrator-Subagent pattern recommendation (over LangGraph/CrewAI), competitive landscape (Devin, Copilot Workspace, Cursor, BMAD, Kiro, GitHub Spec Kit), and full Specorator codebase audit (36 agents, 38 skills, 85 commands, 12 plugin groups). Artifacts: - specs/goal-oriented-orchestrator-plugin/workflow-state.md - specs/goal-oriented-orchestrator-plugin/idea.md (IDEA-ORCH-001) - specs/goal-oriented-orchestrator-plugin/research.md (RESEARCH-ORCH-001) Next stage: /spec:requirements to produce EARS-formatted requirements. https://claude.ai/code/session_01UKFqNZBDevmYtpiU3QLnVD
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e96c4db163
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
23 EARS-formatted functional requirements (REQ-ORCH-001–023) across four groups: orchestrator dispatch authority, goal-loop conductor (6-phase resolution loop), plugin packaging, and backward compatibility. 8 NFRs with explicit thresholds (performance, reliability, compatibility, build integrity, security). Zero open clarifications. Quality gate green. North star: ≥70% of goal-loop sessions reach Session Summary without manual /spec:* invocation. Counter-metric: <25% abandonment after post-scope HITL gate. Prerequisite flagged in release criteria: workflow-state.md Zod schema (ADR-0042) must be in place before implementing REQ-ORCH-002/022. https://claude.ai/code/session_01UKFqNZBDevmYtpiU3QLnVD
Three-part design for the orchestrator-first plugin: Part A (ux-designer): 10 conversation flows (Mermaid), 4 fully specified AskUserQuestion gate designs with exact option copy, stall detection gate, resume flow, empty/loading/error states for all failure modes. Part B (ui-designer): 12-state CLI screen inventory, 6 output component patterns (progress banner, gate header, criteria list, verdict table, artifact link, option labels), CLI token set (phase labels, separator style, emphasis conventions), microcopy standards (forbidden words, vocabulary rules, tense constraints). Part C (architect): system overview diagram, 12-component responsibility table, data model for workflow-state.md goal_loop block + scope.md + session-summary.md, 2-scenario data flow (happy path + stall path), 6 subagent spawn contracts, requirements coverage for all 23 REQ-ORCH-NNN. ADRs filed: - ADR-0046: orchestrator tool list expanded to dispatch authority - ADR-0047: workflow-state.md schema extended with goal_loop block - ADR-0048: scope.md and session-summary.md introduced as new artifact types https://claude.ai/code/session_01UKFqNZBDevmYtpiU3QLnVD
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e303d0de28
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
19 interface contracts (SPEC-ORCH-001–019), 9 typed data structures with Zod schemas, full goal-loop state machine (13-state Mermaid diagram), 21 validation rules, 15 edge cases, 45 TEST-ORCH-NNN test scenarios. Key precision added beyond design: - Issue reference regex patterns (bare #NNN + full GitHub URL) - Researcher count heuristic: deterministic concern-area algorithm - De-duplication: 80% unique-token overlap + 20% length tolerance - GoalLoopState Zod schema (workflow-state.md goal_loop block, ADR-0047) - scope.md + session-summary.md exact schemas (ADR-0048) - .claude-plugin/plugin.json + settings.json generation contracts - check-agents.ts prohibited-frontmatter validation rule All 23 REQ-ORCH-NNN traced in coverage table. Quality gate all-green. https://claude.ai/code/session_01UKFqNZBDevmYtpiU3QLnVD
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2b5de6e839
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e908b307a8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- design.md: replace T-AUTH-NNN placeholder IDs with T-ORCH-NNN to match workflow area ORCH (fixes 4 check:traceability area-mismatch errors) - spec.md: add REQ-ORCH-* covering references to TEST-ORCH-038/039/040/042/044/045 rows that only cited EC-ORCH-* IDs (fixes 6 check:traceability coverage errors) - spec.md: fix 'unparseable' -> 'unparsable' (fixes typos spell check failure) https://claude.ai/code/session_011TPNgd7jBv3ySSyvaTifA1
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a7aa87e76b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ility P1 — REQ-ORCH-018..023 Downstream fields pointed to SPEC-ORCH-018..023 which do not exist (spec.md defines contracts up to SPEC-ORCH-017). Mapped each requirement to the correct existing SPEC ID: REQ-ORCH-018 → SPEC-ORCH-015 (settings.json declaration contract) REQ-ORCH-019 → SPEC-ORCH-014, SPEC-ORCH-016 (plugin.json + build script) REQ-ORCH-020 → SPEC-ORCH-017 (check-agents.ts frontmatter validation) REQ-ORCH-021 → SPEC-ORCH-001 (backward compat via agent definition) REQ-ORCH-022 → SPEC-ORCH-003/005/008/009/011 (all HITL + stall gates) REQ-ORCH-023 → SPEC-ORCH-002 (goal-loop conductor entry point) P2 — design.md Gate 1 abort branch said "No artifacts were written" but SPEC-ORCH-003 §On response X and §Post-conditions X confirm that both scope.md and workflow-state.md ARE written on abort. Updated the sequence diagram to show the workflow-state.md write step and corrected the user-facing message to match the spec contract. P2 — REQ-ORCH-022 scoped the pre-write guarantee to "three defined HITL gates" only, but REQ-ORCH-014 also fires AskUserQuestion via the stall gate (SPEC-ORCH-008). Updated statement, acceptance criteria, and Downstream field to include the stall gate so every AskUserQuestion path carries the same persistence guarantee. https://claude.ai/code/session_011TPNgd7jBv3ySSyvaTifA1
|
@codex review Generated by Claude Code |
1 similar comment
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f8d4c0f4d3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
… tests TEST-ORCH-027: update path from dist/claude-plugin to claude-plugin/specorator. TEST-ORCH-034: remove sample-of-10 qualifier; test must cover all 85 slash commands. Addresses Codex P2 review threads on PR #508.
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cd44cfa81c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6001076969
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6001076969
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1689a2bda7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: efd36dd4a3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ME exclusion, ADR-0047 fields
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ea9762c184
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Add REQ-* IDs to test table coverage column so check:traceability validateTestCoverage passes — each TEST-ORCH-NNN row now references at least one REQ-* or NFR-* ID alongside its SPEC-* coverage.
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 868ddd15b8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 868ddd15b8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dfd703f183
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 42c1438950
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6bd3d6c005
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cd2900307a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Extend --check flag contract to also validate settings.json - Add R-ORCH-FRONTMATTER rule to SPEC-ORCH-017 check-agents validation - Update NFR-ORCH-006 to cover both R-ORCH-TOOLS and R-ORCH-FRONTMATTER rules https://claude.ai/code/session_011TPNgd7jBv3ySSyvaTifA1
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c6d58a317d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ents spec.md: - §2.4: make AskUserQuestion gate conditional on missing goal statement - §4.3: research synthesis writes to research.md, not scope.md section - §5.2: design output is design.md, not scope.md section - §6.1: plan output is tasks.md with flat YAML format - §7.4: add normative section for SPECORATOR_HEAVY_MODEL selection - §8.1: reduce stall threshold from >10 to >3 identical outputs - §19: add NFR-ORCH-007 for SPECORATOR_HEAVY_MODEL requirements.md: - REQ-ORCH-002 Downstream: SPEC-ORCH-002 → SPEC-ORCH-001 - REQ-ORCH-005 Downstream: SPEC-ORCH-005 → SPEC-ORCH-002 - Release criteria: SPEC-ORCH-016 → SPEC-ORCH-013 https://claude.ai/code/session_011TPNgd7jBv3ySSyvaTifA1
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f0f90c27db
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- §5.2/§5.3: Design phase produces canonical design.md; Gate 2 edits update design.md (not scope.md). Add scope.md cross-reference bullet. - §7.1: Implement wave reads tasks from tasks.md, not scope.md plan section. - §17.1: Add R-ORCH-PROHIBITED-FRONTMATTER rule requiring check-agents.ts to reject hooks/mcpServers/permissionMode in orchestrator frontmatter (satisfies REQ-ORCH-020). - §21.1: Align TEST-ORCH-012 threshold to > 3 identical outputs (matches §8.1 stall detection rule). - §23.1: Correct REQ-ORCH-004 traceability to point to §7.4 (model selection) rather than §7.1. https://claude.ai/code/session_011TPNgd7jBv3ySSyvaTifA1
|
@codex review Generated by Claude Code |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f22b2df97e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| **R-ORCH-TOOLS** — for any agent file with `name: orchestrator`, the `tools:` list MUST exactly match the list in SPEC-ORCH-001 §1.1. | ||
|
|
||
| **R-ORCH-FRONTMATTER** — no agent file may declare prohibited tools in its `tools:` frontmatter field. Prohibited values: `Bash`, `WebSearch`, `WebFetch`, `mcp__github__*` (any). These tools belong to specialist subagents, not the orchestrator. |
There was a problem hiding this comment.
Limit prohibited-tool check to orchestrator agents
The new R-ORCH-FRONTMATTER rule says no agent file may contain Bash, WebSearch, WebFetch, or mcp__github__*, but many existing specialist agents already rely on those tools (for example .claude/agents/dev.md, .claude/agents/analyst.md, and .claude/agents/roadmap-manager.md). If check-agents.ts implements this literally, npm run verify will fail for unrelated agents and break backward compatibility guarantees; this prohibition needs to be scoped to the orchestrator only.
Useful? React with 👍 / 👎.
| 1. Mark task `status: failed` in `workflow-state.md`. | ||
| 2. Pause execution of all dependent tasks. | ||
| 3. Present failure summary to user with options: `retry | skip | abort`. | ||
| 4. On `retry`: re-execute failed task (max 2 retries per task). |
There was a problem hiding this comment.
Align retry limit with the stated 3-retry stall contract
This section caps task retries at 2, but the same feature spec and requirements define stall escalation after three unproductive retries; implementing the 2-retry limit as written will trigger earlier termination and makes the three-retry behavior unachievable. Use a single retry threshold across failure handling, stall handling, and acceptance criteria to avoid contradictory behavior.
Useful? React with 👍 / 👎.
Summary
Closes #501. Produces Stage 1 (Idea) and Stage 2 (Research) spec artifacts for the orchestrator-first Claude plugin architecture — the refactor that makes the goal-oriented resolution loop the core of Specorator.
idea.md(IDEA-ORCH-001) — resolves all 7 open decisions (D1–D7) and 5 open questions from issue idea: goal-oriented orchestrator plugin — Research → Design → Plan → Implement → Review loop #501; defines the orchestrator-first architecture vision with a full flow diagram; establishes constraints (subagent no-nesting hard limit, plugin security strips, naming collision resolution between ADR-0036 and.claude-plugin/); refines acceptance criteriaresearch.md(RESEARCH-ORCH-001) — full synthesis of 4 parallel research workstreams: Claude Code plugin architecture, multi-agent orchestrator patterns, competitive landscape, and Specorator codebase auditworkflow-state.md— stage tracking and active decision registerKey decisions resolved
grillskilldesign.mdartifact + inline summarytasks.mdwith explicitdepends_onDAG edgesisolation: worktreeper implementer (Claude Code native).claude-plugin/plugin.json+settings.json { "agent": "orchestrator" }Architecture headline
The orchestrator gets dispatch authority (currently Read/Grep only, advisory). When the Specorator plugin is enabled,
settings.json agent: orchestratormakes it the main session agent. It runs the goal-loop: Scope → parallel Research wave → Design synthesis → HITL gate → Plan (DAG) → parallel Implement waves (worktree-isolated) → Review → Session summary.Three synchronous HITL gates (AskUserQuestion): post-scope, post-design-approval, post-review.
Recommended implementation pattern: Anthropic native Orchestrator-Subagent (no third-party framework) — zero dependencies, direct alignment with published Anthropic patterns, file-based
workflow-state.mdas durable checkpoint.Competitive context
Specorator is the only SDD tool that combines: EARS notation + stable REQ→TEST traceability IDs + CLI verify gate + multi-track breadth (12 tracks) + tool-agnostic Layer 0. Nearest threats: AWS Kiro (EARS-native but no traceability chain/verify gate), GitHub Spec Kit (simple but shallow), BMAD (role-separated but enterprise-heavy, no EARS).
Primary positioning angle: "The spec is the memory" — file-based artifacts survive session boundaries, context rot, and team rotation. No competitor matches this architecturally.
Test plan
idea.mdpasses quality gate checklist (all boxes checked in the artifact)research.mdpasses quality gate checklist (all boxes checked in the artifact)idea.mdare traceable to future EARS requirementsworkflow-state.mdcorrectly reflects stageresearch/ statusactivewith both artifacts markedcompleteNext step
/spec:requirements— PM produces EARS-formatted functional requirements, NFRs, and success metrics for the orchestrator-first plugin.https://claude.ai/code/session_01UKFqNZBDevmYtpiU3QLnVD
Generated by Claude Code