RunMaestro · dwizzzle · Apr 1, 2026 · Apr 1, 2026 · Apr 1, 2026 · Apr 1, 2026
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -0,0 +1,140 @@
+# Copilot Instructions for Maestro
+
+Maestro is an Electron desktop app for orchestrating multiple AI coding agents (Claude Code, Codex, OpenCode, Factory Droid) with a keyboard-first interface. It uses a dual-process architecture (main + renderer) with strict context isolation.
+
+## Build, Test, and Lint
+
+```bash
+npm run dev              # Dev with hot reload (isolated data, safe alongside production)
+npm run build            # Full production build (main + renderer + web + CLI)
+npm run lint             # TypeScript type checking (renderer, main, cli configs)
+npm run lint:eslint      # ESLint code quality
+npm run test             # Run all unit tests (vitest)
+npm run test:watch       # Watch mode
+npm run format:check     # Check Prettier formatting
+npm run validate:push    # Full pre-push validation (format + lint + eslint + test)
+```
+
+Run a single test file:
+
+```bash
+npx vitest run src/__tests__/path/to/file.test.ts
+```
+
+Run tests matching a name pattern:
+
+```bash
+npx vitest run -t "pattern"
+```
+
+Other test suites:
+
+```bash
+npm run test:e2e           # Playwright end-to-end (requires build first)
+npm run test:integration   # Integration tests
+npm run test:performance   # Performance tests
+```
+
+## Architecture
+
+### Dual-Process (Electron)
+
+- **Main process** (`src/main/`): Node.js backend — process spawning (PTY via `node-pty`), IPC handlers, agent detection, session storage, SQLite via `better-sqlite3`.
+- **Renderer** (`src/renderer/`): React frontend — no Node.js access. Communicates via `window.maestro.*` IPC bridge defined in `preload.ts`.
+- **Preload** (`src/main/preload.ts`): Secure IPC bridge via `contextBridge`. All new IPC must go through here.
+- **Shared** (`src/shared/`): Types and utilities shared across processes.
+- **CLI** (`src/cli/`): Standalone CLI tool (`maestro-cli`) for headless batch automation.
+- **Web** (`src/web/`): Mobile-optimized React app for remote control.
+
+### Agent Model
+
+Each agent runs **two processes simultaneously**: an AI process (suffixed `-ai`) and a terminal process (suffixed `-terminal`). The `Session` interface in code represents an agent (historical naming). Use "agent" in user-facing language; reserve "session" for provider-level conversation contexts.
+
+### IPC Pattern
+
+To add a new IPC capability:
+
+1. Add handler in `src/main/index.ts` via `ipcMain.handle('namespace:action', ...)`
+2. Expose in `src/main/preload.ts` via `ipcRenderer.invoke()`
+3. Add types to `MaestroAPI` interface in preload.ts
+
+### Key Entry Points
+
+| Task               | Files                                                                                         |
+| ------------------ | --------------------------------------------------------------------------------------------- |
+| IPC handlers       | `src/main/index.ts`, `src/main/preload.ts`                                                    |
+| Keyboard shortcuts | `src/renderer/constants/shortcuts.ts`, `App.tsx`                                              |
+| Settings           | `src/renderer/hooks/useSettings.ts`                                                           |
+| Themes             | `src/renderer/constants/themes.ts`, `src/shared/theme-types.ts`                               |
+| Modal priorities   | `src/renderer/constants/modalPriorities.ts`                                                   |
+| Agent definitions  | `src/shared/agentIds.ts`, `src/main/agents/definitions.ts`, `src/main/agents/capabilities.ts` |
+| Output parsers     | `src/main/parsers/`, `src/main/parsers/index.ts`                                              |
+| System prompts     | `src/prompts/*.md`                                                                            |
+
+## Code Conventions
+
+### Formatting & Style
+
+- **Tabs for indentation** in TypeScript/JavaScript (not spaces). JSON/YAML use 2-space indent.
+- Prettier config: tabs, single quotes, trailing commas (es5), 100 char print width.
+- Husky pre-commit hooks auto-format staged files.
+
+### TypeScript
+
+- Strict mode enabled across all configs (`tsconfig.json`, `tsconfig.main.json`, `tsconfig.cli.json`).
+- Three separate tsconfig files: renderer/web/shared, main process, and CLI.
+- `@typescript-eslint/no-explicit-any` is currently `off` (legacy; avoid adding new `any`).
+- `react-hooks/exhaustive-deps` is intentionally `off` — this codebase uses refs to access latest values without causing re-renders.
+
+### React & UI
+
+- Functional components with hooks only.
+- Use Tailwind for layout, **inline styles for theme colors** (e.g., `style={{ color: theme.colors.textMain }}`). Never hardcode hex colors for themed elements.
+- Modals must register with the LayerStack system (don't handle Escape locally).
+- Focus management: use `tabIndex={-1}` + `outline-none` for programmatic focus.
+
+### Settings Pattern
+
+New settings follow a wrapper function pattern:
+
+1. State with `useState` in `useSettings.ts`
+2. Wrapper function that updates state AND calls `window.maestro.settings.set()`
+3. Load in `useEffect` from `window.maestro.settings.get()`
+
+### Error Handling
+
+- Let unexpected exceptions bubble up — Sentry captures them automatically.
+- Handle only expected/recoverable errors explicitly; re-throw unexpected ones.
+- Use `captureException`/`captureMessage` from `src/main/utils/sentry.ts` for explicit reporting.
+- Use `execFileNoThrow` for external commands (never shell-based execution).
+- Always `spawn()` with `shell: false`.
+
+### SSH Remote Execution
+
+Any feature spawning agent processes **must** support SSH remote execution:
+
+1. Check `session.sshRemoteConfig?.enabled`
+2. Use `wrapSpawnWithSsh()` from `src/main/utils/ssh-spawn-wrapper.ts`
+3. Use agent's `binaryName` for remote execution (not local paths)
+4. Don't hardcode `claude-code` — respect the configured agent type
+
+### Commit Messages
+
+Use conventional commits: `feat:`, `fix:`, `docs:`, `refactor:`, `test:`, `chore:`.
+
+### Branching
+
+- `main` = stable (odd minor versions, e.g., 0.15.x)
+- `rc` = pre-release (even minor versions, e.g., 0.16.x)
+- Bug fixes → `main`. New features → `rc`.
+
+## Performance
+
+- Memoize expensive computations with `useMemo`; use Maps for O(1) lookups instead of `Array.find()`.
+- Batch IPC calls; use `useBatchedSessionUpdates` for high-frequency updates.
+- Prefer 3-second intervals over 1-second for non-critical polling. Use event-driven updates via IPC when possible.
+- Clean up all timers, event listeners, and subscriptions in `useEffect` cleanup.
+
+## Encore Features (Feature Gating)
+
+Optional features disabled by default. When disabled, they must be completely invisible (no shortcuts, no menu items). Pattern: add flag to `EncoreFeatureFlags` in `src/renderer/types/index.ts`, default to `false` in `useSettings.ts`, gate all UI access points.
diff --git a/docs/Copilot-CLI-Phase2-Plan.md b/docs/Copilot-CLI-Phase2-Plan.md
@@ -0,0 +1,239 @@
+# Copilot CLI Phase 2: Full Parity Plan
+
+Use this document to kick off a new session once Phase 1 (core agent integration) is verified working. Phase 1 code is already merged — the agent ID, definition, capabilities, output parser, and error patterns are in place.
+
+## Prerequisite: Verify Phase 1
+
+Before starting Phase 2, confirm these work:
+
+1. `copilot` binary detected in Settings → AI Agents
+2. Creating a new Copilot CLI agent session succeeds
+3. Sending a message produces output (parsed from JSONL)
+4. Session resume via `--resume SESSION-ID` works
+5. Model selection via config option works
+
+If the JSONL output schema doesn't match the parser's assumptions, **fix the parser first** — see "JSON Schema Investigation" below.
+
+---
+
+## JSON Schema Investigation (Do This First)
+
+Run: `copilot -p "Say hello in one sentence" --output-format json --allow-all`
+
+Capture the full output and document the actual JSONL event types. The parser in `src/main/parsers/copilot-cli-output-parser.ts` uses heuristic type matching. Refine it to match the actual schema:
+
+1. What event type contains the session ID? (e.g., `session.started`, `init`, `thread.started`)
+2. What event type contains streaming text? (e.g., `content.delta`, `assistant.message`, `text`)
+3. What event type contains tool calls? (e.g., `tool_use`, `tool.started`, `tool.completed`)
+4. What event type contains usage stats? (e.g., `usage`, `turn.completed`, `stats`)
+5. What event type signals completion? (e.g., `result`, `complete`, `done`)
+6. How are errors structured? (e.g., `{ type: "error", error: { message: "..." } }`)
+
+Update `CopilotCliRawMessage` interface and `transformMessage()` method to match.
+
+---
+
+## Todo: Session Storage Browser
+
+**Goal**: Enable browsing/resuming past sessions from the Right Bar.
+
+**Files**:
+
+- New: `src/main/storage/copilot-cli-session-storage.ts`
+- Edit: `src/main/storage/index.ts` — register `CopilotCliSessionStorage`
+- Edit: `src/main/agents/capabilities.ts` — set `supportsSessionStorage: true`
+
+**Implementation**:
+
+1. Investigate session file format at `~/.copilot/session-state/` (may also be `~/.copilot/sessions/`)
+2. Extend `BaseSessionStorage` from `src/main/storage/base-session-storage.ts`
+3. Implement required methods:
+   - `listSessions(projectPath, options)` — list session files, extract metadata (title, date, agent)
+   - `readSessionMessages(projectPath, sessionId, options)` — parse session file into `SessionMessage[]`
+   - `searchSessions(projectPath, query)` — search session content
+   - `getGlobalStats()` — aggregate usage statistics (optional)
+4. Register in `initializeSessionStorages()` in `src/main/storage/index.ts`
+
+**Reference**: Follow `codex-session-storage.ts` or `factory-droid-session-storage.ts` patterns.
+
+---
+
+## Todo: Usage Stats & Cost Tracking
+
+**Goal**: Show token counts and cost in the UI (MainPanel token display, cost widget).
+
+**Files**:
+
+- Edit: `src/main/parsers/copilot-cli-output-parser.ts` — refine `extractUsageFromRaw()`
+- Edit: `src/main/agents/capabilities.ts` — set `supportsUsageStats: true`, `supportsCostTracking: true`
+
+**Implementation**:
+
+1. From the JSON schema investigation, identify which event carries usage data
+2. Map fields to `ParsedEvent.usage` (inputTokens, outputTokens, cacheReadTokens, costUsd)
+3. If Copilot CLI doesn't report cost directly, leave `supportsCostTracking: false` and only enable `supportsUsageStats: true`
+4. Context window: parse from JSON events if reported, otherwise use the user's configured value
+
+---
+
+## Todo: Thinking/Reasoning Display
+
+**Goal**: Show model reasoning/thinking content in the AI Terminal.
+
+**Files**:
+
+- Edit: `src/main/parsers/copilot-cli-output-parser.ts`
+- Edit: `src/main/agents/capabilities.ts` — set `supportsThinkingDisplay: true`
+
+**Implementation**:
+
+1. Check if Copilot CLI JSON output includes reasoning/thinking tokens (separate from main content)
+2. If yes: emit them as `type: 'text'` with `isPartial: true` (like Codex reasoning items)
+3. If no: leave `supportsThinkingDisplay: false`
+
+---
+
+## Todo: Read-Only Mode
+
+**Goal**: Restrict the agent to read-only operations for safe analysis.
+
+**Files**:
+
+- Edit: `src/main/agents/definitions.ts` — set `readOnlyArgs` and `readOnlyCliEnforced`
+- Edit: `src/main/agents/capabilities.ts` — set `supportsReadOnlyMode: true`
+
+**Implementation**:
+
+1. Test: `copilot -p "prompt" --deny-tool=write --deny-tool=create --deny-tool=apply_patch --output-format json`
+2. If this reliably prevents file modifications, update the definition:
+   ```typescript
+   readOnlyArgs: ['--deny-tool=write', '--deny-tool=create', '--deny-tool=apply_patch'],
+   readOnlyCliEnforced: true,
+   ```
+3. If `--deny-tool` doesn't work for read-only, use prompt-only enforcement (leave `readOnlyCliEnforced: false`)
+
+---
+
+## Todo: Image Input
+
+**Goal**: Allow attaching images/screenshots to prompts.
+
+**Files**:
+
+- Edit: `src/main/agents/definitions.ts` — add `imageArgs`
+- Edit: `src/main/agents/capabilities.ts` — set `supportsImageInput: true`
+
+**Implementation**:
+
+1. Check if Copilot CLI supports image input via `@ filename.png` or a flag like `-i`
+2. If supported via a flag: add `imageArgs: (imagePath: string) => ['--flag', imagePath]`
+3. If supported via stdin/stream-json: set `supportsStreamJsonInput: true` and add `--input-format stream-json` handling
+4. If not supported: leave `supportsImageInput: false`
+
+---
+
+## Todo: Wizard Support
+
+**Goal**: Enable inline wizard (structured output conversations) with Copilot CLI.
+
+**Files**:
+
+- Edit: `src/main/agents/capabilities.ts` — set `supportsWizard: true`
+
+**Implementation**:
+
+1. Test sending a structured wizard prompt to Copilot CLI
+2. Verify the agent follows the structured output format (numbered steps, clear sections)
+3. If output quality is sufficient: enable `supportsWizard: true`
+4. The wizard system is prompt-driven, so no code changes are needed if the agent handles prompts well
+
+---
+
+## Todo: Group Chat Moderation
+
+**Goal**: Allow Copilot CLI agents to serve as group chat moderators.
+
+**Files**:
+
+- Edit: `src/main/agents/capabilities.ts` — set `supportsGroupChatModeration: true`
+
+**Implementation**:
+
+1. Test group chat with Copilot CLI as moderator
+2. Verify it can coordinate between agents, route questions, and synthesize responses
+3. Group chat uses prompt-based coordination, so no code changes needed if quality is sufficient
+
+---
+
+## Todo: Context Export
+
+**Goal**: Allow exporting Copilot CLI session context for transfer to other agents.
+
+**Files**:
+
+- Edit: `src/main/agents/capabilities.ts` — set `supportsContextExport: true`
+
+**Implementation**:
+
+- Depends on session storage being implemented first
+- Context export reads session messages and formats them for another agent
+- Once `CopilotCliSessionStorage.readSessionMessages()` works, enable this flag
+
+---
+
+## Todo: Result Messages
+
+**Goal**: Detect when the agent has finished its response for Auto Run sequencing.
+
+**Files**:
+
+- Edit: `src/main/parsers/copilot-cli-output-parser.ts` — refine `isResultMessage()`
+- Edit: `src/main/agents/capabilities.ts` — set `supportsResultMessages: true`
+
+**Implementation**:
+
+1. From JSON schema investigation, identify the completion signal
+2. Update `transformMessage()` to emit `type: 'result'` for the correct event type
+3. Update `isResultMessage()` to match
+
+---
+
+## Capability Parity Matrix
+
+| Capability       | Claude Code | Codex | OpenCode | Factory Droid | Copilot CLI (Phase 1) | Copilot CLI (Target) |
+| ---------------- | :---------: | :---: | :------: | :-----------: | :-------------------: | :------------------: |
+| Resume           |     ✅      |  ✅   |    ✅    |      ✅       |          ✅           |          ✅          |
+| Read-Only        |     ✅      |  ✅   |    ✅    |      ✅       |          ❌           |          ✅          |
+| JSON Output      |     ✅      |  ✅   |    ✅    |      ✅       |          ✅           |          ✅          |
+| Session ID       |     ✅      |  ✅   |    ✅    |      ✅       |          ✅           |          ✅          |
+| Image Input      |     ✅      |  ✅   |    ✅    |      ✅       |          ❌           |          ❓          |
+| Slash Commands   |     ✅      |  ❌   |    ❌    |      ❌       |          ✅           |          ✅          |
+| Session Storage  |     ✅      |  ✅   |    ✅    |      ✅       |          ✅           |          ✅          |
+| Cost Tracking    |     ✅      |  ❌   |    ✅    |      ❌       |          ❌           |          ❓          |
+| Usage Stats      |     ✅      |  ✅   |    ✅    |      ✅       |          ✅           |          ✅          |
+| Batch Mode       |     ✅      |  ✅   |    ✅    |      ✅       |          ✅           |          ✅          |
+| Streaming        |     ✅      |  ✅   |    ✅    |      ✅       |          ✅           |          ✅          |
+| Result Messages  |     ✅      |  ❌   |    ✅    |      ✅       |          ✅           |          ✅          |
+| Model Selection  |     ❌      |  ✅   |    ✅    |      ✅       |          ✅           |          ✅          |
+| Thinking Display |     ✅      |  ✅   |    ✅    |      ✅       |          ❌           |          ✅          |
+| Context Merge    |     ✅      |  ✅   |    ✅    |      ✅       |          ✅           |          ✅          |
+| Context Export   |     ✅      |  ✅   |    ✅    |      ✅       |          ❌           |          ✅          |
+| Wizard           |     ✅      |  ✅   |    ✅    |      ❌       |          ❌           |          ✅          |
+| Group Chat Mod   |     ✅      |  ✅   |    ✅    |      ✅       |          ❌           |          ✅          |
+
+❓ = depends on CLI capability (needs investigation)
+
+---
+
+## Suggested Order of Work
+
+1. **JSON Schema Investigation** — must be first; everything else depends on it
+2. **Parser Refinement** — fix parser to match actual schema
+3. **Usage Stats** — quick win, high visibility
+4. **Result Messages** — needed for Auto Run to work properly
+5. **Session Storage** — enables session browsing in Right Bar
+6. **Read-Only Mode** — safety feature
+7. **Thinking Display** — nice-to-have
+8. **Image Input** — if supported by CLI
+9. **Context Export** — depends on session storage
+10. **Wizard + Group Chat** — quality-dependent, test and enable