feat: add GitHub Copilot CLI as first-class agent by dwizzzle · Pull Request #701 · RunMaestro/Maestro

dwizzzle · 2026-04-01T03:22:25Z

Summary

Adds \copilot-cli\ as a first-class agent in Maestro, on par with Claude Code, Codex, OpenCode, and Factory Droid.

What's included

Core integration (8 modified files, 2 new files):

Agent ID, definition, capabilities, display name, beta badge, context window
CLI args: -p\ (batch), --output-format json, --resume, --allow-all, --model\
Config options for model selection and context window size

Output parser — verified against actual Copilot CLI JSONL output:

11 event types: \session.*, \�ssistant.message_delta, \�ssistant.message, \ ool.execution_start/complete,
esult\
Streaming text display, tool use tracking, session ID extraction, output token accumulation

Session storage browser — reads ~/.copilot/session-state//:

Parses \workspace.yaml\ for metadata (summary, cwd, timestamps)
Reads \�vents.jsonl\ for message history
Supports pagination, search, and project path filtering

Error patterns — auth failures, rate limiting, network errors, token exhaustion

UI — Added to \SUPPORTED_AGENTS\ and wizard agent tiles (no more 'Coming Soon')

Testing

TypeScript compiles cleanly across all 3 configs
Maestro launches, detects \copilot\ binary, registers parser
Ran \copilot -p ... --output-format json\ to verify JSONL schema matches parser

Follow-up (Phase 2)

\docs/Copilot-CLI-Phase2-Plan.md\ covers remaining parity items: read-only mode, wizard, group chat moderation.

Add copilot-cli agent support with full JSONL output parsing, session storage browsing, error detection, and UI integration. Agent definition: - Binary: copilot, batch mode via -p flag - JSON output: --output-format json (JSONL) - Session resume: --resume SESSION-ID - YOLO mode: --allow-all - Model selection: --model flag Output parser (verified against actual CLI output): - 11 event types: session lifecycle, streaming deltas, tool execution, assistant messages, and result with session ID - Accumulates outputTokens from assistant.message events Session storage: - Reads ~/.copilot/session-state/<uuid>/workspace.yaml for metadata - Parses events.jsonl for message history - Supports pagination, search, and project filtering Also includes: - Error patterns (auth, rate limit, network, token exhaustion) - UI: added to SUPPORTED_AGENTS and wizard agent tiles - copilot-instructions.md for Copilot sessions in this repo Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

greptile-apps · 2026-04-01T03:27:30Z

Greptile Summary

This PR integrates GitHub Copilot CLI (copilot) as a first-class agent in Maestro, following the established patterns used by Claude Code, Codex, OpenCode, and Factory Droid. The integration spans all required layers: agent ID/definition/capabilities, a JSONL output parser, a session storage browser, error-pattern matching, and UI registration (wizard tile + SUPPORTED_AGENTS). A Phase 2 roadmap document is included for future parity work.

Key changes:

src/main/parsers/copilot-cli-output-parser.ts — New parser handling 11 Copilot CLI JSONL event types; contains two bugs around accumulatedOutputTokens never resetting between sessions and tokens being silently dropped when the result event lacks a usage field.
src/main/storage/copilot-cli-session-storage.ts — New session storage reading ~/.copilot/session-state/<uuid>/; unvalidated sessionId in public methods creates a path traversal risk.
src/main/agents/capabilities.ts — Capability flags for Copilot CLI; supportsSessionStorage and supportsResultMessages are set true but the Phase 2 plan parity matrix marks both as Phase 1 unsupported.
All shared/registry wiring (agent IDs, metadata, constants, parser index, storage index) is clean and consistent with existing agent patterns.

Confidence Score: 4/5

Safe to merge after fixing the two P1 token-accumulation bugs in the output parser.

Two P1 issues exist in the new output parser: accumulatedOutputTokens is never reset between sessions (singleton parser instance), and the accumulated token count is silently discarded when the result event has no usage field. Both produce incorrect usage stats from the second session onward. The session storage and UI changes are clean. Fixing the two accumulator issues is straightforward and does not require architectural changes.

src/main/parsers/copilot-cli-output-parser.ts — token accumulator reset and unconditional usage reporting; src/main/storage/copilot-cli-session-storage.ts — sessionId validation before path join.

Important Files Changed

Filename	Overview
src/main/parsers/copilot-cli-output-parser.ts	New JSONL parser for Copilot CLI; two bugs: accumulatedOutputTokens never resets between sessions (singleton parser), and accumulated tokens are silently discarded when the result event lacks a usage field.
src/main/storage/copilot-cli-session-storage.ts	New session storage reading ~/.copilot/session-state/; logic is solid but readSessionMessages/getSessionPath accept an unvalidated sessionId that is directly concatenated into a file path, enabling potential path traversal.
src/main/agents/capabilities.ts	Copilot CLI capability block added; supportsSessionStorage and supportsResultMessages are both set true but the Phase 2 plan parity matrix marks both as Phase 1 unsupported, requiring clarification.
src/main/agents/definitions.ts	Copilot CLI agent definition added with correct CLI flags (-p, --output-format json, --resume, --allow-all, --model) and UI config options for model and context window.
src/main/parsers/error-patterns.ts	Copilot CLI error patterns added covering auth failures, rate limiting, network errors, and token exhaustion; patterns are reasonable and registered correctly.
src/renderer/components/Wizard/screens/AgentSelectionScreen.tsx	Copilot CLI tile added to the wizard; GRID_ROWS bumped to 3 for 7 items (correct), but the constant is hardcoded rather than derived from tile count.
docs/Copilot-CLI-Phase2-Plan.md	Phase 2 roadmap doc; useful, but the parity matrix is already stale — session storage shipped in this PR but is still shown as Phase 1 unsupported.

Sequence Diagram

sequenceDiagram
    participant UI as Renderer (UI)
    participant Main as Main Process
    participant Parser as CopilotCliOutputParser
    participant Storage as CopilotCliSessionStorage
    participant CLI as copilot binary

    UI->>Main: Launch agent (copilot -p "..." --output-format json --allow-all)
    Main->>CLI: spawn()

    CLI-->>Parser: session.tools_updated (JSONL)
    Parser-->>Main: ParsedEvent { type: 'init' }

    CLI-->>Parser: assistant.message_delta (streaming)
    Parser-->>Main: ParsedEvent { type: 'text', isPartial: true }

    CLI-->>Parser: assistant.message (with toolRequests)
    Parser-->>Main: ParsedEvent { type: 'tool_use' }

    CLI-->>Parser: tool.execution_start / tool.execution_complete
    Parser-->>Main: ParsedEvent { type: 'tool_use', toolState }

    CLI-->>Parser: assistant.message (text only)
    Parser-->>Main: ParsedEvent { type: 'result', text }
    Note over Parser: accumulatedOutputTokens += outputTokens

    CLI-->>Parser: result (sessionId, usage)
    Parser-->>Main: ParsedEvent { type: 'usage', sessionId, usage }
    Note over Parser: ⚠ tokens not reset after this point

    Main->>UI: Session complete, sessionId stored

    UI->>Main: Browse past sessions
    Main->>Storage: listSessions(projectPath)
    Storage->>Storage: readdir ~/.copilot/session-state/
    Storage->>Storage: parseWorkspaceYaml per UUID dir
    Storage-->>Main: AgentSessionInfo[]
    Main-->>UI: Session list

    UI->>Main: Resume session (--resume SESSION-ID)
    Main->>CLI: spawn with resumeArgs

Comments Outside Diff (2)

src/main/storage/copilot-cli-session-storage.ts, line 1304-1311 (link)

sessionId not validated before path construction

readSessionMessages (and getSessionPath, getSearchableMessages) construct a file path by directly concatenating the caller-supplied sessionId into the session base directory:
```
const sessionDir = path.join(getCopilotSessionDir(), sessionId);
```
If sessionId contains path traversal sequences (e.g. ../../../etc/passwd), path.join will resolve them and the code will attempt to read files outside ~/.copilot/session-state/. listSessions generates clean UUIDs from directory reads, but readSessionMessages is a public interface that accepts arbitrary strings. Adding a UUID-format guard before path construction would eliminate this risk:
```
if (!/^[0-9a-f-]{36}$/i.test(sessionId)) {
    return { messages: [], total: 0, hasMore: false };
}
```
src/renderer/components/Wizard/screens/AgentSelectionScreen.tsx, line 1476-1478 (link)

Hardcoded GRID_ROWS will need manual bumping for future agents

The change from GRID_ROWS = 2 to GRID_ROWS = 3 is correct for 7 items, but this constant will need to be bumped again when a future tile is added. Consider deriving it programmatically:
```
const GRID_ROWS = Math.ceil(AGENT_TILES.length / GRID_COLS);
```

_{Reviews (1): Last reviewed commit: "feat: add GitHub Copilot CLI as first-cl..." | Re-trigger Greptile}

greptile-apps · 2026-04-01T03:27:35Z

src/main/parsers/copilot-cli-output-parser.ts

+	readonly agentId: ToolType = 'copilot-cli';
+
+	// Accumulate output tokens from assistant.message events for usage reporting
+	private accumulatedOutputTokens = 0;


accumulatedOutputTokens never resets between sessions

CopilotCliOutputParser is registered as a singleton (one instance for the lifetime of the process). The accumulatedOutputTokens field is never reset, so after the first Copilot CLI session ends and a second session starts, the token count from session 1 will be added to session 2's tally. Every subsequent session will report an inflated (and incorrect) output-token count.

Other parsers avoid this by reading token values directly from each event rather than accumulating across sessions. The fix is to reset the counter when the top-level result event is emitted:

case 'result': { const event: ParsedEvent = { type: 'usage', sessionId: msg.sessionId, raw: msg, }; event.usage = { inputTokens: 0, outputTokens: this.accumulatedOutputTokens, }; // Reset for next session this.accumulatedOutputTokens = 0; return event; }

greptile-apps · 2026-04-01T03:27:37Z

src/main/parsers/copilot-cli-output-parser.ts

+
+			// ---- Result (session complete) ----
+
+			case 'result': {
+				const event: ParsedEvent = {
+					type: 'usage',
+					sessionId: msg.sessionId,
+					raw: msg,
+				};
+
+				// Extract usage stats
+				if (msg.usage) {
+					event.usage = {
+						inputTokens: 0, // Copilot CLI doesn't report input tokens
+						outputTokens: this.accumulatedOutputTokens,
+						// No per-token cost — Copilot uses premium requests model
+					};
+				}
+


Accumulated tokens silently dropped when result event has no usage field

accumulatedOutputTokens is only written into event.usage when msg.usage is truthy. If the result event is missing the top-level usage object (network truncation, CLI version difference, empty session, etc.), the entire per-turn accumulation is discarded and the caller receives a ParsedEvent with type: 'usage' but no event.usage set.

Because extractUsage returns event.usage || null, the caller will receive null, meaning all output-token data for the session is silently lost. Report the accumulated tokens unconditionally:

case 'result': { const event: ParsedEvent = { type: 'usage', sessionId: msg.sessionId, raw: msg, }; // Report accumulated output tokens regardless of whether msg.usage exists event.usage = { inputTokens: 0, outputTokens: this.accumulatedOutputTokens, }; this.accumulatedOutputTokens = 0; return event; }

greptile-apps · 2026-04-01T03:27:39Z

src/main/agents/capabilities.ts

+	 *
+	 * Phase 1 capabilities are conservative — advanced features (session storage
+	 * browsing, cost tracking, thinking display) will be enabled in Phase 2
+	 * after verifying the JSON output schema.
+	 */
+	'copilot-cli': {
+		supportsResume: true, // --resume SESSION-ID
+		supportsReadOnlyMode: false, // No explicit CLI flag; may use --deny-tool in future
+		supportsJsonOutput: true, // --output-format json (JSONL format)
+		supportsSessionId: true, // sessionId in 'result' event - Verified
+		supportsImageInput: false, // Not documented in CLI reference
+		supportsImageInputOnResume: false,
+		supportsSlashCommands: true, // /help, /compact, /model, /resume, /usage, etc.
+		supportsSessionStorage: true, // ~/.copilot/session-state/<uuid>/ - Verified
+		supportsCostTracking: false, // Uses premium requests model, not per-token cost
+		supportsUsageStats: true, // outputTokens in assistant.message events - Verified
+		supportsBatchMode: true, // -p flag for programmatic execution
+		requiresPromptToStart: true, // Requires -p prompt in batch mode


Capabilities contradict the Phase 2 parity matrix

supportsSessionStorage: true and supportsResultMessages: true are set here, but the parity matrix in docs/Copilot-CLI-Phase2-Plan.md explicitly marks both as ❌ for "Copilot CLI (Phase 1)".

supportsSessionStorage is fine because CopilotCliSessionStorage is actually implemented in this PR — the plan doc is simply stale on that point. However supportsResultMessages needs a second look: the plan doc marks it as unsupported in Phase 1 and lists it as a Phase 2 TODO. If it truly is unsupported, the flag should remain false to avoid enabling Auto Run prematurely. If it is supported as implemented, the plan doc should be updated to reflect that.

41 tests covering all 11 JSONL event types verified against actual copilot CLI output. Includes: - Session lifecycle: mcp_server_status_changed, mcp_servers_loaded, tools_updated - Conversation: user.message, assistant.turn_start/end, message_delta, message - Tool execution: tool.execution_start, tool.execution_complete - Completion: result with sessionId and usage - Error detection: auth, rate limit, network, exit codes - End-to-end: full session simulations (simple + tool use) - Edge cases: empty deltas, long output truncation, token accumulation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ssues On Windows, long prompts passed as PowerShell CLI args get garbled due to escaping issues with special characters. Changed copilot-cli to use '-p -' (read from stdin) and sendPromptViaStdinRaw=true. Also added sendPromptViaStdinRaw as an agent definition field so other agents can opt into stdin-based prompt delivery. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When an output parser is registered (JSONL agents like copilot-cli, codex, opencode, factory-droid), non-JSON lines from stdout are now suppressed instead of being displayed to the user as raw text. This prevents PowerShell profile banners and MCP server startup messages from cluttering the agent output. Only agents WITHOUT an output parser (terminal, legacy mode) continue to emit raw non-JSON lines, which is correct for terminal sessions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

nolanmclark · 2026-04-01T14:35:02Z

Had a similar PR if you wanted to build on it or take anything from it. :) #566

The promptArgs function was being skipped when sendPromptViaStdinRaw=true because the spawner's promptViaStdin guard prevents adding prompt args. But '-p -' isn't prompt text — it's a flag telling copilot to read stdin. Moved to batchModeArgs so it's always present in batch mode. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

copilot -p - means 'prompt is the literal dash character', not 'read from stdin'. When stdin is piped (sendPromptViaStdinRaw), copilot reads it automatically without -p. Removed -p - from batchModeArgs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot CLI outputs MCP server startup messages, PowerShell profile banners, and initialization noise to stderr. These were being displayed via the onStderr renderer handler. Now suppressed for all JSONL agents with output parsers (except Codex which has special stderr handling). Error detection still runs first, so real errors are captured. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Two fixes for the raw JSONL display issue: 1. detectErrorFromParsed now handles session.error events (Copilot CLI format: data.message, data.errorType). Previously, session.error wasn't caught because it has no top-level 'error' field, causing the error to fall through to detectErrorFromExit which dumps the full stdoutBuffer in raw. 2. detectErrorFromExit no longer includes stderr/stdout in raw — for JSONL agents the stdoutBuffer contains all parsed JSON events which is noise, not useful error context. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The root cause of raw JSONL display: copilot-cli's --output-format json args didn't match the isStreamJsonMode heuristic check, so the process was treated as batch-JSON (single JSON blob) instead of streaming JSONL. On exit, handleBatchModeExit tried to JSON.parse the entire buffer, failed, and dumped the raw content to the display. Fixed by adding outputParser presence as a signal for stream-json mode. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

P1: Reset accumulatedOutputTokens on result event (singleton parser was carrying tokens across sessions). Report tokens unconditionally even when result event has no usage field. P2: Add UUID validation to session storage public methods to prevent path traversal. Update Phase 2 parity matrix to reflect shipped features. Derive GRID_ROWS from AGENT_TILES.length. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dwizzzle · 2026-04-01T18:57:36Z

Review Feedback Addressed (commit `05ae36d`)

Thanks for the thorough review. All 5 issues fixed:

P1 - Token accumulator bugs (both fixed)

accumulatedOutputTokens never resets - Now resets to 0 after the result event. The parser is a singleton so this was carrying tokens across sessions.
Tokens silently dropped when result has no usage field - Usage stats are now reported unconditionally on result, regardless of whether msg.usage exists.

P2 - Session storage, docs, UI (all fixed)

Path traversal in session storage - Added UUID format validation to readSessionMessages, getSessionPath, and getSearchableMessages before path construction.
Capabilities vs Phase 2 parity matrix - Updated docs/Copilot-CLI-Phase2-Plan.md to mark session storage, result messages, and usage stats as shipped.
Hardcoded GRID_ROWS - Changed to Math.ceil(AGENT_TILES.length / GRID_COLS) so it auto-adjusts when future agents are added.

Tests updated to cover token reset behavior and unconditional usage reporting. All 41 tests pass.

pedramamini · 2026-04-02T01:32:07Z

@coderabbitai analyze this PR

greptile-apps bot reviewed Apr 1, 2026

View reviewed changes

dwizzzle and others added 3 commits March 31, 2026 20:37

dwizzzle and others added 5 commits April 1, 2026 08:00

dwizzzle force-pushed the feat/copilot-cli-agent branch from 82f1fa8 to be67a81 Compare April 1, 2026 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add GitHub Copilot CLI as first-class agent#701

feat: add GitHub Copilot CLI as first-class agent#701
dwizzzle wants to merge 10 commits intoRunMaestro:mainfrom
dwizzzle:feat/copilot-cli-agent

dwizzzle commented Apr 1, 2026

Uh oh!

greptile-apps bot commented Apr 1, 2026 •

edited

Loading

Comments Outside Diff (2)

Uh oh!

greptile-apps bot Apr 1, 2026

Uh oh!

greptile-apps bot Apr 1, 2026

Uh oh!

greptile-apps bot Apr 1, 2026

Uh oh!

nolanmclark commented Apr 1, 2026

Uh oh!

dwizzzle commented Apr 1, 2026

Uh oh!

pedramamini commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dwizzzle commented Apr 1, 2026

Summary

What's included

Testing

Follow-up (Phase 2)

Uh oh!

greptile-apps bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

greptile-apps bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

nolanmclark commented Apr 1, 2026

Uh oh!

dwizzzle commented Apr 1, 2026

Review Feedback Addressed (commit 05ae36d)

P1 - Token accumulator bugs (both fixed)

P2 - Session storage, docs, UI (all fixed)

Uh oh!

pedramamini commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps bot commented Apr 1, 2026 •

edited

Loading

Review Feedback Addressed (commit `05ae36d`)