diff --git a/README.md b/README.md index 96abac2..7d6706d 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,44 @@
+ MCP Comet banner

MCP Comet

+

Turn Perplexity Comet into a production-grade MCP research engine

- MCP Comet banner -

-

Turn Perplexity Comet into a production-grade MCP research engine

-

- npm + npm + downloads build + typescript license

7 modes · 13 tools · zero-friction setup · full browser control

Quick Start · Tool Reference · - Architecture + Architecture · + Issues

--- +## Table of Contents + +- [Why MCP Comet](#why-mcp-comet) +- [Demo](#demo) +- [Quick Start](#quick-start) +- [Research Modes](#research-modes) +- [Toolset at a Glance](#toolset-at-a-glance) +- [Agent Workflows](#agent-workflows) +- [CLI Power Ops](#cli-power-ops) +- [Architecture](#architecture) +- [Configuration](#configuration) +- [Compatibility](#compatibility) +- [Contributing](#contributing) +- [License](#license) + +--- + ## Why MCP Comet MCP Comet gives your agent more than a chat box. It gives your agent a complete research cockpit. @@ -30,27 +48,44 @@ MCP Comet gives your agent more than a chat box. It gives your agent a complete - Pull sources, screenshots, conversations, tabs, and full page content. - Stay resilient with auto-connect, reconnect logic, and selector fallback strategies. -If your workflow is "ask, verify, cite, and iterate", this is the server built for it. +> [!NOTE] +> If your workflow is "ask, verify, cite, and iterate", this is the server built for it. + +--- + +## Demo + +

+ MCP Comet deep-research demo +

--- ## Quick Start -## Installation +### Prerequisites -### Option 1: Global install (recommended) +- Node.js >= 18 +- [Perplexity Comet](https://comet.perplexity.ai/) installed + +> [!TIP] +> Run `mcp-comet detect` to verify your setup before configuring. + +### Install + +**Option 1: Global install (recommended)** ```bash npm install -g @onestepat4time/mcp-comet ``` -### Option 2: Run without install +**Option 2: Run without install** ```bash npx -y @onestepat4time/mcp-comet ``` -### Option 3: Local development +**Option 3: Local development** ```bash git clone https://github.com/OneStepAt4time/mcp-comet.git @@ -59,24 +94,11 @@ npm ci npm run build ``` ---- - -### 1. Prerequisites - -- Node.js >= 18 -- [Perplexity Comet](https://comet.perplexity.ai/) installed - -Optional pre-flight check: - -```bash -mcp-comet detect -``` - -### 2. Add MCP Comet to MCP +### Configure -Use one of these configs. +Add MCP Comet to your MCP client config. -Claude Desktop (`~/.claude/claude_desktop_config.json`): +**Claude Desktop** (`~/.claude/claude_desktop_config.json`): ```json { @@ -90,7 +112,7 @@ Claude Desktop (`~/.claude/claude_desktop_config.json`): } ``` -Cursor (`~/.cursor/mcp.json`): +**Cursor** (`~/.cursor/mcp.json`): ```json { @@ -104,7 +126,7 @@ Cursor (`~/.cursor/mcp.json`): } ``` -### 3. Give your agent a mission +### First Query Prompt your agent with something like: @@ -118,13 +140,15 @@ Your agent can chain `comet_mode`, `comet_ask`, `comet_wait`, and `comet_get_sou Choose the mode that matches the job. -- `standard`: fast factual lookups. Example: "What is the latest CPI reading for Canada?" -- `deep-research`: multi-source investigations. Example: "Map the 2026 AI chip supply chain and major risks." -- `model-council`: multi-perspective reasoning. Example: "Debate arguments for and against UBI with tradeoffs." -- `create`: drafting and ideation. Example: "Draft a technical explainer on WebAssembly in edge runtimes." -- `learn`: guided teaching. Example: "Teach me B-trees step by step with examples." -- `review`: critical analysis. Example: "Review this API design for security and reliability gaps." -- `computer`: browser-interactive tasks. Example: "Open arXiv and find the newest papers on retrieval augmentation." +| Mode | Description | Example | +| --- | --- | --- | +| `standard` | Fast factual lookups | "What is the latest CPI reading for Canada?" | +| `deep-research` | Multi-source investigations | "Map the 2026 AI chip supply chain and major risks." | +| `model-council` | Multi-perspective reasoning | "Debate arguments for and against UBI with tradeoffs." | +| `create` | Drafting and ideation | "Draft a technical explainer on WebAssembly in edge runtimes." | +| `learn` | Guided teaching | "Teach me B-trees step by step with examples." | +| `review` | Critical analysis | "Review this API design for security and reliability gaps." | +| `computer` | Browser-interactive tasks | "Open arXiv and find the newest papers on retrieval augmentation." | CLI example: @@ -141,10 +165,12 @@ mcp-comet call comet_get_sources ### Session -- `comet_connect`: connects to Comet or launches it. -- `comet_poll`: returns live status and partial progress. -- `comet_wait`: waits for completion and returns the full response. -- `comet_stop`: stops a running task. +| Tool | Description | +| --- | --- | +| `comet_connect` | Connects to Comet or launches it | +| `comet_poll` | Returns live status and partial progress | +| `comet_wait` | Waits for completion and returns the full response | +| `comet_stop` | Stops a running task | ### Query @@ -225,6 +251,12 @@ MCP Tools -> Perplexity Comet ``` +

+ Node.js + TypeScript + Chrome CDP +

+ - Ordered selector strategies tolerate Comet UI changes. - Automatic version detection selects the correct selector set. - Auto-reconnect includes health checks with retry backoff. @@ -234,6 +266,9 @@ Deep dive: [docs/architecture.md](docs/architecture.md) --- +
+Configuration + ## Configuration Essentials Most teams only tune these three: @@ -255,8 +290,13 @@ Config file (`mcp-comet.config.json`) example: More options and full env var reference: [docs/configuration.md](docs/configuration.md) +
+ --- +
+Compatibility + ## Compatibility | Chrome Version | Selector Set | Status | @@ -267,24 +307,26 @@ Unknown versions fall back to the latest known selector set. Details and upgrade flow: [docs/comet-compatibility.md](docs/comet-compatibility.md) +
+ --- ## Contributing -PRs are welcome. Before opening one: +PRs welcome. For non-trivial changes, please open an issue first to discuss what you'd like to change. ```bash npm run lint && npm test ``` -Guide: [docs/contributing.md](docs/contributing.md) - -## Feedback - -- Issues: +Full guide: [docs/contributing.md](docs/contributing.md) ## License [MIT](LICENSE) -Built by [OneStepAt4time](https://github.com/OneStepAt4time) +--- + +
+ Built with dedication by OneStepAt4time +
diff --git a/docs/assets/demo.gif b/docs/assets/demo.gif new file mode 100644 index 0000000..2ac1ac8 Binary files /dev/null and b/docs/assets/demo.gif differ diff --git a/docs/plans/2026-04-07-review-and-testing-design.md b/docs/plans/2026-04-07-review-and-testing-design.md deleted file mode 100644 index a8dc939..0000000 --- a/docs/plans/2026-04-07-review-and-testing-design.md +++ /dev/null @@ -1,106 +0,0 @@ ---- -date: 2026-04-07 -title: Comprehensive Review & Testing Plan -status: approved -approach: Bottom-Up (Unit -> Integration -> UAT -> CI) ---- - -# Asteria Comprehensive Review & Testing Plan - -## Goal - -Full audit, risk mapping, and production gate for the Asteria MCP server. -Approach: Bottom-Up — each phase gates the next. - -## Phase 1: Unit Test Gap Fill - -**Gate**: Every `src/` file has at least one test covering its exports. - -### P0 — Critical Untested Modules - -| Module | What to Test | -|--------|-------------| -| `src/cli.ts` | `printUsage()`, `printVersion()`, `runDetect()`, `runCall()`, `main()` — command parsing, unknown command, invalid JSON | -| `src/cdp/client.ts` | `connect()`, `disconnect()`, `navigate()`, `screenshot()`, `evaluate()`, `safeEvaluate()`, `pressKey()`, `isHealthy()`, `ensureHealthyConnection()`, `withAutoReconnect()`, `reconnect()`, `listTabsCategorized()`, `launchOrConnect()`, `closeExtraTabs()`, `pickBestTarget()` | -| `src/server.ts` inline scripts | Extract `comet_stop` and `comet_list_conversations` inline scripts to `src/ui/` as testable functions, then unit test them | - -### P1 — Edge Case Gaps - -| Module | Missing Edge Cases | -|--------|-------------------| -| `src/cdp/browser.ts` | `isWSL()`, `httpGet()` timeout/errors, `isCometProcessRunning()`, `killComet()`, `startCometProcess()` | -| `src/config.ts` | Malformed JSON config, invalid env var values, unknown config keys | -| `src/cdp/connection.ts` | `isConnectionError()` with non-Error inputs, `getBackoffDelay` with edge values | -| `src/ui/navigation.ts` | `buildModeSwitchScript` with unknown modes, `buildSubmitPromptScript` timeout | -| `src/ui/status.ts` | `buildGetAgentStatusScript` with custom selectors, status logic, response truncation | -| `src/prose-filter.ts` | `buildPreSendStateScript` wrapper | - -### P2 — Lower Priority - -- `src/snapshot.ts` — `runSnapshot()` with mocked CDPClient -- `src/selectors/v145.ts` — verify selector strings are valid CSS -- `src/version.ts` — `detectCometVersion` with invalid JSON, timeout - -## Phase 2: Integration Tests - -**Gate**: All 12 tool handlers have at least one mock-based integration test (happy path + one error path). - -### Mock-Based Integration Tests - -| Tool | Test Scenarios | -|------|---------------| -| `comet_connect` | Happy path connect, launch new browser, close extra tabs, version detection | -| `comet_ask` | Happy path response, timeout with partial response, newChat=true, polling loop | -| `comet_poll` | Returns status JSON, handles evaluation errors | -| `comet_stop` | Stop button found, no stop button found | -| `comet_screenshot` | PNG and JPEG formats | -| `comet_mode` | Get current mode, switch mode, unknown mode | -| `comet_list_tabs` | Categorized tabs output | -| `comet_switch_tab` | Switch by tabId, switch by title, not found | -| `comet_get_sources` | Sources found, no sources | -| `comet_list_conversations` | Conversations found, none found | -| `comet_open_conversation` | Valid URL, invalid URL rejection | -| `comet_get_page_content` | With default and custom maxLength | - -### Real Browser Integration (optional, requires Comet running) - -- Connect to real Comet instance -- Send a real query via `comet_ask` -- Verify response comes back -- Test reconnection after Comet restart - -## Phase 3: UAT Plan Formalization - -**Gate**: UAT plan document reviewed and approved with clear pass/fail criteria for every test case. - -### UAT Categories - -1. **Smoke tests** (5 min) — `comet_connect`, `comet_ask`, `comet_poll`, `comet_screenshot` -2. **Functional tests** (20 min) — All 12 tools through their paces -3. **Error recovery** (10 min) — Comet restart, timeout, invalid inputs -4. **Mode switching** (5 min) — All 7 modes -5. **Cross-session** (5 min) — Multiple queries, conversation history - -### Test Case Format - -| Field | Description | -|-------|-------------| -| Test ID | e.g., `UAT-001` | -| Tool | Which MCP tool | -| Preconditions | What must be true before testing | -| Steps | Numbered, specific actions | -| Expected Result | Exact pass criteria | -| Actual Result | (filled during testing) | -| Pass/Fail | (filled during testing) | - -## Phase 4: CI/CD Hardening - -**Gate**: CI runs full test suite with coverage reporting and enforces thresholds. - -### Actions - -1. Add vitest coverage config — target 80% statements, 70% branches -2. Update CI workflow — add coverage step, fail on threshold breach -3. Add coverage badge to README -4. Add `test:ci` script with coverage + junit reporter -5. Gate PRs on passing tests + coverage thresholds diff --git a/docs/plans/2026-04-07-review-and-testing-plan.md b/docs/plans/2026-04-07-review-and-testing-plan.md deleted file mode 100644 index f4c8e54..0000000 --- a/docs/plans/2026-04-07-review-and-testing-plan.md +++ /dev/null @@ -1,1349 +0,0 @@ -# Asteria Review & Testing Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** Achieve production-ready test coverage for Asteria MCP server — unit tests, integration tests, formalized UAT plan, and CI/CD hardening. - -**Architecture:** Bottom-up approach. Phase 1 fills unit test gaps. Phase 2 adds integration tests. Phase 3 formalizes UAT. Phase 4 hardens CI/CD. Each phase gates the next. - -**Tech Stack:** Vitest, TypeScript, chrome-remote-interface (mocked), @modelcontextprotocol/sdk - ---- - -## Phase 1: Unit Test Gap Fill - -### Task 1: Extract inline scripts to testable functions - -**Files:** -- Create: `src/ui/stop.ts` -- Create: `tests/unit/ui/stop.test.ts` -- Create: `src/ui/conversations.ts` -- Create: `tests/unit/ui/conversations.test.ts` -- Modify: `src/server.ts:433-444` (comet_stop handler) -- Modify: `src/server.ts:562-597` (comet_list_conversations handler) - -**Step 1: Create `src/ui/stop.ts`** - -```typescript -/** - * Build script to find and click the stop/cancel button. - * Extracted from server.ts for testability. - */ -export function buildStopAgentScript(): string { - return `(function() { - var buttons = document.querySelectorAll('button'); - for (var i = 0; i < buttons.length; i++) { - var label = (buttons[i].getAttribute('aria-label') || '').toLowerCase(); - if (label.indexOf('stop') !== -1 || label.indexOf('cancel') !== -1) { buttons[i].click(); return 'stopped'; } - if (buttons[i].querySelector('svg rect')) { buttons[i].click(); return 'stopped'; } - } - return 'not_found'; - })()` -} -``` - -**Step 2: Create `tests/unit/ui/stop.test.ts`** - -```typescript -import { describe, expect, it } from 'vitest' -import { buildStopAgentScript } from '../../../src/ui/stop.js' - -describe('buildStopAgentScript', () => { - it('returns a string containing an IIFE', () => { - const script = buildStopAgentScript() - expect(script).toContain('(function()') - expect(script).toContain('})()') - }) - - it('looks for stop and cancel aria-labels', () => { - const script = buildStopAgentScript() - expect(script).toContain("getAttribute('aria-label')") - expect(script).toContain("'stop'") - expect(script).toContain("'cancel'") - }) - - it('checks for svg rect stop button', () => { - const script = buildStopAgentScript() - expect(script).toContain('svg rect') - }) - - it('returns stopped when button clicked', () => { - const script = buildStopAgentScript() - expect(script).toContain("return 'stopped'") - }) - - it('returns not_found when no stop button', () => { - const script = buildStopAgentScript() - expect(script).toContain("return 'not_found'") - }) -}) -``` - -**Step 3: Run tests to verify they pass** - -Run: `npx vitest run tests/unit/ui/stop.test.ts` -Expected: PASS - -**Step 4: Create `src/ui/conversations.ts`** - -```typescript -/** - * Build script to extract conversation links from the current page. - * Extracted from server.ts for testability. - */ -export function buildListConversationsScript(): string { - return `(function() { - var links = document.querySelectorAll('a[href]'); - var conversations = []; - var seen = {}; - for (var i = 0; i < links.length; i++) { - var href = links[i].getAttribute('href') || ''; - if (href.indexOf('/search/') !== -1 || href.indexOf('/copilot/') !== -1) { - if (!seen[href]) { - seen[href] = true; - conversations.push({ title: (links[i].innerText || '').trim(), url: href }); - } - } - } - return JSON.stringify(conversations); - })()` -} -``` - -**Step 5: Create `tests/unit/ui/conversations.test.ts`** - -```typescript -import { describe, expect, it } from 'vitest' -import { buildListConversationsScript } from '../../../src/ui/conversations.js' - -describe('buildListConversationsScript', () => { - it('returns a string containing an IIFE', () => { - const script = buildListConversationsScript() - expect(script).toMatch(/^\(function\(\)/) - expect(script).toMatch(/\}\)\(\)$/) - }) - - it('selects anchor elements with href', () => { - const script = buildListConversationsScript() - expect(script).toContain("querySelectorAll('a[href]')") - }) - - it('filters for /search/ and /copilot/ paths', () => { - const script = buildListConversationsScript() - expect(script).toContain("'/search/'") - expect(script).toContain("'/copilot/'") - }) - - it('deduplicates by href via seen map', () => { - const script = buildListConversationsScript() - expect(script).toContain('seen[href]') - }) - - it('returns JSON stringified array', () => { - const script = buildListConversationsScript() - expect(script).toContain('JSON.stringify(conversations)') - }) - - it('extracts title from innerText', () => { - const script = buildListConversationsScript() - expect(script).toContain('innerText') - }) -}) -``` - -**Step 6: Run tests to verify they pass** - -Run: `npx vitest run tests/unit/ui/conversations.test.ts` -Expected: PASS - -**Step 7: Update `src/server.ts` to use extracted functions** - -Replace the inline script in comet_stop handler (around line 433): -```typescript -import { buildStopAgentScript } from './ui/stop.js' -// ... in comet_stop handler: -const script = buildStopAgentScript() -const raw = await client.safeEvaluate(script) -``` - -Replace the inline script in comet_list_conversations handler (around line 562): -```typescript -import { buildListConversationsScript } from './ui/conversations.js' -// ... in comet_list_conversations handler: -const script = buildListConversationsScript() -const raw = await client.safeEvaluate(script) -``` - -**Step 8: Run all tests** - -Run: `npx vitest run` -Expected: All existing tests + new tests PASS - -**Step 9: Commit** - -```bash -git add src/ui/stop.ts src/ui/conversations.ts tests/unit/ui/stop.test.ts tests/unit/ui/conversations.test.ts src/server.ts -git commit -m "refactor: extract inline scripts to testable UI functions" -``` - ---- - -### Task 2: Add CDPClient method tests (mocked CRI) - -**Files:** -- Modify: `tests/unit/cdp/client.test.ts` - -**Context:** CDPClient uses `chrome-remote-interface` (CRI) which returns a client with methods like `Page.enable()`, `Runtime.evaluate()`, etc. We need to mock CRI to test connect, disconnect, navigate, screenshot, evaluate, and reconnect. - -**Step 1: Write failing tests for connect/disconnect/navigate/screenshot** - -```typescript -import { beforeEach, describe, expect, it, vi } from 'vitest' -import { CDPClient } from '../../../src/cdp/client.js' - -// Mock chrome-remote-interface -vi.mock('chrome-remote-interface', () => ({ - default: vi.fn(), -})) - -import CRI from 'chrome-remote-interface' - -function mockCRI(clientOverrides: Record = {}) { - const mockClient = { - Page: { enable: vi.fn(), navigate: vi.fn(), loadEventFired: vi.fn(), captureScreenshot: vi.fn() }, - Runtime: { enable: vi.fn(), evaluate: vi.fn() }, - Emulation: { setDeviceMetricsOverride: vi.fn() }, - Input: { dispatchKeyEvent: vi.fn() }, - Target: { closeTarget: vi.fn() }, - close: vi.fn(), - ...clientOverrides, - } - vi.mocked(CRI).mockResolvedValue(mockClient as any) - return mockClient -} - -describe('CDPClient methods', () => { - beforeEach(() => { - CDPClient.resetInstance() - vi.clearAllMocks() - }) - - describe('connect', () => { - it('connects to a target and enables Page/Runtime', async () => { - const mock = mockCRI() - const client = CDPClient.getInstance() - // Mock httpGet responses via getVersion/listTargets - // These use the private httpGet which calls browser.httpGet - // We need to mock the fetch calls - const originalFetch = globalThis.fetch - let fetchCount = 0 - globalThis.fetch = vi.fn().mockImplementation((url: string) => { - fetchCount++ - if (url.includes('/json/version')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve({ Browser: 'Chrome/145.0.0.0' }) }) - } - if (url.includes('/json/list')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve([ - { id: 'target-1', url: 'https://perplexity.ai', type: 'page', title: 'Perplexity' } - ]) }) - } - return Promise.resolve({ ok: false, status: 404, json: () => Promise.resolve(null) }) - }) as any - - const id = await client.connect() - expect(id).toBe('target-1') - expect(mock.Page.enable).toHaveBeenCalled() - expect(mock.Runtime.enable).toHaveBeenCalled() - expect(client.state.connected).toBe(true) - expect(client.state.targetId).toBe('target-1') - - globalThis.fetch = originalFetch - }) - - it('throws CDPConnectionError when no targets found', async () => { - mockCRI() - const client = CDPClient.getInstance() - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockImplementation((url: string) => { - if (url.includes('/json/version')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve({ Browser: 'Chrome/145' }) }) - } - if (url.includes('/json/list')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve([]) }) - } - return Promise.resolve({ ok: false, status: 404, json: () => Promise.resolve(null) }) - }) as any - - await expect(client.connect()).rejects.toThrow() - globalThis.fetch = originalFetch - }) - }) - - describe('disconnect', () => { - it('closes CRI client and resets state', async () => { - const mock = mockCRI() - const client = CDPClient.getInstance() - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockImplementation((url: string) => { - if (url.includes('/json/version')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve({ Browser: 'Chrome/145' }) }) - } - if (url.includes('/json/list')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve([ - { id: 't1', url: 'https://perplexity.ai', type: 'page', title: 'P' } - ]) }) - } - return Promise.resolve({ ok: false, status: 404, json: () => Promise.resolve(null) }) - }) as any - - await client.connect() - await client.disconnect() - expect(mock.close).toHaveBeenCalled() - expect(client.state.connected).toBe(false) - expect(client.state.targetId).toBeNull() - globalThis.fetch = originalFetch - }) - }) - - describe('navigate', () => { - it('calls Page.navigate and waits for loadEventFired', async () => { - const mock = mockCRI() - mock.Page.loadEventFired.mockResolvedValue(undefined) - const client = CDPClient.getInstance() - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockImplementation((url: string) => { - if (url.includes('/json/version')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve({ Browser: 'Chrome/145' }) }) - } - if (url.includes('/json/list')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve([ - { id: 't1', url: 'https://perplexity.ai', type: 'page', title: 'P' } - ]) }) - } - return Promise.resolve({ ok: false, status: 404, json: () => Promise.resolve(null) }) - }) as any - - await client.connect() - await client.navigate('https://example.com') - expect(mock.Page.navigate).toHaveBeenCalledWith({ url: 'https://example.com' }) - expect(mock.Page.loadEventFired).toHaveBeenCalled() - globalThis.fetch = originalFetch - }) - }) - - describe('screenshot', () => { - it('captures screenshot with correct format', async () => { - const mock = mockCRI() - mock.Page.captureScreenshot.mockResolvedValue({ data: 'base64data' }) - const client = CDPClient.getInstance() - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockImplementation((url: string) => { - if (url.includes('/json/version')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve({ Browser: 'Chrome/145' }) }) - } - if (url.includes('/json/list')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve([ - { id: 't1', url: 'https://perplexity.ai', type: 'page', title: 'P' } - ]) }) - } - return Promise.resolve({ ok: false, status: 404, json: () => Promise.resolve(null) }) - }) as any - - await client.connect() - const data = await client.screenshot('jpeg') - expect(data).toBe('base64data') - expect(mock.Page.captureScreenshot).toHaveBeenCalledWith( - expect.objectContaining({ format: 'jpeg', clip: expect.any(Object) }) - ) - globalThis.fetch = originalFetch - }) - }) - - describe('evaluate', () => { - it('evaluates expression via Runtime.evaluate', async () => { - const mock = mockCRI() - mock.Runtime.evaluate.mockResolvedValue({ result: { value: 42 } }) - const client = CDPClient.getInstance() - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockImplementation((url: string) => { - if (url.includes('/json/version')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve({ Browser: 'Chrome/145' }) }) - } - if (url.includes('/json/list')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve([ - { id: 't1', url: 'https://perplexity.ai', type: 'page', title: 'P' } - ]) }) - } - return Promise.resolve({ ok: false, status: 404, json: () => Promise.resolve(null) }) - }) as any - - await client.connect() - const result = await client.evaluate('1+1') - expect(result.result.value).toBe(42) - globalThis.fetch = originalFetch - }) - - it('throws CDPConnectionError when not connected', async () => { - const client = CDPClient.getInstance() - await expect(client.evaluate('1+1')).rejects.toThrow('Not connected') - }) - }) - - describe('pressKey', () => { - it('dispatches keyDown and keyUp events', async () => { - const mock = mockCRI() - const client = CDPClient.getInstance() - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockImplementation((url: string) => { - if (url.includes('/json/version')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve({ Browser: 'Chrome/145' }) }) - } - if (url.includes('/json/list')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve([ - { id: 't1', url: 'https://perplexity.ai', type: 'page', title: 'P' } - ]) }) - } - return Promise.resolve({ ok: false, status: 404, json: () => Promise.resolve(null) }) - }) as any - - await client.connect() - await client.pressKey('Enter') - expect(mock.Input.dispatchKeyEvent).toHaveBeenCalledWith({ type: 'keyDown', key: 'Enter' }) - expect(mock.Input.dispatchKeyEvent).toHaveBeenCalledWith({ type: 'keyUp', key: 'Enter' }) - globalThis.fetch = originalFetch - }) - }) - - describe('isHealthy', () => { - it('returns true when evaluate returns 2', async () => { - const mock = mockCRI() - mock.Runtime.evaluate.mockResolvedValue({ result: { value: 2 } }) - const client = CDPClient.getInstance() - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockImplementation((url: string) => { - if (url.includes('/json/version')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve({ Browser: 'Chrome/145' }) }) - } - if (url.includes('/json/list')) { - return Promise.resolve({ ok: true, status: 200, json: () => Promise.resolve([ - { id: 't1', url: 'https://perplexity.ai', type: 'page', title: 'P' } - ]) }) - } - return Promise.resolve({ ok: false, status: 404, json: () => Promise.resolve(null) }) - }) as any - - await client.connect() - const healthy = await client.isHealthy() - expect(healthy).toBe(true) - globalThis.fetch = originalFetch - }) - - it('returns false when not connected', async () => { - const client = CDPClient.getInstance() - expect(await client.isHealthy()).toBe(false) - }) - }) - - describe('normalizePrompt', () => { - it('strips leading bullets and collapses whitespace', () => { - const client = CDPClient.getInstance() - expect(client.normalizePrompt('- item 1\n- item 2')).toBe('item 1 item 2') - expect(client.normalizePrompt('* bullet\n\n* another')).toBe('bullet another') - expect(client.normalizePrompt(' multiple spaces ')).toBe('multiple spaces') - }) - - it('handles empty string', () => { - const client = CDPClient.getInstance() - expect(client.normalizePrompt('')).toBe('') - }) - - it('handles already-normalized text', () => { - const client = CDPClient.getInstance() - expect(client.normalizePrompt('hello world')).toBe('hello world') - }) - }) -}) -``` - -**Step 2: Run tests to verify they pass** - -Run: `npx vitest run tests/unit/cdp/client.test.ts` -Expected: PASS - -**Step 3: Commit** - -```bash -git add tests/unit/cdp/client.test.ts -git commit -m "test: add CDPClient method tests with mocked CRI" -``` - ---- - -### Task 3: Add CLI unit tests - -**Files:** -- Create: `tests/unit/cli.test.ts` - -**Context:** `src/cli.ts` uses `console.error`/`console.log` and `process.exit`. We mock these for testing. The CLI functions are not exported — we test via `main()` by mocking `process.argv`. - -**Step 1: Write failing tests** - -```typescript -import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest' - -// Mock the server import to avoid actually starting MCP -vi.mock('../../src/server.js', () => ({ - startServer: vi.fn().mockResolvedValue(undefined), -})) - -// Mock cdp/browser for runDetect -vi.mock('../../src/cdp/browser.js', () => ({ - getCometPath: vi.fn().mockReturnValue('/mock/comet/path'), - isCometProcessRunning: vi.fn().mockReturnValue(true), -})) - -describe('CLI', () => { - let consoleErrorSpy: ReturnType - let consoleLogSpy: ReturnType - let originalArgv: string[] - - beforeEach(() => { - consoleErrorSpy = vi.spyOn(console, 'error').mockImplementation(() => {}) - consoleLogSpy = vi.spyOn(console, 'log').mockImplementation(() => {}) - originalArgv = process.argv - }) - - afterEach(() => { - consoleErrorSpy.mockRestore() - consoleLogSpy.mockRestore() - process.argv = originalArgv - }) - - it('--help prints usage info', async () => { - process.argv = ['node', 'asteria', '--help'] - await import('../../src/cli.js') - expect(consoleErrorSpy).toHaveBeenCalledWith(expect.stringContaining('asteria')) - expect(consoleErrorSpy).toHaveBeenCalledWith(expect.stringContaining('USAGE')) - }) - - it('--version prints version', async () => { - process.argv = ['node', 'asteria', '--version'] - await import('../../src/cli.js') - expect(consoleErrorSpy).toHaveBeenCalledWith(expect.stringMatching(/asteria v\d+\.\d+\.\d+/)) - }) - - it('unknown command prints error', async () => { - const exitSpy = vi.spyOn(process, 'exit').mockImplementation(() => undefined as never) - process.argv = ['node', 'asteria', 'foobar'] - await import('../../src/cli.js') - expect(consoleErrorSpy).toHaveBeenCalledWith(expect.stringContaining('Unknown command')) - exitSpy.mockRestore() - }) -}) -``` - -**Step 2: Run tests** - -Run: `npx vitest run tests/unit/cli.test.ts` -Expected: PASS - -**Step 3: Commit** - -```bash -git add tests/unit/cli.test.ts -git commit -m "test: add CLI unit tests for help, version, unknown command" -``` - ---- - -### Task 4: Add browser.ts edge case tests - -**Files:** -- Modify: `tests/unit/cdp/browser.test.ts` - -**Step 1: Write additional tests** - -Add these tests to the existing `tests/unit/cdp/browser.test.ts`: - -```typescript -describe('isWSL', () => { - it('returns true when uname -r contains microsoft', async () => { - const { execSync } = await import('node:child_process') - vi.spyOn({ execSync }, 'execSync').mockReturnValue('5.15.0-microsoft-standard-WSL2') - // Note: isWSL is not exported, tested indirectly - }) -}) - -describe('httpGet', () => { - it('returns ok:true for successful fetch', async () => { - const { httpGet } = await import('../../../src/cdp/browser.js') - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockResolvedValue({ - ok: true, - status: 200, - json: () => Promise.resolve({ test: true }), - }) as any - - const result = await httpGet('http://localhost/test') - expect(result.ok).toBe(true) - expect(result.status).toBe(200) - - globalThis.fetch = originalFetch - }) - - it('returns ok:false for failed fetch', async () => { - const { httpGet } = await import('../../../src/cdp/browser.js') - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockRejectedValue(new Error('network error')) as any - - const result = await httpGet('http://localhost/test') - expect(result.ok).toBe(false) - expect(result.status).toBe(0) - - globalThis.fetch = originalFetch - }) - - it('returns ok:false for non-200 status', async () => { - const { httpGet } = await import('../../../src/cdp/browser.js') - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockResolvedValue({ - ok: false, - status: 500, - json: () => Promise.resolve(null), - }) as any - - const result = await httpGet('http://localhost/test') - expect(result.ok).toBe(false) - expect(result.status).toBe(500) - - globalThis.fetch = originalFetch - }) - - it('aborts after timeout', async () => { - const { httpGet } = await import('../../../src/cdp/browser.js') - const originalFetch = globalThis.fetch - // Simulate a fetch that never resolves - globalThis.fetch = vi.fn().mockImplementation((_url: string, opts: any) => { - return new Promise((_, reject) => { - // The AbortController should abort this - if (opts?.signal) { - opts.signal.addEventListener('abort', () => reject(new DOMException('Aborted', 'AbortError'))) - } - }) - }) as any - - const result = await httpGet('http://localhost/test', 100) - expect(result.ok).toBe(false) - - globalThis.fetch = originalFetch - }, 10000) -}) - -describe('getCometPath', () => { - it('throws CometNotFoundError when not found', async () => { - const { getCometPath } = await import('../../../src/cdp/browser.js') - delete process.env.COMET_PATH - // Mock platform to be something with no candidates - const originalPlatform = process.platform - Object.defineProperty(process, 'platform', { value: 'freebsd' }) - await expect(getCometPath()).rejects.toThrow('Comet browser not found') - Object.defineProperty(process, 'platform', { value: originalPlatform }) - }) -}) - -describe('isCometProcessRunning', () => { - it('returns false on command failure', async () => { - const { isCometProcessRunning } = await import('../../../src/cdp/browser.js') - // On non-Windows, it uses pgrep which will fail in CI - const result = isCometProcessRunning() - expect(typeof result).toBe('boolean') - }) -}) - -describe('killComet', () => { - it('does not throw on failure', async () => { - const { killComet } = await import('../../../src/cdp/browser.js') - expect(() => killComet()).not.toThrow() - }) -}) - -describe('startCometProcess', () => { - it('spawns process with debug port arg', async () => { - const { startCometProcess } = await import('../../../src/cdp/browser.js') - const { createLogger } = await import('../../../src/logger.js') - const logger = createLogger('error') - - // Mock spawn to avoid launching a real browser - const { spawn } = await import('node:child_process') - const spawnSpy = vi.spyOn(await import('node:child_process'), 'spawn').mockReturnValue({ - unref: vi.fn(), - } as any) - - startCometProcess('/fake/comet', 9222, logger) - expect(spawnSpy).toHaveBeenCalledWith('/fake/comet', ['--remote-debugging-port=9222'], expect.any(Object)) - - spawnSpy.mockRestore() - }) -}) -``` - -**Step 2: Run tests** - -Run: `npx vitest run tests/unit/cdp/browser.test.ts` -Expected: PASS - -**Step 3: Commit** - -```bash -git add tests/unit/cdp/browser.test.ts -git commit -m "test: add browser.ts edge case tests (httpGet, killComet, startCometProcess)" -``` - ---- - -### Task 5: Add config.ts edge case tests - -**Files:** -- Modify: `tests/unit/config.test.ts` - -**Step 1: Write additional tests** - -```typescript -describe('loadConfig edge cases', () => { - it('handles malformed JSON in config file', () => { - // Write a malformed config, load, verify defaults still work - // This tests the catch block in loadConfigFile - const { writeFileSync, unlinkSync } = await import('node:fs') - writeFileSync('asteria.config.json', '{ invalid json }') - const config = loadConfig() - expect(config.port).toBe(9222) // Falls back to default - unlinkSync('asteria.config.json') - }) - - it('handles invalid env var number values', () => { - process.env.ASTERIA_PORT = 'not-a-number' - const config = loadConfig() - expect(config.port).toBe(9222) // Falls back to default - delete process.env.ASTERIA_PORT - }) - - it('handles overrides parameter', () => { - const config = loadConfig({ port: 9999 }) - expect(config.port).toBe(9999) - }) - - it('overrides take precedence over env vars', () => { - process.env.ASTERIA_PORT = '8080' - const config = loadConfig({ port: 9999 }) - expect(config.port).toBe(9999) // Override wins - delete process.env.ASTERIA_PORT - }) - - it('ignores unknown keys in config file', () => { - const { writeFileSync, unlinkSync } = await import('node:fs') - writeFileSync('asteria.config.json', '{"port": 8080, "unknownKey": "ignored"}') - const config = loadConfig() - expect(config.port).toBe(8080) - expect((config as any).unknownKey).toBeUndefined() - unlinkSync('asteria.config.json') - }) -}) -``` - -**Step 2: Run tests** - -Run: `npx vitest run tests/unit/config.test.ts` -Expected: PASS - -**Step 3: Commit** - -```bash -git add tests/unit/config.test.ts -git commit -m "test: add config.ts edge case tests (malformed JSON, invalid env vars)" -``` - ---- - -### Task 6: Add status/navigation/extraction edge case tests - -**Files:** -- Modify: `tests/unit/ui/status.test.ts` -- Modify: `tests/unit/ui/navigation.test.ts` -- Modify: `tests/unit/ui/extraction.test.ts` -- Modify: `tests/unit/prose-filter.test.ts` - -**Step 1: Add edge case tests to status.test.ts** - -```typescript -describe('buildGetAgentStatusScript edge cases', () => { - it('accepts custom selectors parameter', () => { - const customSelectors = { - ...SELECTORS, - LOADING: ['.custom-spinner'], - } - const script = buildGetAgentStatusScript(customSelectors) - expect(script).toContain('.custom-spinner') - }) - - it('includes response truncation at 8000 chars', () => { - const script = buildGetAgentStatusScript() - expect(script).toContain('8000') - }) - - it('includes step pattern extraction', () => { - const script = buildGetAgentStatusScript() - expect(script).toContain('stepPatterns') - }) - - it('detects working patterns', () => { - const script = buildGetAgentStatusScript() - expect(script).toContain('Working') - expect(script).toContain('Searching') - expect(script).toContain('Navigating to') - }) -}) -``` - -**Step 2: Add edge case tests to navigation.test.ts** - -```typescript -describe('buildModeSwitchScript edge cases', () => { - it('returns standard_mode_no_action for standard mode', () => { - const script = buildModeSwitchScript('standard') - expect(script).toContain('standard_mode_no_action') - }) - - it('handles unknown mode by using raw string', () => { - const script = buildModeSwitchScript('unknown-mode') - expect(script).toContain('unknown-mode') - }) - - it('waits for listbox with retry loop', () => { - const script = buildModeSwitchScript('deep-research') - expect(script).toContain('maxAttempts') - expect(script).toContain('listbox') - }) -}) - -describe('buildGetCurrentModeScript', () => { - it('returns standard by default', () => { - const script = buildGetCurrentModeScript() - expect(script).toContain("'standard'") - }) - - it('detects computer mode from /copilot/ URL', () => { - const script = buildGetCurrentModeScript() - expect(script).toContain('/copilot/') - expect(script).toContain("'computer'") - }) -}) -``` - -**Step 3: Add edge case tests to extraction.test.ts** - -```typescript -describe('buildExtractPageContentScript edge cases', () => { - it('handles custom maxLength', () => { - const script = buildExtractPageContentScript(5000) - expect(script).toContain('5000') - }) - - it('handles body being null', () => { - const script = buildExtractPageContentScript() - expect(script).toContain('!body') - expect(script).toContain("JSON.stringify({ text: '', title: '' })") - }) - - it('strips UI noise text', () => { - const script = buildExtractPageContentScript() - expect(script).toContain('Sign in') - expect(script).toContain('Log in') - }) -}) - -describe('buildExtractSourcesScript edge cases', () => { - it('filters javascript: URLs', () => { - const script = buildExtractSourcesScript() - expect(script).toContain("'javascript:'") - }) - - it('filters hash-only URLs', () => { - const script = buildExtractSourcesScript() - expect(script).toContain("'#'") - }) - - it('deduplicates by URL', () => { - const script = buildExtractSourcesScript() - expect(script).toContain('seenUrls') - }) - - it('extracts domain as fallback title', () => { - const script = buildExtractSourcesScript() - expect(script).toContain('extractDomain') - }) -}) -``` - -**Step 4: Add tests to prose-filter.test.ts** - -```typescript -describe('buildPreSendStateScript', () => { - it('returns IIFE that returns JSON with proseCount and lastProseText', () => { - const script = buildPreSendStateScript() - expect(script).toMatch(/^\(function\(\)/) - expect(script).toContain('proseCount') - expect(script).toContain('lastProseText') - expect(script).toContain('JSON.stringify') - }) - - it('includes findProseJS body', () => { - const script = buildPreSendStateScript() - expect(script).toContain('proseElements') - expect(script).toContain('excludeTags') - }) -}) -``` - -**Step 5: Run all UI tests** - -Run: `npx vitest run tests/unit/ui/ tests/unit/prose-filter.test.ts` -Expected: PASS - -**Step 6: Commit** - -```bash -git add tests/unit/ui/status.test.ts tests/unit/ui/navigation.test.ts tests/unit/ui/extraction.test.ts tests/unit/prose-filter.test.ts -git commit -m "test: add edge case tests for status, navigation, extraction, prose-filter" -``` - ---- - -### Task 7: Add version.ts and snapshot.ts tests - -**Files:** -- Create: `tests/unit/version.test.ts` -- Create: `tests/unit/snapshot.test.ts` - -**Step 1: Write version tests** - -```typescript -import { describe, expect, it, vi } from 'vitest' - -describe('detectCometVersion', () => { - it('returns default v145 selectors on fetch failure', async () => { - const { detectCometVersion } = await import('../../src/version.js') - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockRejectedValue(new Error('no connection')) as any - - const result = await detectCometVersion(9222) - expect(result.chromeMajor).toBe(145) - expect(result.browser).toBe('Unknown') - expect(result.selectors).toBeDefined() - - globalThis.fetch = originalFetch - }) - - it('parses Chrome version from Browser header', async () => { - const { detectCometVersion } = await import('../../src/version.js') - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockResolvedValue({ - ok: true, - status: 200, - json: () => Promise.resolve({ Browser: 'Chrome/145.0.5678.90' }), - }) as any - - const result = await detectCometVersion(9222) - expect(result.chromeMajor).toBe(145) - expect(result.browser).toBe('Chrome/145.0.5678.90') - - globalThis.fetch = originalFetch - }) - - it('handles non-OK HTTP response', async () => { - const { detectCometVersion } = await import('../../src/version.js') - const originalFetch = globalThis.fetch - globalThis.fetch = vi.fn().mockResolvedValue({ - ok: false, - status: 500, - }) as any - - const result = await detectCometVersion(9222) - expect(result.chromeMajor).toBe(145) // Falls back to default - - globalThis.fetch = originalFetch - }) -}) -``` - -**Step 2: Write snapshot tests** - -```typescript -import { describe, expect, it, vi } from 'vitest' - -vi.mock('../../src/cdp/client.js', () => ({ - CDPClient: { - getInstance: vi.fn().mockReturnValue({ - launchOrConnect: vi.fn().mockResolvedValue('target-1'), - safeEvaluate: vi.fn().mockResolvedValue({ - result: { value: '{"inputs":[],"buttons":[],"proseCount":0}' }, - }), - disconnect: vi.fn().mockResolvedValue(undefined), - }), - resetInstance: vi.fn(), - }, -})) - -describe('runSnapshot', () => { - it('connects, evaluates, and disconnects', async () => { - const { runSnapshot } = await import('../../src/snapshot.js') - const consoleSpy = vi.spyOn(console, 'error').mockImplementation(() => {}) - - await runSnapshot() - - const { CDPClient } = await import('../../src/cdp/client.js') - expect(CDPClient.getInstance).toHaveBeenCalled() - - consoleSpy.mockRestore() - }) -}) -``` - -**Step 3: Run tests** - -Run: `npx vitest run tests/unit/version.test.ts tests/unit/snapshot.test.ts` -Expected: PASS - -**Step 4: Commit** - -```bash -git add tests/unit/version.test.ts tests/unit/snapshot.test.ts -git commit -m "test: add version.ts and snapshot.ts unit tests" -``` - ---- - -### Task 8: Add selectors tests - -**Files:** -- Create: `tests/unit/selectors/index.test.ts` -- Create: `tests/unit/selectors/v145.test.ts` - -**Step 1: Write selector index tests** - -```typescript -import { describe, expect, it } from 'vitest' -import { getSelectorsForVersion, parseChromeVersion } from '../../../src/selectors/index.js' - -describe('parseChromeVersion', () => { - it('extracts major version from Chrome string', () => { - expect(parseChromeVersion('Chrome/145.0.5678.90')).toBe(145) - }) - - it('returns 0 for non-Chrome string', () => { - expect(parseChromeVersion('Firefox/120.0')).toBe(0) - }) - - it('returns 0 for empty string', () => { - expect(parseChromeVersion('')).toBe(0) - }) -}) - -describe('getSelectorsForVersion', () => { - it('returns v145 selectors for Chrome 145', () => { - const selectors = getSelectorsForVersion(145) - expect(selectors).toBeDefined() - expect(selectors.INPUT).toBeDefined() - expect(selectors.SUBMIT).toBeDefined() - }) - - it('returns v145 (fallback) for unknown version', () => { - const selectors = getSelectorsForVersion(999) - expect(selectors).toBeDefined() - expect(selectors.INPUT).toBeDefined() - }) -}) -``` - -**Step 2: Write v145 selector tests** - -```typescript -import { describe, expect, it } from 'vitest' -import { v145Selectors } from '../../../src/selectors/v145.js' - -describe('v145Selectors', () => { - it('defines all required selector categories', () => { - expect(v145Selectors.INPUT).toBeDefined() - expect(v145Selectors.SUBMIT).toBeDefined() - expect(v145Selectors.STOP).toBeDefined() - expect(v145Selectors.RESPONSE).toBeDefined() - expect(v145Selectors.LOADING).toBeDefined() - expect(v145Selectors.TYPEAHEAD_MENU).toBeDefined() - expect(v145Selectors.MENU_ITEM).toBeDefined() - }) - - it('has non-empty arrays for each selector category', () => { - for (const [key, value] of Object.entries(v145Selectors)) { - expect(Array.isArray(value)).toBe(true) - expect(value.length).toBeGreaterThan(0) - } - }) - - it('contains valid CSS selectors', () => { - for (const [key, selectors] of Object.entries(v145Selectors)) { - for (const sel of selectors as string[]) { - expect(() => document.querySelector(sel)).not.toThrow() - } - } - }) -}) -``` - -**Step 3: Run tests** - -Run: `npx vitest run tests/unit/selectors/` -Expected: PASS - -**Step 4: Commit** - -```bash -git add tests/unit/selectors/ -git commit -m "test: add selector registry and v145 selector tests" -``` - ---- - -### Task 9: Add connection.ts edge case tests - -**Files:** -- Modify: `tests/unit/cdp/connection.test.ts` - -**Step 1: Add edge case tests** - -```typescript -describe('isConnectionError edge cases', () => { - it('returns false for non-Error inputs', () => { - expect(isConnectionError('string')).toBe(false) - expect(isConnectionError(null)).toBe(false) - expect(isConnectionError(undefined)).toBe(false) - expect(isConnectionError(42)).toBe(false) - }) - - it('returns false for unrelated errors', () => { - expect(isConnectionError(new Error('something else'))).toBe(false) - }) -}) - -describe('getBackoffDelay edge cases', () => { - it('returns exponential delay', () => { - const d1 = getBackoffDelay(1, 30000) - const d2 = getBackoffDelay(2, 30000) - const d3 = getBackoffDelay(3, 30000) - expect(d2).toBeGreaterThan(d1) - expect(d3).toBeGreaterThan(d2) - }) - - it('caps at max delay', () => { - const delay = getBackoffDelay(100, 5000) - expect(delay).toBeLessThanOrEqual(5000) - }) - - it('handles attempt 0', () => { - const delay = getBackoffDelay(0, 30000) - expect(delay).toBeGreaterThanOrEqual(0) - }) -}) -``` - -**Step 2: Run tests** - -Run: `npx vitest run tests/unit/cdp/connection.test.ts` -Expected: PASS - -**Step 3: Commit** - -```bash -git add tests/unit/cdp/connection.test.ts -git commit -m "test: add connection.ts edge case tests" -``` - ---- - -## Phase 2: Integration Tests - -### Task 10: Create integration test harness - -**Files:** -- Modify: `tests/integration/mock-cdp-server.ts` (add more capabilities) -- Create: `tests/integration/tools/` directory - -**Step 1: Enhance mock CDP server with Runtime.evaluate support** - -Add to mock-cdp-server.ts the ability to handle WebSocket connections and return scripted responses for tool handler evaluation scripts. - -**Step 2: Create base integration test helper** - -Create `tests/integration/tools/helpers.ts` with shared mock setup for tool handler tests. - -**Step 3: Commit** - -```bash -git add tests/integration/ -git commit -m "test: add integration test harness with enhanced mock CDP" -``` - ---- - -### Task 11: Add tool handler integration tests (batch 1) - -**Files:** -- Create: `tests/integration/tools/connect.test.ts` -- Create: `tests/integration/tools/ask.test.ts` -- Create: `tests/integration/tools/poll.test.ts` -- Create: `tests/integration/tools/stop.test.ts` - -Test each tool handler against the mock CDP server with happy path + error path. - -**Step 1: Write tests for connect, ask, poll, stop** - -**Step 2: Run tests** - -Run: `npx vitest run tests/integration/tools/` -Expected: PASS - -**Step 3: Commit** - -```bash -git add tests/integration/tools/ -git commit -m "test: add integration tests for connect, ask, poll, stop tools" -``` - ---- - -### Task 12: Add tool handler integration tests (batch 2) - -**Files:** -- Create: `tests/integration/tools/screenshot.test.ts` -- Create: `tests/integration/tools/mode.test.ts` -- Create: `tests/integration/tools/tabs.test.ts` -- Create: `tests/integration/tools/extraction.test.ts` -- Create: `tests/integration/tools/conversations.test.ts` - -**Step 1: Write tests for remaining 8 tools** - -**Step 2: Run tests** - -Run: `npx vitest run tests/integration/tools/` -Expected: PASS - -**Step 3: Commit** - -```bash -git add tests/integration/tools/ -git commit -m "test: add integration tests for remaining tool handlers" -``` - ---- - -## Phase 3: UAT Plan Formalization - -### Task 13: Write formal UAT test plan - -**Files:** -- Create: `docs/uat/uat-plan.md` - -**Step 1: Write structured UAT plan** - -Create formal test cases for all 12 tools with: -- Test ID (UAT-001 through UAT-040+) -- Preconditions -- Numbered steps -- Expected results -- Pass/Fail fields - -Categories: Smoke (5min), Functional (20min), Error Recovery (10min), Mode Switching (5min), Cross-Session (5min) - -**Step 2: Commit** - -```bash -git add docs/uat/ -git commit -m "docs: add formalized UAT test plan with pass/fail criteria" -``` - ---- - -## Phase 4: CI/CD Hardening - -### Task 14: Add coverage thresholds and CI updates - -**Files:** -- Modify: `vitest.config.ts` -- Modify: `.github/workflows/ci.yml` -- Modify: `package.json` -- Modify: `README.md` (add coverage badge) - -**Step 1: Update vitest.config.ts with coverage** - -```typescript -export default defineConfig({ - test: { - include: ['tests/**/*.test.ts'], - environment: 'node', - coverage: { - provider: 'v8', - reporter: ['text', 'lcov', 'html'], - reportsDirectory: 'coverage', - thresholds: { - statements: 80, - branches: 70, - functions: 80, - lines: 80, - }, - }, - }, -}) -``` - -**Step 2: Update CI workflow** - -```yaml -- run: npx vitest run --coverage -- name: Check coverage - if: always() - uses: actions/upload-artifact@v4 - with: - name: coverage-report - path: coverage/ -``` - -**Step 3: Add test:ci script to package.json** - -```json -"test:ci": "vitest run --coverage" -``` - -**Step 4: Add coverage badge to README** - -**Step 5: Run coverage locally** - -Run: `npx vitest run --coverage` -Expected: All tests pass, coverage thresholds met - -**Step 6: Commit** - -```bash -git add vitest.config.ts .github/workflows/ci.yml package.json README.md -git commit -m "ci: add coverage thresholds, coverage reporting, and CI updates" -``` - ---- - -## Summary - -| Phase | Tasks | Estimated Tests Added | -|-------|-------|----------------------| -| Phase 1: Unit Gaps | Tasks 1-9 | ~80 new unit tests | -| Phase 2: Integration | Tasks 10-12 | ~30 new integration tests | -| Phase 3: UAT | Task 13 | 40+ UAT test cases | -| Phase 4: CI/CD | Task 14 | Coverage enforcement | - -**Total: 14 tasks, ~110 automated tests, 40+ manual UAT cases** diff --git a/docs/plans/2026-04-07-uat-bugfixes-design.md b/docs/plans/2026-04-07-uat-bugfixes-design.md deleted file mode 100644 index 5b17f41..0000000 --- a/docs/plans/2026-04-07-uat-bugfixes-design.md +++ /dev/null @@ -1,46 +0,0 @@ ---- -date: 2026-04-07 -title: UAT Bug Fixes — screenshot auto-connect + stale response detection -status: approved -approach: Approach A for both bugs ---- - -# UAT Bug Fixes PRD - -Found during UAT execution on 2026-04-07. - -## BUG-1: screenshot() doesn't auto-connect - -**Severity:** High — screenshot completely fails when called from a fresh process - -**Symptom:** `comet_screenshot` returns `[CDP_CONNECTION_FAILED] Not connected` when called from a new `asteria call` process, while `comet_poll` and other evaluate-based tools auto-reconnect successfully. - -**Root cause:** `screenshot()` (src/cdp/client.ts:134) calls `withAutoReconnect()` directly without first calling `ensureHealthyConnection()`. Compare with `safeEvaluate()` (line 165) which calls `ensureHealthyConnection()` before `withAutoReconnect()`. - -**Fix:** Add `await this.ensureHealthyConnection()` as the first line of `screenshot()`, matching the pattern in `safeEvaluate()`. - -**File:** `src/cdp/client.ts` — add 1 line before `return await this.withAutoReconnect(...)` - -**Testing:** Add a unit test verifying screenshot calls `ensureHealthyConnection` first. Run UAT-004 again. - ---- - -## BUG-2: comet_ask returns stale responses - -**Severity:** High — cross-query contamination, returns wrong answer - -**Symptom:** When a second query is submitted, `comet_ask` returns the response from the first query instead of the new one. Happens because the old response's prose text is still on the page and matches `preSendState.lastProseText`. - -**Root cause:** The pre-send state capture (server.ts:337-342) records `lastProseText`. During polling (line 376), the check `status.response !== preSendState.lastProseText` is `false` when the old response is still on the page, so the code never detects a "new" response. The `hasSubstantialResponse` heuristic (>50 chars) can match the old response. - -**Fix:** Use `proseCount` from pre-send state as the primary new-response signal instead of text comparison. A new response adds new prose elements, so if `proseCount` increases, a new response exists. Keep the existing text comparison as a secondary signal. - -**File:** `src/server.ts` — modify the polling loop in comet_ask handler (lines 360-396) - -**Logic change:** -- Capture `preSendState.proseCount` (already captured) -- During polling, also check prose count from agent status -- If prose count > preSend count, treat as new response -- Keep existing `hasSubstantialResponse` as fallback - -**Testing:** Add integration test for sequential queries. Run UAT-027 again. diff --git a/docs/plans/2026-04-08-audit-fixes-design.md b/docs/plans/2026-04-08-audit-fixes-design.md deleted file mode 100644 index 5c59467..0000000 --- a/docs/plans/2026-04-08-audit-fixes-design.md +++ /dev/null @@ -1,163 +0,0 @@ ---- -date: 2026-04-08 -title: Audit Fixes — Production Release Gate -status: approved -approach: Approach A — Surgical Fixes ---- - -# Audit Fixes Design - -Comprehensive audit found 38 findings (6 Critical, 6 High, 11 Medium, 10 Low, 5 Info). -This design covers Critical + High only, plus test coverage backfill — the production release gate. - -## Scope - -- 12 code fixes (3 security, 3 resilience, 3 error handling, 3 dead code cleanup) -- 4 test coverage areas -- All changes are surgical — minimal lines touched per finding - ---- - -## Section 1: Security (3 fixes) - -### S1.1 — JS injection via prompt - -**File:** `src/ui/input.ts:4-42`, `src/ui/navigation.ts:23-68` - -**Problem:** `buildTypePromptScript()` escapes backslashes, quotes, newlines but misses backticks and template literals (`${...}`). `buildModeSwitchScript()` interpolates mode name without escaping. - -**Fix:** Replace manual escaping with `JSON.stringify()` in both functions. This handles all edge cases (backticks, template literals, unicode line separators, etc.). - -### S1.2 — SSRF via URL validation - -**File:** `src/server.ts:587-604` - -**Problem:** `url.includes('perplexity.ai')` allows `perplexity.ai.evil.com`, `evil.com/perplexity.ai/`, `perplexity.ai@evil.com`. - -**Fix:** Use `new URL(url).hostname` check — only allow `hostname === 'perplexity.ai'` or `.endsWith('.perplexity.ai')`. - -### S1.3 — Spawn error handling - -**File:** `src/cdp/browser.ts:108-117` - -**Problem:** Detached Comet process has no error handler. Spawn failures are silently swallowed. - -**Fix:** Add `.on('error', (err) => logger.error(...))` before `unref()`. - ---- - -## Section 2: Resilience and Concurrency (3 fixes) - -### S2.1 — Operation Queue in CDPClient - -**File:** `src/cdp/client.ts` - -**Problem:** CDPClient singleton has no locking. Concurrent tool calls share mutable state — can corrupt connection state. - -**Fix:** Add Promise-based mutex serializer: -``` -private opLock: Promise = Promise.resolve() - -private async enqueue(fn: () => Promise): Promise { - const prev = this.opLock - let resolve: () => void - this.opLock = new Promise(r => { resolve = r }) - await prev - try { return await fn() } finally { resolve!() } -} -``` -All public methods (screenshot, safeEvaluate, navigate, etc.) wrap their logic through `enqueue()`. No external dependencies. - -### S2.2 — Reconnect race condition - -**File:** `src/cdp/client.ts:207-238` - -**Problem:** `reconnect()` returns silently if already reconnecting. Callers proceed against broken connection. - -**Fix:** Track a shared `reconnectPromise`. If already reconnecting, await the existing promise instead of returning silently: -``` -private reconnectPromise: Promise | null = null - -// In reconnect(): -if (this.state.isReconnecting && this.reconnectPromise) { - return this.reconnectPromise -} -this.reconnectPromise = (async () => { /* reconnect logic */ })() -try { await this.reconnectPromise } finally { this.reconnectPromise = null } -``` - -### S2.3 — Health check + error propagation - -**File:** `src/cdp/client.ts:201-205`, `114-123`, `276-296` - -**Problem:** `ensureHealthyConnection` doesn't catch reconnect failures. `disconnect()` and `closeExtraTabs()` silently swallow errors. - -**Fix:** -- `ensureHealthyConnection`: wrap reconnect in try/catch, log warning, let caller's withAutoReconnect handle retry. -- Add `logger.debug()` in all empty catch blocks. - ---- - -## Section 3: Error Handling and Timeout (3 fixes) - -### S3.1 — Timeout cleanup in comet_ask - -**File:** `src/server.ts:364-410` - -**Problem:** Timeout fires but last safeEvaluate may still be pending. No cleanup of in-progress state. - -**Fix:** Add `timedOut` flag checked in the while loop. On timeout exit, flag prevents further evaluations. - -### S3.2 — extractValue error type - -**File:** `src/server.ts:203-212` - -**Problem:** `extractValue` throws generic `Error` — loses AsteriaError context. - -**Fix:** Throw `EvaluationError` instead of `Error`, consistent with the error hierarchy. - -### S3.3 — parseAgentStatus safety - -**File:** `src/server.ts:225-228` - -**Problem:** `JSON.parse` on malformed CDP response throws unhandled. - -**Fix:** Wrap in try/catch, return empty status object as fallback. - ---- - -## Section 4: Test Coverage Backfill - -### S4.1 — src/cli.ts (0% → 70%+) - -- Mock child_process, test runDetect() and runCall() -- JSON-RPC message construction, timeout, screenshot file write - -### S4.2 — CDP client reconnection (75% → 85%+) - -- Max reconnect attempts → throws -- withAutoReconnect retry on connection error -- ensureHealthyConnection calls reconnect when unhealthy -- Operation queue serializes concurrent calls - -### S4.3 — Security fix tests - -- Prompt with backticks, `${}`, unicode → properly escaped -- SSRF URLs → rejected -- Spawn error → logged - -### S4.4 — Config env var branches (62% → 80%+) - -- All ASTERIA_* env vars with valid/invalid values -- Malformed config file JSON - ---- - -## Out of Scope (deferred post-v1.0) - -- Dead code removal (unused exports) -- Magic number extraction to config -- Logging consistency cleanup -- Type safety improvements (any casts) -- buildModeSwitchScript async fix -- Medium/Low findings from all audits diff --git a/docs/plans/2026-04-08-audit-fixes-plan.md b/docs/plans/2026-04-08-audit-fixes-plan.md deleted file mode 100644 index 2cd474d..0000000 --- a/docs/plans/2026-04-08-audit-fixes-plan.md +++ /dev/null @@ -1,874 +0,0 @@ -# Audit Fixes — Production Release Gate Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** Fix all Critical + High audit findings and backfill test coverage for production readiness. - -**Architecture:** Surgical in-place fixes. Add operation queue to CDPClient for concurrency safety. Use JSON.stringify for injection-safe escaping. Use URL parsing for SSRF-safe validation. - -**Tech Stack:** TypeScript, Vitest, chrome-remote-interface, MCP SDK - ---- - -### Task 1: Fix JS injection via prompt (S1.1 — CRITICAL) - -**Files:** -- Modify: `src/ui/input.ts:4-42` -- Test: `tests/unit/ui/input.test.ts` - -**Step 1: Write the failing test** - -Add to `tests/unit/ui/input.test.ts`: - -```typescript -describe('buildTypePromptScript injection safety', () => { - it('escapes backticks in prompt', () => { - const script = buildTypePromptScript('test` injected code') - expect(script).not.toContain('${') - expect(script).not.toContain('`') - }) - - it('escapes template literal expressions', () => { - const script = buildTypePromptScript('${document.cookie}') - expect(script).not.toContain('${') - }) - - it('escapes unicode line separators', () => { - const script = buildTypePromptScript('test\u2028line') - expect(script).not.toContain('\u2028') - }) -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/unit/ui/input.test.ts` -Expected: FAIL — backtick/template injection not escaped - -**Step 3: Write minimal implementation** - -In `src/ui/input.ts`, replace lines 5-9: - -```typescript -// BEFORE (manual escaping — incomplete): -const escaped = prompt - .replace(/\\/g, '\\\\') - .replace(/'/g, "\\'") - .replace(/\n/g, '\\n') - .replace(/"/g, '\\"') - -// AFTER (JSON.stringify — complete escaping): -const escaped = JSON.stringify(prompt).slice(1, -1) -``` - -**Step 4: Run test to verify it passes** - -Run: `npx vitest run tests/unit/ui/input.test.ts` -Expected: PASS - -**Step 5: Commit** - -```bash -git add src/ui/input.ts tests/unit/ui/input.test.ts -git commit -m "fix: use JSON.stringify for prompt escaping to prevent JS injection (audit S1.1)" -``` - ---- - -### Task 2: Fix JS injection via mode name (S1.1b — HIGH) - -**Files:** -- Modify: `src/ui/navigation.ts:25-26` -- Test: `tests/unit/ui/navigation.test.ts` - -**Step 1: Write the failing test** - -Add to `tests/unit/ui/navigation.test.ts`: - -```typescript -describe('buildModeSwitchScript injection safety', () => { - it('escapes special characters in mode display name', () => { - // If a mode name contained injection payload - const script = buildModeSwitchScript("standard';alert(1);//") - expect(script).not.toContain("';alert") - }) -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/unit/ui/navigation.test.ts` - -**Step 3: Write minimal implementation** - -In `src/ui/navigation.ts`, change line 26: - -```typescript -// BEFORE: -var displayName = '${displayName}'; - -// AFTER: -var displayName = ${JSON.stringify(displayName)}; -``` - -**Step 4: Run test to verify it passes** - -Run: `npx vitest run tests/unit/ui/navigation.test.ts` - -**Step 5: Commit** - -```bash -git add src/ui/navigation.ts tests/unit/ui/navigation.test.ts -git commit -m "fix: use JSON.stringify for mode name escaping to prevent JS injection (audit S1.1b)" -``` - ---- - -### Task 3: Fix SSRF via URL validation (S1.2 — HIGH) - -**Files:** -- Modify: `src/server.ts:587-592` -- Test: `tests/integration/tools/ui-tools.test.ts` - -**Step 1: Write the failing test** - -Add to the `comet_open_conversation` describe block in the integration test file: - -```typescript -it('rejects domain suffix attack', async () => { - const handler = getHandler('comet_open_conversation') - const result = await handler({ url: 'https://perplexity.ai.evil.com/search/123' }) - expect(result.content[0].text).toContain('Error') -}) - -it('rejects path-based bypass', async () => { - const handler = getHandler('comet_open_conversation') - const result = await handler({ url: 'https://evil.com/perplexity.ai/' }) - expect(result.content[0].text).toContain('Error') -}) - -it('rejects credential-based URL confusion', async () => { - const handler = getHandler('comet_open_conversation') - const result = await handler({ url: 'https://perplexity.ai@evil.com/' }) - expect(result.content[0].text).toContain('Error') -}) - -it('accepts valid perplexity.ai URL', async () => { - const handler = getHandler('comet_open_conversation') - const result = await handler({ url: 'https://www.perplexity.ai/search/abc123' }) - expect(result.content[0].text).toContain('Navigated to:') -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/integration/tools/ui-tools.test.ts` -Expected: FAIL — suffix attack not blocked - -**Step 3: Write minimal implementation** - -In `src/server.ts`, replace the URL validation block (around line 588-592): - -```typescript -// BEFORE: -if (!url.startsWith('https://') || !url.includes('perplexity.ai')) { - return toMcpError( - new Error(`Invalid URL: must be a https://perplexity.ai/ URL, got "${url}"`), - ) -} - -// AFTER: -let parsed: URL -try { - parsed = new URL(url) -} catch { - return toMcpError(new Error(`Invalid URL: "${url}"`)) -} -if (parsed.protocol !== 'https:' || !parsed.hostname.endsWith('perplexity.ai')) { - return toMcpError( - new Error(`Invalid URL: must be a https://perplexity.ai/ URL, got "${url}"`), - ) -} -``` - -Note: `.endsWith('perplexity.ai')` covers both `perplexity.ai` and `www.perplexity.ai` and any subdomain. - -**Step 4: Run test to verify it passes** - -Run: `npx vitest run tests/integration/tools/ui-tools.test.ts` - -**Step 5: Commit** - -```bash -git add src/server.ts tests/integration/tools/ui-tools.test.ts -git commit -m "fix: use URL hostname parsing for SSRF-safe URL validation (audit S1.2)" -``` - ---- - -### Task 4: Fix spawn error handling (S1.3 — CRITICAL) - -**Files:** -- Modify: `src/cdp/browser.ts:108-117` -- Test: `tests/unit/cdp/browser.test.ts` - -**Step 1: Write the failing test** - -Add to `tests/unit/cdp/browser.test.ts`: - -```typescript -it('startCometProcess attaches error handler to spawned process', () => { - const { spawn } = await import('node:child_process') - // Mock spawn to capture the child process - const mockChild = { unref: vi.fn(), on: vi.fn() } - vi.doMock('node:child_process', () => ({ - ...actual, - spawn: vi.fn().mockReturnValue(mockChild), - })) - startCometProcess('/path/to/comet', 9222, mockLogger) - expect(mockChild.on).toHaveBeenCalledWith('error', expect.any(Function)) -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/unit/cdp/browser.test.ts` - -**Step 3: Write minimal implementation** - -In `src/cdp/browser.ts`, change `startCometProcess`: - -```typescript -// BEFORE: -export function startCometProcess(cometPath: string, port: number, logger: Logger): void { - const args = [`--remote-debugging-port=${port}`] - logger.info(`Launching Comet: ${cometPath} ${args.join(' ')}`) - const child = spawn(cometPath, args, { - detached: true, - stdio: 'ignore', - shell: isWindows(), - }) - child.unref() -} - -// AFTER: -export function startCometProcess(cometPath: string, port: number, logger: Logger): void { - const args = [`--remote-debugging-port=${port}`] - logger.info(`Launching Comet: ${cometPath} ${args.join(' ')}`) - const child = spawn(cometPath, args, { - detached: true, - stdio: 'ignore', - shell: isWindows(), - }) - child.on('error', (err) => logger.error(`Comet spawn failed: ${err.message}`)) - child.unref() -} -``` - -**Step 4: Run test to verify it passes** - -Run: `npx vitest run tests/unit/cdp/browser.test.ts` - -**Step 5: Commit** - -```bash -git add src/cdp/browser.ts tests/unit/cdp/browser.test.ts -git commit -m "fix: add error handler to spawned Comet process (audit S1.3)" -``` - ---- - -### Task 5: Add operation queue to CDPClient (S2.1 — CRITICAL) - -**Files:** -- Modify: `src/cdp/client.ts` (add enqueue method, wrap public methods) -- Test: `tests/unit/cdp/client.test.ts` - -**Step 1: Write the failing test** - -Add to `tests/unit/cdp/client.test.ts`: - -```typescript -describe('CDPClient operation queue', () => { - it('serializes concurrent operations', async () => { - originalFetch = mockFetchForConnect() - mockCRI() - - const client = CDPClient.getInstance() - await client.connect() - - const order: number[] = [] - const op1 = client['enqueue'](async () => { - order.push(1) - await new Promise((r) => setTimeout(r, 50)) - order.push(2) - }) - const op2 = client['enqueue'](async () => { - order.push(3) - }) - await Promise.all([op1, op2]) - expect(order).toEqual([1, 2, 3]) - }) -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/unit/cdp/client.test.ts` -Expected: FAIL — `enqueue` not defined - -**Step 3: Write minimal implementation** - -Add to `CDPClient` class in `src/cdp/client.ts`, after the `state` property (around line 22): - -```typescript - private opLock: Promise = Promise.resolve() - - private async enqueue(fn: () => Promise): Promise { - const prev = this.opLock - let resolve!: () => void - this.opLock = new Promise((r) => { resolve = r }) - await prev - try { - return await fn() - } finally { - resolve() - } - } -``` - -Then wrap the public methods that touch `criClient`. For each of `screenshot`, `navigate`, `safeEvaluate`, `evaluate`, `pressKey`, `disconnect`, `closeExtraTabs`, wrap the body in `enqueue`: - -```typescript -// Example for screenshot (already has ensureHealthyConnection): -async screenshot(format: 'png' | 'jpeg' = 'png'): Promise { - return this.enqueue(async () => { - await this.ensureHealthyConnection() - return await this.withAutoReconnect(async () => { - // ... existing code unchanged ... - }) - }) -} -``` - -Same pattern for `navigate`, `safeEvaluate`, `evaluate`, `pressKey`, `disconnect`, `closeExtraTabs`. - -**Step 4: Run test to verify it passes** - -Run: `npx vitest run tests/unit/cdp/client.test.ts` - -**Step 5: Commit** - -```bash -git add src/cdp/client.ts tests/unit/cdp/client.test.ts -git commit -m "feat: add operation queue to CDPClient for concurrency safety (audit S2.1)" -``` - ---- - -### Task 6: Fix reconnect race condition (S2.2 — CRITICAL) - -**Files:** -- Modify: `src/cdp/client.ts:223-238` -- Test: `tests/unit/cdp/client.test.ts` - -**Step 1: Write the failing test** - -```typescript -it('concurrent reconnects share the same promise', async () => { - originalFetch = mockFetchForConnect() - let connectCalls = 0 - mockCRI({ - Runtime: { - enable: vi.fn(), - evaluate: vi.fn().mockResolvedValue({ result: { value: 2 } }), - }, - Page: { enable: vi.fn(), navigate: vi.fn(), loadEventFired: vi.fn().mockResolvedValue(undefined) }, - Emulation: { setDeviceMetricsOverride: vi.fn().mockRejectedValue('ignore') }, - Input: { dispatchKeyEvent: vi.fn() }, - Target: { closeTarget: vi.fn() }, - close: vi.fn(), - }) - vi.mocked(CRI).mockImplementation(async () => { - connectCalls++ - await new Promise((r) => setTimeout(r, 100)) - return mockCriMock - }) - - const client = CDPClient.getInstance() - await client.connect() - expect(connectCalls).toBe(1) - - // Trigger two reconnects concurrently - await Promise.all([ - client['reconnect'](), - client['reconnect'](), - ]) - // Should only reconnect once, not twice - expect(connectCalls).toBeLessThanOrEqual(3) // initial + at most 1 reconnect -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/unit/cdp/client.test.ts` - -**Step 3: Write minimal implementation** - -Add field to CDPClient: - -```typescript -private reconnectPromise: Promise | null = null -``` - -Replace the `reconnect()` method: - -```typescript -private async reconnect(): Promise { - if (this.reconnectPromise) return this.reconnectPromise - this.state.isReconnecting = true - this.reconnectPromise = (async () => { - try { - await this.disconnect() - await this.connect(this.state.targetId ?? undefined) - this.state.reconnectAttempts = 0 - this.logger.info('Reconnected') - } catch (err) { - this.state.reconnectAttempts++ - this.logger.error(`Reconnect failed: ${err instanceof Error ? err.message : String(err)}`) - throw err - } finally { - this.state.isReconnecting = false - this.reconnectPromise = null - } - })() - return this.reconnectPromise -} -``` - -**Step 4: Run test to verify it passes** - -Run: `npx vitest run tests/unit/cdp/client.test.ts` - -**Step 5: Commit** - -```bash -git add src/cdp/client.ts tests/unit/cdp/client.test.ts -git commit -m "fix: share reconnect promise to prevent concurrent reconnection race (audit S2.2)" -``` - ---- - -### Task 7: Fix health check + error propagation (S2.3 — CRITICAL) - -**Files:** -- Modify: `src/cdp/client.ts:201-205`, `114-118`, `276-296` -- Test: `tests/unit/cdp/client.test.ts` - -**Step 1: Write the failing test** - -```typescript -it('ensureHealthyConnection catches reconnect failure gracefully', async () => { - const client = CDPClient.getInstance() - // Make isHealthy return false, connect throw - client.criClient = {} as any - client.state.connected = true - vi.spyOn(client as any, 'isHealthy').mockResolvedValue(false) - vi.spyOn(client as any, 'disconnect').mockResolvedValue(undefined) - vi.spyOn(client as any, 'connect').mockRejectedValue(new Error('Connection refused')) - - // Should not throw — let caller handle via withAutoReconnect - await expect(client['ensureHealthyConnection']()).resolves.toBeUndefined() -}) - -it('disconnect logs errors instead of swallowing silently', async () => { - originalFetch = mockFetchForConnect() - const criMock = mockCRI() - criMock.close.mockRejectedValue(new Error('close failed')) - - const client = CDPClient.getInstance() - await client.connect() - - await client.disconnect() - expect(criMock.close).toHaveBeenCalled() - expect(client.state.connected).toBe(false) -}) -``` - -**Step 2: Run test to verify it fails** - -**Step 3: Write minimal implementation** - -Replace `ensureHealthyConnection`: - -```typescript -private async ensureHealthyConnection(): Promise { - if (await this.isHealthy()) return - this.logger.warn('Connection unhealthy, reconnecting...') - try { - await this.reconnect() - } catch (err) { - this.logger.warn(`Reconnect failed during health check: ${err instanceof Error ? err.message : String(err)}`) - } -} -``` - -Replace `disconnect` catch block: - -```typescript -async disconnect(): Promise { - if (this.criClient) { - try { - await this.criClient.close() - } catch (err) { - this.logger.debug(`Disconnect error (ignored): ${err instanceof Error ? err.message : String(err)}`) - } - this.criClient = null - } - this.state.connected = false - this.state.targetId = null -} -``` - -Same for `closeExtraTabs` inner catches — add `this.logger.debug(...)`. - -**Step 4: Run test to verify it passes** - -**Step 5: Commit** - -```bash -git add src/cdp/client.ts tests/unit/cdp/client.test.ts -git commit -m "fix: add error logging to disconnect/closeExtraTabs, catch reconnect in ensureHealthyConnection (audit S2.3)" -``` - ---- - -### Task 8: Fix timeout cleanup + extractValue + parseAgentStatus (S3.1-S3.3 — HIGH) - -**Files:** -- Modify: `src/server.ts:203-212`, `225-228`, `363-410` -- Test: `tests/integration/tools/core-tools.test.ts` - -**Step 1: Write the failing tests** - -In `tests/integration/tools/core-tools.test.ts`: - -```typescript -it('comet_ask stops polling after timeout', async () => { - let evalCalls = 0 - mocks.safeEvaluate.mockImplementation(async () => { - evalCalls++ - if (evalCalls === 1) return { result: { value: '{"proseCount":0,"lastProseText":""}' } } - if (evalCalls === 2) return { result: { value: 'typed' } } - if (evalCalls === 3) return { result: { value: 'submitted' } } - return { result: { value: JSON.stringify({ status: 'working', steps: [], currentStep: '', response: '', hasStopButton: true }) } } - }) - - const handler = getHandler('comet_ask') - const result = await handler({ prompt: 'test', timeout: 300 }) - expect(result.content[0].text).toContain('Agent is still working') - // Verify no runaway polling after timeout - const callsAfterTimeout = evalCalls - await new Promise((r) => setTimeout(r, 500)) - expect(evalCalls).toBe(callsAfterTimeout) -}) -``` - -**Step 2: Run test to verify it fails** - -**Step 3: Write minimal implementation** - -In `src/server.ts`, update `extractValue`: - -```typescript -// BEFORE: -throw new Error(`Script error: ${desc}`) - -// AFTER: -throw new EvaluationError(`Script error: ${desc}`, { expression: '(unknown)' }) -``` - -Add import at top: `import { ... EvaluationError ... } from './errors.js'` - -Update `parseAgentStatus`: - -```typescript -function parseAgentStatus(raw: unknown): RawAgentStatus { - if (typeof raw === 'string') { - try { - return JSON.parse(raw) as RawAgentStatus - } catch { - return { status: 'idle', steps: [], currentStep: '', response: '', hasStopButton: false, proseCount: 0 } - } - } - return raw as RawAgentStatus -} -``` - -Update the polling loop — add `timedOut` flag: - -```typescript -const startTime = Date.now() -let sawNewResponse = false -let timedOut = false -const collectedSteps: string[] = [] -let lastResponse = '' - -while (!timedOut && Date.now() - startTime < effectiveTimeout) { - await sleep(config.pollInterval) - - if (timedOut) break - - const statusRaw = await client.safeEvaluate(buildGetAgentStatusScript(activeSelectors)) - // ... rest unchanged, but add check at end of loop body: -} -// After loop: -timedOut = true // Signal to prevent any further polling -``` - -**Step 4: Run test to verify it passes** - -**Step 5: Commit** - -```bash -git add src/server.ts tests/integration/tools/core-tools.test.ts -git commit -m "fix: add timeout flag, EvaluationError in extractValue, safe parseAgentStatus (audit S3.1-S3.3)" -``` - ---- - -### Task 9: Backfill CLI tests (S4.1 — CRITICAL coverage gap) - -**Files:** -- Create: `tests/unit/cli-run.test.ts` -- Modify: `tests/unit/cli.test.ts` (if needed) - -**Step 1: Write tests for runDetect()** - -Create `tests/unit/cli-run.test.ts`: - -```typescript -import { describe, expect, it, vi, beforeEach, afterEach } from 'vitest' -import { spawn } from 'node:child_process' - -vi.mock('node:child_process', () => ({ - spawn: vi.fn(), -})) - -// Mock config, version, server modules -vi.mock('../../src/config.js', () => ({ - loadConfig: () => ({ - port: 9222, timeout: 30000, cometPath: null, responseTimeout: 120000, - logLevel: 'error', screenshotFormat: 'png', screenshotQuality: 80, - windowWidth: 1440, windowHeight: 900, maxReconnectAttempts: 5, - maxReconnectDelay: 5000, pollInterval: 1000, - }), -})) - -vi.mock('../../src/version.js', () => ({ - detectCometVersion: vi.fn().mockResolvedValue({ chromeMajor: 145, browser: 'Chrome/145', selectors: {} }), -})) - -describe('CLI runDetect', () => { - it('connects to server and returns detected version', async () => { - const mockStdout = { - on: vi.fn((event: string, cb: (d: Buffer) => void) => { - if (event === 'data') { - // Simulate MCP initialize + tool call response - const initResp = { jsonrpc: '2.0', id: 1, result: { capabilities: {} } } - const toolResp = { - jsonrpc: '2.0', id: 2, result: { - content: [{ type: 'text', text: 'Connected to Comet on port 9222 (Chrome/145), target t1' }], - }, - } - cb(Buffer.from(`Content-Length: ${JSON.stringify(initResp).length}\r\n\r\n${JSON.stringify(initResp)}`)) - cb(Buffer.from(`Content-Length: ${JSON.stringify(toolResp).length}\r\n\r\n${JSON.stringify(toolResp)}`)) - } - }), - } - const mockChild = { - stdout: mockStdout, - stderr: { on: vi.fn() }, - stdin: { write: vi.fn() }, - on: vi.fn((event: string, cb: Function) => { - if (event === 'exit') setTimeout(() => cb(0), 100) - }), - kill: vi.fn(), - } - vi.mocked(spawn).mockReturnValue(mockChild as any) - - // Import and test runDetect - const { runDetect } = await import('../../src/cli.js') - // runDetect should complete without throwing - // Actual implementation depends on CLI internals - }) -}) -``` - -**Step 2-5:** Run, fix, iterate, commit. - -```bash -git add tests/unit/cli-run.test.ts -git commit -m "test: add CLI runDetect unit tests (audit S4.1)" -``` - ---- - -### Task 10: Backfill reconnection tests (S4.2 — HIGH coverage gap) - -**Files:** -- Modify: `tests/unit/cdp/client.test.ts` - -**Step 1: Write tests** - -```typescript -describe('CDPClient reconnection paths', () => { - it('withAutoReconnect retries on connection error', async () => { - originalFetch = mockFetchForConnect() - let attempt = 0 - const criMock = mockCRI({ - Page: { - enable: vi.fn(), - navigate: vi.fn(), - loadEventFired: vi.fn().mockResolvedValue(undefined), - captureScreenshot: vi.fn().mockImplementation(() => { - attempt++ - if (attempt === 1) throw new Error('Connection lost') - return { data: 'base64data' } - }), - }, - }) - - const client = CDPClient.getInstance() - await client.connect() - - // Manually trigger reconnect on connection error - const result = await client['withAutoReconnect'](async () => { - if (attempt === 0) throw new Error('Connection lost') - return 'success' - }) - // withAutoReconnect should catch and retry - }) - - it('withAutoReconnect throws after max attempts', async () => { - originalFetch = mockFetchForConnect() - mockCRI() - - const client = CDPClient.getInstance() - await client.connect() - - await expect( - client['withAutoReconnect'](async () => { throw new Error('Connection lost') }), - ).rejects.toThrow('Connection lost') - }) - - it('ensureHealthyConnection calls reconnect when unhealthy', async () => { - originalFetch = mockFetchForConnect() - mockCRI() - const client = CDPClient.getInstance() - await client.connect() - - const reconnectSpy = vi.spyOn(client as any, 'reconnect').mockResolvedValue(undefined) - vi.spyOn(client as any, 'isHealthy').mockResolvedValue(false) - - await client['ensureHealthyConnection']() - expect(reconnectSpy).toHaveBeenCalled() - }) -}) -``` - -**Step 2-5:** Run, fix, iterate, commit. - -```bash -git add tests/unit/cdp/client.test.ts -git commit -m "test: add reconnection path tests for CDPClient (audit S4.2)" -``` - ---- - -### Task 11: Security fix tests (S4.3) - -**Files:** -- Modify: `tests/unit/ui/input.test.ts`, `tests/integration/tools/ui-tools.test.ts`, `tests/unit/cdp/browser.test.ts` - -These tests are written as part of Tasks 1-4 (Step 1 of each task includes the security test). - -**Commit:** - -```bash -git commit -m "test: add security injection and SSRF tests (audit S4.3)" -``` - ---- - -### Task 12: Config env var branch tests (S4.4) - -**Files:** -- Modify: `tests/unit/config.test.ts` - -**Step 1: Write tests** - -```typescript -describe('Config env var branches', () => { - const envVars = [ - { name: 'ASTERIA_PORT', key: 'port', value: '9999', expected: 9999 }, - { name: 'ASTERIA_TIMEOUT', key: 'timeout', value: '10000', expected: 10000 }, - { name: 'ASTERIA_RESPONSE_TIMEOUT', key: 'responseTimeout', value: '60000', expected: 60000 }, - { name: 'ASTERIA_SCREENSHOT_FORMAT', key: 'screenshotFormat', value: 'jpeg', expected: 'jpeg' }, - { name: 'ASTERIA_SCREENSHOT_QUALITY', key: 'screenshotQuality', value: '90', expected: 90 }, - { name: 'ASTERIA_WINDOW_WIDTH', key: 'windowWidth', value: '1920', expected: 1920 }, - { name: 'ASTERIA_WINDOW_HEIGHT', key: 'windowHeight', value: '1080', expected: 1080 }, - { name: 'ASTERIA_MAX_RECONNECT', key: 'maxReconnectAttempts', value: '10', expected: 10 }, - { name: 'ASTERIA_RECONNECT_DELAY', key: 'maxReconnectDelay', value: '10000', expected: 10000 }, - { name: 'ASTERIA_POLL_INTERVAL', key: 'pollInterval', value: '500', expected: 500 }, - ] - - for (const { name, key, value, expected } of envVars) { - it(`reads ${name} from environment`, () => { - process.env[name] = value - const config = loadConfig() - expect(config[key]).toBe(expected) - delete process.env[name] - }) - } - - it('uses default when env var is invalid number', () => { - process.env.ASTERIA_PORT = 'not-a-number' - const config = loadConfig() - expect(config.port).toBe(9222) // default - delete process.env.ASTERIA_PORT - }) -}) -``` - -**Step 2-5:** Run, fix, iterate, commit. - -```bash -git add tests/unit/config.test.ts -git commit -m "test: add config env var branch tests (audit S4.4)" -``` - ---- - -### Task 13: Final validation - -**Step 1: Run full test suite** - -Run: `npx vitest run` -Expected: ALL PASS, no regressions - -**Step 2: Run coverage** - -Run: `npx vitest run --coverage` -Expected: Statement coverage >85%, all Critical/High areas covered - -**Step 3: Build** - -Run: `npm run build` -Expected: Clean build, no type errors - -**Step 4: Final commit** - -```bash -git commit --allow-empty -m "chore: audit fixes complete — production release gate passed" -``` diff --git a/docs/plans/2026-04-08-uat-timeout-fixes-design.md b/docs/plans/2026-04-08-uat-timeout-fixes-design.md deleted file mode 100644 index 905254b..0000000 --- a/docs/plans/2026-04-08-uat-timeout-fixes-design.md +++ /dev/null @@ -1,57 +0,0 @@ ---- -date: 2026-04-08 -title: UAT Failure Fixes — smart polling, stop retry, timeout bump -status: approved -approach: All of the above ---- - -# UAT Failure Fixes Design - -3 UAT failures from full 31-test run, all timeout/timing related. - -## Fix 1: Smart polling with auto-extend - -**File:** `src/server.ts` polling loop (lines 364-409) - -**Problem:** Polling loop waits for hard timeout even when Perplexity is actively generating text. Responses that arrive late but correctly are discarded as "timed out." - -**Fix:** Track response growth. If `lastResponse` text is getting longer across polls, reset a stall counter. Only give up after N consecutive polls with no growth (10 polls × pollInterval = 10s stall). Hard timeout remains as absolute ceiling. - -**Logic:** -``` -let stallCount = 0 -const MAX_STALL_POLLS = 10 - -// In loop: -const prevLength = lastResponse.length -// ... update lastResponse ... -if (lastResponse.length > prevLength) { - stallCount = 0 // growing, reset -} else if (sawNewResponse) { - stallCount++ // stalled, increment -} -if (stallCount >= MAX_STALL_POLLS) break -``` - -## Fix 2: comet_stop retry - -**File:** `src/server.ts` comet_stop handler (lines 429-443) - -**Problem:** `comet_stop` checks for stop button once. If agent hasn't started yet, returns "No stop button found" immediately. - -**Fix:** Retry loop — poll for stop button up to 5 times with 1s sleep between attempts. 5s total wait window. - -## Fix 3: Default timeout increase - -**File:** `src/config.ts` line 9 - -**Problem:** Default `responseTimeout` of 120s is tight for slow environments (fresh Comet profile, complex queries). - -**Fix:** Increase to `180000` (3 minutes). - -## Tests - -- Test smart polling: mock growing response that exceeds original timeout, verify auto-extend -- Test smart polling stall: mock response that stops growing, verify stall detection -- Test stop retry: mock stop button appearing on 3rd attempt, verify success -- Test stop retry exhaustion: mock no stop button ever, verify "not found" after 5 attempts diff --git a/docs/plans/2026-04-09-assessment-fixes-plan.md b/docs/plans/2026-04-09-assessment-fixes-plan.md deleted file mode 100644 index 1b78a82..0000000 --- a/docs/plans/2026-04-09-assessment-fixes-plan.md +++ /dev/null @@ -1,673 +0,0 @@ -# Assessment P0+P1 Fixes — Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** Fix all P0 (critical) and P1 (important) issues identified in the 2026-04-09 repository assessment. - -**Architecture:** Each task is a self-contained fix with TDD: write failing test first, implement, verify. Tasks are ordered by priority — P0s first, then P1s. Most tasks touch 1-2 files and are independent. - -**Tech Stack:** TypeScript, Vitest, Biome, Node 18+ - ---- - -### Task 1: Fix SSRF domain suffix check in `comet_open_conversation` [P0] - -**Files:** -- Modify: `src/server.ts:627` -- Test: `tests/integration/tools/ui-tools.test.ts:120-136` -- Test: `tests/integration/tools/extraction-tools.test.ts:115-132` - -**Step 1: Write the failing test** - -Add to `tests/integration/tools/ui-tools.test.ts` in the `comet_open_conversation` describe block, after the existing "rejects domain suffix attack" test: - -```typescript -it('rejects evilperplexity.ai domain suffix attack', async () => { - const handler = getHandler('comet_open_conversation') - const result = await handler({ url: 'https://evilperplexity.ai/search/123' }) - expect(result.content[0].text).toContain('Error') -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/integration/tools/ui-tools.test.ts -t "rejects evilperplexity"` -Expected: FAIL — `evilperplexity.ai` passes `endsWith('perplexity.ai')` check. - -**Step 3: Write minimal implementation** - -In `src/server.ts`, replace line 627: - -```typescript -// Before: -if (parsed.protocol !== 'https:' || !parsed.hostname.endsWith('perplexity.ai')) { - -// After: -const isPerplexityHost = - parsed.hostname === 'perplexity.ai' || parsed.hostname.endsWith('.perplexity.ai') -if (parsed.protocol !== 'https:' || !isPerplexityHost) { -``` - -**Step 4: Run tests to verify they pass** - -Run: `npx vitest run tests/integration/tools/ui-tools.test.ts tests/integration/tools/extraction-tools.test.ts` -Expected: ALL PASS - -**Step 5: Commit** - -```bash -git add src/server.ts tests/integration/tools/ui-tools.test.ts -git commit -m "fix: close SSRF domain suffix bypass in comet_open_conversation" -``` - ---- - -### Task 2: Fix mode switch `setTimeout` bug in `buildModeSwitchScript` [P0] - -**Files:** -- Modify: `src/ui/navigation.ts:23-68` -- Test: `tests/unit/ui/navigation.test.ts` - -**Step 1: Write the failing test** - -Add to `tests/unit/ui/navigation.test.ts`: - -```typescript -describe('buildModeSwitchScript synchronous behavior', () => { - it('does not use setTimeout — returns result synchronously', () => { - const s = buildModeSwitchScript('deep-research') - expect(s).not.toContain('setTimeout') - expect(s).toContain('return') - }) - - it('returns clicked:displayName on success within same IIFE', () => { - const s = buildModeSwitchScript('deep-research') - // The script must be a single IIFE that returns a value directly - expect(s).toMatch(/^\(function\(\)\s*\{[\s\S]*\}\)\(\)$/) - }) -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/unit/ui/navigation.test.ts -t "synchronous behavior"` -Expected: FAIL — current implementation contains `setTimeout`. - -**Step 3: Write minimal implementation** - -Replace `buildModeSwitchScript` in `src/ui/navigation.ts`: - -```typescript -export function buildModeSwitchScript(mode: string): string { - const displayName = MODE_DISPLAY_NAMES[mode] ?? mode - return `(function() { - var displayName = ${JSON.stringify(displayName)}; - if (!displayName) return 'standard_mode_no_action'; - - var input = document.querySelector('#ask-input') || document.querySelector('[contenteditable="true"]'); - if (!input) return 'no_input_found'; - - input.focus(); - input.textContent = '/'; - input.dispatchEvent(new InputEvent('input', { bubbles: true, cancelable: true })); - input.dispatchEvent(new KeyboardEvent('keydown', { key: '/', bubbles: true })); - - var listbox = document.querySelector('[role="listbox"]'); - if (!listbox) return 'no_listbox_found'; - - var menuItems = document.querySelectorAll('[role="menuitem"]'); - for (var i = 0; i < menuItems.length; i++) { - var itemText = menuItems[i].textContent || ''; - if (itemText.indexOf(displayName) !== -1) { - menuItems[i].click(); - return 'clicked:' + displayName; - } - } - return 'menu_item_not_found:' + displayName; - })()` -} -``` - -This removes the `setTimeout` retry loop entirely. The mode switch is now a synchronous check — if the typeahead menu isn't immediately available, it returns `'no_listbox_found'`. The caller (MCP tool handler) can retry the whole script if needed. - -**Step 4: Update existing test that checks for setTimeout** - -In `tests/unit/ui/navigation.test.ts`, find and update the "has listbox retry with maxAttempts" test: - -```typescript -it('returns no_listbox_found when listbox missing', () => { - const s = buildModeSwitchScript('deep-research') - expect(s).toContain('no_listbox_found') - expect(s).not.toContain('setTimeout') -}) -``` - -**Step 5: Run tests to verify they pass** - -Run: `npx vitest run tests/unit/ui/navigation.test.ts` -Expected: ALL PASS - -**Step 6: Commit** - -```bash -git add src/ui/navigation.ts tests/unit/ui/navigation.test.ts -git commit -m "fix: remove broken setTimeout from mode switch, use synchronous IIFE" -``` - ---- - -### Task 3: Add runtime config validation [P1] - -**Files:** -- Modify: `src/config.ts` -- Test: `tests/unit/config.test.ts` - -**Step 1: Write the failing tests** - -Add to `tests/unit/config.test.ts`: - -```typescript -describe('config validation', () => { - beforeEach(() => { - vi.unstubAllEnvs() - delete process.env.ASTERIA_PORT - delete process.env.ASTERIA_TIMEOUT - delete process.env.ASTERIA_RESPONSE_TIMEOUT - delete process.env.ASTERIA_POLL_INTERVAL - delete process.env.ASTERIA_MAX_RECONNECT - delete process.env.ASTERIA_RECONNECT_DELAY - delete process.env.ASTERIA_LOG_LEVEL - delete process.env.ASTERIA_SCREENSHOT_FORMAT - }) - - it('clamps port to valid range 1-65535', () => { - expect(loadConfig({ port: 0 }).port).toBe(1) - expect(loadConfig({ port: 99999 }).port).toBe(65535) - expect(loadConfig({ port: 9222 }).port).toBe(9222) - }) - - it('clamps timeout to minimum 1000ms', () => { - expect(loadConfig({ timeout: 0 }).timeout).toBe(1000) - expect(loadConfig({ timeout: -500 }).timeout).toBe(1000) - expect(loadConfig({ timeout: 5000 }).timeout).toBe(5000) - }) - - it('clamps pollInterval to minimum 100ms', () => { - expect(loadConfig({ pollInterval: 0 }).pollInterval).toBe(100) - expect(loadConfig({ pollInterval: 50 }).pollInterval).toBe(100) - expect(loadConfig({ pollInterval: 2000 }).pollInterval).toBe(2000) - }) - - it('clamps maxReconnectAttempts to minimum 0', () => { - expect(loadConfig({ maxReconnectAttempts: -5 }).maxReconnectAttempts).toBe(0) - expect(loadConfig({ maxReconnectAttempts: 3 }).maxReconnectAttempts).toBe(3) - }) - - it('falls back to default for invalid logLevel env var', () => { - vi.stubEnv('ASTERIA_LOG_LEVEL', 'verbose') - const cfg = loadConfig() - expect(cfg.logLevel).toBe('info') - }) - - it('falls back to default for invalid screenshotFormat env var', () => { - vi.stubEnv('ASTERIA_SCREENSHOT_FORMAT', 'gif') - const cfg = loadConfig() - expect(cfg.screenshotFormat).toBe('png') - }) -}) -``` - -**Step 2: Run tests to verify they fail** - -Run: `npx vitest run tests/unit/config.test.ts -t "config validation"` -Expected: FAIL — no clamping or validation exists yet. - -**Step 3: Write minimal implementation** - -Add at the end of `src/config.ts`, before the final return in `loadConfig`: - -```typescript -const VALID_LOG_LEVELS = ['debug', 'info', 'warn', 'error'] as const -const VALID_SCREENSHOT_FORMATS = ['png', 'jpeg'] as const - -function clamp(value: number, min: number, max: number): number { - return Math.max(min, Math.min(max, value)) -} - -function validatedConfig(raw: CometConfig): CometConfig { - return { - ...raw, - port: clamp(raw.port, 1, 65535), - timeout: clamp(raw.timeout, 1000, Number.POSITIVE_INFINITY), - responseTimeout: clamp(raw.responseTimeout, 1000, Number.POSITIVE_INFINITY), - pollInterval: clamp(raw.pollInterval, 100, Number.POSITIVE_INFINITY), - maxReconnectAttempts: Math.max(0, raw.maxReconnectAttempts), - logLevel: VALID_LOG_LEVELS.includes(raw.logLevel as (typeof VALID_LOG_LEVELS)[number]) - ? raw.logLevel - : DEFAULTS.logLevel, - screenshotFormat: VALID_SCREENSHOT_FORMATS.includes( - raw.screenshotFormat as (typeof VALID_SCREENSHOT_FORMATS)[number], - ) - ? raw.screenshotFormat - : DEFAULTS.screenshotFormat, - } -} -``` - -Then change the return statement in `loadConfig` from: - -```typescript -return { - ...DEFAULTS, - ...fileConfig, - ...envConfig, - ...overrides, -} -``` - -to: - -```typescript -return validatedConfig({ - ...DEFAULTS, - ...fileConfig, - ...envConfig, - ...overrides, -}) -``` - -**Step 4: Run tests to verify they pass** - -Run: `npx vitest run tests/unit/config.test.ts` -Expected: ALL PASS - -**Step 5: Commit** - -```bash -git add src/config.ts tests/unit/config.test.ts -git commit -m "fix: add runtime config validation with clamping and enum checks" -``` - ---- - -### Task 4: Fix CLI `process.execPath` for global binary [P1] - -**Files:** -- Modify: `src/cli.ts:158` -- Test: `tests/unit/cli-run.test.ts` - -**Step 1: Write the failing test** - -Add to `tests/unit/cli-run.test.ts`: - -```typescript -it('runCall uses script path not node path', async () => { - const { spawn } = await import('node:child_process') - vi.doMock('node:child_process', () => ({ - ...await import('node:child_process'), - spawn: vi.fn().mockReturnValue({ - stdout: { on: vi.fn() }, - stderr: { on: vi.fn() }, - stdin: { write: vi.fn(), end: vi.fn() }, - on: vi.fn(), - }), - })) - vi.resetModules() - - // ... set up process.argv for 'call comet_connect' - const origArgv = process.argv - process.argv = ['node', 'dist/cli.js', 'call', 'comet_connect'] - const origExit = process.exit - process.exit = vi.fn() as never - - // Import and run - const mod = await import('../../../src/cli.js') - // Check spawn was called with resolve(__dirname, 'index.js'), not process.execPath - const spawnCalls = vi.mocked(spawn).mock.calls - if (spawnCalls.length > 0) { - const scriptArg = spawnCalls[0][1] - expect(scriptArg).toBeDefined() - // Should NOT use process.execPath as first arg - expect(spawnCalls[0][0]).not.toBe(process.execPath) - } - - process.argv = origArgv - process.exit = origExit - vi.doUnmock('node:child_process') - vi.resetModules() -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/unit/cli-run.test.ts -t "script path not node path"` -Expected: FAIL — current code uses `process.execPath`. - -**Step 3: Write minimal implementation** - -In `src/cli.ts`, replace line 158: - -```typescript -// Before: -const child = spawn(process.execPath, [resolve(__dirname, 'index.js')], { - -// After: -const child = spawn(process.execPath, [resolve(__dirname, 'server.js')], { -``` - -Note: `index.js` just calls `startServer()` from `server.js`. Using `index.js` is fine but the real fix is that `process.execPath` should be replaced with the actual node binary resolution. Since `asteria` is run via node, `process.execPath` is correct for the node binary itself. The real issue is the second argument. Let me reconsider... - -Actually, `process.execPath` gives the Node binary (e.g., `/usr/local/bin/node`), and `[resolve(__dirname, 'index.js')]` is the script. This is correct when run via `node dist/cli.js`, but wrong when run as a globally installed binary because the script path resolves relative to the installed location. - -The fix should be to use `import.meta.url` to resolve correctly in ESM: - -```typescript -const scriptPath = resolve(__dirname, 'index.js') -const child = spawn(process.execPath, [scriptPath], { - stdio: ['pipe', 'pipe', 'pipe'], - env: { ...process.env }, -}) -``` - -This is already what the code does. The actual fix is simpler — just ensure `__dirname` resolves correctly. Let me verify and add a log for debugging: - -Actually, the current implementation is fine for ESM with the `__dirname` polyfill at the top. The issue was identified as "works only via `node dist/cli.js`" but the assessment was incorrect — `process.execPath` + `resolve(__dirname, 'index.js')` works correctly for globally installed binaries too because `__dirname` is resolved from `import.meta.url` which always points to the installed location. - -**Revised Step 3:** Mark this as "verified not a bug" after manual testing. No code change needed. - -**Step 4: Commit** - -If the test confirms the existing code works, no commit needed. Remove the test or adjust it to verify correct behavior instead. - ---- - -### Task 5: Replace `httpbin.org` with local mock in tests [P1] - -**Files:** -- Modify: `tests/unit/cdp/browser.test.ts:43-75` -- Reference: `tests/integration/mock-cdp-server.ts` (pattern for local HTTP server) - -**Step 1: Write the failing test** - -Add a test that uses a local HTTP server: - -```typescript -import { createServer, type Server } from 'node:http' - -describe('httpGet with local server', () => { - let server: Server - let port: number - - beforeEach((done) => { - server = createServer((req, res) => { - if (req.url === '/ok') { - res.writeHead(200, { 'Content-Type': 'application/json' }) - res.end('{"status":"ok"}') - } else if (req.url === '/fail') { - res.writeHead(500) - res.end('error') - } else if (req.url === '/slow') { - setTimeout(() => { res.writeHead(200); res.end('late') }, 5000) - } else { - res.writeHead(404) - res.end() - } - }) - server.listen(0, () => { - port = (server.address() as { port: number }).port - done() - }) - }) - - afterEach((done) => { - server.close(done) - }) - - it('returns ok:true for successful fetch', async () => { - const { httpGet } = await import('../../../src/cdp/browser.js') - const result = await httpGet(`http://127.0.0.1:${port}/ok`, 3000) - expect(result.ok).toBe(true) - expect(result.status).toBe(200) - }) - - it('returns ok:false for non-200 status', async () => { - const { httpGet } = await import('../../../src/cdp/browser.js') - const result = await httpGet(`http://127.0.0.1:${port}/fail`, 3000) - expect(result.ok).toBe(false) - expect(result.status).toBe(500) - }) - - it('aborts after timeout', async () => { - const { httpGet } = await import('../../../src/cdp/browser.js') - const start = Date.now() - const result = await httpGet(`http://127.0.0.1:${port}/slow`, 100) - const elapsed = Date.now() - start - expect(result.ok).toBe(false) - expect(elapsed).toBeLessThan(500) - }) -}) -``` - -**Step 2: Run tests to verify they pass (new tests)** - -Run: `npx vitest run tests/unit/cdp/browser.test.ts -t "local server"` -Expected: PASS - -**Step 3: Remove old httpbin.org tests** - -Delete the three tests in the old `httpGet` describe block that call `httpbin.org`: -- "returns ok:true for successful fetch" (line 43-49) -- "returns ok:false for non-200 status" (line 59-64) -- "aborts after timeout" (line 66-74) - -Keep the "returns ok:false for failed fetch" test (line 51-56) as it uses `127.0.0.1:99999`. - -**Step 4: Run all browser tests** - -Run: `npx vitest run tests/unit/cdp/browser.test.ts` -Expected: ALL PASS, no network calls - -**Step 5: Commit** - -```bash -git add tests/unit/cdp/browser.test.ts -git commit -m "fix: replace httpbin.org with local HTTP server in browser tests" -``` - ---- - -### Task 6: Update `docs/configuration.md` with missing env vars [P1] - -**Files:** -- Modify: `docs/configuration.md` - -**Step 1: Update the documentation** - -Replace the entire content of `docs/configuration.md` with: - -```markdown -# Configuration - -## Environment Variables - -| Variable | Default | Description | -|----------|---------|-------------| -| `ASTERIA_PORT` | 9222 | Chrome DevTools Protocol port (1-65535) | -| `COMET_PATH` | auto-detect | Path to Comet executable | -| `ASTERIA_LOG_LEVEL` | info | Logging level: `debug`, `info`, `warn`, `error` | -| `ASTERIA_TIMEOUT` | 30000 | Comet launch timeout in ms (min 1000) | -| `ASTERIA_RESPONSE_TIMEOUT` | 180000 | Response polling timeout in ms (min 1000) | -| `ASTERIA_SCREENSHOT_FORMAT` | png | Screenshot format: `png` or `jpeg` | -| `ASTERIA_SCREENSHOT_QUALITY` | 80 | JPEG screenshot quality (0-100) | -| `ASTERIA_MAX_RECONNECT` | 5 | Max reconnection attempts (min 0) | -| `ASTERIA_RECONNECT_DELAY` | 5000 | Max reconnection backoff delay in ms | -| `ASTERIA_POLL_INTERVAL` | 1000 | Status poll interval in ms (min 100) | -| `ASTERIA_USER_DATA_DIR` | null | Custom Chrome user data directory | -| `ASTERIA_WINDOW_WIDTH` | 1440 | Browser window width in pixels | -| `ASTERIA_WINDOW_HEIGHT` | 900 | Browser window height in pixels | - -## Priority - -1. **Defaults** — hardcoded sensible defaults -2. **Config file** — `asteria.config.json` in current working directory -3. **Environment variables** — `ASTERIA_*` and `COMET_PATH` -4. **Programmatic overrides** — via `loadConfig(overrides)` - -Higher priority overrides lower. Invalid values fall back to defaults. - -## Config File Example - -See `asteria.config.example.json` in the repository root. -``` - -**Step 2: Commit** - -```bash -git add docs/configuration.md -git commit -m "docs: add missing env vars to configuration docs" -``` - ---- - -### Task 7: Log startup errors in `src/index.ts` [P2] - -**Files:** -- Modify: `src/index.ts` - -**Step 1: Write the fix** - -Replace `src/index.ts` entirely: - -```typescript -import { createLogger } from './logger.js' - -const logger = createLogger('info') - -import('./server.js') - .then(({ startServer }) => startServer()) - .catch((err: unknown) => { - logger.error(`Fatal: ${err instanceof Error ? err.message : String(err)}`) - process.exit(1) - }) -``` - -**Step 2: Run build to verify** - -Run: `npx tsc --noEmit` -Expected: No errors - -**Step 3: Commit** - -```bash -git add src/index.ts -git commit -m "fix: log startup errors instead of silently exiting" -``` - ---- - -### Task 8: Remove unused `_logger` in `src/snapshot.ts` [P2] - -**Files:** -- Modify: `src/snapshot.ts:6` - -**Step 1: Write the fix** - -In `src/snapshot.ts`, remove line 6: - -```typescript -// Remove this line: -const _logger = createLogger('info') -``` - -And the import of `createLogger` if it becomes unused. - -**Step 2: Verify** - -Run: `npx tsc --noEmit && npx vitest run tests/unit/snapshot.test.ts` -Expected: No errors, tests pass - -**Step 3: Commit** - -```bash -git add src/snapshot.ts -git commit -m "chore: remove unused logger variable from snapshot" -``` - ---- - -### Task 9: Log warning on version detection fallback [P2] - -**Files:** -- Modify: `src/version.ts:21` - -**Step 1: Write the failing test** - -Add to `tests/unit/version.test.ts`: - -```typescript -it('logs warning on fetch failure before falling back', async () => { - const stderrWrite = vi.spyOn(process.stderr, 'write').mockReturnValue(true) - vi.spyOn(globalThis, 'fetch').mockRejectedValue(new Error('connection refused')) - const { detectCometVersion } = await import('../../../src/version.js') - const result = await detectCometVersion(9222) - expect(result.chromeMajor).toBe(0) - expect(stderrWrite).toHaveBeenCalledWith(expect.stringContaining('warn')) - stderrWrite.mockRestore() -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/unit/version.test.ts -t "logs warning"` -Expected: FAIL — no warning logged currently. - -**Step 3: Write minimal implementation** - -In `src/version.ts`, update the fetch catch block: - -```typescript -export async function detectCometVersion(port: number): Promise { - try { - const resp = await fetch(`http://127.0.0.1:${port}/json/version`) - if (!resp.ok) { - process.stderr.write('[asteria:warn] Comet version detection: non-OK response, using default selectors\n') - return { chromeMajor: 0, selectors: getSelectorsForVersion(0) } - } - const data = (await resp.json()) as { Browser?: string } - const browser = data.Browser ?? '' - const chromeMajor = parseChromeVersion(browser) - return { chromeMajor, selectors: getSelectorsForVersion(chromeMajor) } - } catch { - process.stderr.write('[asteria:warn] Comet version detection failed, using default selectors\n') - return { chromeMajor: 0, selectors: getSelectorsForVersion(0) } - } -} -``` - -**Step 4: Run tests to verify they pass** - -Run: `npx vitest run tests/unit/version.test.ts` -Expected: ALL PASS - -**Step 5: Commit** - -```bash -git add src/version.ts tests/unit/version.test.ts -git commit -m "fix: log warning when Comet version detection falls back to defaults" -``` - ---- - -## Summary - -| Task | Priority | Files Changed | Risk | -|------|----------|---------------|------| -| 1. SSRF domain suffix | P0 | server.ts, ui-tools test | LOW — tighter check | -| 2. Mode switch setTimeout | P0 | navigation.ts, nav test | MEDIUM — behavior change | -| 3. Config validation | P1 | config.ts, config test | LOW — additive | -| 4. CLI execPath | P1 | cli.ts | LOW — verify only | -| 5. Remove httpbin.org | P1 | browser.test.ts | LOW — test only | -| 6. Config docs | P1 | configuration.md | NONE — docs only | -| 7. Startup error log | P2 | index.ts | NONE — additive | -| 8. Remove unused logger | P2 | snapshot.ts | NONE — cleanup | -| 9. Version fallback warning | P2 | version.ts, test | LOW — additive | diff --git a/docs/plans/2026-04-09-repo-assessment-design.md b/docs/plans/2026-04-09-repo-assessment-design.md deleted file mode 100644 index 015f031..0000000 --- a/docs/plans/2026-04-09-repo-assessment-design.md +++ /dev/null @@ -1,460 +0,0 @@ -# Asteria Repository Assessment - Complete Report - -**Date:** 2026-04-09 -**Scope:** Full repository assessment (architecture, code quality, test coverage, security, documentation) -**Version assessed:** 0.1.0 (commit 663a7c2) - ---- - -## 1. Project Overview - -**Asteria** is an MCP server that bridges AI agents with Perplexity Comet via Chrome DevTools Protocol (CDP). TypeScript, Node 18+, 24 source files, ~2,100 LOC, 12 MCP tools. - -| Metric | Value | -|---|---| -| Source files | 24 | -| Test files | 26 (unit + integration) | -| LOC (src/) | ~2,100 | -| Dependencies (prod) | 3 | -| MCP tools | 12 | -| Coverage target | 75% statements, 70% branches | -| License | MIT | - -**Strengths:** -- Clean 3-layer architecture (MCP Tools -> UI Automation -> CDP Transport) -- Well-designed error hierarchy (8 subclasses + `toMcpError`) -- Strong test coverage: 22/24 files with unit tests, all 12 tools with integration tests -- Serious security hardening (SSRF prevention, injection-safe JS embedding) -- Zero heavy browser dependencies (no Puppeteer/Playwright) -- Minimal dependencies (3 prod, 6 dev), all up-to-date - ---- - -## 2. File-by-File Assessment - -### 2.1 CDP Layer - -#### `src/cdp/browser.ts` (125 lines) - Severity: MEDIUM - -**Purpose:** Platform-specific Comet browser detection, launching, and process management (macOS, Windows, WSL). - -**Key exports:** `getCometPath()`, `httpGet()`, `isCometProcessRunning()`, `killComet()`, `startCometProcess()`, `isWindows()`, `isMac()`, `isWSL()`. - -**Issues:** -1. **`getCometPath()` uses `execSync` for file existence checks on macOS** (line 52). Should use `statSync`/`existsSync` for efficiency and correctness. -2. **`killComet()` kills ALL Comet processes** via `pkill -f 'Comet.app'`, not just the one launched by Asteria. No PID tracking exists. -3. **`startCometProcess()` spawns a detached child** (lines 117-124). If the parent crashes, the Comet process is orphaned without cleanup. -4. **`httpGet()` implements manual timeout** with `setTimeout` + `destroy()`. Could use `AbortController` for cleaner async cancellation. - -**Tests:** `tests/unit/cdp/browser.test.ts` (239 lines) - good coverage, but some tests call real `httpbin.org` which may cause CI flakiness. - ---- - -#### `src/cdp/client.ts` (341 lines) - Severity: MEDIUM - -**Purpose:** Singleton CDP client wrapping `chrome-remote-interface`. Manages connections, operations queue, health checks, auto-reconnect, screenshots, and evaluation. - -**Key exports:** `CDPClient` class. - -**Issues:** -1. **Singleton pattern with `instance = undefined!`** (line 52) uses non-null assertion. Works but not elegant. `resetInstance()` exposed for testing is a code smell of test/implementation coupling. -2. **`evaluate()` casts response as `Promise`** (line 189) without runtime validation of the CRI response shape. Malformed CDP responses could produce unexpected behavior downstream. -3. **Operation queue uses promise-chain pattern** (lines 28-38). If `fn()` throws, the next queued operation still runs because `resolve()` is in the `finally` block. This is correct behavior but subtle. - -**Tests:** `tests/unit/cdp/client.test.ts` (572 lines) - comprehensive. Covers singleton, connect, disconnect, navigate, screenshot, evaluate, pressKey, isHealthy, operation queue serialization, reconnect race conditions, withAutoReconnect retry/throw behavior, ensureHealthyConnection, and error propagation. The largest test file. - ---- - -#### `src/cdp/connection.ts` (29 lines) - Severity: LOW - -**Purpose:** Connection error detection (pattern matching) and exponential backoff calculation. - -**Key exports:** `isConnectionError()`, `getBackoffDelay()`, `CDPConnectionState` interface. - -**Issues:** -1. **`isConnectionError` relies on string pattern matching** in error messages. Fragile if the underlying library changes its error message format. - -**Tests:** `tests/unit/cdp/connection.test.ts` (82 lines) - good coverage. - ---- - -#### `src/cdp/tabs.ts` (40 lines) - Severity: NONE - -**Purpose:** Categorizes browser tabs into roles (main, sidecar, agentBrowsing, overlay, others) based on URL patterns. - -**Key exports:** `categorizeTabs()`. - -**Issues:** None. Case-insensitive URL classification, clean logic. - -**Tests:** `tests/unit/cdp/tabs.test.ts` (66 lines) - covers all categories. - ---- - -### 2.2 UI Automation Layer - -#### `src/ui/status.ts` (60 lines) - Severity: HIGH - -**Purpose:** Builds JS script to detect agent status (idle/working/completed), extract steps, and capture response text. - -**Key exports:** `buildGetAgentStatusScript(selectors?)`. - -**Issues:** -1. **Status detection via regex on `body.innerText`** (`workingPatterns`, `stepPatterns`). Extremely fragile and locale-dependent. If Comet changes button text or adds languages, detection breaks completely. -2. **Response truncation at 8000 chars** (line 48) is hardcoded. Should be configurable. -3. **This is the single most fragile component** in the codebase. Comet UI changes will break this first. - -**Tests:** `tests/unit/ui/status.test.ts` (66 lines) - covers status fields, working patterns, step extraction, stop button detection, custom selectors, response truncation. - ---- - -#### `src/ui/navigation.ts` (87 lines) - Severity: HIGH - -**Purpose:** Builds JS scripts for submitting prompts, switching modes, detecting current mode, and starting new chats. - -**Key exports:** `buildSubmitPromptScript()`, `buildModeSwitchScript(mode)`, `buildNewChatScript()`, `buildGetCurrentModeScript()`. - -**Issues:** -1. **`buildModeSwitchScript` uses `setTimeout(tryClickMenuItem, 100)` in a polling loop inside an IIFE** (lines 43-67). Since this is evaluated via `Runtime.evaluate` with `awaitPromise: true`, the `setTimeout` callbacks will **never execute** within the evaluation context. The mode switch likely relies on the synchronous check at invocation time and **may not work reliably**. This is potentially a production bug. -2. Mode detection depends on button text matching, which is locale-dependent. - -**Tests:** `tests/unit/ui/navigation.test.ts` (106 lines) - covers submit, getCurrentMode, modeSwitch for all modes, injection safety, edge cases. However, tests validate the script string structure, not the runtime behavior of `setTimeout` within CDP evaluation. - ---- - -#### `src/ui/input.ts` (55 lines) - Severity: LOW - -**Purpose:** Builds JS scripts for typing prompts into contenteditable elements and textarea/input fields, with injection-safe embedding via `JSON.stringify`. - -**Key exports:** `buildTypePromptScript(prompt, selectors?)`, `buildFindInputScript(selectors?)`. - -**Issues:** -1. **Uses `document.execCommand('insertText', ...)`** which is deprecated but still necessary for contenteditable React inputs. Acceptable trade-off. - -**Good patterns:** U+2028/U+2029 escape handling (lines 8-9) shows attention to edge cases. - -**Tests:** `tests/unit/ui/input.test.ts` (46 lines) - covers quote escaping, newline escaping, execCommand usage, backtick injection, template literal injection, unicode line separator handling. - ---- - -#### `src/ui/extraction.ts` (65 lines) - Severity: MEDIUM - -**Purpose:** Builds JS IIFE scripts for extracting sources/citations and page content. - -**Key exports:** `buildExtractSourcesScript()`, `buildExtractPageContentScript(maxLength)`. - -**Issues:** -1. **`buildExtractPageContentScript` strips UI noise via regex** (line 60), which is fragile. The regex only handles text at the start of lines. - -**Tests:** `tests/unit/ui/extraction.test.ts` (90 lines) - covers structure, filtering, deduplication, edge cases. - ---- - -#### `src/ui/conversations.ts` (21 lines) - Severity: MEDIUM - -**Purpose:** Builds JS IIFE script to extract conversation links from the page. - -**Key exports:** `buildListConversationsScript()`. - -**Issues:** -1. **`getAttribute('href')` returns relative URLs** (e.g., `/search/abc123`). The `url` field in results contains relative paths, not absolute URLs. Consumers must reconstruct the full URL. - -**Tests:** `tests/unit/ui/conversations.test.ts` (56 lines) - covers IIFE structure, anchor selection, URL filtering, deduplication, title extraction. - ---- - -#### `src/ui/stop.ts` (15 lines) - Severity: LOW - -**Purpose:** Builds JS script to click the stop/cancel button. - -**Key exports:** `buildStopAgentScript()`. - -**Issues:** -1. **SVG rect heuristic** (line 10) for detecting stop buttons could match any button with an SVG rectangle child. - -**Tests:** `tests/unit/ui/stop.test.ts` (49 lines). - ---- - -#### `src/ui/selectors.ts` (23 lines) - Severity: LOW - -**Purpose:** Default selector constants used as fallback. - -**Key exports:** `SELECTORS`. - -**Issues:** -1. **Duplicates the same values as `v145Selectors`**. Maintenance burden if selectors need updating. The duplication is intentional (default/fallback) but creates a sync risk. - -**Tests:** `tests/unit/ui/selectors.test.ts` (28 lines). - ---- - -### 2.3 Server and Configuration - -#### `src/server.ts` (674 lines) - Severity: HIGH (most critical file) - -**Purpose:** Core MCP server. Registers 12 tool handlers with `McpServer`, connects via stdio transport, handles all tool dispatch logic. - -**Key exports:** `startServer()`, `toolDefinitions`, `ToolDef`. - -**Issues:** -1. **Largest file in the project** (674 lines) with heterogeneous logic: tool definitions + handler dispatch + polling loop + response extraction. Should be split into separate modules. -2. **Line 52: Zod internal API access** - `(current as any)._def.innerType` to unwrap optional schemas. Acknowledged with biome-ignore. -3. **Line 225-234: `parseAgentStatus` does `as RawAgentStatus` cast** without validation. Malformed JSON from the browser could produce unexpected shapes. -4. **Lines 373-437: Polling loop in `comet_ask`** is the most logic-dense section of the codebase with 6 state variables (`sawNewResponse`, `stallCount`, `timedOut`, `collectedSteps`, `lastResponse`). Should be extracted into a dedicated polling state machine. -5. **Line 627: `comet_open_conversation` SSRF check** - `parsed.hostname.endsWith('perplexity.ai')` would also match `evilperplexity.ai`. However, this is mitigated since only subdomains of `perplexity.ai` are valid Comet URLs. - -**Tests:** `tests/unit/tools/handlers.test.ts` (47 lines), `tests/unit/tools/registry.test.ts` (60 lines), `tests/integration/tools/core-tools.test.ts` (408 lines), `tests/integration/tools/extraction-tools.test.ts` (202 lines), `tests/integration/tools/ui-tools.test.ts` (213 lines). - ---- - -#### `src/config.ts` (114 lines) - Severity: MEDIUM - -**Purpose:** Loads configuration from defaults < config file < environment variables < programmatic overrides. - -**Key exports:** `loadConfig(overrides?)`. - -**Issues:** -1. **Line 39: `(result as any)[key]`** for dynamic key assignment. Could be typed more precisely. -2. **Lines 74, 78: unchecked type assertions** `as CometConfig['logLevel']` and `as CometConfig['screenshotFormat']` on env var values. Invalid values like `ASTERIA_LOG_LEVEL=verbose` pass through without validation. -3. **No runtime validation of config values** (e.g., port range 1-65535, timeout positivity, poll interval minimum). - -**Tests:** `tests/unit/config.test.ts` (127 lines) - excellent coverage including defaults, env var overrides, programmatic overrides, config file loading, malformed JSON fallback, invalid number fallback. - ---- - -#### `src/cli.ts` (278 lines) - Severity: MEDIUM - -**Purpose:** CLI binary supporting `start`, `call`, `detect`, `--version`, `--help` commands. - -**Issues:** -1. **Line 158: `process.execPath` used to spawn child** - `asteria call` works only when run via `node dist/cli.js`, not as a globally installed binary. Should use `process.argv[1]` or resolve the script path differently. -2. **Line 197-198: screenshots saved to CWD** with timestamped name. No user control over output location, no path sanitization. -3. **Timeout of 180s** (line 234) is hardcoded. - -**Tests:** `tests/unit/cli.test.ts` (66 lines), `tests/unit/cli-run.test.ts` (324 lines). - ---- - -#### `src/index.ts` (7 lines) - Severity: LOW - -**Purpose:** Entry point. Imports and calls `startServer()`. - -**Issues:** -1. **`.catch((_err) => { process.exit(1) })`** silently swallows the error without logging. Makes debugging startup failures harder. - -**Tests:** None (trivial entry point). - ---- - -#### `src/errors.ts` (99 lines) - Severity: NONE - -**Purpose:** Custom error hierarchy with `AsteriaError` base + 8 subclasses + `toMcpError` converter. - -**Issues:** None. Well-designed error hierarchy with structured codes and context. Exemplary model. - -**Tests:** `tests/unit/errors.test.ts` (79 lines) - thorough coverage. - ---- - -#### `src/logger.ts` (39 lines) - Severity: NONE - -**Purpose:** Leveled logger that writes to stderr (avoids interfering with MCP stdio on stdout). - -**Issues:** None. - -**Tests:** `tests/unit/logger.test.ts` (55 lines). - ---- - -#### `src/snapshot.ts` (40 lines) - Severity: LOW - -**Purpose:** Debug utility that connects to Comet and dumps a DOM snapshot. - -**Issues:** -1. **`const _logger = createLogger('info')`** (line 6) creates a logger that is never used. - -**Tests:** `tests/unit/snapshot.test.ts` (28 lines). - ---- - -#### `src/version.ts` (26 lines) - Severity: MEDIUM - -**Purpose:** Detects Comet's Chrome version via `/json/version` and loads matching selector set. - -**Issues:** -1. **On fetch failure** (line 21), silently falls back to v145 selectors without logging a warning. Could hide connectivity issues. - -**Tests:** `tests/unit/version.test.ts` (44 lines). - ---- - -#### `src/prose-filter.ts` (66 lines) - Severity: LOW - -**Purpose:** Shared logic for finding and filtering "prose" content elements. - -**Issues:** -1. Large JS code string embedded in TypeScript. Harder to debug and test. -2. 100-character minimum and question-mark filtering heuristics are hardcoded. - -**Tests:** `tests/unit/prose-filter.test.ts` (86 lines). - ---- - -### 2.4 Selectors - -#### `src/selectors/index.ts` (15 lines) - Severity: MEDIUM - -**Issues:** Only v145 registered. All other versions fallback to v145, which breaks if Comet updates its DOM. - -#### `src/selectors/v145.ts` (22 lines) - Severity: LOW - -**Issues:** Selectors tightly coupled to Comet's internal DOM. Any Comet UI update could break them. - -#### `src/selectors/types.ts` (10 lines) - Severity: NONE - -Clean type definition. No issues. - ---- - -## 3. Cross-Cutting Analysis - -### 3.1 Dependencies - -| Dependency | Version | Status | -|---|---|---| -| `@modelcontextprotocol/sdk` | ^1.12.1 | Current, appropriate | -| `chrome-remote-interface` | ^0.33.2 | Mature, well-maintained | -| `zod` | ^3.25.76 | Current | -| `@biomejs/biome` | ^2.4.10 | Current | -| `@types/chrome-remote-interface` | ^0.33.0 | Matched to library | -| `@types/node` | ^20.11.0 | Slightly behind (22.x types exist) but functional | -| `@vitest/coverage-v8` | ^1.6.1 | Stable | -| `typescript` | ^5.4.0 | Current | -| `vitest` | ^1.6.0 | Current | - -**Assessment:** All dependencies are reasonable, up-to-date, and from well-known sources. No concerning or unnecessary dependencies. - ---- - -### 3.2 Type Safety - -**`any` type usages in `src/`:** -1. `src/config.ts:39` - `(result as any)[key]` - Dynamic key assignment, suppressed with biome-ignore. -2. `src/server.ts:52` - `(current as any)._def.innerType` - Zod internal API access, suppressed with biome-ignore. - -Both are acknowledged with `biome-ignore` comments. No unchecked `any` types elsewhere. - -**Unsafe casts:** -1. `src/server.ts:233` - `return raw as RawAgentStatus` - No validation after `JSON.parse`. -2. `src/cdp/client.ts:189` - `as Promise` - Unvalidated CRI response. -3. `src/config.ts:74,78` - `as CometConfig['logLevel']` / `as CometConfig['screenshotFormat']` - Unvalidated env var assertions. - ---- - -### 3.3 Error Handling - -The codebase follows a consistent and well-structured error handling pattern: -1. Custom error hierarchy (8 domain-specific subclasses) with structured codes and context. -2. `toMcpError()` converts all errors into uniform MCP error responses. -3. Every tool handler wraps its body in `try/catch(err) { return toMcpError(err) }`. -4. Auto-reconnect with exponential backoff for transient CDP failures. -5. Best-effort operations (like `closeExtraTabs`) catch and log errors. - ---- - -### 3.4 Test Coverage - -| Source File | Unit Tests | Integration Tests | -|---|---|---| -| `src/index.ts` | No | No | -| `src/cli.ts` | Yes (2 files) | No | -| `src/config.ts` | Yes | Indirect (via harness) | -| `src/errors.ts` | Yes | No | -| `src/logger.ts` | Yes | No | -| `src/prose-filter.ts` | Yes | No | -| `src/server.ts` | Yes (2 files) | Yes (3 files) | -| `src/snapshot.ts` | Yes | No | -| `src/types.ts` | Yes | No | -| `src/version.ts` | Yes | Yes | -| `src/cdp/browser.ts` | Yes | No | -| `src/cdp/client.ts` | Yes | Indirect (via harness) | -| `src/cdp/connection.ts` | Yes | No | -| `src/cdp/tabs.ts` | Yes | No | -| `src/selectors/index.ts` | Yes | Yes | -| `src/selectors/types.ts` | No | No | -| `src/selectors/v145.ts` | Yes | No | -| `src/ui/*` (6 files) | All covered | Partial (via extraction-tools, ui-tools) | - -**Coverage gaps:** -- `src/index.ts` and `src/selectors/types.ts` lack tests (both trivial files). -- `browser.test.ts` has tests calling real `httpbin.org` - potential CI flakiness. -- `handlers.test.ts` and `registry.test.ts` partially overlap. -- `version-detect.test.ts` (integration) duplicates `version.test.ts` (unit). - ---- - -### 3.5 Security - -| Area | Severity | Detail | -|---|---|---| -| SSRF Prevention | GOOD | `https:` + `perplexity.ai` hostname validation, tests for domain suffix attack | -| JS Injection | GOOD | `JSON.stringify()` for prompt embedding, tests for backtick/template literal injection | -| CDP Port Binding | LOW | Hardcoded `127.0.0.1`, but any local process can connect to port 9222 | -| Process Management | MEDIUM | `killComet()` kills all Comet processes, not just Asteria's | -| API Key Exposure | MITIGATED | `.mcp.json` has keys but is in `.gitignore` | -| Config Validation | MEDIUM | No runtime validation of env var values | -| Domain Suffix SSRF | MEDIUM | `evilperplexity.ai` passes `endsWith('perplexity.ai')` check | - ---- - -### 3.6 Documentation - -| Document | Status | Gaps | -|---|---|---| -| README.md | Complete (9,575 chars) | None | -| architecture.md | Present (36 lines) | Could be more detailed | -| configuration.md | Incomplete | Missing `ASTERIA_USER_DATA_DIR`, `ASTERIA_WINDOW_WIDTH/HEIGHT` | -| contributing.md | Present, concise | Adequate | -| comet-compatibility.md | Present | Only v145 documented | -| uat-checklist.md | Present | **No items executed yet** | -| JSDoc inline | Minimal | Most functions lack formal doc comments | - ---- - -## 4. Priority Actions - -### P0 - Critical (blocks production) - -1. **Verify and fix mode switch** - `setTimeout` in `buildModeSwitchScript` may not execute within `Runtime.evaluate` context. Test manually and fix if broken. -2. **Execute UAT checklist** - No manual testing has been done yet. All items in `docs/uat-checklist.md` are unchecked. -3. **Fix domain suffix SSRF** - `endsWith('perplexity.ai')` matches `evilperplexity.ai`. Use proper suffix check (`=== 'perplexity.ai' || endsWith('.perplexity.ai')`). - -### P1 - Important (degrades maintainability) - -4. Extract polling loop from `server.ts` into a dedicated state machine (~60 lines of complex logic). -5. Add PID tracking for `killComet()` and cleanup handler for parent crash. -6. Add runtime validation for config values (port range, timeout positivity). -7. Update `docs/configuration.md` with missing env vars. -8. Replace `httpbin.org` calls in tests with a local mock server. -9. Fix `process.execPath` in CLI to work as global binary. - -### P2 - Improvement (nice-to-have) - -10. Add JSDoc to public functions. -11. Remove `SELECTORS` vs `v145Selectors` duplication. -12. Add support for future Comet versions in selector registry. -13. Log startup errors in `src/index.ts` catch block. -14. Remove unused `_logger` in `src/snapshot.ts`. -15. Log a warning when version detection falls back to default selectors. - ---- - -## 5. Overall Assessment - -**Asteria is a well-structured project with strong fundamentals.** The 3-layer architecture is clean, error handling is exemplary, test coverage is strong, and security hardening is serious. The main risks are: - -1. **Fragility to Comet UI changes** - The entire UI automation layer (selectors, status detection, mode switching) is tightly coupled to Comet's DOM structure. A Comet update could break multiple tools simultaneously. -2. **No manual testing completed** - The UAT checklist exists but hasn't been executed. -3. **One potential production bug** - The mode switch `setTimeout` pattern may not work in CDP evaluation context. - -For a v0.1.0 pre-release, the codebase quality is above average. The priority should be: execute UAT -> fix P0 bugs -> npm publish. diff --git a/docs/plans/2026-04-09-uat-fixes-design.md b/docs/plans/2026-04-09-uat-fixes-design.md deleted file mode 100644 index 04a967f..0000000 --- a/docs/plans/2026-04-09-uat-fixes-design.md +++ /dev/null @@ -1,122 +0,0 @@ -# UAT Fixes Design — Sources, Conversations, Mode Switch, Auto-Connect - -**Date:** 2026-04-09 -**Scope:** Fix all 4 issues found during UAT testing against Comet Chrome/145.2.7632.4587 - ---- - -## Issue 1: `comet_get_sources` returns empty - -### Root Cause - -Sources in Comet v145 are `` elements, not `` anchors inside `[role="tabpanel"]`. Some citation spans have a parent `` with href, others don't. The current `buildExtractSourcesScript` only finds anchor links inside tabpanels and misses the citation structure entirely. - -### DOM Evidence - -```html - - - openai - - - -openai\n+3 - - - -``` - -### Design - -Add a second extraction strategy to `buildExtractSourcesScript`: - -1. **Strategy A (existing):** Find `a[href]` inside `[role="tabpanel"]`, filter internal links. -2. **Strategy B (new):** Find `[class*="citation"]` elements, extract URL from closest `` parent, extract title from innerText. - -Merge results from both strategies, deduplicate by URL. - ---- - -## Issue 2: `comet_list_conversations` returns empty - -### Root Cause - -The script filters anchors by `href` containing `/search/` or `/copilot/`. Comet v145 also uses `/computer/tasks/` for Computer mode conversations. This pattern is missing from the filter. - -### DOM Evidence - -Conversation links found on the page: -- `https://www.perplexity.ai/search/deep-research-test-...` (search mode) -- `https://www.perplexity.ai/computer/tasks/most-populated-cities-...` (computer mode) -- `https://www.perplexity.ai/search/what-is-the-capital-...` (search mode) - -Also: conversation innerText often duplicates the title (e.g., "TitleTitle") because Comet renders both a heading and a subtitle. - -### Design - -1. Add `/computer/tasks/` as a third accepted URL pattern in `buildListConversationsScript`. -2. Add deduplication of the title text — if the first half of the string equals the second half, use only the first half. - ---- - -## Issue 3: Mode switch typeahead timing - -### Root Cause - -After typing `/` into the input field, Comet's typeahead menu is not immediately rendered in the DOM. The current retry (5 attempts x 200ms = 1s) is insufficient. The menu may take 1-2 seconds to appear. - -Additionally, the `/` character injection via `input.textContent = '/'` may not reliably trigger Comet's React event handlers on first attempt. - -### Design - -1. Increase retry to 10 attempts x 300ms = 3 seconds total. -2. Before each retry, re-dispatch the `/` input event to ensure Comet received the keystroke. -3. Add a small initial delay (100ms) after the first `/` before starting retries. - ---- - -## Issue 4: Auto-connect for stateless CLI calls - -### Root Cause - -Each `asteria call ` spawns a new process with a fresh `CDPClient` singleton. Tools like `comet_open_conversation` call `client.navigate()` which requires an active connection. Only `comet_connect` establishes the connection — other tools assume it's already done. - -In persistent MCP sessions (Claude Code, Cursor), `comet_connect` is called once and the connection persists. But in CLI `call` mode, every invocation is independent. - -### Design - -Add an `ensureConnected()` helper in `server.ts`: - -```typescript -async function ensureConnected(): Promise { - if (client.state.targetId) return // Already connected - await client.launchOrConnect() - await client.closeExtraTabs() - const { chromeMajor, selectors } = await detectCometVersion(config.port) - activeSelectors = selectors - logger.info(`Auto-connected to Comet Chrome/${chromeMajor}`) -} -``` - -Call `await ensureConnected()` at the start of every tool handler that needs a connection. This is a no-op when already connected (MCP sessions) and auto-connects when needed (CLI calls). - -**Handlers that need `ensureConnected()`:** All 12 tools except `comet_connect` (which does the connection itself). - ---- - -## Testing Strategy - -Each fix follows TDD: -1. Write/update failing test first -2. Implement fix -3. Verify all tests pass -4. Manual UAT verification against real Comet - ---- - -## Priority - -1. **Issue 4 (auto-connect)** — Enables proper testing of all other tools via CLI -2. **Issue 1 (sources)** — Most visible user-facing bug -3. **Issue 2 (conversations)** — Simple selector fix -4. **Issue 3 (mode switch)** — Timing tuning, partially works already diff --git a/docs/plans/2026-04-09-uat-fixes-plan.md b/docs/plans/2026-04-09-uat-fixes-plan.md deleted file mode 100644 index 049b904..0000000 --- a/docs/plans/2026-04-09-uat-fixes-plan.md +++ /dev/null @@ -1,363 +0,0 @@ -# UAT Fixes Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** Fix 4 UAT issues: sources extraction, conversation listing, mode switch timing, and auto-connect for stateless CLI calls. - -**Architecture:** 4 independent tasks in priority order. Task 1 (auto-connect) enables testing of the others via CLI. Tasks 2-4 are selector/logic fixes in the UI automation layer. - -**Tech Stack:** TypeScript, Vitest, Chrome DevTools Protocol, Biome - ---- - -### Task 1: Add `ensureConnected()` auto-connect helper [Issue 4] - -**Files:** -- Modify: `src/server.ts` (add helper + add to 11 tool handlers) -- Test: `tests/integration/tools/ui-tools.test.ts` (add test) -- Reference: `tests/integration/tools/harness.ts` (mock state setup) - -**Step 1: Write the failing test** - -Add to `tests/integration/tools/ui-tools.test.ts`: - -```typescript -describe('auto-connect', () => { - it('auto-connects when no targetId set for comet_list_tabs', async () => { - // Simulate fresh client with no connection - mocks.state.targetId = null - mocks.launchOrConnect.mockResolvedValue('target-1') - mocks.closeExtraTabs.mockResolvedValue(undefined) - mocks.listTabsCategorized.mockResolvedValue({ - main: [{ id: 'target-1', url: 'https://www.perplexity.ai', type: 'page', title: 'Perplexity' }], - sidecar: [], - agentBrowsing: [], - overlay: [], - others: [], - }) - - const handler = getHandler('comet_list_tabs') - const result = await handler({}) - - // Should have auto-connected - expect(mocks.launchOrConnect).toHaveBeenCalled() - expect(result.content[0].text).toContain('Main') - }) -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/integration/tools/ui-tools.test.ts -t "auto-connects"` -Expected: FAIL — `launchOrConnect` not called (no auto-connect logic). - -**Step 3: Write minimal implementation** - -In `src/server.ts`, add this helper after the `activeSelectors` declaration (around line 34): - -```typescript -/** Ensure the client is connected before using tools. Auto-connects if needed. */ -async function ensureConnected(): Promise { - if (client.state.targetId) return - logger.info('Auto-connecting to Comet...') - await client.launchOrConnect() - await client.closeExtraTabs() - try { - const { chromeMajor, selectors } = await detectCometVersion(config.port) - activeSelectors = selectors - logger.info(`Auto-connected to Comet Chrome/${chromeMajor}`) - } catch { - // Version detection failure is non-fatal - } -} -``` - -Then add `await ensureConnected()` at the start of each tool handler that needs a connection. The handlers that need it are all EXCEPT `comet_connect` (which does its own connection). Add `await ensureConnected()` as the first line inside each `try` block of these 11 handlers: - -- `comet_ask` (line ~324) -- `comet_poll` (line ~446) -- `comet_stop` (line ~462) -- `comet_screenshot` (line ~483) -- `comet_mode` (line ~502) -- `comet_list_tabs` (line ~525) -- `comet_switch_tab` (line ~539) -- `comet_get_sources` (line ~568) -- `comet_list_conversations` (line ~592) -- `comet_open_conversation` (line ~619) -- `comet_get_page_content` (line ~647) - -For each handler, add `await ensureConnected()` as the first line after `try {`. For example, `comet_poll`: - -```typescript -async () => { - try { - await ensureConnected() - const raw = await client.safeEvaluate(buildGetAgentStatusScript(activeSelectors)) - ... -``` - -**Step 4: Run tests to verify they pass** - -Run: `npx vitest run tests/integration/tools/` -Expected: ALL PASS - -**Step 5: Commit** - -```bash -git add src/server.ts tests/integration/tools/ui-tools.test.ts -git commit -m "feat: add ensureConnected auto-connect for stateless CLI tool calls" -``` - ---- - -### Task 2: Fix sources extraction — add citation element strategy [Issue 1] - -**Files:** -- Modify: `src/ui/extraction.ts` -- Test: `tests/unit/ui/extraction.test.ts` - -**Step 1: Write the failing test** - -Add to `tests/unit/ui/extraction.test.ts`: - -```typescript -describe('citation extraction strategy', () => { - it('includes citation element strategy', () => { - const s = buildExtractSourcesScript() - expect(s).toContain('citation') - }) - - it('looks for citation class elements', () => { - const s = buildExtractSourcesScript() - expect(s).toContain('[class*="citation"]') - }) - - it('extracts URL from closest anchor parent of citation', () => { - const s = buildExtractSourcesScript() - expect(s).toContain('closest') - expect(s).toContain('a') - }) -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/unit/ui/extraction.test.ts -t "citation"` -Expected: FAIL — no citation strategy in current code. - -**Step 3: Write minimal implementation** - -Replace `buildExtractSourcesScript` in `src/ui/extraction.ts`: - -```typescript -export function buildExtractSourcesScript(): string { - return `(function() { - function isInternalLink(url) { - if (!url) return true; - try { - var hostname = new URL(url).hostname; - // Domain check must match src/utils.ts isPerplexityDomain() - return hostname === 'perplexity.ai' || hostname.endsWith('.perplexity.ai'); - } catch (e) { - return true; - } - } - - function extractDomain(url) { - try { - return new URL(url).hostname; - } catch (e) { - return ''; - } - } - - var sources = []; - var seenUrls = {}; - - // Strategy A: Find links inside tabpanel elements - var tabpanels = document.querySelectorAll('[role="tabpanel"]'); - for (var t = 0; t < tabpanels.length; t++) { - var anchors = tabpanels[t].querySelectorAll('a[href]'); - for (var i = 0; i < anchors.length; i++) { - var a = anchors[i]; - var href = a.href; - if (!href || seenUrls[href]) continue; - if (href.indexOf('javascript:') === 0) continue; - if (href.indexOf('#') === href.length - 1) continue; - if (isInternalLink(href)) continue; - seenUrls[href] = true; - sources.push({ url: href, title: (a.innerText || '').trim() || extractDomain(href) || href }); - } - } - - // Strategy B: Find citation elements (Comet v145 source format) - var citations = document.querySelectorAll('[class*="citation"]'); - for (var c = 0; c < citations.length; c++) { - var el = citations[c]; - if (el.className.indexOf('citation-nbsp') !== -1) continue; - var anchor = el.closest('a') || el.querySelector('a'); - if (!anchor) continue; - var href2 = anchor.href; - if (!href2 || seenUrls[href2]) continue; - if (isInternalLink(href2)) continue; - seenUrls[href2] = true; - var text2 = (el.innerText || '').trim().split('\\n')[0].trim(); - sources.push({ url: href2, title: text2 || extractDomain(href2) || href2 }); - } - - return JSON.stringify(sources); - })()` -} -``` - -**Step 4: Run tests to verify they pass** - -Run: `npx vitest run tests/unit/ui/extraction.test.ts` -Expected: ALL PASS - -**Step 5: Commit** - -```bash -git add src/ui/extraction.ts tests/unit/ui/extraction.test.ts -git commit -m "fix: add citation element extraction strategy for Comet v145 sources" -``` - ---- - -### Task 3: Fix conversation listing — add `/computer/tasks/` pattern [Issue 2] - -**Files:** -- Modify: `src/ui/conversations.ts` -- Test: `tests/unit/ui/conversations.test.ts` - -**Step 1: Write the failing test** - -Add to `tests/unit/ui/conversations.test.ts`: - -```typescript -it('filters for /computer/tasks/ URLs', () => { - const s = buildListConversationsScript() - expect(s).toContain('/computer/tasks/') -}) - -it('deduplicates doubled title text', () => { - const s = buildListConversationsScript() - expect(s).toContain('dedupeTitle') -}) -``` - -**Step 2: Run test to verify it fails** - -Run: `npx vitest run tests/unit/ui/conversations.test.ts -t "computer"` -Expected: FAIL — no `/computer/tasks/` in current script. - -**Step 3: Write minimal implementation** - -Replace `buildListConversationsScript` in `src/ui/conversations.ts`: - -```typescript -export function buildListConversationsScript(): string { - return `(function() { - function dedupeTitle(text) { - if (!text) return ''; - var half = Math.floor(text.length / 2); - if (half > 0 && text.substring(0, half) === text.substring(half)) return text.substring(0, half); - return text; - } - - var links = document.querySelectorAll('a[href]'); - var conversations = []; - var seen = {}; - for (var i = 0; i < links.length; i++) { - var href = links[i].getAttribute('href') || ''; - if (href.indexOf('/search/') !== -1 || href.indexOf('/copilot/') !== -1 || href.indexOf('/computer/tasks/') !== -1) { - if (!seen[href]) { - seen[href] = true; - conversations.push({ title: dedupeTitle((links[i].innerText || '').trim()), url: href }); - } - } - } - return JSON.stringify(conversations); - })()` -} -``` - -**Step 4: Run tests to verify they pass** - -Run: `npx vitest run tests/unit/ui/conversations.test.ts` -Expected: ALL PASS - -**Step 5: Commit** - -```bash -git add src/ui/conversations.ts tests/unit/ui/conversations.test.ts -git commit -m "fix: add /computer/tasks/ pattern and title dedup to conversation listing" -``` - ---- - -### Task 4: Fix mode switch retry timing [Issue 3] - -**Files:** -- Modify: `src/server.ts` (the comet_mode handler retry logic) - -**Step 1: Verify current retry implementation** - -The retry logic was added in a previous follow-up. Read the current `comet_mode` handler in `src/server.ts` to find the retry loop. It should look like: - -```typescript -const MAX_MODE_RETRIES = 5 -for (let attempt = 0; attempt < MAX_MODE_RETRIES; attempt++) { - const raw = await client.safeEvaluate(buildModeSwitchScript(mode)) - const result = extractValue(raw) - if (result !== 'no_listbox_found') { - return textResult(`Mode switch result: ${result}`) - } - await sleep(200) -} -``` - -**Step 2: Update retry parameters and add re-injection** - -Change the retry loop to: - -```typescript -const MAX_MODE_RETRIES = 10 -for (let attempt = 0; attempt < MAX_MODE_RETRIES; attempt++) { - const raw = await client.safeEvaluate(buildModeSwitchScript(mode)) - const result = extractValue(raw) - if (result !== 'no_listbox_found') { - return textResult(`Mode switch result: ${result}`) - } - // Wait longer between retries to give Comet time to render - await sleep(300) -} -return textResult('Mode switch failed: typeahead menu did not appear after retries') -``` - -Also update the integration test in `tests/integration/tools/ui-tools.test.ts`: - -Find the test `'retries mode switch when listbox not immediately available'` and update `toHaveBeenCalledTimes(3)` to account for the new 10-retry max if needed (the test should still work since it resolves on the 3rd call). - -**Step 3: Run tests** - -Run: `npx vitest run tests/integration/tools/ui-tools.test.ts -t "mode"` -Expected: ALL PASS - -**Step 4: Commit** - -```bash -git add src/server.ts tests/integration/tools/ui-tools.test.ts -git commit -m "fix: increase mode switch retry to 10x300ms for slower Comet rendering" -``` - ---- - -## Summary - -| Task | Issue | Files | Risk | -|------|-------|-------|------| -| 1. Auto-connect | Issue 4 | server.ts, ui-tools test | MEDIUM — touches all handlers | -| 2. Sources fix | Issue 1 | extraction.ts, test | LOW — additive strategy | -| 3. Conversations fix | Issue 2 | conversations.ts, test | LOW — add pattern + dedup | -| 4. Mode switch timing | Issue 3 | server.ts, test | LOW — parameter change | diff --git a/docs/plans/2026-04-10-docs-restructure-plan.md b/docs/plans/2026-04-10-docs-restructure-plan.md deleted file mode 100644 index cdc63cb..0000000 --- a/docs/plans/2026-04-10-docs-restructure-plan.md +++ /dev/null @@ -1,609 +0,0 @@ -# Documentation Restructure Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** Restructure Asteria's documentation into enterprise-grade English docs for AI agent developers — README hub + 7 focused doc files. - -**Architecture:** README.md is the landing page (install + quickstart + overview). Three new docs (`tools.md`, `integration.md`, `troubleshooting.md`) cover deep reference material. Four existing docs (`architecture.md`, `configuration.md`, `contributing.md`, `comet-compatibility.md`) get updated and expanded. - -**Tech Stack:** Markdown, GitHub-flavored. No build tools. - ---- - -### Task 1: Rewrite README.md - -**Files:** -- Modify: `README.md` - -**Step 1: Rewrite README.md** - -Replace the entire README with the following content. Key changes: -- Remove "GIF coming soon" placeholder and demo section -- Update 12 → 13 tools everywhere -- Add `comet_wait` to tools table -- Remove "npm publish" from roadmap (done) -- Remove "active development / install from source" note — just `npm install -g` -- Update architecture diagram to include `comet_wait` -- Add "Guides" section linking to docs -- Tighten prose for enterprise tone -- Keep banner, badges, license, sponsor sections - -The new README structure: -1. Banner + badges (keep existing) -2. One-liner pitch -3. How it works mermaid diagram (update T13 label to comet_wait) -4. Installation (just npm install -g) -5. Quick Start (3 steps: MCP config, Comet, use) -6. Tools overview table (13 tools, link to docs/tools.md) -7. CLI reference -8. Configuration link -9. Guides section (links to integration, troubleshooting, architecture) -10. Roadmap (remove npm publish, keep rest) -11. Contributing link -12. Sponsor + license - -Write the complete file. Do not use placeholders — every section has real content. - -**Step 2: Verify markdown renders correctly** - -Run: `cat README.md | head -20` -Expected: Clean markdown, no broken formatting - -**Step 3: Commit** - -```bash -git add README.md -git commit -m "docs: rewrite README for v1.1.0 — 13 tools, professional tone" -``` - ---- - -### Task 2: Create docs/tools.md - -**Files:** -- Create: `docs/tools.md` - -**Step 1: Write docs/tools.md** - -Full tool reference for all 13 tools. Each tool entry follows this structure: - -```markdown -## comet_ask - -Send a prompt to Perplexity Comet and poll until the agent responds or times out. - -### Parameters - -| Name | Type | Required | Default | Description | -|------|------|----------|---------|-------------| -| prompt | string | Yes | — | The question or instruction to send | -| newChat | boolean | No | false | Start a fresh chat before sending | -| timeout | number | No | 120000 | Maximum wait time in ms | - -### Response - -Returns the agent's response text. If the agent times out, returns a partial response with steps so far and a suggestion to use `comet_poll` or `comet_wait`. - -### Example - -asteria call comet_ask '{"prompt": "Summarize the latest AI research papers"}' - -### Notes - -- Non-blocking: returns partial results if timeout is reached -- Auto-connects if not already connected -- Use `comet_wait` after timeout to get the full response -``` - -Tool schemas from `src/server.ts:121-149`: - -1. **comet_connect** — params: `port` (number, optional). Auto-detects Comet, closes extra tabs, navigates to perplexity.ai. -2. **comet_ask** — params: `prompt` (string, required), `newChat` (boolean, optional), `timeout` (number, optional). Sends prompt, polls until response. -3. **comet_poll** — no params. Returns current agent status JSON. -4. **comet_wait** — params: `timeout` (number, optional, default 120000). Polls until agent finishes, returns full response. -5. **comet_stop** — no params. Clicks stop button. Retries up to 5 times. -6. **comet_screenshot** — params: `format` (enum: png|jpeg, optional, default png). Returns base64 image. -7. **comet_mode** — params: `mode` (enum: standard|deep-research|model-council|create|learn|review|computer, nullable, optional). Omit to get current mode. Icon-based, locale-independent. -8. **comet_list_tabs** — no params. Returns tabs categorized as Main, Sidecar, Agent Browsing, Overlay, Other. -9. **comet_switch_tab** — params: `tabId` (string, optional), `title` (string, optional). Switch by exact ID or title substring. -10. **comet_get_sources** — no params. Extracts citations. Auto-expands collapsed citations (e.g. `wsj+3`). -11. **comet_list_conversations** — no params. Returns recent conversations from sidebar. -12. **comet_open_conversation** — params: `url` (string, required). Must be `https://perplexity.ai/` URL. Validates domain (SSRF protection). -13. **comet_get_page_content** — params: `maxLength` (number, optional, default 10000). Extracts title + body text. - -After all 13 tools, add these sections: - -**Common Patterns:** -- Ask and wait: `comet_ask` with sufficient timeout -- Ask + poll: `comet_ask` with short timeout, then `comet_poll` in a loop -- Ask + wait: `comet_ask` (may timeout), then `comet_wait` to get full result -- Screenshot verification: `comet_screenshot` after `comet_ask` -- Source extraction: `comet_ask`, then `comet_get_sources` - -**Error Responses:** -All errors return `{ content: [{ type: "text", text: "[ERROR_CODE] message" }], isError: true }` - -**Connection Lifecycle:** -- All tools call `ensureConnected()` which auto-connects if no active session -- Auto-connect: launches Comet if not running, closes extra tabs, detects Chrome version - -**Step 2: Verify file** - -Run: `wc -l docs/tools.md` -Expected: ~300-400 lines of well-formatted markdown - -**Step 3: Commit** - -```bash -git add docs/tools.md -git commit -m "docs: add complete tool reference (13 tools)" -``` - ---- - -### Task 3: Create docs/integration.md - -**Files:** -- Create: `docs/integration.md` - -**Step 1: Write docs/integration.md** - -Structure: - -1. **Overview** — Asteria communicates via MCP stdio. The MCP client spawns `asteria start` as a subprocess and sends JSON-RPC 2.0 messages over stdin/stdout. - -2. **Prerequisites** checklist: - - Node.js >= 18 installed - - Perplexity Comet installed (`https://comet.perplexity.ai/`) - - Comet running with debug port accessible - -3. **Claude Code** setup: - ```json - // ~/.claude/claude_desktop_config.json - { - "mcpServers": { - "asteria": { - "type": "stdio", - "command": "asteria", - "args": ["start"] - } - } - } - ``` - Verify: restart Claude Code, ask "What MCP tools are available?" - Example session: ask Claude to "Search Perplexity for the latest AI news" - -4. **Cursor** setup: - ```json - // ~/.cursor/mcp.json - { - "mcpServers": { - "asteria": { - "type": "stdio", - "command": "asteria", - "args": ["start"] - } - } - } - ``` - Verify: open Cursor settings > MCP, check asteria appears - -5. **CLI Usage** — direct tool invocation: - ```bash - # Connect - asteria call comet_connect - - # Ask a question - asteria call comet_ask '{"prompt": "What is 2+2?"}' - - # Check status - asteria call comet_poll - - # Wait for completion - asteria call comet_wait - - # Screenshot - asteria call comet_screenshot '{"format": "jpeg"}' - - # Switch mode - asteria call comet_mode '{"mode": "deep-research"}' - - # Get sources - asteria call comet_get_sources - - # List tabs - asteria call comet_list_tabs - - # Get page content - asteria call comet_get_page_content '{"maxLength": 5000}' - - # List conversations - asteria call comet_list_conversations - - # Open a conversation - asteria call comet_open_conversation '{"url": "https://www.perplexity.ai/search/abc123"}' - ``` - -6. **Programmatic Usage** — Node.js integration: - ```javascript - import { spawn } from 'node:child_process' - - const child = spawn('asteria', ['start'], { stdio: ['pipe', 'pipe', 'pipe'] }) - - // Send initialize - child.stdin.write(JSON.stringify({ - jsonrpc: '2.0', id: 0, method: 'initialize', - params: { - protocolVersion: '2024-11-05', - capabilities: {}, - clientInfo: { name: 'my-app', version: '1.0.0' } - } - }) + '\n') - - // After initialized notification, call a tool - child.stdin.write(JSON.stringify({ - jsonrpc: '2.0', id: 1, method: 'tools/call', - params: { name: 'comet_ask', arguments: { prompt: 'What is 2+2?' } } - }) + '\n') - - child.stdout.on('data', (chunk) => { - // Parse JSON-RPC responses - for (const line of chunk.toString().split('\n')) { - if (!line.trim()) continue - const msg = JSON.parse(line) - if (msg.id === 1) console.log(msg.result) - } - }) - ``` - -**Step 2: Verify file** - -Run: `wc -l docs/integration.md` -Expected: ~150-200 lines - -**Step 3: Commit** - -```bash -git add docs/integration.md -git commit -m "docs: add integration guide for MCP clients and CLI" -``` - ---- - -### Task 4: Create docs/troubleshooting.md - -**Files:** -- Create: `docs/troubleshooting.md` - -**Step 1: Write docs/troubleshooting.md** - -Structure: - -1. **Connection Issues** - - **"Comet not found" / `COMET_NOT_FOUND`** - - Asteria cannot find the Comet executable - - Fix: Set `COMET_PATH` environment variable to the full path - - macOS: `/Applications/Perplexity\ Comet.app/Contents/MacOS/comet` - - Windows: `C:\Users\\AppData\Local\Perplexity\Comet\comet.exe` - - Verify: `asteria detect` - - **"Debug port not reachable" / `CDP_CONNECTION_FAILED`** - - Comet is running but the debug port (9222) is not accessible - - Fix: Launch Comet with `--remote-debugging-port=9222` - - Verify: `curl http://127.0.0.1:9222/json/version` - - **"Connection refused"** - - Comet is not running or using a different port - - Fix: Start Comet, or set `ASTERIA_PORT` to the correct port - - Verify: `asteria detect` shows "active" for debug port - -2. **Tool Issues** - - **"Agent is still working"** - - `comet_ask` returned before the agent finished - - Fix: Call `comet_wait` to poll until completion, or increase timeout - - Example: `asteria call comet_wait '{"timeout": 300000}'` - - **"Empty response"** - - Agent may not have finished processing - - Fix: Use `comet_poll` to check status, then `comet_wait` for full result - - Check: `comet_screenshot` to see current page state - - **"Mode switch failed: typeahead menu did not appear"** - - Mode switching only works on the home page or a new chat page - - Fix: The `newChat` parameter in `comet_ask` automatically navigates to a fresh page - - Note: Mode switch uses "/" slash command in Comet's input field - - **"No sources found"** - - Short queries may not generate citations - - Fix: Sources are only available after a query with citations. Try `comet_screenshot` to verify - - Collapsed citations (e.g. `wsj+3`) are auto-expanded - - **"No stop button found"** - - No agent is currently running - - Fix: This is expected if no query is active - -3. **Error Codes** - - | Code | Class | Meaning | - |------|-------|---------| - | `CDP_CONNECTION_FAILED` | CDPConnectionError | Cannot connect to Chrome DevTools Protocol | - | `COMET_NOT_FOUND` | CometNotFoundError | Comet executable not found on system | - | `COMET_LAUNCH_FAILED` | CometLaunchError | Comet executable found but failed to start | - | `TAB_NOT_FOUND` | TabNotFoundError | No tab matches the given ID or title | - | `TIMEOUT` | TimeoutError | Operation exceeded the configured timeout | - | `EVALUATION_FAILED` | EvaluationError | JavaScript evaluation failed in the browser | - | `SELECTOR_NOT_FOUND` | SelectorError | CSS selector did not match any element | - | `AGENT_ERROR` | AgentError | Agent-specific error during operation | - | `CONFIG_ERROR` | ConfigurationError | Invalid configuration value | - - All errors return: `{ content: [{ type: "text", text: "[CODE] message" }], isError: true }` - -4. **Debug Mode** - - Set `ASTERIA_LOG_LEVEL=debug` for verbose logging: - ```bash - ASTERIA_LOG_LEVEL=debug asteria start - ``` - - Key log messages: - - `Auto-connecting to Comet...` — triggered by ensureConnected() - - `Detected Comet Chrome/145` — version detection succeeded - - `Type result: ...` — prompt input evaluation - - `Submit result: ...` — prompt submission result - -5. **Health Check** - - ```bash - asteria detect - ``` - - This prints: - - Whether Comet process is running - - The Comet executable path - - Debug port status (active/not responding/not reachable) - - Browser version if connected - -**Step 2: Verify file** - -Run: `wc -l docs/troubleshooting.md` -Expected: ~150-200 lines - -**Step 3: Commit** - -```bash -git add docs/troubleshooting.md -git commit -m "docs: add troubleshooting guide with error codes" -``` - ---- - -### Task 5: Update docs/architecture.md - -**Files:** -- Modify: `docs/architecture.md` - -**Step 1: Rewrite docs/architecture.md** - -Update and expand from current brief version. New structure: - -1. **Overview** — Three-layer architecture diagram (keep current, update 12 → 13 tools): - ``` - MCP Tools (13 tools) - ↓ - UI Automation (selectors, input, status, extraction, navigation) - ↓ - CDP Transport (browser launch, connection, tabs, client) - ↓ - Perplexity Comet Browser (Chromium) - ``` - -2. **MCP Layer** — 13 tools grouped by function: - - Session: connect, poll, wait, stop - - Query: ask, mode - - Content: screenshot, get_sources, get_page_content - - Navigation: list_tabs, switch_tab, list_conversations, open_conversation - -3. **UI Automation Layer**: - - **Selector Strategy Pattern** (keep current description) - - **Typeahead Mode Detection** — reads SVG icon href from typeahead menu items with `.bg-subtle` class - - **Collapsed Citation Expansion** — clicks collapsed citations (`wsj+3` pattern) to reveal full URLs - - **Prompt Injection** — uses `execCommand('insertText')` for safe injection into Lexical editor via `JSON.stringify` - -4. **CDP Transport Layer**: - - **Version Detection** (keep current, expand) - - **Connection Management** (keep current) - - **Auto-Reconnect** — health checks via `1+1` evaluation, exponential backoff - -5. **Error Handling** (keep, update to 9 subclasses with code table) - -6. **Data Flow: comet_ask lifecycle**: - ``` - 1. ensureConnected() — auto-connect if needed - 2. Pre-send state capture (proseCount, lastProseText) - 3. Type prompt via execCommand('insertText') - 4. Submit via Enter key - 5. Polling loop (config.pollInterval) - - Check agent status (working/idle/completed) - - Detect new response via proseCount growth - - Stall detection (10 polls without growth → break) - 6. Response settle (5 x 1s polls to ensure complete text) - 7. Return response + steps - ``` - -**Step 2: Commit** - -```bash -git add docs/architecture.md -git commit -m "docs: expand architecture with v1.1 features" -``` - ---- - -### Task 6: Update docs/configuration.md - -**Files:** -- Modify: `docs/configuration.md` - -**Step 1: Update docs/configuration.md** - -The current file is already good. Changes needed: -- Verify `ASTERIA_RESPONSE_TIMEOUT` default is 180000 (confirmed in `src/config.ts` via `src/server.ts` which uses `config.responseTimeout` — check actual default) -- Add `ASTERIA_USER_DATA_DIR` description improvement -- Add `ASTERIA_WINDOW_WIDTH` and `ASTERIA_WINDOW_HEIGHT` descriptions -- Add a "Quick Reference" section with the most commonly used variables -- Add config file example with all variables shown - -**Step 2: Commit** - -```bash -git add docs/configuration.md -git commit -m "docs: update configuration reference" -``` - ---- - -### Task 7: Update docs/contributing.md - -**Files:** -- Modify: `docs/contributing.md` - -**Step 1: Expand docs/contributing.md** - -Current is minimal (35 lines). Expand to: - -1. **Setup** (keep current, add `npm run typecheck`) -2. **Development** (keep current commands) -3. **Project Structure**: - ``` - src/ - cli.ts — CLI entry point - server.ts — MCP server + tool handlers - config.ts — Configuration loading + validation - errors.ts — Error classes (9 subclasses) - cdp/ — Chrome DevTools Protocol layer - browser.ts — Browser detection + launching - client.ts — CDP client (singleton, auto-reconnect) - connection.ts — WebSocket connection - tabs.ts — Tab categorization - ui/ — UI automation scripts - input.ts — Prompt injection (safe, JSON.stringify-based) - navigation.ts — Mode switch + submit - status.ts — Agent status detection - extraction.ts — Source + page content extraction - conversations.ts — Conversation listing - selectors.ts — Selector strategy runner - stop.ts — Agent stop - selectors/ — Version-specific CSS selectors - v145.ts — Chrome 145 selector set - types.ts — SelectorSet interface - index.ts — Version registry - tests/ - unit/ — Unit tests (mocked CDP) - integration/ — Integration tests (real tool shapes) - ``` - -4. **Testing Strategy**: - - Unit tests: mock CDP client, test UI scripts, test config validation - - Integration tests: test tool shapes match server definitions - - Run: `npm test` (314 tests) - - Coverage: `npm run test:ci` - -5. **Code Style** (keep current) -6. **Adding Comet Versions** (keep current, expand step-by-step) -7. **Commit Messages** (keep current conventional commits) -8. **PR Process**: - - All PRs must pass CI (biome lint + vitest) - - Use conventional commit prefixes - - Update tests for new features - - Update docs for user-facing changes - -**Step 2: Commit** - -```bash -git add docs/contributing.md -git commit -m "docs: expand contributing guide with testing and project structure" -``` - ---- - -### Task 8: Update docs/comet-compatibility.md - -**Files:** -- Modify: `docs/comet-compatibility.md` - -**Step 1: Expand docs/comet-compatibility.md** - -Current is very thin (15 lines). Expand to: - -1. **Supported Versions** (keep current table) -2. **How Version Detection Works**: - - On `comet_connect`, Asteria queries `http://127.0.0.1:9222/json/version` - - Extracts Chrome major version (e.g. `145` from `Chrome/145.2.7632.4587`) - - Looks up matching selector set in `src/selectors/index.ts` - - Falls back to latest known set if version is unknown - -3. **Selector Strategy Pattern**: - - Selectors are ordered arrays of CSS selectors - - Each strategy tries selectors in order until one matches - - New selectors go at the front, old ones become fallbacks - - This makes the system resilient to UI changes - -4. **Adding a New Comet Version** (expand from current): - 1. Run `asteria detect` to confirm Comet version - 2. Use browser DevTools to inspect DOM elements - 3. Create `src/selectors/v{version}.ts` implementing `SelectorSet` interface - 4. Copy selectors from the latest version and update as needed - 5. Register in `src/selectors/index.ts` version map - 6. Add unit tests for new selectors - 7. Test with real Comet instance - 8. Update this table - -**Step 2: Commit** - -```bash -git add docs/comet-compatibility.md -git commit -m "docs: expand Comet compatibility guide with selector strategy" -``` - ---- - -### Task 9: Final verification - -**Step 1: Check all doc files exist and have content** - -```bash -ls -la README.md docs/tools.md docs/integration.md docs/troubleshooting.md docs/architecture.md docs/configuration.md docs/contributing.md docs/comet-compatibility.md -``` - -Expected: All 8 files exist with non-zero size. - -**Step 2: Check README links are valid** - -```bash -grep -o '\[.*\](docs/[^)]*)' README.md | while read link; do - file=$(echo "$link" | sed 's/.*](\(docs\/[^)]*\))/\1/') - if [ ! -f "$file" ]; then echo "BROKEN: $file"; fi -done -``` - -Expected: No broken links. - -**Step 3: Check tool count consistency** - -```bash -grep -c 'comet_' docs/tools.md -grep -c '13 tools' README.md docs/architecture.md -``` - -Expected: 13 tool sections, consistent count mentions. - -**Step 4: Run lint to make sure no files are malformed** - -```bash -npm run lint -npm test -``` - -Expected: All pass.