feat: v0.2.0 — output scanning, security model transparency, authz positioning

dormstern · claude · dormstern · commit fbe05283537f · 2026-02-20T17:31:27.000+02:00
Output scanner: post-execution deny-keyword detection flags suspicious
AnchorBrowser output in audit trail. Domain hints enrich audit events.
CLI refactored to shared audit logger with flagged event count.

README: added Security Model section (honest IAM analogy + what's enforced
vs not), Roadmap (v0.2/v1.0), authz framing. SECURITY.md: added Trust
Model with full threat matrix. Positioning cheat sheet for calls.

74 tests passing (was 61). Zero breaking changes.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/README.md b/README.md
@@ -205,8 +205,41 @@ flowchart LR
 ### Three layers of protection
 
 1. **Credential isolation** — your password stays in an isolated cloud browser. The agent gets a pre-authenticated session, never the credentials themselves.
-2. **Scoped boundaries** — the agent can only do what your policy allows. Read inbox? Yes. Delete contacts? Blocked before it starts.
-3. **Audit + kill switch** — every action logged (allowed and blocked). Budget enforced. Instant session destruction when you're done.
+2. **Scoped boundaries** — tasks that don't match your policy are blocked before they start. Deny-first pattern matching with Unicode bypass protection.
+3. **Audit + kill switch** — every action logged (allowed and blocked). Budget enforced. Session destruction when you're done.
+
+## Security Model
+
+In security terms, leashed is **application-layer authz for AI agents** — it governs what agents are *authorized to do*, not who they are or what credentials they hold. Think of it like an AWS IAM policy that checks what you *request*, not what the underlying service *executes*.
+
+### What leashed enforces today (v0.1)
+
+| Layer | Enforced | How |
+|-------|----------|-----|
+| Task gating | Yes | Deny-first glob pattern matching on task strings |
+| Time + action budgets | Yes | Configurable expiration and action limits |
+| Credential isolation | Yes | Passwords stay in AnchorBrowser's isolated session, never exposed to the agent |
+| Session destruction | Yes | `leash.yank()` destroys the cloud browser session |
+| Audit trail | Yes | Every task request (allowed + blocked) logged to JSONL |
+| Unicode bypass protection | Yes | Strips zero-width chars, combining marks, BiDi controls |
+
+### What leashed does NOT enforce (yet)
+
+| Layer | Status | Why |
+|-------|--------|-----|
+| Browser action validation | Roadmap (v1.0) | AnchorBrowser executes tasks autonomously — leashed has no visibility into actual browser clicks/navigation |
+| URL/domain restrictions | Roadmap (v1.0) | Requires AnchorBrowser session-level allowlists (not yet available in their SDK) |
+| Semantic equivalence | By design | `"forward email"` and `"send email to myself"` are different strings — glob patterns match literally, not semantically |
+
+### The honest version
+
+The policy engine checks the **task description string** — the human-readable instruction you pass to `leash.task()`. If the string matches a deny pattern, it never reaches the browser. If it's allowed, AnchorBrowser's AI executes it autonomously.
+
+This means: a well-intentioned agent that uses descriptive task names gets real governance. A deliberately adversarial agent that lies about what it's doing can bypass pattern matching — just like a developer with an IAM read-only key could name their Lambda "ReadOnlyFunction" while it actually writes to S3.
+
+**leashed is a seatbelt, not a cage.** It stops the 95% of accidents that come from misconfiguration, scope creep, and unintended actions. It does not stop a determined attacker with direct API access.
+
+For defense-in-depth, see [SECURITY.md](./SECURITY.md).
 
 ## CLI
 
@@ -218,6 +251,23 @@ npx leashed yank     # Kill switch — destroy session immediately
 
 [Full API reference & policy examples →](./docs/API.md)
 
+## Roadmap
+
+leashed is v0.1 — the governance primitives. Here's what's coming:
+
+### v0.2 — Output Scanning
+- Post-execution validation: scan AnchorBrowser output for policy-violating content
+- Domain hints in policy: `domains: [linkedin.com]` for documentation and audit enrichment
+- Structured output schemas for safer result parsing
+
+### v1.0 — Session-Level Enforcement (with AnchorBrowser)
+- URL allowlists at the session level — the browser itself refuses to navigate outside your policy
+- Browser action audit trail — not just task requests, but actual clicks, form fills, and navigation
+- Webhook callbacks for real-time policy violation alerts
+- This is the "IAM enforcement" layer — restrictions enforced by the infrastructure, not just the intent
+
+Want to help shape v1.0? [Open an issue](https://github.com/dormstern/leashed/issues) or reach out.
+
 ## Empowered by AnchorBrowser
 
 leashed runs on [AnchorBrowser](https://anchorbrowser.io) — ephemeral, hardened cloud browser sessions purpose-built for AI agents. Each session is isolated, auto-expires, and leaves no trace. [Cloudflare](https://cloudflare.com) verified bot partner. SOC2 Type 2 and ISO27001 certified. Trusted by [Google](https://google.com), [Coinbase](https://coinbase.com), and [Composio](https://composio.dev). Stealth proxies, CAPTCHA solving, anti-fingerprinting, and full session isolation out of the box.
diff --git a/SECURITY.md b/SECURITY.md
@@ -31,3 +31,29 @@ We will acknowledge receipt within 48 hours and aim to release a fix within 7 da
 - Glob pattern matching operates on the literal task string. It cannot detect semantic equivalents (e.g., "forward" vs "send").
 - The audit log is a local file. For tamper-proof logging, export to an immutable store (S3 with object lock, a database, or syslog).
 - The expire timer and kill switch are best-effort — an in-flight AnchorBrowser task may complete after the kill signal.
+
+## Trust Model
+
+leashed operates at the **intent layer** — it evaluates task description strings before forwarding to AnchorBrowser. It does NOT have visibility into browser-level execution.
+
+### Threat model
+
+| Threat | Mitigated? | Notes |
+|--------|-----------|-------|
+| Accidental scope creep (agent uses descriptive task names) | Yes | Policy gating blocks unintended categories |
+| Credential exposure to agent code | Yes | Credentials stay in AnchorBrowser's isolated session |
+| Unlimited session duration | Yes | Time-based expiration + action budgets |
+| Session left running after use | Yes | `leash.yank()` + CLI `npx leashed yank` |
+| Unicode obfuscation of task strings | Yes | sanitizeTask() strips invisible characters |
+| Deliberately adversarial task labeling | Partially | Pattern matching is literal, not semantic |
+| Direct AnchorBrowser API bypass | No | Agent with API key can skip leashed entirely |
+| In-browser action divergence | No | AnchorBrowser AI executes autonomously |
+| Prompt injection via web content | No | AnchorBrowser's responsibility — report to them |
+
+### Defense-in-depth recommendations
+
+1. Use `default: deny` and explicit allow lists
+2. Keep `max_actions` low — budget limits blast radius even if patterns are bypassed
+3. Use `expire_after` — session auto-kills limit exposure window
+4. Review audit logs regularly — `npx leashed audit` or export JSONL to your SIEM
+5. For production: complement leashed with AnchorBrowser's own session monitoring
diff --git a/package.json b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "leashed",
-  "version": "0.1.2",
+  "version": "0.2.0",
   "description": "AI got hands. This is the leash. Policy, audit, kill switch for any AI agent with access to your accounts.",
   "type": "module",
   "main": "dist/index.js",
diff --git a/src/cli.ts b/src/cli.ts
@@ -2,7 +2,7 @@
 
 import { readFileSync, existsSync } from 'node:fs'
 import AnchorClient from 'anchorbrowser'
-import type { AuditEvent } from './types.js'
+import { createAuditLogger } from './audit.js'
 import { SESSION_FILE, DEFAULT_AUDIT_FILE } from './constants.js'
 
 const AUDIT_FILE = DEFAULT_AUDIT_FILE
@@ -56,22 +56,8 @@ async function killSession() {
   try { unlinkSync(SESSION_FILE) } catch {}
 }
 
-function readAuditEvents(): AuditEvent[] {
-  if (!existsSync(AUDIT_FILE)) return []
-  const events: AuditEvent[] = []
-  for (const line of readFileSync(AUDIT_FILE, 'utf-8').split('\n')) {
-    if (!line.trim()) continue
-    try {
-      events.push(JSON.parse(line) as AuditEvent)
-    } catch {
-      // Skip corrupt lines
-    }
-  }
-  return events
-}
-
 function printStatus() {
-  const events = readAuditEvents()
+  const events = createAuditLogger(AUDIT_FILE).export()
   if (events.length === 0) {
     console.log('No audit events found.')
     return
@@ -80,6 +66,7 @@ function printStatus() {
   const allowed = events.filter(e => e.action === 'allowed').length
   const blocked = events.filter(e => e.action === 'blocked').length
   const errors = events.filter(e => e.action === 'error').length
+  const flagged = events.filter(e => e.flags && e.flags.length > 0).length
   const killed = events.some(e => e.action === 'killed')
   const agent = events[0]?.agent ?? 'unknown'
 
@@ -88,11 +75,12 @@ function printStatus() {
   console.log(`Allowed: ${allowed}`)
   console.log(`Blocked: ${blocked}`)
   console.log(`Errors:  ${errors}`)
+  console.log(`Flagged: ${flagged}`)
   console.log(`Total:   ${events.length}`)
 }
 
 function printAudit() {
-  const events = readAuditEvents()
+  const events = createAuditLogger(AUDIT_FILE).export()
   if (events.length === 0) {
     console.log('No audit events found.')
     return
@@ -106,7 +94,8 @@ function printAudit() {
     const action = e.action.padEnd(9)
     const task = e.task.length > 40 ? e.task.slice(0, 37) + '...' : e.task
     const reason = e.reason ? ` (${e.reason})` : ''
-    console.log(`${time}  ${action} ${task}${reason}`)
+    const flagIndicator = e.flags && e.flags.length > 0 ? ' [!]' : ''
+    console.log(`${time}  ${action} ${task}${reason}${flagIndicator}`)
   }
 }
 
diff --git a/src/index.ts b/src/index.ts
@@ -2,13 +2,15 @@ import { writeFileSync, existsSync } from 'node:fs'
 import { loadPolicy, evaluatePolicy } from './policy.js'
 import { createAuditLogger, type AuditLogger } from './audit.js'
 import { createSessionManager, type SessionManager } from './session.js'
-import type { LeashConfig, LeashResult, AuditEvent, LeashStatus } from './types.js'
+import { scanOutput } from './output-scanner.js'
+import type { LeashConfig, LeashResult, AuditEvent, LeashStatus, OutputFlag } from './types.js'
 import { SESSION_FILE, DEFAULT_AUDIT_FILE } from './constants.js'
 
-export type { LeashConfig, LeashResult, AuditEvent, LeashStatus } from './types.js'
+export type { LeashConfig, LeashResult, AuditEvent, LeashStatus, OutputFlag } from './types.js'
 export { loadPolicy, evaluatePolicy, matchesPattern } from './policy.js'
 export { createAuditLogger } from './audit.js'
 export { createSessionManager } from './session.js'
+export { scanOutput } from './output-scanner.js'
 export { SESSION_FILE, DEFAULT_AUDIT_FILE } from './constants.js'
 
 export interface Leash {
@@ -98,6 +100,7 @@ export function createLeash(
           task: description,
           action: 'blocked',
           reason: expiredReason,
+          ...(config.domains?.length ? { domains: config.domains } : {}),
         }
         logger.log(event)
         blockedCount++
@@ -115,6 +118,7 @@ export function createLeash(
           task: description,
           action: 'blocked',
           reason: decision.reason,
+          ...(config.domains?.length ? { domains: config.domains } : {}),
         }
         logger.log(event)
         blockedCount++
@@ -136,17 +140,25 @@ export function createLeash(
           // Non-fatal: CLI yank won't work but task still succeeds
         }
 
+        // Post-execution output scan — detect deny keywords in output
+        const flags = scanOutput(output, config)
+
         const event: AuditEvent = {
           id: auditId,
           timestamp: new Date().toISOString(),
           agent: agentName,
           task: description,
           action: 'allowed',
           duration,
+          ...(flags.length > 0 ? { flags } : {}),
+          ...(config.domains?.length ? { domains: config.domains } : {}),
         }
         logger.log(event)
         allowedCount++
-        return { allowed: true, output, auditId }
+
+        const result: LeashResult = { allowed: true, output, auditId }
+        if (flags.length > 0) result.flags = flags
+        return result
       } catch (err) {
         const event: AuditEvent = {
           id: auditId,
diff --git a/src/output-scanner.ts b/src/output-scanner.ts
@@ -0,0 +1,36 @@
+import type { LeashConfig, OutputFlag } from './types.js'
+
+/**
+ * Scan AnchorBrowser output for keywords from deny patterns.
+ * Detection only — flags suspicious content for audit review.
+ */
+export function scanOutput(output: string, config: LeashConfig): OutputFlag[] {
+  if (!output || !config.deny?.length) return []
+
+  const flags: OutputFlag[] = []
+  const normalizedOutput = output.toLowerCase()
+
+  for (const pattern of config.deny) {
+    const keyword = extractKeyword(pattern)
+    if (!keyword) continue
+
+    const idx = normalizedOutput.indexOf(keyword)
+    if (idx !== -1) {
+      const start = Math.max(0, idx - 20)
+      const end = Math.min(output.length, idx + keyword.length + 20)
+      const snippet = output.slice(start, end).trim()
+      flags.push({ pattern, keyword, snippet })
+    }
+  }
+
+  return flags
+}
+
+/**
+ * Extract a matchable keyword from a glob pattern.
+ * "*send*" → "send", "delete*" → "delete", "*" → null
+ */
+function extractKeyword(pattern: string): string | null {
+  const keyword = pattern.replace(/\*/g, '').trim().toLowerCase()
+  return keyword.length >= 2 ? keyword : null
+}
diff --git a/src/policy.ts b/src/policy.ts
@@ -51,6 +51,7 @@ export function loadPolicy(configOrPath: string | LeashConfig): LeashConfig {
       default: policy.default ?? 'deny',
       expire: policy.expire_after,
       maxActions: policy.max_actions,
+      domains: policy.domains,
     }
   }
   return {
diff --git a/src/types.ts b/src/types.ts
@@ -5,13 +5,21 @@ export interface LeashConfig {
   expire?: string
   maxActions?: number
   agent?: string
+  domains?: string[]
+}
+
+export interface OutputFlag {
+  pattern: string
+  keyword: string
+  snippet: string
 }
 
 export interface LeashResult {
   allowed: boolean
   output?: string
   reason?: string
   auditId: string
+  flags?: OutputFlag[]
 }
 
 export interface AuditEvent {
@@ -22,6 +30,8 @@ export interface AuditEvent {
   action: 'allowed' | 'blocked' | 'error' | 'killed'
   reason?: string
   duration?: number
+  flags?: OutputFlag[]
+  domains?: string[]
 }
 
 export interface LeashStatus {
@@ -42,4 +52,5 @@ export interface YamlPolicy {
   default?: 'allow' | 'deny'
   expire_after?: string
   max_actions?: number
+  domains?: string[]
 }
diff --git a/tests/output-scanner.test.ts b/tests/output-scanner.test.ts
@@ -0,0 +1,67 @@
+import { describe, it, expect } from 'vitest'
+import { scanOutput } from '../src/output-scanner.js'
+import type { LeashConfig } from '../src/types.js'
+
+describe('scanOutput', () => {
+  const config: LeashConfig = {
+    allow: ['read*', 'check*'],
+    deny: ['*send*', '*delete*', '*export*'],
+    default: 'deny',
+  }
+
+  it('returns empty array when no deny keywords match output', () => {
+    const flags = scanOutput('Here are your 5 unread messages from today.', config)
+    expect(flags).toEqual([])
+  })
+
+  it('flags output containing a deny keyword', () => {
+    const flags = scanOutput('Successfully exported 500 contacts to CSV file.', config)
+    expect(flags).toHaveLength(1)
+    expect(flags[0].pattern).toBe('*export*')
+    expect(flags[0].keyword).toBe('export')
+    expect(flags[0].snippet).toContain('export')
+  })
+
+  it('flags multiple deny keywords in same output', () => {
+    const flags = scanOutput('Deleted 3 messages and exported the archive.', config)
+    expect(flags).toHaveLength(2)
+    const keywords = flags.map(f => f.keyword)
+    expect(keywords).toContain('delete')
+    expect(keywords).toContain('export')
+  })
+
+  it('is case-insensitive', () => {
+    const flags = scanOutput('EXPORTED all contacts to spreadsheet', config)
+    expect(flags).toHaveLength(1)
+    expect(flags[0].keyword).toBe('export')
+  })
+
+  it('returns empty array for empty output', () => {
+    expect(scanOutput('', config)).toEqual([])
+  })
+
+  it('returns empty array for null-ish output', () => {
+    expect(scanOutput(null as unknown as string, config)).toEqual([])
+    expect(scanOutput(undefined as unknown as string, config)).toEqual([])
+  })
+
+  it('returns empty array when config has no deny patterns', () => {
+    const noDeny: LeashConfig = { allow: ['*'], default: 'allow' }
+    const flags = scanOutput('exported everything', noDeny)
+    expect(flags).toEqual([])
+  })
+
+  it('skips wildcard-only patterns', () => {
+    const wildcardConfig: LeashConfig = { deny: ['*'], default: 'deny' }
+    const flags = scanOutput('some output text', wildcardConfig)
+    expect(flags).toEqual([])
+  })
+
+  it('provides context snippet around matched keyword', () => {
+    const output = 'The system successfully exported all 500 contacts to a CSV file on disk.'
+    const flags = scanOutput(output, config)
+    expect(flags).toHaveLength(1)
+    // Snippet should contain surrounding context, not just the keyword
+    expect(flags[0].snippet.length).toBeGreaterThan('export'.length)
+  })
+})
diff --git a/tests/policy.test.ts b/tests/policy.test.ts
diff --git a/tests/shield.test.ts b/tests/shield.test.ts

Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "leashed",`
`3`		`- "version": "0.1.2",`
	`3`	`+ "version": "0.2.0",`
`4`	`4`	`"description": "AI got hands. This is the leash. Policy, audit, kill switch for any AI agent with access to your accounts.",`
`5`	`5`	`"type": "module",`
`6`	`6`	`"main": "dist/index.js",`
Original file line number	Diff line number	Diff line change
`@@ -51,6 +51,7 @@ export function loadPolicy(configOrPath: string \| LeashConfig): LeashConfig {`
`51`	`51`	`default: policy.default ?? 'deny',`
`52`	`52`	`expire: policy.expire_after,`
`53`	`53`	`maxActions: policy.max_actions,`
	`54`	`+ domains: policy.domains,`
`54`	`55`	`}`
`55`	`56`	`}`
`56`	`57`	`return {`