Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ Pick by the **shape** of the problem, not the topic:
| A problem decomposed before you act | **think** | Sequential thinking — linear or reflexion (forced self-critique), optional branches; surfaces open questions + a `goal` handoff. |
| Your own decision pressure-tested | **nero** | Adversarial self-challenge — the top-rated *critic* (by tribunal rating, not the best builder) attacks it and returns a verdict. |
| A library/repo/spec/fact looked up with sources | **research** | Keyless, web-grounded, *cited* research — Agon discovers sources (npm/GitHub/MDN/IETF/Stack Overflow/Wikipedia, **no API key**), an engine drafts a grounded answer, and Agon **verifies every citation**. |
| Your **own browser** driven to research / check a page | **chrome** | Drive your real browser (navigate / read / screenshot / click) from the REPL, the CLI, or Cesar — reuses a running agon the side panel is on, else embeds a transient bridge. Needs the Agon browser extension. |
| Code built competitively against a test | **forge** | Engines race on the same task in isolated worktrees; the best test-passing patch is applied. |
| The **first** passing solution, fast | **speculate** | N engines race; the first to pass the test wins and is applied immediately. |
| A routine build with one engine | **pipeline** | Single-engine build → review → fix loop; no competition overhead. |
Expand All @@ -138,6 +139,7 @@ Pick by the **shape** of the problem, not the topic:
- Need to think a problem through before acting → `think`
- Need your own decision attacked before you commit → `nero`
- Need a library/repo/spec/fact looked up with trustworthy, verified sources → `research`
- Need your own browser driven to research or check a page design → `chrome`
- Need the whole panel on a high-stakes, hard-to-reverse call → `council`
- Need a whole open-ended thing built unattended, away from the desk → `conquer`

Expand Down Expand Up @@ -219,6 +221,22 @@ agon research "what is a Merkle tree?" --json # emit the Rese
- **Citations are verified, not trusted.** Every cited URL is re-fetched; dead/redirected/mismatched ones are surfaced so you know which lines to trust.
- **For external CLIs too** — `agon call research "<question>" --jsonl` lets Codex / Antigravity / Claude get web-grounded, citation-checked answers without their own search key.

### Chrome
Drive your **own browser** from Agon — to research live pages or check a design without leaving the flow. Where `research` reads keyless source APIs, `chrome` drives the *real* browser you're logged into: it navigates, reads the rendered page, screenshots it, and (with approval) clicks and types. The agentic ReAct brain runs the turn through a loopback bridge the **Agon browser extension's side panel** attaches to — the panel lends the page tools and executes them in your Chrome.

```bash
/chrome open news.ycombinator.com and summarize the top 3 posts # in the agon REPL
/chrome check the pricing page design — is the hero cluttered?
agon chrome "open the staging dashboard and tell me if the chart overflows" # from the CLI
agon chrome "..." --auto-approve # act on the page without prompting (unattended)
agon call chrome "<task>" [--auto-approve] # machine bridge for external CLIs (Codex/Antigravity)
```

- **Reuse-or-embed, no separate terminal.** `chrome` reuses a running agon (`agon serve` or the REPL) the side panel is already attached to; if nothing's running it embeds a transient in-process bridge just for the turn and writes the 0600 connection file the panel auto-connects through. The REPL `/chrome` hosts the bridge itself, so the panel auto-connects to your live agon.
- **Your approval is the backstop.** Read-only page tools (navigate / read / screenshot) run without prompting; page-changing actions (click / type) prompt with Agon's own Y/N gate (the REPL permission UI, or a terminal prompt from the CLI). `--auto-approve` allows them unattended — required for a non-interactive caller (`agon call chrome`) to act on the page. Page content is treated as untrusted: the approval gate is the prompt-injection backstop.
- **Never Cesar.** The browser turn runs on the agentic brain, never your live Cesar session (no split-brain); its result is fed back to Cesar so the conversation builds on what the browser found. In the REPL it's also recorded for `Ctrl+R` / `agon last`.
- **Requires the Agon browser extension** with its side panel open + attached. With no panel attached, the brain answers in text only (and says so).

### Brainstorm
Engines bid their confidence level on how they would tackle a complex problem. Engines with higher confidence bids are allocated more tokens and priority.

Expand Down
2 changes: 2 additions & 0 deletions docs/modes.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Run it inside the git repo you are working on.
- `agon think "<problem>" [--strategy linear|reflexion] [--steps N] [--branches N]` — sequential thinking: decompose a problem into structured thoughts (reflexion forces a self-critique+revision per step; --branches explores alternatives), surface open questions, and emit a refined spec to hand to `agon goal`. Use to think before acting, or for engines with weak built-in reasoning.
- `agon nero "<decision>" [--reasoning "<why>"] [--focus "<concern>"] [--confidence N]` — adversarial self-challenge. The top-rated CRITIC (picked by tribunal-discipline rating, not the best builder) attacks your decision with concrete failure scenarios and returns a verdict (FLAWED | PROCEED WITH CAUTION | SOUND) plus its own confidence the original is correct. Use this INSTEAD of an internal evil-twin / devil's-advocate pass — it gives you a real second model, not your own reasoning mirrored.
- `agon research "<question>" [--count N] [--engine <id>]` — keyless, web-grounded, CITED research. Agon (not the model) discovers sources via first-party endpoints that need NO API key — npm registry, GitHub repo search, MDN, IETF/RFC datatracker, Stack Overflow, Wikipedia — WebFetches them, an engine drafts an answer grounded ONLY in that content with inline [n] citations, and Agon then re-fetches and VERIFIES every citation (rejecting dead/redirected/mismatched URLs). Use it to look up a library/repo/spec/Q&A/encyclopedic fact and get an answer with sources you can trust. A truly-general web query (no keyless lane) reports that rather than guessing.
- `agon chrome "<task>"` (machine: `agon call chrome "<task>" [--auto-approve] [--engine <id>]`) — drive the USER's own browser to research or check a page design (navigate / read / screenshot / click / type). Reuses a running agon (serve/REPL) the side panel is attached to, else embeds a transient bridge just for the run. Read-only page tools need no approval; add `--auto-approve` so a non-interactive caller can perform page-changing actions (without it they're denied). Requires the Agon browser extension with its side panel open + attached; with no panel attached the brain answers in text only.
- `agon review <uncommitted|branch:NAME|commit:SHA>` — non-interactive multi-engine code review.
- `agon goal "<intent>" --queue <dir|.jsonl> --gate "<test cmd>"` — autonomous controller: drives a task queue to completion unattended, looping build -> witness -> gate -> review (panel + judge) -> fix -> commit per task on a goal/ branch. Bound it with `--max-hours`/`--budget`; `--push` pushes each task. Long-running (designed for 8-24h).
- `agon conquer "<task>" --gate "<test cmd>"` — supervised-autonomous BUILD of an OPEN-ENDED task. Cesar drives a pluggable builder CLI (codex/claude/agy) in agent mode turn by turn; when the builder hits a fork it asks and Cesar convenes the cheapest sufficient consult (nero/tribunal/brainstorm/council) and feeds back a compact verdict; when it claims done, a layered done-oracle runs (the `--gate` command + diff acceptance-drift + a nero falsification round) and it STOPS at a HUMAN merge gate — it never auto-merges to main. The open-ended sibling to `goal`: use `conquer` when you CANNOT write a clean discriminating oracle up front (build a whole tool), `goal` when you can. `--push` needs a clean tree; bound it with `--max-turns`/`--max-hours`.
Expand Down Expand Up @@ -63,6 +64,7 @@ The direct `agon <mode>` commands stream human output. For JSONL lifecycle + out
- explore, no decision needed -> campfire
- think before acting / decompose -> think
- pressure-test your own decision -> nero
- check a live page / a design in your browser -> chrome
- judge existing code -> review
- drive a whole task queue to done -> goal
- build a whole open-ended thing -> conquer
Expand Down
2 changes: 2 additions & 0 deletions packages/cli/src/commands/chrome.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
// Re-export from KERN-generated chrome command (source: kern/commands/chrome.kern)
export { chromeCommand, runChrome } from '../generated/commands/chrome.js';
23 changes: 20 additions & 3 deletions packages/cli/src/generated/blocks/results-formatter.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
// @generated by kern v4.0.0 — DO NOT EDIT. Source: src/kern/blocks/results-formatter.kern

import type { SessionResult, BrainstormResultData, CampfireResultData, TribunalResultData, ForgeResultData, ThinkResultData, CouncilResultData, SynthesisResultData, NeroResultData, ReviewResultData, ResearchResultData, ChatSession } from '@kernlang/agon-core';
import type { SessionResult, BrainstormResultData, CampfireResultData, TribunalResultData, ForgeResultData, ThinkResultData, CouncilResultData, SynthesisResultData, NeroResultData, ReviewResultData, ResearchResultData, ChromeResultData, ChatSession } from '@kernlang/agon-core';

// @kern-source: results-formatter:3
export const BOLD: string = '\x1b[1m';
Expand Down Expand Up @@ -263,6 +263,22 @@ export function formatResearch(r: SessionResult, idx: number): string {
}

// @kern-source: results-formatter:227
export function formatChrome(r: SessionResult, idx: number): string {
const data = r.data as ChromeResultData;
const lines: string[] = [];
lines.push(`${BOLD}${CYAN}${RULE}${RESET}`);
lines.push(`${BOLD} CHROME #${idx} ${DIM}· ${formatTime(r.timestamp)} · ${RESET}${BOLD}"${r.question}"${RESET}`);
lines.push(`${DIM} Engine: ${data.engineId}${data.pageActivity ? ' · drove the page' : ' · text only (no page tools ran)'}${RESET}`);
lines.push(`${BOLD}${CYAN}${RULE}${RESET}`);
lines.push('');
if (data.answer) {
lines.push(data.answer);
lines.push('');
}
return lines.join('\n');
}

// @kern-source: results-formatter:241
export function formatReview(r: SessionResult, idx: number): string {
const data = r.data as ReviewResultData;
const lines: string[] = [];
Expand All @@ -286,10 +302,10 @@ export function formatReview(r: SessionResult, idx: number): string {
return lines.join('\n');
}

// @kern-source: results-formatter:248
// @kern-source: results-formatter:262
export function formatSessionResults(results: SessionResult[]): string {
if (results.length === 0) {
return `${DIM}No results in this session yet. Run /brainstorm, /campfire, /tribunal, /forge, /think, /council, /synthesis, /nero, /research, or /review first.${RESET}\n`;
return `${DIM}No results in this session yet. Run /brainstorm, /campfire, /tribunal, /forge, /think, /council, /synthesis, /nero, /research, /chrome, or /review first.${RESET}\n`;
}

const sections: string[] = [];
Expand All @@ -308,6 +324,7 @@ export function formatSessionResults(results: SessionResult[]): string {
case 'synthesis': sections.push(formatSynthesis(r, idx)); break;
case 'nero': sections.push(formatNero(r, idx)); break;
case 'research': sections.push(formatResearch(r, idx)); break;
case 'chrome': sections.push(formatChrome(r, idx)); break;
case 'review': sections.push(formatReview(r, idx)); break;
}
}
Expand Down
38 changes: 28 additions & 10 deletions packages/cli/src/generated/bridge/agentic-brain-client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -104,14 +104,22 @@ export function buildAgentSystemPrompt(tools: CapabilitySpec[], base?: string):
'TOOL LINE NOW. One tool per step — you then get its result and may call another tool.',
'Read the page before acting on it. Never fabricate a result or claim an action you',
'did not take. Reply in prose ONLY when giving the user your FINAL answer (no tool line).',
'',
'BE PROACTIVE — take initiative. When the user asks you to DO something on the web —',
'search for / find / look up something, open or navigate to a site, fill a field, click —',
'ACT by calling tools; the user wants it DONE, not described, and does NOT want to spell',
'out each step. Drive it yourself across multiple steps (read → act → read → act) until',
'the goal is reached. Need a site you are not on? navigate there, then readPage, then act.',
'This holds in EVERY language: a request written in German/French/etc. STILL requires tool',
'lines — never answer a "do X" request with only a prose promise to do it.',
);
return lines.join('\n');
}

/**
* The growing ReAct transcript re-sent to the (stateless, exec-mode) engine each step: the original request plus every prior tool call and its result, ending with a nudge to act or answer.
*/
// @kern-source: agentic-brain-client:119
// @kern-source: agentic-brain-client:127
export function renderAgentTranscript(userInput: string, steps: Array<{ name: string; input: Record<string, unknown>; output: string }>): string {
const lines: string[] = [`User request: ${userInput}`, ''];
if (steps.length === 0) {
Expand All @@ -128,21 +136,31 @@ export function renderAgentTranscript(userInput: string, steps: Array<{ name: st
}

/**
* True when an engine's reply DESCRIBES an action ('Let me navigate…', 'I'll click…') but emitted no tool call — a short intent preamble, not a final answer. Used to NUDGE the engine to actually emit the tool line instead of treating the narration as a final answer (the common weak-engine failure). Looks only at the head + caps length to avoid matching a real prose answer that happens to say 'review' or 'let me know'.
* True when an engine's reply DESCRIBES an action ('Let me navigate…', 'Ich suche…') but emitted no tool call — a short intent preamble, not a final answer — so the loop can NUDGE it to actually emit the tool line (the common weak-engine failure: the brain says it will act, then stops). Covers English + German narration, the panel's two languages; other languages degrade to no-nudge (the proactivity system prompt still pushes the model to act in any language — this is only the backstop). Bounded to short replies so a real prose answer that mentions 'review'/'search' isn't misread as narration.
*/
// @kern-source: agentic-brain-client:136
// @kern-source: agentic-brain-client:144
export function looksLikeActionIntent(text: string): boolean {
const head = text.trim().slice(0, 200).toLowerCase();
const s = text.trim();
if (s.length === 0 || s.length >= 500) return false; // a substantive reply is a real answer, not a preamble
const head = s.slice(0, 200).toLowerCase();
if (/\blet me know\b/.test(head)) return false; // a sign-off, not an action
const intent = /\b(let me|i'?ll|i will|i'?m going to|i am going to|let's|first,? i|now i'?ll|going to)\b/.test(head);
const verb = /\b(navigat|click|type|go to|open|fill|select|read|review|look at|check|press|scroll|enter|search|find)\b/.test(head);
return intent && verb && text.trim().length < 500;
// English: a first-person "about to act" opener AND an action verb. Verb stems anchor at
// the word START only (no trailing \b), so a stem matches its inflections (navigat→navigate).
const enIntent = /\b(let me|i'?ll|i will|i'?m going to|i am going to|let's|now i'?ll|going to)\b/.test(head);
const enVerb = /\b(navigat|go to|open|fill|select|read|review|look at|check|press|scroll|click|type|search|find|brows)/.test(head);
// German: a first-person SELF-action ("ich suche/navigiere/öffne…", "starte ich"). Listed as
// explicit "ich <verb>" pairs (not a bare adverb or a stem) so opinion phrasing like
// "ich finde X gut" (I think X is good) and "jetzt sieht …" (now … looks) are NOT misread as
// intent. ASCII-anchored before "ich", which sidesteps the \b-vs-umlaut problem (\b fails
// before 'ö', so a bare "\böffne" never matched).
const deAction = /\b(ich\s+(werde|möchte|gehe|navigiere|klicke|tippe|öffne|lese|suche|scrolle|wähle|prüfe|fülle|drücke|starte|beginne|rufe|schaue)|starte ich|beginne ich|lass mich)\b/.test(head);
return (enIntent && enVerb) || deAction;
}

/**
* A compact, human-readable one-liner for the approval popup's `command` field — what the agent is about to do.
*/
// @kern-source: agentic-brain-client:146
// @kern-source: agentic-brain-client:164
export function describeAgentAction(name: string, input: Record<string, unknown>): string {
let arg = '';
try { arg = JSON.stringify(input); } catch { arg = '{…}'; }
Expand All @@ -153,7 +171,7 @@ export function describeAgentAction(name: string, input: Record<string, unknown>
/**
* v2 BrainClient: a bounded ReAct tool-loop over one engine, with client-lent capabilities (registerCapability) the brain pulls mid-turn via capability-request, and a per-action approval gate for destructive tools. Construct with the daemon's EngineRegistry; open() binds engine/cwd; runTurn() drives the loop; provideCapabilityResult/provideApproval answer the *-request events by requestId.
*/
// @kern-source: agentic-brain-client:157
// @kern-source: agentic-brain-client:175
export class AgenticTurnBrainClient implements BrainClient {
private registry: EngineRegistry;
private adapter: EngineAdapter;
Expand Down Expand Up @@ -496,7 +514,7 @@ export class AgenticTurnBrainClient implements BrainClient {
/**
* Factory mirroring createHeadlessTurnBrainClient: build the v2 agentic tool-loop BrainClient from the daemon's EngineRegistry.
*/
// @kern-source: agentic-brain-client:522
// @kern-source: agentic-brain-client:540
export function createAgenticTurnBrainClient(registry: EngineRegistry): BrainClient {
return new AgenticTurnBrainClient(registry);
}
Loading