@@ -11,22 +11,23 @@ How to run Gambit, the agent harness framework, locally and observe runs.
1111 - Command help: ` deno run -A src/cli.ts help <command> ` (or
1212 ` deno run -A src/cli.ts <command> -h ` ).
1313- Run once:
14- ` deno run -A src/cli.ts run <deck> [--context <json|string>] [--message <json|string>] [--model <id>] [--model-force <id>] [--trace <file>] [--state <file>] [--stream] [--responses] [--verbose] `
14+ ` deno run -A src/cli.ts run <deck> [--context <json|string>] [--message <json|string>] [--model <id>] [--model-force <id>] [--trace <file>] [--state <file>] [--stream] [--responses] [--verbose] [--worker-sandbox|--no-worker-sandbox|--legacy-exec] `
1515- Check models: ` deno run -A src/cli.ts check <deck> `
1616- REPL: ` deno run -A src/cli.ts repl <deck> ` (defaults to
1717 ` src/decks/gambit-assistant.deck.md ` in a local checkout). Streams by default
1818 and keeps state in memory for the session.
19- - Test bot (CLI):
20- ` deno run -A src/cli.ts test-bot <root-deck> --test-deck <persona-deck> [--context <json|string>] [--bot-input <json|string>] [--message <json|string>] [--max-turns <n>] [--state <file>] [--grade <grader-deck> ...] [--trace <file>] [--responses] [--verbose] `
19+ - Scenario (CLI):
20+ ` deno run -A src/cli.ts scenario <root-deck> --test-deck <persona-deck> [--context <json|string>] [--bot-input <json|string>] [--message <json|string>] [--max-turns <n>] [--state <file>] [--grade <grader-deck> ...] [--trace <file>] [--responses] [--verbose] [--worker-sandbox|--no-worker-sandbox|--legacy-exec ] `
2121- Grade (CLI):
22- ` deno run -A src/cli.ts grade <grader-deck> --state <file> [--model <id>] [--model-force <id>] [--trace <file>] [--responses] [--verbose] `
22+ ` deno run -A src/cli.ts grade <grader-deck> --state <file> [--model <id>] [--model-force <id>] [--trace <file>] [--responses] [--verbose] [--worker-sandbox|--no-worker-sandbox|--legacy-exec] `
2323- Export bundle (CLI):
2424 ` deno run -A src/cli.ts export [<deck>] --state <file> --out <bundle.tar.gz> `
25- - Debug UI: ` deno run -A src/cli.ts serve <deck> --port 8000 ` then open
25+ - Debug UI: ` deno run -A src/cli.ts serve <deck> --port 8000 ` or
26+ ` deno run -A src/cli.ts serve --artifact <bundle.tar.gz> ` then open
2627 http://localhost:8000/ . This serves a multi-page UI:
2728
2829 - Debug (default): ` http://localhost:8000/debug `
29- - Test: ` http://localhost:8000/test-bot `
30+ - Test: ` http://localhost:8000/test `
3031 - Calibrate: ` http://localhost:8000/calibrate `
3132
3233 The WebSocket server streams turns, traces, and status updates.
@@ -46,21 +47,33 @@ How to run Gambit, the agent harness framework, locally and observe runs.
4647- ` GAMBIT_RESPONSES_MODE=1 ` : env alternative to ` --responses ` for runtime/state.
4748- ` GAMBIT_OPENROUTER_RESPONSES=1 ` : route OpenRouter calls through the Responses
4849 API (experimental; chat remains the default path).
50+ - Worker execution defaults on for deck-executing surfaces. Use
51+ ` --no-worker-sandbox ` (or ` --legacy-exec ` ) to roll back to legacy in-process
52+ execution. ` --sandbox/--no-sandbox ` still work as deprecated aliases.
53+ - ` gambit.toml ` config equivalent:
54+ ``` toml
55+ [execution ]
56+ worker_sandbox = false # same as --no-worker-sandbox
57+ # legacy_exec = true # equivalent rollback toggle
58+ ```
4959
5060## State and tracing
5161
52- - ` --state <file> ` (run/test-bot /grade/export): load/persist messages so you can
62+ - ` --state <file> ` (run/scenario /grade/export): load/persist messages so you can
5363 continue a conversation; skips ` gambit_context ` on resume. ` grade ` writes
5464 ` meta.gradingRuns ` back into the session state, while ` export ` reads the state
5565 file to build the bundle.
5666- ` --out <file> ` (export): bundle output path (tar.gz).
57- - ` --grade <grader-deck> ` (test-bot ): can be repeated; graders run in the order
67+ - ` --grade <grader-deck> ` (scenario ): can be repeated; graders run in the order
5868 provided and append results to ` meta.gradingRuns ` in the same session state
5969 file.
6070- ` --trace <file> ` writes JSONL trace events; ` --verbose ` prints trace to
6171 console. Combine with ` --stream ` to watch live output while capturing traces.
6272- ` --port <n> ` overrides debug UI port (default 8000); ` PORT ` env is honored
6373 when ` --port ` is not provided.
74+ - ` --artifact <bundle.tar.gz> ` (serve only): restore and serve a bundle created
75+ by ` gambit export ` (or FAQ download). Mutually exclusive with explicit deck
76+ path.
6477- ` serve ` auto-builds the debug UI bundle on every start and generates source
6578 maps by default in dev environments (set ` GAMBIT_ENV=development ` or
6679 ` NODE_ENV=development ` , or pass ` --bundle ` /` --sourcemap ` explicitly).
@@ -91,17 +104,17 @@ How to run Gambit, the agent harness framework, locally and observe runs.
91104 ` window.gambitFormatTrace ` hook in the page; return a string or
92105 ` {role?, summary?, details?, depth?} ` to override the entry that appears in
93106 the Traces & Tools pane.
94- - The Test page reuses the same simulator runtime but drives persona/test-bot
107+ - The Test page reuses the same simulator runtime but drives persona/scenario
95108 decks so you can batch synthetic conversations, inspect per-turn scoring, and
96109 export JSONL artifacts for later ingestion. List personas by declaring
97110 ` [[testDecks]] ` entries in your root deck (for example
98111 ` gambit/examples/advanced/voice_front_desk/decks/root.deck.md ` ). Each entry’s
99112 ` path ` should point to a persona deck (Markdown or TS) that includes
100113 ` acceptsUserTurns = true ` ; the persona deck’s own ` contextSchema ` and defaults
101- power the Scenario/Test Bot form (see
114+ power the Scenario form (see
102115 ` gambit/examples/advanced/voice_front_desk/tests/new_patient_intake.deck.md ` ).
103116 Editing those deck files is how you add/remove personas now—there is no
104- ` .gambit/test-bot .md ` override.
117+ ` .gambit/scenario .md ` override.
105118- The Calibrate page is the regroup/diagnostics view for graders that run
106119 against saved Debug/Test sessions; it currently serves as a placeholder until
107120 the grading transport lands.
0 commit comments