Releases: mnapoli/exspec
Releases · mnapoli/exspec
0.1.7
Anti rabbit-holing
- Prompt guidance: the agent now skips setup steps immediately when the browser UI clearly can't accomplish them, instead of trying every possible workaround
domainTimeoutconfig: new frontmatter option to cap how long a domain (group of scenarios) can run, in minutes. Partial results are preserved if the timeout is reached.
---
domainTimeout: 10
---0.3.0
- Recommendations: exspec will say when some tests are confusing
- #1 Now uses the playwright CLI instead of the MCP server, should be faster and more token-efficient
- Detailed logs! (
--verbose) - The agent now doesn't load local MCP servers, skills, rules, etc. -> much more token efficient
- Many bugfixes!
Example of recommendation:
Full Changelog: 0.2.2...0.3.0
0.2.2
Changes
- Codified scenario IDs: Scenarios are now identified by IDs (
s1,s2, ...) instead of free-text names, fixing reconciliation failures when agents add prefixes or rephrase scenario names - Fix
--allowedToolsformat: Use comma-separated format in a single arg to properly restrict agent tool access - Duration display: Show elapsed time at the end of CLI output
0.2.1
Bug fix + real-time streaming
Bug fix
- Fix
--allowedToolsformat: Space-separated args instead of comma-separated. The previous format gave the agent access to all tools (Bash, Write, etc.), causing it to create random files on disk instead of testing in the browser.
New features
- Real-time results: Scenario results (✓/✗) now appear in the CLI as soon as the agent reports them, instead of waiting for the entire domain to complete.
- Agent activity log: The results markdown file now includes a collapsible section listing all tool calls the agent made, for debugging when things go wrong.
- Verbose mode:
--verboseshows agent tool calls in real-time in the CLI.
0.2.0
MCP tool-based result reporting
Replaces fragile text parsing with a structured MCP tool for reporting scenario results. This fixes the issue where the Claude agent would deviate from the prescribed output format, causing most scenarios to be marked "not executed" despite actually running.
What changed
- New MCP reporter server: Exposes a
report_scenario_resulttool that the agent calls after each scenario. Tool calls are schema-validated, so the result format is guaranteed. - JSONL-based results: Each scenario result is written as a JSON line to a temp file, then read by the runner — no more regex parsing of free text.
- Partial results on crash: If the agent errors mid-run, results reported before the crash are preserved.
Breaking changes
- Minimum Node version bumped from 20 to 22 (current LTS)
0.1.6
0.1.5
Reworked CLI output
Cleaner, test-runner-style output:
- Show scenario results inline with ✓/✗ symbols
- Display failed step name and error message for failures
- Colored output (green ✓, red ✗, dim details) — respects
NO_COLOR - Removed progress dots, tool names, and per-domain cost from output
- Results path shown at the top, "Detailed results in ..." at the bottom
Other
- Added test fixtures (
npm run fixtures) targeting example.com - Added CLAUDE.md
0.1.4
Setup commands
You can now run shell commands before tests start using YAML frontmatter in exspec.md:
---
setup:
- php artisan migrate:fresh --seed
---This is useful for resetting the database or preparing test data before running scenarios.
Details
- Commands run once before all tests, on the local machine
- Accepts a single string or a list of commands
- Aborts immediately (exit 1) if any command fails
- 2-minute timeout per command
- Output hidden by default, visible with
--verbose - Unknown frontmatter keys are rejected (catches typos)
- Frontmatter is stripped before being passed to the AI agent
0.1.3
0.1.2
What's new
- Track not-executed scenarios: when the agent runs but doesn't report results for expected scenarios, they are now marked as "not executed" with a contextual reason
- Non-zero exit code when no scenarios pass or any are not executed
- Warn and discard unexpected scenario names reported by the agent
- Consistent summary formatting across CLI output and results file
- Standardized error message truncation with ellipsis indicator
Full Changelog: 0.1.1...0.1.2