Skip to content

Releases: mnapoli/exspec

0.1.7

13 Apr 17:41
46ab6f8

Choose a tag to compare

Anti rabbit-holing

  • Prompt guidance: the agent now skips setup steps immediately when the browser UI clearly can't accomplish them, instead of trying every possible workaround
  • domainTimeout config: new frontmatter option to cap how long a domain (group of scenarios) can run, in minutes. Partial results are preserved if the timeout is reached.
---
domainTimeout: 10
---

0.3.0

31 Mar 20:12
a088e92

Choose a tag to compare

  • Recommendations: exspec will say when some tests are confusing
  • #1 Now uses the playwright CLI instead of the MCP server, should be faster and more token-efficient
  • Detailed logs! (--verbose)
  • The agent now doesn't load local MCP servers, skills, rules, etc. -> much more token efficient
  • Many bugfixes!

Example of recommendation:

Screen-004196

Full Changelog: 0.2.2...0.3.0

0.2.2

30 Mar 21:57
35024b4

Choose a tag to compare

Changes

  • Codified scenario IDs: Scenarios are now identified by IDs (s1, s2, ...) instead of free-text names, fixing reconciliation failures when agents add prefixes or rephrase scenario names
  • Fix --allowedTools format: Use comma-separated format in a single arg to properly restrict agent tool access
  • Duration display: Show elapsed time at the end of CLI output

0.2.1

30 Mar 17:31
f117c9c

Choose a tag to compare

Bug fix + real-time streaming

Bug fix

  • Fix --allowedTools format: Space-separated args instead of comma-separated. The previous format gave the agent access to all tools (Bash, Write, etc.), causing it to create random files on disk instead of testing in the browser.

New features

  • Real-time results: Scenario results (✓/✗) now appear in the CLI as soon as the agent reports them, instead of waiting for the entire domain to complete.
  • Agent activity log: The results markdown file now includes a collapsible section listing all tool calls the agent made, for debugging when things go wrong.
  • Verbose mode: --verbose shows agent tool calls in real-time in the CLI.

0.2.0

30 Mar 16:36
1a94279

Choose a tag to compare

MCP tool-based result reporting

Replaces fragile text parsing with a structured MCP tool for reporting scenario results. This fixes the issue where the Claude agent would deviate from the prescribed output format, causing most scenarios to be marked "not executed" despite actually running.

What changed

  • New MCP reporter server: Exposes a report_scenario_result tool that the agent calls after each scenario. Tool calls are schema-validated, so the result format is guaranteed.
  • JSONL-based results: Each scenario result is written as a JSON line to a temp file, then read by the runner — no more regex parsing of free text.
  • Partial results on crash: If the agent errors mid-run, results reported before the crash are preserved.

Breaking changes

  • Minimum Node version bumped from 20 to 22 (current LTS)

0.1.6

24 Mar 19:45
928a241

Choose a tag to compare

  • Add --version / -v flag
  • Add --help / -h flag

0.1.5

24 Mar 16:15
db5ca93

Choose a tag to compare

Reworked CLI output

Cleaner, test-runner-style output:

  • Show scenario results inline with ✓/✗ symbols
  • Display failed step name and error message for failures
  • Colored output (green ✓, red ✗, dim details) — respects NO_COLOR
  • Removed progress dots, tool names, and per-domain cost from output
  • Results path shown at the top, "Detailed results in ..." at the bottom

Other

  • Added test fixtures (npm run fixtures) targeting example.com
  • Added CLAUDE.md

0.1.4

23 Mar 22:01
37ee651

Choose a tag to compare

Setup commands

You can now run shell commands before tests start using YAML frontmatter in exspec.md:

---
setup:
  - php artisan migrate:fresh --seed
---

This is useful for resetting the database or preparing test data before running scenarios.

Details

  • Commands run once before all tests, on the local machine
  • Accepts a single string or a list of commands
  • Aborts immediately (exit 1) if any command fails
  • 2-minute timeout per command
  • Output hidden by default, visible with --verbose
  • Unknown frontmatter keys are rejected (catches typos)
  • Frontmatter is stripped before being passed to the AI agent

0.1.3

23 Mar 19:53
ecb912c

Choose a tag to compare

Fix release pipeline to set version from git tags

0.1.2

23 Mar 15:13
51cba16

Choose a tag to compare

What's new

  • Track not-executed scenarios: when the agent runs but doesn't report results for expected scenarios, they are now marked as "not executed" with a contextual reason
  • Non-zero exit code when no scenarios pass or any are not executed
  • Warn and discard unexpected scenario names reported by the agent
  • Consistent summary formatting across CLI output and results file
  • Standardized error message truncation with ellipsis indicator

Full Changelog: 0.1.1...0.1.2