Skip to content

feat!: replace @playwright/mcp with playwright-cli#1

Closed
pierreboissinot wants to merge 1 commit intomnapoli:mainfrom
le-phare:feat-playwright-cli
Closed

feat!: replace @playwright/mcp with playwright-cli#1
pierreboissinot wants to merge 1 commit intomnapoli:mainfrom
le-phare:feat-playwright-cli

Conversation

@pierreboissinot
Copy link
Copy Markdown

Context

Hey @mnapoli ! Following up on our exchange under your LinkedIn post about exspec —
I mentioned I was experimenting with playwright-cli (https://github.com/microsoft/playwright-cli)
as a more token-efficient alternative to the MCP-based Playwright integration.

Here's a working implementation.

Summary

  • Replace @playwright/mcp with @playwright/cli (playwright-cli)
  • Agent uses direct shell commands (Bash(playwright-cli:*)) instead of MCP tool calls
  • Lightweight YAML snapshots with ref-based selectors (e3, e15) instead of full accessibility trees
  • Skills-less operation: agent discovers commands via playwright-cli --help
  • No MCP server, no temp config files, fewer round-trips

Why

The MCP approach returns verbose accessibility trees on every browser_snapshot call —
each interaction bloats the context window. playwright-cli returns concise YAML with
numbered refs, which means:

  • Less tokens per interaction — compact snapshots vs. full a11y trees
  • No MCP schema overhead — no tool definitions loaded into context
  • Simpler architecture — no MCP server to configure, no temp JSON files
  • Better debuggabilityplaywright-cli show for live session monitoring

What changed

File Change
package.json @playwright/mcp@playwright/cli
src/runner.ts Removed MCP config generation, --allowedTools "Bash(playwright-cli:*)"
src/prompt.ts Added headed option with {HEADED_MODE} interpolation
src/cli.ts Passes headed to prompt instead of runner
prompt-template.md Full rewrite: playwright-cli commands, --help discovery, ref-based selectors
README.md Updated "How it works" section

Test plan

  • All 54 unit tests pass
  • Lint clean, build succeeds
  • npm run fixtures runs successfully against example.com
    • Scenario 1 (title check): PASS
    • Scenario 3 (intentional failure): FAIL as expected

  BREAKING CHANGE: swap MCP-based browser automation for playwright-cli
  (https://github.com/microsoft/playwright-cli), a CLI-based alternative
  that returns lightweight YAML snapshots instead of full accessibility
  trees, reducing token usage and speeding up test execution.

  - Replace @playwright/mcp dependency with @playwright/cli
  - Use Bash(playwright-cli:*) instead of mcp__playwright__* tools
  - Remove MCP config generation in runner.ts
  - Update prompt-template.md to use playwright-cli commands with
    ref-based selectors (e.g. e3, e15) and skills-less discovery
    via --help
  - Move headed/headless mode from runner to prompt interpolation
@mnapoli
Copy link
Copy Markdown
Owner

mnapoli commented Mar 27, 2026

Very interesting thank you!

What do you think about this mention in the cli readme:

image

Do you think it's still worth switching here? If so why?

@mnapoli
Copy link
Copy Markdown
Owner

mnapoli commented Mar 27, 2026

Oh also why not include the official skill? Because it might be useful for Claude to use the CLI optimally? (i.e. I'd be afraid we do a poorer job at teaching Claude how to use the CLI than the official skill)

@pierreboissinot
Copy link
Copy Markdown
Author

Oh also why not include the official skill? Because it might be useful for Claude to use the CLI optimally? (i.e. I'd be afraid we do a poorer job at teaching Claude how to use the CLI than the official skill)

Ok, i'll look into that idea on monday, I'm attending to https://shift-hackathon.com this week end

prompt,
"--allowedTools",
"mcp__playwright__*",
"Bash(playwright-cli:*)",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not mistaken we can let it load the skill by using:

Bash(playwright-cli:*) Skill

And adding the section below on prompt-template.md):5

First, invoke the playwright-cliskill (via the Skill tool with skill name "playwright-cli") to set up browser interaction capabilities. If the skill is not available, fall back to runningplaywrigh
+t-cli --help to discover available commands.

@pierreboissinot
Copy link
Copy Markdown
Author

pierreboissinot commented Mar 31, 2026

Very interesting thank you!

What do you think about this mention in the cli readme:

image Do you think it's still worth switching here? If so why?

@mnapoli
It depends on the usage. Personally, I use playwright-ci to run tests and playwright MCP to heal tests based on a test report.

In my opinion playwright-cli is faster and costs less token so yes, worth it.

@mnapoli
Copy link
Copy Markdown
Owner

mnapoli commented Mar 31, 2026

Thanks. I'm working on a big refactor and will change to the CLI version. I've noticed (big surprise) AI sometimes fail to follow instructions correctly, so I need to improve the internals first.

@pierreboissinot
Copy link
Copy Markdown
Author

Thanks. I'm working on a big refactor and will change to the CLI version. I've noticed (big surprise) AI sometimes fail to follow instructions correctly, so I need to improve the internals first.

Ok so I let this PR as it is atm ?

@mnapoli
Copy link
Copy Markdown
Owner

mnapoli commented Mar 31, 2026

Yes, thank you for the contribution even if it wasn't that exact code the idea was great!

I released a pretty major update: https://github.com/mnapoli/exspec/releases/tag/0.3.0

I fixed a lot of weird behaviors, making things (hopefully) much more stable and reliable. And since we have more logs now, we can see where the agent gets confused.

That prompted me to add "Recommendations" -> the agent now can recommend improvements to tests. E.g. I had a confusing test (that is wasting time and tokens as the agent needs to figure out how to understand the website), now I can know about that.

I think next features could be:

  • report warnings on things spotted but not directly related to the test (e.g. test is passing, but something needs to be flagged)
  • optimizing some tests into code (turning this into a mix of Cucumber and full AI) -> I'm not sure about that, I need to experiment

@mnapoli mnapoli closed this Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants