Skip to content

appautomaton/webmaton

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Webmaton

English | 中文

Acknowledgement

Special thanks to LINUX.DO — originally published on linux.do. Thank you to the community for the incredible support and feedback.

Important

These skills require uv and Python 3.13. Runnable skill entrypoints use PEP 723 inline metadata — dependencies resolve automatically via uv run. No requirements.txt needed.

Webmaton is a curated toolkit of portable, high-fidelity agent skills for web work — deep research, page capture, and browser automation. Each skill is self-contained, documented, and designed to drop into any agent runtime (OpenCode, Claude, Codex, and others) with minimal setup.

The name is a portmanteau of web and automaton — tools that let AI agents see, read, and interact with the web the way a human researcher would.


Skills

Skill What it does Best for
agentic-search Grok-primary deep research with grounded citations, Tavily/Firecrawl source discovery, verbatim extraction, and rerankable sessions. Research tasks that need sources, not summaries.
html-to-markdown Browser capture + deterministic HTML→Markdown conversion with metadata, link/image inventory, and quality signals. Converting JS-heavy pages or static articles into clean, structured Markdown.
nodriver-browser Persistent Chrome/Chromium automation via nodriver — clicks, typing, screenshots, DOM snapshots, and multi-step flows. Anything that requires interacting with a page like a human (logins, buttons, forms).
playwright-cli Playwright-backed browser sessions with snapshots, element refs, generated test code, storage, network, tracing, and video commands. Repeatable browser flows, Playwright test debugging, and test generation.
chrome-devtools-cli Chrome DevTools action CLI for snapshots, page interaction, console/network inspection, screenshots, Lighthouse, and performance traces. Frontend runtime debugging, layout inspection, and performance diagnostics.

Design principles

  1. Self-contained entrypoints — Every runnable skill entrypoint uses PEP 723 inline metadata, so dependencies resolve automatically via uv run. Private _*.py helper modules are imported by entrypoints and are not standalone commands.
  2. Composable sessionsagentic-search persists research sessions to disk, letting you search, extract quotes, rerank sources, and compose findings across multiple invocations.
  3. Browser-first fidelity — When a page needs JavaScript, login state, or DOM interaction, we reach for a real browser (Chrome → Chromium → Playwright fallback). For static content, we fetch directly. No overkill.
  4. Portable by default — Skills are symlink-friendly and runtime-agnostic. Drop them into ~/.codex/skills/, ~/.claude/skills/, or your agent workspace and they just work.

Quick start

Clone the repository and symlink the skills you need into your agent's skill directory:

git clone <repo-url> webmaton
cd webmaton

# Example: make agentic-search available to Claude
ln -s "$(pwd)/skills/agentic-search" ~/.claude/skills/agentic-search

Each skill's SKILL.md contains invocation examples, reference docs, and failure-mode guidance.


Requirements

  • Python 3.13
  • uv (for uv run script execution)
  • Node.js and npm for CLI-backed browser skills:
    • Node.js 18+ for playwright-cli
    • Node.js 20.19+ plus current Chrome stable for chrome-devtools-cli
  • API keys for providers you plan to use:
    • GROK_API_KEY / GROK_API_URL — for Grok-powered search and fetch
    • TAVILY_API_KEY — for Tavily search and site mapping
    • FIRECRAWL_API_KEY — for Firecrawl fallback fetching

Underlying tools

html-to-markdown uses markmaton for deterministic HTML-to-Markdown conversion, with nodriver handling browser-rendered capture when JavaScript is needed.

playwright-cli uses Microsoft's @playwright/cli. chrome-devtools-cli uses the chrome-devtools command from Google's chrome-devtools-mcp package.


License

MIT

About

Portable web-research and browser-automation SKILLs for Claude Code, Codex, and OpenCode — Playwright, Chrome DevTools, nodriver, and HTML-to-Markdown.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages