📖 Installation Guide — quick start, manual setup, and troubleshooting
FastMCP 3.1 helper for Arcade ToolBench: curated links, rescoring steps, Glama vs ToolBench notes, and an optional explanation of Arcades own MCP product (their hosted integrations runtime separate from grading).
This project is not affiliated with Arcade. It does not call ToolBench HTTP APIs; the optional Playwright scraper loads public pages like a browser. You are responsible for complying with Arcades terms and for using the scraper only in line with the intended use section below.
git clone https://github.com/sandraschi/toolbench-mcp
cd toolbench-mcp
justThis opens an interactive dashboard showing all available commands. Run just bootstrap to install dependencies, then just serve or just dev to start.
If you don't have just installed:
If you have never heard of it, you are not alone it is easy to miss.
ToolBench is a public website run by Arcade at toolbench.arcade.dev. It is part of Arcades ecosystem around MCP servers (the protocol many AI coding tools use to expose tools to models).
Roughly, ToolBench is where Arcade:
- Indexes and grades MCP servers that appear in their ecosystem (report cards, scores, methodology).
- Publishes methodology (how scores are derived), improve guidance, and a submit flow so maintainers can request rescoring after changes.
- Surfaces links to related Arcade docs (e.g. agentic tool patterns).
It is not the same thing as:
- Glama (glama.ai) or other directories those are mainly discovery; ToolBench tends to emphasize transparent methodology and actionable report pages for servers Arcade tracks.
- Arcades MCP product (hosted integrations like Gmail/Slack through their runtime) that is a separate product for using Arcades tools with auth; it does not replace reading your ToolBench report or improving your own server repo.
This repo (toolbench-mcp) gives you a small MCP server plus a local webapp so agents and humans can open the same links, follow rescoring steps, and optionally run a bundled Playwright scraper to save offline copies of assessment pages (respect rate limits; see the scripts --delay-seconds / --jitter-seconds).
Not legal advice. Whether automated access is allowed is governed by Arcades current policies (and any terms shown on ToolBench). Read their legal pages yourself, for example Arcade Terms of Service and Arcade Privacy Policy ToolBench is an Arcade surface, so those documents are the right starting point unless Arcade publishes ToolBench-specific terms elsewhere.
How this project expects you to use the scraper: as a maintainer archiving or reviewing feedback about your own MCP servers (public report pages that concern your repos) so you can fix issues and request rescoring through official flows. That is a narrow, legitimate engineering use case; it is not a license to bulk-harvest other peoples report cards or to run aggressive crawls against production.
Safer patterns:
- Prefer
scrapewith aurls.txtyou built yourself (only URLs for your servers assessments). That avoids thediscoverpath, which follows search-result pagination and can collect many third-party URLs if you point it at a broad listing. - Use generous
--delay-seconds/--jitter-secondsand do not run multiple scrapers in parallel against ToolBench. - When Arcade offers an official path (e.g. Submit / rescoring, or API access if you are approved), prefer that over scraping for anything business-critical.
If you are unsure, ask Arcade (support or your contact) before scaling automated access.
Only if you want their hosted integrations (Gmail, Slack, GitHub, ) through Arcades MCP clients. That product is separate from ToolBench grading. It will not by itself raise scores on your GitHub MCP servers use methodology + repo fixes for that.
| Tool | Purpose |
|---|---|
toolbench_guide |
operation: get_help | list_official_links | rescoring_after_improvements | glama_vs_toolbench | arcade_mcp_product |
The scraper lives in this repo:
scripts/scrape_toolbench_assessments.pydiscover assessment URLs from a ToolBench search results page (with pagination), scrape a list of URLs, or run full (discover then scrape). Output defaults to a per-run directory underscrape_out/when used from the webapp API.
The webapp drives that script via the backend:
GET /api/scraper/statusscript path, Playwright installed, output rootPOST /api/scraper/discover|scrape|fullsame args as the CLI (JSON body)GET /api/scraper/tree?subdir=list output files underscrape_out/<subdir>/GET /api/scraper/file?rel_path=preview a file (path relative to output root)DELETE /api/scraper/output?subdir=clear one subfolder
Install Playwright into this venv: pip install -e ".[scraper]" then python -m playwright install chromium.
| Env | Purpose |
|---|---|
TOOLBENCH_SCRAPER_SCRIPT |
Override path to scrape_toolbench_assessments.py (default: <repo>/scripts/scrape_toolbench_assessments.py) |
TOOLBENCH_SCRAPER_OUTPUT_ROOT |
Override output root (default: <repo>/scrape_out) |
- stdio:
toolbench-mcporpython -m toolbench_mcp - HTTP:
toolbench-mcp --serveMCP athttp://127.0.0.1:10817/mcp(see/health)
Iron Shell: sidebar (Home, Tools, Apps, Help, Settings), topbar + breadcrumbs, main scroll, logger panel (timestamped levels, auto-scroll with pause-on-scroll-up), SOTA Chat FAB (Ctrl+K / Cmd+K, personas: Reductionist / Debugger / Explainer logs to panel until an LLM is wired). Dark default + light in Settings. Glom On: GET /api/meta/local-llm probes Ollama 11434 and LM Studio 1234. /apps loads public/glama.json. /tools = MCP inspector + Playwright scraper.
| Backend | Path |
|---|---|
| MCP manifest | GET /api/meta/tools |
| Local LLM probe | GET /api/meta/local-llm |
| Role | Port |
|---|---|
| Vite (UI) | 10816 |
| FastAPI + MCP | 10817 |
cd D:\Dev\repos\toolbench-mcp
.\start.ps1Double-click start.bat (repo root) or webapp\start.bat (frontend only) if you prefer a launcher without typing.
webapp\start.ps1 alone: if nothing is listening on 10817, it opens a second PowerShell window with python -m toolbench_mcp --serve, waits for /health, then starts Vite so you avoid ECONNREFUSED on the proxy. If the backend never comes up (venv missing, wrong cwd), start it manually: cd to repo root, .\.venv\Scripts\Activate.ps1, python -m toolbench_mcp --serve.
Or: terminal A python -m toolbench_mcp --serve, terminal B cd webapp; npm install; npm run dev.
| Variable | Default |
|---|---|
TOOLBENCH_MCP_HOST |
127.0.0.1 |
TOOLBENCH_MCP_PORT |
10817 |
TOOLBENCH_MCP_HTTP_PATH |
/mcp |
TOOLBENCH_MCP_WEBAPP_PORT |
10816 |
cd D:\Dev\repos\toolbench-mcp
py -3 -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -e ".[dev]"
ruff check src
pytestThis project adheres to SOTA 14.1 industrial standards for high-fidelity agentic orchestration:
- Python (Core): Ruff for linting and formatting. Zero-tolerance for
printstatements in core handlers (T201). - Webapp (UI): Biome for sub-millisecond linting. Strict
noConsoleLogenforcement. - Protocol Compliance: Hardened
stdout/stderrisolation to ensure crash-resistant JSON-RPC communication. - Automation: Justfile recipes for all fleet operations (
just lint,just fix,just dev). - Security: Automated audits via
banditandsafety.
MIT