Skip to content

SAY-5/mcp-agentlab

Repository files navigation

AgentLab

AI agent orchestration framework with MCP-style tool integrations. The Go orchestrator runs a multi-step agent loop against tool servers that speak JSON-RPC 2.0 over stdio. Every tool response is validated against a JSON Schema declared by the tool itself, retries are bounded with exponential backoff and an explicit transient-vs-permanent classifier, and every step writes an OpenTelemetry span with the input arguments, result preview, and attempt count.

What this studies

This repo is a small lab for four ideas:

  1. MCP-style tool protocols. Tools are subprocesses, not in-process callables. Each one speaks JSON-RPC 2.0 over newline-delimited JSON on stdin/stdout. See docs/jsonrpc-wire.md.
  2. Structured-output validation as a load-bearing piece. Every tool declares a resultSchema in its first tools/list response. The orchestrator validates every response against that schema before passing it to the next step. Validation failures are classified as permanent and are not retried.
  3. OTel-style tracing for agent steps. One root span per run, one span per step, one span per attempt. Result previews, retry counts, and failure classes live on the span attributes. See docs/tracing.md.
  4. Retry classification. Network and JSON-RPC -32603 internal error are transient (retried at 100 ms / 400 ms / 1600 ms). Schema-validation errors, -32002 permanent tool error, and bad-input codes are not retried. See internal/orchestrator/retry.go.

How this differs from SAY-5/agentic-runner

The two repos sit on different axes of the agent design space.

Axis agentic-runner agentlab (this repo)
Language split Pure Python, single process Go orchestrator + Python tool servers (two processes/tool)
Tool interface In-process callables, looked up in a registry Subprocess JSON-RPC 2.0 over stdio (MCP-style)
Tool count and surface One generic registry; tools added as functions 8 distinct tools, each its own Python server
Provider focus Replan loop: validate output, re-decompose plan Fixed scripted plan; the focus is the orchestrator side
Schema validation Pydantic at tool boundary Pydantic at tool boundary plus gojsonschema in Go on every result
Tracing Python logging + step records OTel-style span tree (root + step + attempt)
Retry classifier Inline per-tool Centralised in the orchestrator; transient/permanent table

They are deliberately complementary. agentic-runner is the place to study provider-driven replanning; agentlab is the place to study the protocol and orchestrator layer.

The 8 tools

Each tool is a separate Python process. The orchestrator spawns it, discovers its schema via tools/list, and routes calls to it via tools/call.

Tool Purpose
file_read Read a UTF-8 text file (with byte cap and truncation flag).
file_write Write a UTF-8 text file (overwrites; optional create_parents).
http_get GET a URL. In CI, served from a fixture map (no real network).
calculate Safe arithmetic evaluator using a real recursive-descent parser.
query_db SELECT-only queries against a bundled SQLite (cities, countries).
summarize Deterministic first-N-sentences summariser. No LLM call.
extract_json Pull a JSON object from a string and validate against a JSON Schema.
finish Terminate the loop with a final answer.

See docs/tool-protocol.md for the skeleton if you want to add another one.

Architecture

                +-----------------------------+
                |     cmd/agentlab (Go)       |
                +--------------+--------------+
                               |
                +--------------v--------------+
                |  internal/orchestrator       |
                |  - step loop                 |
                |  - retry classifier          |
                |  - OTel tracer               |
                +------+------+--------+-------+
                       |      |        |
              +--------v-+ +--v----+ +-v-----------+
              | provider | | tools | |  trace      |
              | (Fake/   | | reg + | |  exporter   |
              | Claude   | | schema| |  (mem/OTLP) |
              | stub)    | | cache | |             |
              +----+-----+ +---+---+ +-------------+
                   |           |
            scripted YAML      | JSON-RPC 2.0 over stdio (one subprocess per tool)
                               v
        +----------+ +-----------+ +----------+ +----------+ ...
        | file_read| | file_write| | calculate| |  finish  |
        | (Python) | | (Python)  | | (Python) | | (Python) |
        +----------+ +-----------+ +----------+ +----------+

Quickstart

Requirements: Go 1.22+, Python 3.11 or 3.12, Docker (optional).

make up        # creates a venv, installs deps, builds bin/agentlab
make demo      # runs the 8-tool demo, writes demo-output/{result,trace}.jsonl
make lint      # golangci-lint + ruff + black --check
make typecheck # go vet + mypy --strict
make test      # go test + pytest + the 8-tool integration

Or via Docker:

docker compose build
docker compose run --rm agentlab

Demo run output

$ make demo
registered 8 tools: [calculate extract_json file_read file_write finish http_get query_db summarize]
steps=8 final_answer="Tokyo, capital of Japan, population ~13.96M" done=true

The plan in tasks/multi_step_demo.yaml touches every tool exactly once, in this order:

Step Tool Result (truncated)
0 file_write {"path": "/tmp/agentlab-demo.txt", "bytes_written": 139}
1 file_read {"content": "Tokyo is the capital of Japan...", "bytes_read": 139, ...}
2 calculate {"value": 13.96, "expression": "13960000 / 1000000"}
3 query_db {"rows": [{"name": "Tokyo", "population": 13960000, "country": "JP"}], ...}
4 http_get {"status": 200, "url": "http://agentlab.local/tokyo.json", "body": "..."}
5 extract_json {"data": {"city": "Tokyo", "population": 13960000, "country": "JP"}, ...}
6 summarize {"summary": "Tokyo is the capital of Japan...", "sentence_count": 2}
7 finish {"answer": "Tokyo, capital of Japan, population ~13.96M", "done": true}

Trace numbers from one local run:

Metric Value
Spans emitted 17
Step spans 8
Attempt spans 8 (no retries; happy-path run)
Sum of step latencies 41.8 ms
Wall-clock end-to-end (including subprocess spawn) sub-second

Latencies vary by machine. The shape (17 spans, 8 steps, 0 retries on the happy path) is asserted by the integration test in tests/integration/demo_test.go.

What this is not

  • Not a real MCP implementation. The wire format and tools/list / tools/call names borrow from MCP, but resources, prompts, content blocks, capability negotiation, and sampling are out of scope. See docs/mcp-comparison.md for the full list of deltas.
  • No real LLM in CI. The FakeProvider reads a scripted YAML plan; every test and the make demo flow uses it. A ClaudeProvider stub exists in internal/provider/claude.go to document the BYOK swap path, but it is env-gated and never invoked in CI.
  • No auth on tool servers. The subprocess trust model assumes that whoever spawned the server gets to talk to it. There is no transport encryption or authentication.
  • No streaming tool outputs. Strict request/response.
  • No tool composition or sub-agents. A scripted plan is one linear list of (tool, arguments) pairs.
  • No parallel tool calls. Tool calls are sequential. The companion agentic-runner repo studies a different shape (provider-driven replan); parallel tool dispatch is left for a future repo.

Project layout

agentlab/
  cmd/agentlab/          Go CLI entrypoint
  internal/
    orchestrator/        agent loop, retry classifier
    jsonrpc/             JSON-RPC 2.0 client + stdio transport
    tools/               registry, schema cache, validation
    provider/            Provider interface, FakeProvider, ClaudeProvider stub
    trace/               span model and exporters (in-memory + OTLP/JSON)
    config/              YAML loader
    chaos/               in-process fault injection for tests
  agentlab_tools/        the 8 Python tool servers + shared protocol base
  tasks/                 the canonical demo plan + top-level config
  tests/                 Go integration + Python unit suites
  docs/                  wire format, MCP comparison, tool-author guide, tracing

License

MIT, see LICENSE.

About

Go orchestrator + Python tool servers MCP-style agent framework: 8 tools over JSON-RPC stdio, schema validation, retry classification, OTel tracing

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors