leather

Local agent infrastructure in one stdlib-only Go binary.

Leather runs declarative agents on your workstation, server, or Raspberry Pi: scheduled jobs, one-shot runs, webhook-driven workflows, tool calling, and auditable outputs.

No Python stack. No hosted control plane. No broker, telemetry, or dependency pile.

leather run ~/.leather/agents/summarizer.agent.md
leather serve --pretty --stats
leather ingest pr.json --kind github.pr --curing pr-review

Why leather

Instead of…	leather gives you
A Python stack, a virtualenv, and a `requirements.txt` you re-pin every quarter	One Go binary. Stdlib only. `go.mod` has zero `require` entries.
A hosted control plane or always-on SaaS	A single process you run on your workstation, server, or Raspberry Pi. No phone-home. No telemetry. Loopback API by default.
A heavy broker (Redis, RabbitMQ, Temporal) just to fan webhooks into agents	Built-in queues with backpressure, retries, and dead-letter routing. HMAC-validated webhook intake. Same process.
Writing a Python plugin every time an agent needs a tool	Skills, toolsets, MCP servers, and a shell-manifest companion binary — all stdlib Go.
"Context window exceeded" halfway through a multi-megabyte PR	Oversized inputs are paged into bounded cuts; the model only ever sees what fits.
Grepping yesterday's logs to reconstruct what an agent actually did	JSONL run history, deterministic replay, lineage on every artifact (agent, curing, input hide, timestamps).
Picking between a cron tool and a workflow engine	Scheduled agents and webhook-driven multi-agent pipelines, in one binary, side by side.

Examples

Every example is a single make target. Copy the env template, point it at your local model, and go:

Example 02: scheduled agent

cp examples/env.example examples/.env
$EDITOR examples/.env
make example-02

Animated GIF of a terminal running 'make example-02' to start a scheduled agent, then showing the agent's output in the terminal

Example 06: multi-agent curing + devtools UI

make example-06

Animated GIF of a terminal running 'make example-06' to start a multi-agent curing

Screenshot of the devtools UI showing a session timeline with prompt and tool events

The full catalog lives in examples/ — twelve runnable demos from hello-mock to a high-volume CI gate.

Capabilities

Local agent runtime

Agents are Markdown + YAML. They run against any OpenAI-compatible endpoint. Token budgets are tracked per request; context is summarized or truncated before the model's limit is hit.

Tools, skills, MCP, and shell

Skills (*.skill.yaml) define tools plus optional prompt/parameter metadata.
Toolsets (*.toolset.yaml) bundle named tool collections.
MCP servers (mcp-servers.yaml) plug in any stdio‑transport MCP server.
shell-mcp is a companion binary that turns a JSON manifest into a fast local tool surface — git, gh, anything you'd put behind a shell command.

Curings — multi-stage workflows

A *.curing.yaml binds one agent to one input queue. Compose pipelines by writing one curing's output into the next curing's input queue. Runs under plain leather serve; no tannery required.

Queues — per‑curing FIFO with bounded depth, backpressure, configurable concurrency, exponential‑backoff retry, and a dead‑letter queue for items that exhaust their retry budget. Inspectable via /queues and /queues/{name}.
CLI ingest — leather ingest path/to/file --kind <hide-kind> --curing <name> drops a hide directly onto a curing queue, no HTTP needed.
Process lock — non‑blocking <state-dir>/leather.lock; a second leather serve against the same state directory exits with code 2 and a clear stderr message.

HTTP poll workers

*.worker.yaml files under --worker-dir run background HTTP pollers that push results into named queues — RSS feeds, status pages, JSON APIs — with retry/backoff and the same dead-letter routing as curings.

Tannery — HTTP intake, hides, and artifacts

Add a tannery.yaml next to your config and leather serve also stands up:

Intake (POST /intake) and webhooks (POST /webhooks/{name}) — HMAC‑validated, per‑route body‑size caps, source/event matching dispatches into the right curing queue.
Hides — raw inputs (PR threads, API responses, logs, files) stored content‑addressed under hide_dir. Agents only ever see a bounded cut through a paged HideBuffer, so multi‑megabyte inputs can't blow the context window. Browseable via /hides and /hides/{id}.
Artifacts — promoted outputs stored content‑addressed under artifact_dir with lineage (which curing, which input hide(s), which agent, when). Queryable via /artifacts and /artifacts/{id}.

Replay, snapshots, and run history

--persist-runs writes every turn to JSONL with rotation
--replay and --replay-live reconstruct past sessions deterministically
/snapshot captures live state; /replay/control drives playback when the API is enabled

Browser DevTools UI

With --api, leather serve exposes a single-page UI at /ui/devtools.html: session timeline, prompt/tool event inspector, curing flow diagram, queue and worker status, and live SSE updates. No build step, no JS dependencies — served straight from the binary.

HTTP API

Loopback-bound by default (127.0.0.1:7749):

Runtime: /healthz, /status, /metrics, /config, /history
Scheduler & queues: /jobs, /jobs/{id}, /queues, /queues/{name}, /workers
Cache: /cache/stats
Tannery (when enabled): /intake, /webhooks/*, /hides, /hides/{id}, /artifacts, /artifacts/{id}, /curings
Replay: /snapshot, /replay/control
DevTools: /api/devtools/snapshot, /api/devtools/inspect/..., /api/devtools/trace/...

Notifications

Finished agent runs can be delivered to messaging sinks via per-agent output routes. Built-in Telegram and Signal backends ship in internal/notify; curing outputs can additionally land in queues or be promoted to artifacts.

Build your own

1. Write an agent

---
name: summarizer
---
You are a concise planning assistant. Output bullet points only.

That's a complete *.agent.md file. Front matter declares identity, the body is the system prompt.

2. Give it a schedule (optional)

agent: summarizer
schedule: "0 9 * * *"
model: llama3
prompt: Summarize the three most important things to do today.

*.lifecycle.yaml files sit next to agents and carry the when and how.

3. Run it

leather validate                                  # check everything parses
leather run ~/.leather/agents/summarizer.agent.md # run once
leather serve --pretty --stats                    # run on schedule
leather chat --model llama3                       # talk to the model interactively

4. Add a workflow (when one agent isn't enough)

Two agents passing a hide between them, dispatched by a webhook:

# tannery.yaml
hide_dir: ./.tannery/hides
artifact_dir: ./.tannery/artifacts
curing_dir: ./curings
webhooks:
  - name: github
    path: /webhooks/github
    source: github
    secret: "{{env:GITHUB_WEBHOOK_SECRET}}"
routes:
  - name: pr-review
    match:
      source: github
      event_type: pull_request
    hide_kind: github.pull_request
    curing: triage
    queue: triage-in

Each curing binds one agent to one queue. Chain them by writing the output of the first into the input queue of the second:

# curings/triage.curing.yaml — first stage
name: triage
agent: triage      # classifies the PR, tags it
hide_types: [github.pull_request]
queue: triage-in
output:
  queue: review-in # hand off to the reviewer

# curings/review.curing.yaml — second stage
name: review
agent: reviewer    # reads the diff in cuts, writes the verdict
hide_types: [github.pull_request]
queue: review-in
output:
  artifact: true

See examples/06-multi-agent-curing for a working two-curing chain and examples/10-ci-gate for a webhook-driven fan-out.

leather serve --config tannery/config.yaml
# any POST to /webhooks/github with a valid HMAC now enqueues a curing run

You've just built a two-stage agent pipeline triggered by a GitHub webhook, running in a single local process.

The vocabulary

leather uses a deliberate metaphor borrowed from leatherworking. Sixty seconds here makes the rest of the docs read smoothly:

Leather    the CLI/runtime/binary
Tanning    your local working area — configs, agents, curings, tools (a folder)
Tannery    the long-running workspace service composed from queue+worker+scheduler+session
Hide       raw input material — a PR thread, an API response, a log
HideBuffer in-memory paged view of one hide
Cut        the bounded slice an agent actually sees in its context window
Curing     a named N-agent workflow that transforms hides into artifacts
Operation  one agent working on one or more cuts in a single turn
Artifact   stabilized output with lineage (which curing, which hide, when)
Intake     how hides get created (webhook, HTTP poll, CLI ingest, file)

Full reference: docs/GLOSSARY.md.

Install

From source:

git clone https://github.com/tgpski/leather
cd leather
make build && make build-shell-mcp

With go install:

go install github.com/tgpski/leather/cmd/leather@latest
go install github.com/tgpski/leather/cmd/shell-mcp@latest

Verify the install — no LLM endpoint required:

leather --version    # prints version
make example-01      # runs a mock-LLM example end-to-end

Commands

Command	Purpose
`leather serve`	Run scheduler, queue workers, and (when enabled) HTTP API, tannery, or replay.
`leather run`	Execute one agent definition once and exit.
`leather chat`	Interactive chat session with token‑budget management.
`leather ingest`	Create a hide from a file or stdin and (optionally) enqueue a curing.
`leather validate`	Validate config, agents, lifecycles, skills, workers, and MCP servers.
`leather test-agent`	Run an agent against `MockLLM` and print the transcript.
`leather status`	Print scheduler state and current token‑budget settings.
`leather replay`	Replay a snapshot or live session.
`leather version` / `leather help`	The obvious.

Configuration

config.Load seeds defaults from built‑ins and LEATHER_* env vars, overlays config.yaml, then applies explicitly‑set CLI flags. In order of precedence:

Explicit flag
YAML config
Environment variable
Built‑in default

Every flag has a matching env var: --flag-name → LEATHER_FLAG_NAME.

Shared flags (click to expand)

Flag	Env var	Default	Notes
`--config`	`LEATHER_CONFIG`	`~/.leather/config.yaml`	Config file path.
`--agent-dir`	`LEATHER_AGENT_DIR`	`~/.leather/agents`	Agent and lifecycle directory.
`--model`	`LEATHER_MODEL`	empty	Global default model name.
`--temperature`	`LEATHER_TEMPERATURE`	`0.7`	Global default sampling temperature.
`--log-level`	`LEATHER_LOG_LEVEL`	`info`	`debug`, `info`, `warn`, `error`.
`--log-format`	`LEATHER_LOG_FORMAT`	`text`	`text` or `json`.
`--max-tokens`	`LEATHER_MAX_TOKENS`	`8192`	Default max context window.
`--completion-reserve`	`LEATHER_COMPLETION_RESERVE`	`1024`	Tokens reserved for completion.
`--summarize-threshold`	`LEATHER_SUMMARIZE_THRESHOLD`	`0.85`	Summarization trigger ratio.
`--llm-endpoint`	`LEATHER_LLM_ENDPOINT`	`http://localhost:11434`	OpenAI‑compatible base URL.
`--llm-api-key`	`LEATHER_LLM_API_KEY`	empty	Bearer token for cloud OpenAI‑compatible endpoints. Empty = no auth.
`--llm-timeout`	`LEATHER_LLM_TIMEOUT`	`60s`	Request timeout.
`--scheduler-tick`	`LEATHER_SCHEDULER_TICK`	`1m`	Scheduler wake interval.
`--max-concurrent-jobs`	`LEATHER_MAX_CONCURRENT_JOBS`	`4`	Scheduler concurrency cap.
`--run-duration`	`LEATHER_RUN_DURATION`	`0`	Serve exits after this duration; `0` means unlimited.
`--max-jobs`	`LEATHER_MAX_JOBS`	`0`	Serve exits after this many completed jobs; `0` means unlimited.
`--state-dir`	`LEATHER_STATE_DIR`	`~/.leather/.state`	Scheduler state root.
`--api`	`LEATHER_API`	`false`	Enable serve HTTP API.
`--api-addr`	`LEATHER_API_ADDR`	`127.0.0.1:7749`	HTTP API bind address. Loopback recommended; see security note below.
`--log-file`	`LEATHER_LOG_FILE`	empty	Tee structured logs to file, or file‑only in pretty mode.
`--pretty`	`LEATHER_PRETTY`	`false`	Human‑readable console rendering.
`--pretty-mode`	`LEATHER_PRETTY_MODE`	`all`	`messages` or `all`.
`--stats`	`LEATHER_STATS`	`false`	Print token stats and summaries.
`--tokens-per-turn`	`LEATHER_TOKENS_PER_TURN`	`false`	Print per‑turn token usage in pretty mode.
`--persist-runs`	`LEATHER_PERSIST_RUNS`	`false`	Persist JSONL run history.
`--run-history-dir`	`LEATHER_RUN_HISTORY_DIR`	empty	Defaults to `<state-dir>/runs` when persistence is enabled.
`--run-max-bytes`	`LEATHER_RUN_MAX_BYTES`	`10485760`	Rotate run logs at this size.
`--replay`	`LEATHER_REPLAY`	empty	Snapshot replay file.
`--replay-live`	`LEATHER_REPLAY_LIVE`	empty	Live replay directory.
`--replay-speed`	`LEATHER_REPLAY_SPEED`	`1.0`	Replay speed multiplier.
`--tool-dir`	`LEATHER_TOOL_DIR`	empty	Directory containing `.skill.yaml` and `.toolset.yaml`.
`--default-toolsets`	`LEATHER_DEFAULT_TOOLSETS`	empty	Comma‑separated toolsets applied to every agent.
`--max-tool-rounds`	`LEATHER_MAX_TOOL_ROUNDS`	`5`	Default tool‑call round cap.
`--worker-dir`	`LEATHER_WORKER_DIR`	empty	Directory containing `*.worker.yaml`.
`--cache-dir`	`LEATHER_CACHE_DIR`	empty	Defaults to `<state-dir>/cache` in serve mode.
`--mcp-servers-file`	`LEATHER_MCP_SERVERS_FILE`	empty	`run`, `serve`, and `validate` fall back to `~/.leather/mcp-servers.yaml`.
`--loop`	`LEATHER_LOOP`	`1`	Repeat `leather run` this many times.

Cloud LLM endpoints

leather speaks the OpenAI Chat Completions API, so any cloud or self‑hosted endpoint that implements the same spec works without a Go recompile — only llm_endpoint, model, and an API key change.

Provide the bearer token in whichever form fits your deployment:

# Inline (good for one‑offs)
leather run --agent hello \
  --llm-endpoint https://api.openai.com \
  --model gpt-4o-mini \
  --llm-api-key "sk-..."

# Environment variable
export LEATHER_LLM_API_KEY="sk-..."
leather serve --llm-endpoint https://api.openai.com --model gpt-4o-mini

For long‑running servers the recommended form is the Unix pass store, with an env var as fallback. In config.yaml:

llm_endpoint: https://api.openai.com
model: gpt-4o-mini
llm_api_key:
  pass: openai/api-key       # `pass show openai/api-key`
  env:  OPENAI_API_KEY       # fallback when pass is empty or unavailable

Go deeper

Runnable examples → examples/ — end-to-end demos, each one a single make target away
Architecture & data flow → docs/ARCHITECTURE.md
Glossary → docs/GLOSSARY.md
Per‑package docs → docs/modules/
AI agent contributor guide → AGENTS.md
Domain‑specific subagent guides → .subagents/

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.agents		.agents
.github		.github
.subagents		.subagents
cmd		cmd
docs		docs
examples		examples
internal		internal
schemas		schemas
scripts		scripts
ui		ui
.gitignore		.gitignore
.golangci.yml		.golangci.yml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
doc.go		doc.go
go.mod		go.mod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

leather

Why leather

Examples

Capabilities

Local agent runtime

Tools, skills, MCP, and shell

Curings — multi-stage workflows

HTTP poll workers

Tannery — HTTP intake, hides, and artifacts

Replay, snapshots, and run history

Browser DevTools UI

HTTP API

Notifications

Build your own

1. Write an agent

2. Give it a schedule (optional)

3. Run it

4. Add a workflow (when one agent isn't enough)

The vocabulary

Install

Commands

Configuration

Cloud LLM endpoints

Go deeper

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

leather

Why leather

Examples

Capabilities

Local agent runtime

Tools, skills, MCP, and shell

Curings — multi-stage workflows

HTTP poll workers

Tannery — HTTP intake, hides, and artifacts

Replay, snapshots, and run history

Browser DevTools UI

HTTP API

Notifications

Build your own

1. Write an agent

2. Give it a schedule (optional)

3. Run it

4. Add a workflow (when one agent isn't enough)

The vocabulary

Install

Commands

Configuration

Cloud LLM endpoints

Go deeper

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages