"Where are my testicles, Summer?"
The simplest multi-agent system. ~570 lines of core Python.
Small enough to read in an evening.
Snuffles is a compact repo for learning how agents work: messages, prompts, tool calls, routing, and event logs. The goal is to show the mechanics without hiding them behind a large framework.
The name comes from Rick and Morty. Snuffles starts out as an ordinary dog, then becomes self-aware after Rick gives him an intelligence helmet. That extra layer turns Snuffles into Snowball. In this repo, the LLM plays the same role: a thin intelligence layer on top of a few simple parts.
That is the idea behind this repo: start with a few small, readable parts, add an LLM, and watch them turn into agent behavior. There is no big framework here on purpose.
Snuffles is not trying to be a production platform. It is a small scaffold for understanding how an agent system works without reading a large codebase first.
OpenAI-compatible API:
cd snuffles
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
export OPENAI_API_KEY="sk-..."
# Optional: defaults to gpt-4.1-mini
export LLM_MODEL="gpt-4.1-mini"
python examples/01_single_agent.pyLocal OpenAI-compatible model (Ollama, LM Studio, vLLM, etc.):
export OPENAI_BASE_URL="http://localhost:11434/v1"
export OPENAI_API_KEY="unused"
export LLM_MODEL="<local-model-name>"
python examples/01_single_agent.pyOptional AWS Bedrock path:
pip install boto3
export LLM_PROVIDER="bedrock"
export AWS_REGION="us-east-1"
export AWS_PROFILE="<your-profile>"
export LLM_MODEL="<bedrock-model-id>"
python examples/01_single_agent.pysnuffles/message.py— what a message is, and what gets logged as an eventsnuffles/agent.py— the whole agent definition: name, instructions, tools, modelsnuffles/bus.py— the two queues that move messages aroundsnuffles/loop.py— the actual think -> act -> observe loopsnuffles/orchestrator.py— message routing and agent-to-agent re-injectionsnuffles/trigger.py,snuffles/log.py,snuffles/llm.py— proactive wakeups, logging, and provider glue
If you only read two files, read loop.py and orchestrator.py.
| Primitive | File | What it means |
|---|---|---|
| Message | message.py |
A unit of communication: sender, to, content, timestamp |
| Agent | agent.py |
Identity plus instructions plus tools |
| Bus | bus.py |
The inbound and outbound queues |
| Loop | loop.py |
Ask the LLM, run tools, feed tool results back |
| Trigger | trigger.py |
Wake an agent on a timer or file change |
Supporting modules:
| File | Role |
|---|---|
orchestrator.py |
Pull messages from the bus, route them, and re-inject agent-to-agent replies |
llm.py |
Thin provider layer for OpenAI-compatible APIs and Bedrock |
log.py |
Stdout plus optional JSONL event log |
snuffles/
├── snuffles/
│ ├── message.py
│ ├── agent.py
│ ├── bus.py
│ ├── loop.py
│ ├── trigger.py
│ ├── orchestrator.py
│ ├── llm.py
│ └── log.py
├── examples/
│ ├── 01_single_agent.py
│ ├── 02_proactive.py
│ ├── 03_two_agents.py
│ └── 04_delegation.py
└── pyproject.toml
When you send:
await bus.send(Message(sender="user", to="assistant", content="What is Tokyo's population?"))Snuffles does exactly this:
Bus.send()puts the message on the inbound queue.Orchestrator.run()receives it and logsmessage_routed.run_loop()builds a conversation with:- the agent instructions as the system message
- the inbound message content as the user message
- The loop calls the LLM.
- If the LLM asks for tools, Snuffles executes them and appends the tool results.
- When the LLM returns final text:
- plain text replies to the original sender
- a JSON envelope can route to someone else
- The orchestrator publishes the outbound
Message. - If that message targets another agent, it gets re-injected into the inbound queue.
The examples start orch.run() in the background, wait for outbound replies, then
call orch.stop() so the script exits cleanly after the lesson is over.
Snuffles supports one explicit routing convention for final model responses:
{"to": "writer", "content": "Summarize these research notes into a final answer."}If the final model response is valid JSON with string to and content fields,
Snuffles turns it into the final outbound Message.
If the final model response is anything else, Snuffles falls back to:
to = trigger_message.sender
content = response.content
This keeps routing explicit. There is no planner layer or built-in message-sending tool.
examples/01_single_agent.py— one agent, one tool, one reply, then clean shutdownexamples/02_proactive.py— timer-driven agent; this one is intentionally long-runningexamples/03_two_agents.py—researcher -> writer -> userwith explicit JSON routingexamples/04_delegation.py—user -> manager -> calculator -> manager -> user
Every important step is printed to stdout and can also be written to JSONL:
[14:32:01.234] message_routed | | from=user, to=assistant, content=What is Tokyo's population?
[14:32:01.235] loop_start | assistant | trigger=What is Tokyo's population?
[14:32:01.236] llm_call | assistant | iteration=1, message_count=2
[14:32:02.891] tool_call | assistant | tool=web_search, args={"query":"Tokyo population"}
[14:32:03.456] tool_result | assistant | tool=web_search, result=Tokyo has 13.96M people
[14:32:03.457] llm_call | assistant | iteration=2, message_count=4
[14:32:04.123] llm_response | assistant | content=The population of Tokyo is approximately 13.96M
[14:32:04.124] loop_end | assistant | to=user, content=The population of Tokyo is approximately 13.96M
Write to a file:
from pathlib import Path
log = EventLog(path=Path("events.jsonl"))Then inspect it:
cat events.jsonl | jq 'select(.kind == "tool_call")'
cat events.jsonl | jq 'select(.agent == "researcher")'
grep loop_end events.jsonl- No persistent memory across turns
- No planner or task graph layer
- No retries, backoff, or production guardrails
- No extra lifecycle framework beyond the few primitives above
- No GUI or playground in the core repo
This is intentional. Start with the basics first.