A self-improving agent runtime that learns from experience.
Hermes Loop creates reusable skills from complex tasks, compresses context iteratively so conversations never lose important history, maintains persistent memory across sessions, and captures RL training trajectories — all while running on your existing agent (Pi or Claude Code).
Built by combining three systems:
- pi-mono — modular agent runtime with extension system, 20+ LLM providers, GPU pod infrastructure
- hermes-agent — self-improving agent with autonomous skill creation, iterative context compression, persistent memory
- Skill Factory + Agent Factory — quality standards and guided creation workflows for agent skills
flowchart TB
subgraph Runtime["Agent Runtime (Pi or Claude Code)"]
User([User]) --> Agent[Agent Loop]
Agent --> Tools[Tool Execution]
Tools --> Agent
end
subgraph HermesLoop["Hermes Loop Extension"]
direction TB
EventRouter[Event Router]
EventRouter --> SkillGen[Skill Generator]
EventRouter --> Compressor[Context Compressor]
EventRouter --> Memory[Memory Manager]
EventRouter --> Trajectory[Trajectory Tracker]
EventRouter --> Router[Model Router]
SkillGen --> QualityGate[Quality Gate]
QualityGate --> SecurityScan[Security Scanner]
end
subgraph Storage["Persistent Storage (~/.hermes-loop/)"]
Skills[(Generated Skills)]
MemFiles[(MEMORY.md + USER.md)]
Trajectories[(trajectories/*.jsonl)]
State[(state.json)]
end
subgraph LLM["Auxiliary LLM (cheap model)"]
AuxLLM[AuxLLM Client]
LiteLLM[LiteLLM Proxy]
Anthropic[Anthropic API]
OpenAI[OpenAI API]
AuxLLM --> LiteLLM
AuxLLM --> Anthropic
AuxLLM --> OpenAI
end
Agent -- events --> EventRouter
SkillGen -- draft prompt --> AuxLLM
Compressor -- summary prompt --> AuxLLM
SkillGen --> Skills
Memory --> MemFiles
Trajectory --> Trajectories
Compressor --> State
Skills -- next session --> Agent
MemFiles -- next session --> Agent
sequenceDiagram
participant U as User
participant A as Agent
participant HL as Hermes Loop
participant LLM as Aux LLM
participant D as Disk
U->>A: Complex task
A->>HL: agent_start
HL->>HL: Reset complexity counter
HL->>D: Load MEMORY.md
loop Each tool call
A->>HL: tool_call event
HL->>HL: complexity++
A->>HL: tool_result event
HL->>HL: Record trajectory
end
A->>HL: agent_end (12 tool calls)
HL->>HL: complexity ≥ 5? YES
HL->>LLM: Draft SKILL.md from transcript
LLM-->>HL: Generated skill
HL->>HL: Validate (SKILL_SPEC)
HL->>HL: Security scan (30+ patterns)
HL->>D: Save skill + trajectory
Note over A,D: Next session, the skill is in the system prompt
U->>A: Similar task
A->>A: Reads skill → executes efficiently
Unlike simple "summarize and replace," Hermes Loop uses a 5-phase algorithm that extends prior summaries instead of re-summarizing from scratch:
flowchart LR
subgraph Phase1["Phase 1: Prune"]
P1[Replace old tool<br/>results with placeholder]
end
subgraph Phase2["Phase 2-3: Boundaries"]
P2[Protect head<br/>3 messages]
P3[Protect tail<br/>~20K tokens]
end
subgraph Phase4["Phase 4: Summarize"]
P4[Structured LLM summary<br/>Goal / Progress / Decisions<br/>Files / Next Steps / Gotchas]
end
subgraph Phase5["Phase 5: Iterate"]
P5{Previous<br/>summary?}
P5 -->|Yes| Extend[Extend prior summary<br/>Move In Progress → Done<br/>Add new decisions]
P5 -->|No| Fresh[Summarize from scratch]
end
P1 --> P2 --> P3 --> P4 --> P5
style Phase1 fill:#1a1a2e
style Phase5 fill:#16213e
This means information is never lost across multiple compressions — each compression builds on the last.
graph TB
subgraph Core["Core Modules (runtime-agnostic)"]
Types[types.ts]
Config[config.ts]
SG[skill-generator.ts]
CC[context-compressor.ts]
MM[memory-manager.ts]
TT[trajectory-tracker.ts]
MR[model-router.ts]
QG[quality-gate.ts]
SP[security-patterns.ts]
end
subgraph LLMLayer["LLM Client Layer"]
AUX[aux-llm.ts]
LITE[litellm.ts]
ANTH[anthropic.ts]
OAI[openai.ts]
end
subgraph Adapters["Runtime Adapters"]
PA[pi-adapter.ts]
CCA[claude-code-adapter.ts]
end
subgraph Entry["Entry Points"]
IDX[index.ts<br/>Pi Extension]
CLI[cli.ts<br/>Claude Code CLI]
end
subgraph Utils["Utilities"]
TOK[tokens.ts]
SAN[sanitize.ts]
FM[frontmatter.ts]
end
IDX --> PA --> Core
CLI --> CCA --> Core
Core --> LLMLayer
Core --> Utils
SG --> QG --> SP
CC --> AUX
SG --> AUX
# Install
git clone https://github.com/akijain2000/hermes-loop.git
cd hermes-loop
npm install && npm run build
npm link
# Set your LLM key for auxiliary calls (compression, skill drafting)
export ANTHROPIC_API_KEY=sk-ant-... # or use LiteLLM below
# Generate hook configuration
hermes-loop setupThis prints hooks config to add to ~/.claude/settings.json:
{
"hooks": {
"PostToolUse": [{
"matcher": ".*",
"hooks": [{
"type": "command",
"command": "hermes-loop track-tool \"$TOOL_NAME\" '$TOOL_INPUT'"
}]
}],
"Stop": [{
"hooks": [{
"type": "command",
"command": "hermes-loop on-session-end"
}]
}]
}
}# Copy to pi extensions directory
cp -r hermes-loop ~/.pi/agent/extensions/
cd ~/.pi/agent/extensions/hermes-loop
npm install && npm run build
# Pi auto-discovers it on next sessionIf you use LiteLLM as your LLM proxy (like many Claude Code users do):
# Option A: LiteLLM proxy running locally
export LITELLM_BASE_URL=http://localhost:4000
export LITELLM_API_KEY=sk-your-litellm-key
# Option B: Configure in ~/.hermes-loop/config.json
{
"llm": {
"provider": "litellm",
"model": "anthropic/claude-haiku-4-5",
"baseUrl": "http://localhost:4000",
"apiKey": "sk-your-litellm-key"
}
}Hermes Loop only uses the LLM for auxiliary calls (context compression summaries and skill drafting). These are cheap operations — ~$0.005 per session using Haiku. Your main agent (Claude Code or Pi) continues using its own credentials normally.
When provider is set to "auto" (default), the auxiliary LLM client resolves in this order:
1. Explicit config → config.json "llm" section
2. LITELLM_BASE_URL → LiteLLM proxy
3. ANTHROPIC_API_KEY → Direct Anthropic API
4. OPENAI_API_KEY → Direct OpenAI API
5. Claude Code session → Auto-detect from environment
hermes-loop track-tool <name> [input] Track a tool call (called by hooks)
hermes-loop on-session-end Process session end (generates skills)
hermes-loop memory read Read agent memory
hermes-loop memory write <content> Write agent memory
hermes-loop memory user read Read user profile
hermes-loop memory user write <content> Write user profile
hermes-loop status Show current session metrics
hermes-loop setup Generate Claude Code hook config
After completing a complex task (5+ tool calls), a SKILL.md file is generated:
---
name: deploy-staging
description: "Automate deployment to staging environments. Use when deploying services after CI passes."
metadata:
generated:
timestamp: "2026-04-07T14:23:45Z"
complexity:
toolCalls: 12
uniqueTools: 5
errors: 1
---
# Deploy Staging
## Procedure
1. Verify CI status with `gh run list`
2. Build artifacts with `npm run build`
3. Deploy with `fly deploy --app staging`
4. Verify health: `curl https://staging.example.com/health`
## Gotchas
- Always check CI before deploying — a failed deploy takes 15 min to roll back
- The staging DB migrates automatically on deploy, but production does NOTEvery generated skill is:
- Validated against SKILL_SPEC (name format, description quality, body size, etc.)
- Security scanned against 30+ threat patterns (exfiltration, prompt injection, destructive commands, persistence mechanisms, obfuscation)
- Automatically available in the next session's system prompt
ShareGPT-format JSONL for RL fine-tuning:
{
"conversations": [
{"from": "system", "value": "...tool definitions..."},
{"from": "human", "value": "Deploy the staging service"},
{"from": "gpt", "value": "<think>Need to check CI first</think>\n<tool_call>\n{\"name\":\"bash\",\"arguments\":{\"command\":\"gh run list\"}}\n</tool_call>"},
{"from": "human", "value": "<tool_response>\nAll checks passed\n</tool_response>"}
],
"model": "claude-opus-4-6",
"completed": true,
"complexity": {"toolCalls": 12, "uniqueTools": 5}
}Two bounded files:
- MEMORY.md (2200 char max) — agent's working knowledge about the project
- USER.md (1375 char max) — user preferences, expertise, communication style
Memory uses a frozen snapshot pattern: the system prompt gets a read-only copy at session start. Writes go to disk immediately but don't disrupt prompt caching until the next session. This saves ~75% on Anthropic input token costs.
~/.hermes-loop/config.json:
{
"runtime": "auto",
"llm": {
"provider": "auto",
"model": "claude-haiku-4-5",
"summaryModel": null,
"draftModel": null
},
"skillGeneration": {
"enabled": true,
"complexityThreshold": 5,
"autoSave": true,
"requireUserConfirmation": false
},
"compression": {
"enabled": true,
"thresholdPercent": 0.50,
"protectFirstN": 3,
"protectLastN": 20,
"summaryTargetRatio": 0.20,
"iterative": true
},
"memory": {
"enabled": true,
"memoryMaxChars": 2200,
"userMaxChars": 1375,
"frozenSnapshot": true
},
"trajectories": {
"enabled": true,
"savePath": "~/.hermes-loop/trajectories",
"saveFailures": true
},
"modelRouting": {
"enabled": false,
"capable": "claude-opus-4-6",
"fast": "claude-haiku-4-5"
}
}Generated skills pass through a security scanner with 30+ patterns covering:
| Category | Examples | Severity |
|---|---|---|
| Exfiltration | curl ... $API_KEY, reading ~/.ssh, os.environ |
Critical/High |
| Prompt Injection | "ignore previous instructions", invisible Unicode | Critical/High |
| Destructive | rm -rf /, DROP TABLE, git push --force |
Critical/High |
| Persistence | Cron jobs, SSH key injection, .bashrc modification |
High/Medium |
| Obfuscation | Hex-encoded commands, eval(atob(...)) |
Medium/High |
Trust policy for agent-created skills:
- Safe → auto-allow
- Caution → allow with logged warnings
- Dangerous → requires user confirmation (or rejected in headless mode)
hermes-loop/
├── ARCHITECTURE.md # Detailed architecture document
├── README.md # This file
├── package.json
├── tsconfig.json
├── src/
│ ├── index.ts # Pi extension entry point
│ ├── cli.ts # Claude Code CLI entry point
│ ├── core/ # Runtime-agnostic modules
│ │ ├── types.ts # Shared type definitions
│ │ ├── config.ts # Config loading (~/.hermes-loop/config.json)
│ │ ├── skill-generator.ts # Auto skill creation from complex tasks
│ │ ├── context-compressor.ts # 5-phase iterative compression
│ │ ├── memory-manager.ts # MEMORY.md + USER.md persistence
│ │ ├── trajectory-tracker.ts # ShareGPT JSONL capture
│ │ ├── model-router.ts # Smart model selection
│ │ ├── quality-gate.ts # SKILL_SPEC validation + security scan
│ │ └── security-patterns.ts # 30+ threat regex patterns
│ ├── adapters/ # Runtime-specific wiring
│ │ ├── pi-adapter.ts # Pi extension API adapter
│ │ └── claude-code-adapter.ts # Claude Code hooks adapter
│ ├── llm/ # Auxiliary LLM client
│ │ ├── aux-llm.ts # Provider-agnostic caller
│ │ ├── litellm.ts # LiteLLM / OpenAI-compatible
│ │ ├── anthropic.ts # Direct Anthropic API
│ │ └── openai.ts # Direct OpenAI API
│ └── utils/
│ ├── tokens.ts # Token estimation
│ ├── sanitize.ts # Anti-injection helpers
│ └── frontmatter.ts # YAML frontmatter parse/serialize
├── test/ # 25 tests
│ ├── quality-gate.test.ts
│ ├── context-compressor.test.ts
│ └── skill-generator.test.ts
├── bin/
│ └── hermes-loop # CLI binary
└── dist/ # Compiled output
| System | What we took | Link |
|---|---|---|
| pi-mono | Extension system, dual-loop agent, session persistence, SKILL.md format | badlogic/pi-mono |
| hermes-agent | Self-improving loop, iterative compression, memory manager, trajectory tracking, security scanner | NousResearch/hermes-agent |
| Skill Factory | SKILL_SPEC validation, quality standards, guided creation | Internal knowledge base |
MIT