Skip to content

akijain2000/hermes-loop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hermes Loop

A self-improving agent runtime that learns from experience.

Hermes Loop creates reusable skills from complex tasks, compresses context iteratively so conversations never lose important history, maintains persistent memory across sessions, and captures RL training trajectories — all while running on your existing agent (Pi or Claude Code).

Built by combining three systems:

  • pi-mono — modular agent runtime with extension system, 20+ LLM providers, GPU pod infrastructure
  • hermes-agent — self-improving agent with autonomous skill creation, iterative context compression, persistent memory
  • Skill Factory + Agent Factory — quality standards and guided creation workflows for agent skills

How It Works

flowchart TB
    subgraph Runtime["Agent Runtime (Pi or Claude Code)"]
        User([User]) --> Agent[Agent Loop]
        Agent --> Tools[Tool Execution]
        Tools --> Agent
    end

    subgraph HermesLoop["Hermes Loop Extension"]
        direction TB
        EventRouter[Event Router]

        EventRouter --> SkillGen[Skill Generator]
        EventRouter --> Compressor[Context Compressor]
        EventRouter --> Memory[Memory Manager]
        EventRouter --> Trajectory[Trajectory Tracker]
        EventRouter --> Router[Model Router]

        SkillGen --> QualityGate[Quality Gate]
        QualityGate --> SecurityScan[Security Scanner]
    end

    subgraph Storage["Persistent Storage (~/.hermes-loop/)"]
        Skills[(Generated Skills)]
        MemFiles[(MEMORY.md + USER.md)]
        Trajectories[(trajectories/*.jsonl)]
        State[(state.json)]
    end

    subgraph LLM["Auxiliary LLM (cheap model)"]
        AuxLLM[AuxLLM Client]
        LiteLLM[LiteLLM Proxy]
        Anthropic[Anthropic API]
        OpenAI[OpenAI API]
        AuxLLM --> LiteLLM
        AuxLLM --> Anthropic
        AuxLLM --> OpenAI
    end

    Agent -- events --> EventRouter
    SkillGen -- draft prompt --> AuxLLM
    Compressor -- summary prompt --> AuxLLM
    SkillGen --> Skills
    Memory --> MemFiles
    Trajectory --> Trajectories
    Compressor --> State
    Skills -- next session --> Agent
    MemFiles -- next session --> Agent
Loading

The Learning Loop

sequenceDiagram
    participant U as User
    participant A as Agent
    participant HL as Hermes Loop
    participant LLM as Aux LLM
    participant D as Disk

    U->>A: Complex task
    A->>HL: agent_start
    HL->>HL: Reset complexity counter
    HL->>D: Load MEMORY.md

    loop Each tool call
        A->>HL: tool_call event
        HL->>HL: complexity++
        A->>HL: tool_result event
        HL->>HL: Record trajectory
    end

    A->>HL: agent_end (12 tool calls)
    HL->>HL: complexity ≥ 5? YES

    HL->>LLM: Draft SKILL.md from transcript
    LLM-->>HL: Generated skill

    HL->>HL: Validate (SKILL_SPEC)
    HL->>HL: Security scan (30+ patterns)
    HL->>D: Save skill + trajectory

    Note over A,D: Next session, the skill is in the system prompt
    U->>A: Similar task
    A->>A: Reads skill → executes efficiently
Loading

Iterative Context Compression

Unlike simple "summarize and replace," Hermes Loop uses a 5-phase algorithm that extends prior summaries instead of re-summarizing from scratch:

flowchart LR
    subgraph Phase1["Phase 1: Prune"]
        P1[Replace old tool<br/>results with placeholder]
    end

    subgraph Phase2["Phase 2-3: Boundaries"]
        P2[Protect head<br/>3 messages]
        P3[Protect tail<br/>~20K tokens]
    end

    subgraph Phase4["Phase 4: Summarize"]
        P4[Structured LLM summary<br/>Goal / Progress / Decisions<br/>Files / Next Steps / Gotchas]
    end

    subgraph Phase5["Phase 5: Iterate"]
        P5{Previous<br/>summary?}
        P5 -->|Yes| Extend[Extend prior summary<br/>Move In Progress → Done<br/>Add new decisions]
        P5 -->|No| Fresh[Summarize from scratch]
    end

    P1 --> P2 --> P3 --> P4 --> P5

    style Phase1 fill:#1a1a2e
    style Phase5 fill:#16213e
Loading

This means information is never lost across multiple compressions — each compression builds on the last.

Architecture

graph TB
    subgraph Core["Core Modules (runtime-agnostic)"]
        Types[types.ts]
        Config[config.ts]
        SG[skill-generator.ts]
        CC[context-compressor.ts]
        MM[memory-manager.ts]
        TT[trajectory-tracker.ts]
        MR[model-router.ts]
        QG[quality-gate.ts]
        SP[security-patterns.ts]
    end

    subgraph LLMLayer["LLM Client Layer"]
        AUX[aux-llm.ts]
        LITE[litellm.ts]
        ANTH[anthropic.ts]
        OAI[openai.ts]
    end

    subgraph Adapters["Runtime Adapters"]
        PA[pi-adapter.ts]
        CCA[claude-code-adapter.ts]
    end

    subgraph Entry["Entry Points"]
        IDX[index.ts<br/>Pi Extension]
        CLI[cli.ts<br/>Claude Code CLI]
    end

    subgraph Utils["Utilities"]
        TOK[tokens.ts]
        SAN[sanitize.ts]
        FM[frontmatter.ts]
    end

    IDX --> PA --> Core
    CLI --> CCA --> Core
    Core --> LLMLayer
    Core --> Utils
    SG --> QG --> SP
    CC --> AUX
    SG --> AUX
Loading

Quick Start

With Claude Code

# Install
git clone https://github.com/akijain2000/hermes-loop.git
cd hermes-loop
npm install && npm run build
npm link

# Set your LLM key for auxiliary calls (compression, skill drafting)
export ANTHROPIC_API_KEY=sk-ant-...     # or use LiteLLM below

# Generate hook configuration
hermes-loop setup

This prints hooks config to add to ~/.claude/settings.json:

{
  "hooks": {
    "PostToolUse": [{
      "matcher": ".*",
      "hooks": [{
        "type": "command",
        "command": "hermes-loop track-tool \"$TOOL_NAME\" '$TOOL_INPUT'"
      }]
    }],
    "Stop": [{
      "hooks": [{
        "type": "command",
        "command": "hermes-loop on-session-end"
      }]
    }]
  }
}

With Pi

# Copy to pi extensions directory
cp -r hermes-loop ~/.pi/agent/extensions/
cd ~/.pi/agent/extensions/hermes-loop
npm install && npm run build

# Pi auto-discovers it on next session

With LiteLLM

If you use LiteLLM as your LLM proxy (like many Claude Code users do):

# Option A: LiteLLM proxy running locally
export LITELLM_BASE_URL=http://localhost:4000
export LITELLM_API_KEY=sk-your-litellm-key

# Option B: Configure in ~/.hermes-loop/config.json
{
  "llm": {
    "provider": "litellm",
    "model": "anthropic/claude-haiku-4-5",
    "baseUrl": "http://localhost:4000",
    "apiKey": "sk-your-litellm-key"
  }
}

Hermes Loop only uses the LLM for auxiliary calls (context compression summaries and skill drafting). These are cheap operations — ~$0.005 per session using Haiku. Your main agent (Claude Code or Pi) continues using its own credentials normally.

Provider Resolution Order

When provider is set to "auto" (default), the auxiliary LLM client resolves in this order:

1. Explicit config    →  config.json "llm" section
2. LITELLM_BASE_URL   →  LiteLLM proxy
3. ANTHROPIC_API_KEY   →  Direct Anthropic API
4. OPENAI_API_KEY      →  Direct OpenAI API
5. Claude Code session →  Auto-detect from environment

CLI Reference

hermes-loop track-tool <name> [input]   Track a tool call (called by hooks)
hermes-loop on-session-end              Process session end (generates skills)
hermes-loop memory read                 Read agent memory
hermes-loop memory write <content>      Write agent memory
hermes-loop memory user read            Read user profile
hermes-loop memory user write <content> Write user profile
hermes-loop status                      Show current session metrics
hermes-loop setup                       Generate Claude Code hook config

What Gets Generated

Skills (~/.pi/agent/skills/ or ~/.claude/skills/)

After completing a complex task (5+ tool calls), a SKILL.md file is generated:

---
name: deploy-staging
description: "Automate deployment to staging environments. Use when deploying services after CI passes."
metadata:
  generated:
    timestamp: "2026-04-07T14:23:45Z"
    complexity:
      toolCalls: 12
      uniqueTools: 5
      errors: 1
---

# Deploy Staging

## Procedure
1. Verify CI status with `gh run list`
2. Build artifacts with `npm run build`
3. Deploy with `fly deploy --app staging`
4. Verify health: `curl https://staging.example.com/health`

## Gotchas
- Always check CI before deploying — a failed deploy takes 15 min to roll back
- The staging DB migrates automatically on deploy, but production does NOT

Every generated skill is:

  • Validated against SKILL_SPEC (name format, description quality, body size, etc.)
  • Security scanned against 30+ threat patterns (exfiltration, prompt injection, destructive commands, persistence mechanisms, obfuscation)
  • Automatically available in the next session's system prompt

Trajectories (~/.hermes-loop/trajectories/)

ShareGPT-format JSONL for RL fine-tuning:

{
  "conversations": [
    {"from": "system", "value": "...tool definitions..."},
    {"from": "human", "value": "Deploy the staging service"},
    {"from": "gpt", "value": "<think>Need to check CI first</think>\n<tool_call>\n{\"name\":\"bash\",\"arguments\":{\"command\":\"gh run list\"}}\n</tool_call>"},
    {"from": "human", "value": "<tool_response>\nAll checks passed\n</tool_response>"}
  ],
  "model": "claude-opus-4-6",
  "completed": true,
  "complexity": {"toolCalls": 12, "uniqueTools": 5}
}

Memory (~/.hermes-loop/memory/)

Two bounded files:

  • MEMORY.md (2200 char max) — agent's working knowledge about the project
  • USER.md (1375 char max) — user preferences, expertise, communication style

Memory uses a frozen snapshot pattern: the system prompt gets a read-only copy at session start. Writes go to disk immediately but don't disrupt prompt caching until the next session. This saves ~75% on Anthropic input token costs.

Configuration

~/.hermes-loop/config.json:

{
  "runtime": "auto",
  "llm": {
    "provider": "auto",
    "model": "claude-haiku-4-5",
    "summaryModel": null,
    "draftModel": null
  },
  "skillGeneration": {
    "enabled": true,
    "complexityThreshold": 5,
    "autoSave": true,
    "requireUserConfirmation": false
  },
  "compression": {
    "enabled": true,
    "thresholdPercent": 0.50,
    "protectFirstN": 3,
    "protectLastN": 20,
    "summaryTargetRatio": 0.20,
    "iterative": true
  },
  "memory": {
    "enabled": true,
    "memoryMaxChars": 2200,
    "userMaxChars": 1375,
    "frozenSnapshot": true
  },
  "trajectories": {
    "enabled": true,
    "savePath": "~/.hermes-loop/trajectories",
    "saveFailures": true
  },
  "modelRouting": {
    "enabled": false,
    "capable": "claude-opus-4-6",
    "fast": "claude-haiku-4-5"
  }
}

Security

Generated skills pass through a security scanner with 30+ patterns covering:

Category Examples Severity
Exfiltration curl ... $API_KEY, reading ~/.ssh, os.environ Critical/High
Prompt Injection "ignore previous instructions", invisible Unicode Critical/High
Destructive rm -rf /, DROP TABLE, git push --force Critical/High
Persistence Cron jobs, SSH key injection, .bashrc modification High/Medium
Obfuscation Hex-encoded commands, eval(atob(...)) Medium/High

Trust policy for agent-created skills:

  • Safe → auto-allow
  • Caution → allow with logged warnings
  • Dangerous → requires user confirmation (or rejected in headless mode)

Project Structure

hermes-loop/
├── ARCHITECTURE.md            # Detailed architecture document
├── README.md                  # This file
├── package.json
├── tsconfig.json
├── src/
│   ├── index.ts               # Pi extension entry point
│   ├── cli.ts                 # Claude Code CLI entry point
│   ├── core/                  # Runtime-agnostic modules
│   │   ├── types.ts           # Shared type definitions
│   │   ├── config.ts          # Config loading (~/.hermes-loop/config.json)
│   │   ├── skill-generator.ts # Auto skill creation from complex tasks
│   │   ├── context-compressor.ts  # 5-phase iterative compression
│   │   ├── memory-manager.ts  # MEMORY.md + USER.md persistence
│   │   ├── trajectory-tracker.ts  # ShareGPT JSONL capture
│   │   ├── model-router.ts    # Smart model selection
│   │   ├── quality-gate.ts    # SKILL_SPEC validation + security scan
│   │   └── security-patterns.ts   # 30+ threat regex patterns
│   ├── adapters/              # Runtime-specific wiring
│   │   ├── pi-adapter.ts      # Pi extension API adapter
│   │   └── claude-code-adapter.ts # Claude Code hooks adapter
│   ├── llm/                   # Auxiliary LLM client
│   │   ├── aux-llm.ts         # Provider-agnostic caller
│   │   ├── litellm.ts         # LiteLLM / OpenAI-compatible
│   │   ├── anthropic.ts       # Direct Anthropic API
│   │   └── openai.ts          # Direct OpenAI API
│   └── utils/
│       ├── tokens.ts          # Token estimation
│       ├── sanitize.ts        # Anti-injection helpers
│       └── frontmatter.ts     # YAML frontmatter parse/serialize
├── test/                      # 25 tests
│   ├── quality-gate.test.ts
│   ├── context-compressor.test.ts
│   └── skill-generator.test.ts
├── bin/
│   └── hermes-loop            # CLI binary
└── dist/                      # Compiled output

Inspired By

System What we took Link
pi-mono Extension system, dual-loop agent, session persistence, SKILL.md format badlogic/pi-mono
hermes-agent Self-improving loop, iterative compression, memory manager, trajectory tracking, security scanner NousResearch/hermes-agent
Skill Factory SKILL_SPEC validation, quality standards, guided creation Internal knowledge base

License

MIT

About

Self-improving agent extension — creates skills from experience, compresses context iteratively, maintains persistent memory. Runs on Pi or Claude Code. Combines pi-mono + hermes-agent + Skill Factory.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors