diff --git a/agents/antoinezambelli__forge-guardrails/README.md b/agents/antoinezambelli__forge-guardrails/README.md new file mode 100644 index 0000000..812c750 --- /dev/null +++ b/agents/antoinezambelli__forge-guardrails/README.md @@ -0,0 +1,61 @@ +# forge-guardrails + +**A reliability layer for self-hosted LLM tool-calling.** + +Forge takes an 8B local model from single-digit pass rates to **84% across a +26-scenario eval suite** — and lifts Claude Sonnet from 85% to 98% on the same +workload — by applying composable guardrails without changing a single line of +your agent code. + +## What it does + +Forge intercepts model responses before they reach your client or downstream +logic and applies (in order): + +1. **Response validation** — every tool call is checked against the declared `tools` array; unknown names and malformed shapes are caught before surfacing. +2. **Rescue parsing** — Mistral `[TOOL_CALLS]`, Qwen `` XML, and fenced JSON are normalised to the canonical OpenAI `tool_calls` schema. +3. **Retry with error tracking** — on validation failure, forge retries inference with a corrective tool-result message (up to `max_retries`, default 3). +4. **Synthetic `respond` tool injection** — prevents small models from breaking the tool-calling loop by routing plain-text responses through a structured call (stripped from the outbound response so the client never sees it). + +## Three usage modes + +| Mode | How | Best for | +|------|-----|----------| +| **Proxy server** | `python -m forge.proxy` | Drop-in for existing clients (opencode, aider, Claude Code, Continue) | +| **WorkflowRunner** | Python API | Building new agentic loops directly on forge | +| **Guardrails middleware** | `Guardrails` facade | Adding forge's stack inside your own orchestration loop | + +## Supported backends + +Ollama · llama-server (llama.cpp) · Llamafile · vLLM · Anthropic API + +## Quick install + +```bash +pip install forge-guardrails # core +pip install "forge-guardrails[anthropic]" # + Anthropic client +``` + +## Claude Code integration + +Point Claude Code at a forge-guarded local model: + +```bash +python -m forge.proxy --backend llamaserver --gguf path/to/model.gguf --port 8081 +# then: +ANTHROPIC_BASE_URL=http://localhost:8081 ANTHROPIC_AUTH_TOKEN=anything claude +``` + +## Eval harness + +```bash +python -m tests.eval.eval_runner --backend llamafile --gguf path/to/model.gguf --runs 10 +python -m tests.eval.report eval_results.jsonl +``` + +## Links + +- Repository: https://github.com/antoinezambelli/forge +- PyPI: https://pypi.org/project/forge-guardrails/ +- Paper: https://doi.org/10.1145/3786335.3813193 +- Docs: https://github.com/antoinezambelli/forge/tree/main/docs diff --git a/agents/antoinezambelli__forge-guardrails/metadata.json b/agents/antoinezambelli__forge-guardrails/metadata.json new file mode 100644 index 0000000..0ac4c36 --- /dev/null +++ b/agents/antoinezambelli__forge-guardrails/metadata.json @@ -0,0 +1,14 @@ +{ + "name": "forge-guardrails", + "author": "antoinezambelli", + "description": "A reliability layer for self-hosted LLM tool-calling — guardrails, rescue parsing, retry loops, and proxy mode for Ollama, llama.cpp, vLLM, and Anthropic.", + "repository": "https://github.com/antoinezambelli/forge", + "version": "0.7.2", + "category": "developer-tools", + "tags": ["llm", "tool-calling", "guardrails", "self-hosted", "ollama", "llama-cpp", "proxy", "agentic", "reliability", "python"], + "license": "MIT", + "model": "anthropic:claude-sonnet-4-6", + "adapters": ["system-prompt", "claude-code"], + "icon": false, + "banner": false +}