Skillforge turns successful agent traces into reusable skills.
It compiles a source run into:
- a portable
skill.contract.json - an OpenClaw-friendly
SKILL.md - a
verification.report.json - example inputs for reuse
The goal is simple: stop losing good agent work in transcripts. Capture what worked, parameterize it, attach approval gates, and make it reusable across OpenClaw or your own agent stack.
Most agent tooling can execute one-off tasks, but successful runs usually die as logs. Skillforge promotes a successful trace into a reusable asset:
- extract inputs like paths, URLs, repositories, dates, and emails
- convert raw steps into a reusable execution plan
- infer tool requirements and approval gates
- export an installable markdown skill plus a portable JSON contract
- statically verify that the generated skill can be re-rendered safely
npm install
npm run buildOr run the CLI directly in development:
npm run dev -- inspect examples/fix-flaky-test.trace.jsonCompile a trace into a skill bundle:
npm run dev -- compile examples/fix-flaky-test.trace.json --out generated-skillsInspect a trace without writing files:
npm run dev -- inspect examples/publish-weekly-report.trace.jsonVerify a generated contract:
npm run dev -- verify generated-skills/fix-flaky-auth-test/skill.contract.jsonList a local skill registry:
npm run dev -- list generated-skillsSkillforge accepts:
- normalized trace JSON with
objective+steps - objects containing
messages - objects containing
events,entries, ortrace - JSONL event streams
The compiler is intentionally conservative. It aims to create a useful reusable skill from incomplete data without executing any part of the source trace.
Each compiled skill bundle contains:
skill.contract.json: portable contract for any agent runtimeSKILL.md: Markdown skill for OpenClaw-style skill registriesverification.report.json: static verification resultinputs.example.json: extracted example values
npm run check
npm test
npm run build- adapter plugins for more agent trace formats
- richer policy engines for command risk classification
- replay-backed verification harnesses
- registry publishing and trust receipts