Turn code and docs into instructions AI agents can actually follow.
Skill Forge analyzes your code repositories, documentation, and developer discourse to build verified instruction files for AI agents. Every instruction links back to a specific file and line in the source it was compiled from.
If SKF fixes your agent's API guesses, give it a ⭐ — it helps others find this tool. If it saved you an afternoon, grab me a coffee ☕ — it helps me keep forging.
You ask an AI agent to use a library. It invents function names that don't exist. It guesses parameter types. You paste documentation into the context — it still gets details wrong. You write instructions by hand — they go stale the moment the code changes.
This isn't an edge case. It's the default experience.
For the full story behind SKF, read Hallucination has a line number on Medium.
Without SKF — your agent guesses:
import cognee
# Agent hallucinates: sync call, wrong parameter name, missing await
results = cognee.search("What does Cognee do?", mode="graph")With SKF — your agent reads the verified skill:
import cognee
# Agent follows the skill instruction:
# `search(query_text: str, query_type: SearchType = GRAPH_COMPLETION) -> List[SearchResult]`
# [AST:cognee/api/v1/search/search.py:L27]
results = await cognee.search(
query_text="What does Cognee do?",
query_type=cognee.SearchType.GRAPH_COMPLETION
)The skill told the agent the real function name, the real parameters, and that the call is async — all traced to the exact source line. This example is from the real oms-cognee skill in oh-my-skills — SKF's reference output. The Verifying a Skill section below shows how to walk the citation chain yourself.
Linux, Windows, and macOS supported (platform details). Requires Node.js >= 22, Python >= 3.10, and uv (Python package runner).
npx bmad-module-skill-forge installYou'll be prompted for project name, output folders, and IDE configuration. When the install completes, open your IDE and invoke @Ferris SF to confirm Ferris is reachable. Ferris reports your detected tools and capability tier. See the docs for other install methods.
- Set up your environment:
@Ferris SF(Setup Forge) — detects your tools and sets your capability tier - Generate your first skill:
@Ferris QS <package-name>(Quick Skill) — creates a verified skill in under a minute - Full quality path:
@Ferris forge <your-library>chains Brief → Create → Test → Export automatically — or run manually:@Ferris BS→ clear session →@Ferris CSfor maximum control
Tip: Start a fresh conversation before each workflow, or use pipeline mode to chain them automatically. SKF workflows load significant context; clearing between them prevents interference.
See the workflows docs for all available workflows, pipeline aliases, and headless mode.
- You use AI agents to write code and they keep guessing API calls wrong
- You maintain a library and want to ship official, verified instruction files so AI agents use your API correctly
- You manage a codebase with many dependencies and want a consolidated "stack skill" that teaches your agent how all the pieces fit together
- You use a SaaS API or closed-source tool with no public code — SKF can generate skills from documentation alone
- You need different skills for different use cases from the same target — compile multiple skills with different scopes from one repo or doc set (e.g., a core API skill and a migration guide skill)
A skeptical reader is probably already considering one of these alternatives:
| Skill Forge | MCP doc servers | Hand-edited .cursorrules |
awesome-* lists | |
|---|---|---|---|---|
| Reproducible from source | AST extraction + pinned commit | varies; opaque | whatever you wrote | none |
| Version-pinned & immutable | yes — per-version directories | runtime-dependent | rots silently | no |
| Audit trail | provenance-map.json + test + evidence |
depends on server | none | none |
| Runtime cost | zero (markdown + JSON) | a running process | zero | zero |
| Lifecycle tooling | rename, drop, update, export transactions | varies | file surgery | none |
| Falsifiable | yes — three steps, 60 seconds | rarely | no | no |
The others aren't bad. They solve different problems. SKF solves exactly one: the claim your agent is reading about a library was true at a specific commit on a specific day, and you can prove it in under a minute.
SKF extracts real function signatures, types, and patterns from code, docs, and developer discourse — every instruction links to the exact file and line it came from. On top of that foundation:
- Version-pinned — skills are stored per-version, so updating to v2.0 doesn't replace your v1.x skill. Compatible with skills.sh and npx skills
- Lifecycle tooling — rename skills and drop deprecated versions without manual file surgery. Destructive operations are transactional.
- Follows an open standard — skills comply with the agentskills.io spec and work across Claude, Cursor, Copilot, and other AI agents
Every skill ships two files —
SKILL.md(the full instruction set, loaded on trigger) andcontext-snippet.md(an 80–120 token always-on index injected intoCLAUDE.md/AGENTS.md/.cursorrules). Why both? Per Vercel's agent evals, passive context achieves a 100% pass rate vs. 79% for active skills loaded alone (see Skill Model → Dual-Output Strategy).
You can falsify any AST citation in an SKF-compiled skill in under a minute:
- Open the skill's
provenance-map.json— find your symbol; read itssource_fileandsource_line. - Open the skill's
metadata.json— readsource_commitandsource_repo. - Jump to the upstream repo at that commit, open that file, find that line. The signature in
SKILL.mdshould match the one you're reading.
If it doesn't, that's a bug — open an issue and SKF will republish with a new commit SHA and a new provenance map. Falsifiability isn't a feature; it's the whole deal.
Reference output: oh-my-skills — four Deep-tier skills compiled by SKF (cocoindex, cognee, Storybook v10, uitripled), each shipping its full audit trail in-repo. Scores range from 99.0% to 99.49%. Every claim walks to an upstream line in under 60 seconds. Serves as both the worked example for this section and ongoing proof that the pipeline does what it says.
Workflows end with a health check that can file bug or friction reports as GitHub issues (auto-deduped by fingerprint — re-reporting is safe). Please let workflows run to completion, or open an issue directly. Full details →
The docs are organized into three buckets — Why (start here), Try (do stuff), and Reference (look things up):
Why
- Why Skill Forge? — The JTBD pitch, persona router, and the honest anti-pitch
- Verifying a Skill — The 60-second audit recipe and scoring formula
Try
- Getting Started — Install, first skill, prereqs, and config
- How It Works — Plain-English walkthrough of one skill being built, end to end
- Examples — Real-world scenarios with full command transcripts
- Workflows — All 14 workflows with commands and connection diagrams
Reference
- Concepts — Seven load-bearing terms: provenance, confidence tiers, drift, and more
- Architecture — Runtime flow, 7 tools, workspace artifacts, security, and the design decisions behind them
- Skill Model — Capability tiers, confidence tiers, output format, dual-output strategy, ownership model
- Agents — Ferris: the single AI agent that runs every SKF workflow
- BMAD Synergy — How SKF pairs with BMAD CORE phases and optional modules (TEA, BMB, GDS, CIS)
- Troubleshooting — Common errors (forge setup, ecosystem checks, tier confidence) and how to resolve them
SKF builds on these excellent open-source tools:
| Tool | Role in SKF |
|---|---|
| agentskills.io | Skill specification and ecosystem standard |
| GitHub CLI | Source code access and repository intelligence (all tiers) |
| ast-grep | AST-based structural code extraction (Forge/Forge+/Deep tiers) |
| ast-grep MCP | MCP server for memory-efficient AST queries (recommended) |
| cocoindex-code | Semantic code search and file discovery pre-ranking (Forge+ tier) |
| QMD | Local hybrid search engine for knowledge indexing (Deep tier) |
| skill-check | Skill validation, auto-fix, quality scoring, and security scanning |
| Snyk Agent Scan | Security scanning for prompt injection and data exposure (optional) |
| tessl | Content quality review, actionability scoring, and AI judge evaluation |
| BMAD Method | Agent-workflow framework that SKF extends as a module |
See CONTRIBUTING.md for guidelines.
MIT License — see LICENSE for details.
Skill Forge (SKF) — A standalone BMAD module for agent skill compilation.
See CONTRIBUTORS.md for contributor information.