Problem
ktext init currently scans README, Makefile, package manifests, ADRs, directory structure, and CONTRIBUTING.md — but not agent instruction files like CLAUDE.md, AGENTS.md, .cursorrules, or copilot-instructions.md.
These files often contain exactly the kind of high-signal content that belongs in CONTEXT.yaml: constraints, conventions, architectural decisions. Ignoring them means users who already have them have to re-enter that information manually.
This also directly addresses the common objection "I already have a CLAUDE.md" — ktext doesn't replace it, it absorbs it, structures it, scores it, and makes it portable across tools.
Proposed behaviour
During ktext init, the scanner should:
-
Detect the presence of any of these files:
CLAUDE.md
AGENTS.md
.cursorrules
.github/copilot-instructions.md
.cursor/rules
.windsurfrules
.clinerules
-
Read their content and attempt to extract candidates for:
constraints — lines containing must, never, always, avoid, prohibit, etc.
conventions — lines with action verbs (use, run, write, call, etc.)
decisions — sections that explain why something was chosen
-
Surface extracted candidates in the interactive review — same flow as other discovered values, user accepts or rewrites each one
Test case
The Backstage repo has an AGENTS.md file. That would be a good real-world test of whether the extraction produces useful candidates vs noise.
Notes
- Extraction doesn't need to be perfect — the interactive review is the safety net
- Should log which files were found during the "Scanning repo..." output
- Low-quality or very short instruction files should still be processed — the scorer will penalize vague entries
Problem
ktext initcurrently scans README, Makefile, package manifests, ADRs, directory structure, and CONTRIBUTING.md — but not agent instruction files like CLAUDE.md, AGENTS.md, .cursorrules, or copilot-instructions.md.These files often contain exactly the kind of high-signal content that belongs in CONTEXT.yaml: constraints, conventions, architectural decisions. Ignoring them means users who already have them have to re-enter that information manually.
This also directly addresses the common objection "I already have a CLAUDE.md" — ktext doesn't replace it, it absorbs it, structures it, scores it, and makes it portable across tools.
Proposed behaviour
During
ktext init, the scanner should:Detect the presence of any of these files:
CLAUDE.mdAGENTS.md.cursorrules.github/copilot-instructions.md.cursor/rules.windsurfrules.clinerulesRead their content and attempt to extract candidates for:
constraints— lines containing must, never, always, avoid, prohibit, etc.conventions— lines with action verbs (use, run, write, call, etc.)decisions— sections that explain why something was chosenSurface extracted candidates in the interactive review — same flow as other discovered values, user accepts or rewrites each one
Test case
The Backstage repo has an
AGENTS.mdfile. That would be a good real-world test of whether the extraction produces useful candidates vs noise.Notes