juxt · mattford63 · Apr 20, 2026 · Apr 20, 2026 · Apr 20, 2026 · Apr 20, 2026
diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
@@ -14,6 +14,7 @@
     "./skills/allium",
     "./skills/distill",
     "./skills/elicit",
+    "./skills/impact",
     "./skills/propagate",
     "./skills/tend",
     "./skills/weed"

diff --git a/README.md b/README.md
@@ -65,13 +65,16 @@ Allium provides five skills, an entry point and two autonomous agents.
 | `/propagate <optional constraints>` (or `/allium:propagate`) | Generate tests from a spec. |
 | `/tend <optional constraints>` (or `/allium:tend`) | Targeted changes to existing specs. |
 | `/weed <optional constraints>` (or `/allium:weed`) | Find and fix divergences between spec and code. |
+| `/impact <spec>` (or `/allium:impact`) | Build the spec↔code impact map that `distill`, `weed` and `propagate` read to avoid re-discovering the mapping on every invocation. Python is supported today via `pyright-lsp`; additional languages are added by dropping an adapter into `skills/impact/adapters/`. |
 
 How skills appear depends on your editor. Some show the fully qualified form (`/allium:weed`), others show the short form (`/weed`), and some support both. If one form isn't recognised, try the other. Skills also auto-trigger when you open or edit `.allium` files.
 
 Tend and weed are also available as autonomous **agents** that run in their own context, keeping Allium syntax out of your main session. Claude Code picks up agents from `agents/`, Copilot from `.github/agents/`. How editors discover skills and agents is still settling; we make these available in the most portable formats we can and expect to consolidate as conventions stabilise. If your editor doesn't pick something up, [raise an issue](https://github.com/juxt/allium/issues).
 
 For larger codebases, distillation and other ambitious tasks may need several passes to capture everything. Consider an iterative approach like the [Ralph Wiggum loop](https://ghuntley.com/ralph/), repeating until there's nothing further to do.
 
+The [`impact` skill](skills/impact/SKILL.md) makes each pass yield better results, not just faster ones. Grep finds names that match; the LSP-backed map resolves actual references, catching implementations text search misses — methods reached through polymorphism, framework dependency injection or dynamic dispatch. Its `unmapped` section makes "no implementation found" a finding rather than an oversight. And because `weed`, `distill` and `propagate` read one persisted mapping instead of each re-discovering it per run, their conclusions stay consistent across a long loop rather than drifting against one another.
+
 ## Why not just point the LLM at the code?
 
 Within a session, meaning drifts: by prompt ten or twenty, the model is pattern-matching on its own outputs rather than the original intent. Across sessions, knowledge evaporates entirely. Modern LLMs navigate codebases effectively, but the limitation appears when you need to distinguish what the code *does* from what it *should do*. Code captures implementation, including bugs and expedient decisions. The model treats all of it as intended behaviour.

diff --git a/TODO.md b/TODO.md
@@ -0,0 +1,6 @@
+# repo-impact-map branch
+
+1. ~~Don't try to combine both grep based searching and impact map searching into a single command.  Rather enable the invoker to run either-or or both.~~ — Addressed: weed/propagate/distill default to grep + read; map mode is opt-in via an explicit user phrase ("use the impact map", "in map mode", "via impact"). See the `## Map mode` appendix in each affected SKILL.md and the `### Opting in` section of `skills/allium/references/impact-map.md`.
+2. When building the spec->code mapping might as well build just a code-mapping (this will help with code generation without polluting context with specs if required)
+3. Implement a decent test-suite on master branch before doing any more changes
+4. Implement a harness for development....
diff --git a/hooks/hooks.json b/hooks/hooks.json
@@ -1,4 +1,13 @@
 {
+  "permissions": {
+    "allow": [
+      "Read(//home/matt/.claude/**)",
+      "Edit(~/.claude/skills/code-map/**)"
+    ],
+    "additionalDirectories": [
+      "/home/matt/.claude/skills/code-map"
+    ]
+  },
   "hooks": {
     "PostToolUse": [
       {

diff --git a/skills/allium/references/impact-map.md b/skills/allium/references/impact-map.md
@@ -0,0 +1,284 @@
+# Impact map reference
+
+The impact map is a JSON artifact produced by the [`impact` skill](../skills/impact/SKILL.md) that links Allium spec constructs to implementation code symbols. This document defines the schema and the integration contract that other Allium skills read.
+
+## File location
+
+One JSON file per spec, under the target project's `.allium/impact/` directory:
+
+```text
+<project>/
+  spec.allium
+  .allium/
+    impact/
+      .gitignore         # contains "*", committed so the directory is tracked but contents are not
+      spec.json
+```
+
+The directory is a gitignored cache. The skill writes `.gitignore` on first build. Do not commit impact map files.
+
+## Top-level schema
+
+```json
+{
+  "spec": "spec.allium",
+  "language": "python",
+  "commit": "<git-sha-at-build-time>",
+  "built_at": "<ISO-8601 timestamp>",
+  "adapter_version": "python-v1",
+  "nodes": { ... },
+  "links": [ ... ],
+  "call_edges": [ ... ],
+  "unmapped": { "spec": [ ... ], "code": [ ... ] }
+}
+```
+
+| Field | Meaning |
+| ----- | ------- |
+| `spec` | Filename of the `.allium` file this map covers. |
+| `language` | Target language of the implementation (e.g. `python`, `typescript`). Multiple adapters active means this is an array. |
+| `commit` | Git SHA of the target project at build time. Used by `refresh` mode to detect stale entries. |
+| `built_at` | ISO-8601 timestamp of the build. Informational. |
+| `adapter_version` | Which language adapter and adapter version produced the map. Bump when an adapter's rules change materially. |
+| `nodes` | Keyed by node ID. Every `spec:` or `code:` reference in `links` or `call_edges` resolves to a node. |
+| `links` | Cross-side edges (spec vs. code). This is the primary thing other skills read. |
+| `call_edges` | Same-side edges on the code graph (caller to callee). Used by `propagate` for state-machine action maps and by the "blast radius" query. |
+| `unmapped` | Spec nodes with no confirmed code match, and code symbols with no confirming spec link. Load-bearing for `weed`. |
+
+## Nodes
+
+Every node has a stable string ID: `spec:<name>` or `code:<fqn>`. IDs are the only way other skills reference nodes; do not rely on array positions.
+
+### Spec node
+
+```json
+"spec:Candidacy": {
+  "kind": "entity",
+  "file": "interview.allium",
+  "line": 42
+}
+```
+
+`kind` ∈ `entity`, `variant`, `value_type`, `enum`, `rule`, `trigger`, `surface`, `contract`, `invariant`, `config`, `default`, `actor`.
+
+### Code node
+
+```json
+"code:interview.services.candidacy.create_candidacy": {
+  "kind": "function",
+  "file": "src/interview/services/candidacy.py",
+  "line": 87,
+  "fqn": "interview.services.candidacy.create_candidacy"
+}
+```
+
+`kind` ∈ `function`, `method`, `class`, `module`, `decorator`, `constant`, `type_alias`. The set is language-agnostic; the adapter maps LSP symbol kinds onto it.
+
+`fqn` is the fully-qualified name as LSP reports it, with the language's native separator (`.` in Python, `/` in Go with the package path, etc.). Use it for cross-file identification.
+
+## Links
+
+```json
+{
+  "from": "spec:Candidacy",
+  "to": "code:interview.services.candidacy.create_candidacy",
+  "via": "name-match+hover",
+  "confidence": "high",
+  "rejected_candidates": []
+}
+```
+
+| Field | Meaning |
+| ----- | ------- |
+| `from` | A `spec:` node ID. |
+| `to` | A `code:` node ID. |
+| `via` | How the link was proven. Enumerated below. |
+| `confidence` | `high`, `medium`, `low`. Per the adapter's confidence heuristic. |
+| `rejected_candidates` | Other `code:` IDs that were plausible but lost the disambiguation. Empty when the match was unambiguous. |
+
+### `via` values
+
+- `name-match+hover` — symbol-index hit (via `documentSymbol` walk or `workspaceSymbol` if available) confirmed by LSP `hover` docstring/type. Strongest automatic signal.
+- `name-match+single` — the symbol-index walk returned exactly one candidate; no secondary signal needed.
+- `name-match+ambiguous` — multiple candidates survived; this link is one of several recorded at low confidence.
+- `surface-decorator` — matched via framework entry-point pattern from the adapter (e.g. Flask route decorator).
+- `docstring-ref` — code docstring explicitly references the spec construct by name.
+- `manual` — hand-curated; the skill never writes this. Reserved for user annotation.
+
+Other skills must treat links they don't recognise as opaque: skip them rather than erroring.
+
+## Call edges
+
+```json
+{
+  "caller": "code:...create_candidacy",
+  "callee": "code:...persist",
+  "cross_module": false
+}
+```
+
+Only edges within the project root are recorded. `cross_module` is true when caller and callee live in different top-level packages — `propagate` uses this to identify integration-test candidates.
+
+## Unmapped
+
+```json
+"unmapped": {
+  "spec": [
+    { "id": "spec:Rule.ReassignOnDecline", "reason": "no-workspace-symbol-match" }
+  ],
+  "code": [
+    { "id": "code:interview.legacy.old_flow.handle", "reason": "no-link" }
+  ]
+}
+```
+
+`reason` is a short tag, not free-form prose. Values:
+
+- `no-workspace-symbol-match` — the adapter generated variants, no LSP result.
+- `low-confidence-only` — candidates existed but all fell below the adapter's confidence floor.
+- `no-link` — code symbol found during traversal with no confirming link back to any spec node.
+- `out-of-scope` — deliberately excluded by the adapter's exclusion rules (test files, migrations, etc.).
+
+## Integration contract
+
+This section is the compatibility promise between the `impact` skill and its consumers. Other skills read the JSON directly; the impact skill never renegotiates the schema without bumping `adapter_version`.
+
+### Reading the map
+
+Consumers MUST:
+
+- Resolve every node reference through the `nodes` table. Do not parse IDs into fields.
+- Treat unknown `via` values as opaque — record the link but do not rely on its provenance.
+- Treat unknown `kind` values the same way. New kinds may appear as Allium evolves.
+- Respect the `unmapped` section. A spec node in `unmapped.spec` has no implementation candidate; treat that as a finding, not a bug.
+- Read the linked code. The map points consumers at code; it does not replace reading it. A `link` tells you *where* the implementation lives, not *whether the implementation satisfies the spec construct's clauses* — that's the consumer's job.
+
+Consumers MUST NOT:
+
+- Write to the map. Only the `impact` skill produces it.
+- Invent links. If the map says a spec node is unmapped, the consumer does not silently "find" a match.
+- Cross-reference maps from different specs. Each map is scoped to one `.allium` file.
+
+### Opting in
+
+Consumer skills do not auto-invoke this map. Their default flows (`weed`, `propagate`, `distill`) use grep + read correlation, just as they did before the impact map existed. The user (or the user's prompt to the consumer skill) selects map mode explicitly — typically by saying "use the impact map", "in map mode" or "via impact" in the request.
+
+The presence of `.allium/impact/<spec>.json` is **not** by itself an opt-in signal for `weed`, `propagate` or `distill` — they ignore it unless the user asks. The exception is `tend`, whose pre-rename orphan-link warning fires defensively whenever a map exists; that's a one-shot warning, not a gate.
+
+The map's value is on demand; its absence is not a bug for any consumer.
+
+### When to rebuild
+
+The `impact` skill decides when a rebuild is needed. Consumers request a rebuild by invoking `impact` in `refresh` mode (cheap) or `build` mode (full). Typical triggers — all conditional on the user having opted into map mode for the consumer skill in question:
+
+- Map mode `weed` run — refresh first if the map exists, build if it does not.
+- Map mode `propagate` run — refresh first if the map exists, build if it does not.
+- Map mode `distill` on a spec with an existing skeleton — build.
+- After a large refactor — build (user-initiated, ahead of the next map-mode consumer run).
+- When map mode `weed` reports a surprising volume of divergences, suggesting the map is stale — refresh.
+
+### Graceful degradation
+
+This section applies once the user has opted into map mode and the map cannot be built or refreshed. The map is an optimisation, not a prerequisite. When the `impact` skill returns `degraded: true` (no language adapter matches the project, the required LSP plugin is missing, or the LSP is installed but not indexing), consumers MUST fall back to their default (grep + read) flow rather than refusing the work. Consumers should:
+
+- Note the degradation reason to the user once, not on every step.
+- Proceed with the default flow as they would have without map mode requested.
+- Not write a stub or partial JSON to `.allium/impact/` — only the `impact` skill produces map files, and it writes nothing when degraded.
+
+### Versioning
+
+`adapter_version` bumps when:
+
+- A language adapter's name-variant or confidence rules change.
+- A new `via` value is added.
+- A new `kind` is added.
+
+Consumers may log a warning if they see an `adapter_version` they don't recognise, but must still read the map. Forward compatibility is the goal.
+
+## Worked example
+
+Given this spec fragment:
+
+```allium
+-- interview.allium
+
+entity Candidacy {
+    status: pending | active | closed
+}
+
+rule ScheduleInterview {
+    when: SchedulerTriggered(candidacy)
+    requires: candidacy.status = active
+    ensures: Interview.created(candidacy: candidacy, status: scheduled)
+}
+
+surface CandidateAPI {
+    provides: ScheduleInterview
+}
+```
+
+And this Python implementation:
+
+```python
+# src/interview/models.py
+class Candidacy:
+    status: str  # "pending" | "active" | "closed"
+
+# src/interview/services/scheduler.py
+def schedule_interview(candidacy: Candidacy) -> Interview:
+    """Witnesses rule ScheduleInterview."""
+    assert candidacy.status == "active"
+    return Interview.create(candidacy=candidacy, status="scheduled")
+
+# src/interview/api/routes.py
+@router.post("/candidacies/{id}/interviews")
+def create_interview(id: str):
+    return schedule_interview(get_candidacy(id))
+```
+
+The produced map:
+
+```json
+{
+  "spec": "interview.allium",
+  "language": "python",
+  "commit": "abc123",
+  "built_at": "2026-04-20T10:00:00Z",
+  "adapter_version": "python-v1",
+  "nodes": {
+    "spec:Candidacy":          { "kind": "entity",  "file": "interview.allium", "line": 3 },
+    "spec:Rule.ScheduleInterview": { "kind": "rule", "file": "interview.allium", "line": 7 },
+    "spec:Surface.CandidateAPI":   { "kind": "surface", "file": "interview.allium", "line": 13 },
+    "code:interview.models.Candidacy": {
+      "kind": "class", "file": "src/interview/models.py", "line": 2,
+      "fqn": "interview.models.Candidacy"
+    },
+    "code:interview.services.scheduler.schedule_interview": {
+      "kind": "function", "file": "src/interview/services/scheduler.py", "line": 2,
+      "fqn": "interview.services.scheduler.schedule_interview"
+    },
+    "code:interview.api.routes.create_interview": {
+      "kind": "function", "file": "src/interview/api/routes.py", "line": 2,
+      "fqn": "interview.api.routes.create_interview"
+    }
+  },
+  "links": [
+    { "from": "spec:Candidacy", "to": "code:interview.models.Candidacy",
+      "via": "name-match+single", "confidence": "high", "rejected_candidates": [] },
+    { "from": "spec:Rule.ScheduleInterview",
+      "to": "code:interview.services.scheduler.schedule_interview",
+      "via": "docstring-ref", "confidence": "high", "rejected_candidates": [] },
+    { "from": "spec:Surface.CandidateAPI",
+      "to": "code:interview.api.routes.create_interview",
+      "via": "surface-decorator", "confidence": "high", "rejected_candidates": [] }
+  ],
+  "call_edges": [
+    { "caller": "code:interview.api.routes.create_interview",
+      "callee": "code:interview.services.scheduler.schedule_interview",
+      "cross_module": true }
+  ],
+  "unmapped": { "spec": [], "code": [] }
+}
+```
+
+A `weed` run in **map mode** (the user asked for it) reads this map, sees every spec node linked and no unmapped code, and reports no structural divergences. It then reads the rule's `requires` (`status = active`) and compares to the code (`assert candidacy.status == "active"`) — that check is unchanged by the map, but in map mode the map got `weed` straight to the right file in one hop instead of grep'ing. The default `weed` flow does not consult this map; it greps for `Candidacy` and `ScheduleInterview` and reads the matches.
diff --git a/skills/distill/SKILL.md b/skills/distill/SKILL.md
@@ -314,11 +314,19 @@ The presence of multiple implementations suggests the variation itself is a doma
 
 ## Distillation process
 
+### Step 0: Build the impact map (opt-in)
+
+If the user has explicitly directed you to use the impact map ("use the impact map", "in map mode", "via impact") **and** a spec skeleton already exists (even just entity names), invoke the [`impact` skill](../impact/SKILL.md) in `build` mode first. The resulting `.allium/impact/<spec>.json` gives you a head start: existing links tell you which code symbols already correspond to spec constructs, and `unmapped.code` is your candidate pool for new spec nodes to extract.
+
+If the user has not opted in, or you are distilling from a completely empty spec (no skeleton), skip this step. Distillation works with grep + read alone — that's the default.
+
+If the user opted in but the impact skill returns `degraded: true` (no adapter for this language, or the target LSP is unavailable), note the reason once and proceed without a map. Come back to building a map in a later pass if the language adapter gap is worth filling.
+
 ### Step 1: Map the territory
 
 Before extracting any specification, understand the codebase structure:
 
-1. **Identify entry points.** API routes, CLI commands, message handlers, scheduled jobs.
+1. **Identify entry points.** API routes, CLI commands, message handlers, scheduled jobs. In map mode, the impact map's surface links and `call_edges` entry points are your starting list; otherwise, grep for framework decorators / route definitions and read the matches.
 2. **Find the domain models.** Usually in `models/`, `entities/`, `domain/`.
 3. **Locate business logic.** Services, use cases, handlers.
 4. **Note external integrations.** What third parties does it talk to?