diff --git a/canon/decisions/models-do-not-mutate-canon.md b/canon/decisions/models-do-not-mutate-canon.md index 74ff2c95..43124556 100644 --- a/canon/decisions/models-do-not-mutate-canon.md +++ b/canon/decisions/models-do-not-mutate-canon.md @@ -6,41 +6,50 @@ exposure: nav tier: 1 voice: neutral stability: stable -tags: ["canon", "decisions", "models", "mutation", "governance"] +tags: ["canon", "decisions", "models", "mutation", "governance", "stewardship", "delegation", "authority", "epoch-10"] relevance: decision execution_posture: governing +date: 2026-06-26 --- # Models Do Not Mutate Canon -> Models may analyze and report on Canon, but may not edit it. +> Models do not mutate the operator's canon. A model may hold delegated, bounded, revocable stewardship over a sub-scope — governing within it, never above it, never over itself. ## Description -This decision records that AI models (LLMs, agents, assistants) are not permitted to directly edit Canon content. Models may read, analyze, summarize, and report on Canon. They may draft proposed changes. But the act of mutation—writing changes to Canon files—requires human review and approval. This preserves Canon's role as stable, human-governed truth. +This decision records that AI models (LLMs, agents, assistants) are not permitted to directly edit the operator's Canon. Models may read, analyze, summarize, and report on Canon. They may draft proposed changes. The act of mutating the operator's Canon—writing changes to its files—requires the operator's review and approval. This preserves Canon's role as stable, human-governed truth. + +This decision now also names the one axis along which a model may carry standing: **delegated stewardship over a bounded sub-scope.** The operator may grant a steward (human or model) authority over a sub-scope's governance — a delegated canon distinct from the operator's own. The steward governs within that scope and nowhere else. This is not an exception to human-governed truth; it is a downward delegation of it, and it is revocable from above. (See *Delegated Stewardship* below, and `klappy://canon/principles/rulebook-transfer` for the authority-flow asymmetry this rests on.) ## Operating Constraints -- MUST NOT allow models to write changes directly to Canon files +- MUST NOT allow models to write changes directly to the operator's Canon files - MUST allow models to read, analyze, summarize, and report on Canon -- MUST allow models to draft proposed changes for human review -- MUST require human review and approval for all Canon mutations -- MUST treat Canon as human-governed truth, not generated artifact - ---- +- MUST allow models to draft proposed changes for review +- MUST require operator review and approval for all mutations to the operator's Canon +- MUST treat the operator's Canon as human-governed truth, not generated artifact +- MUST confine any delegated steward's authority to its granted sub-scope — never upward, never over its own boundary, never over its own proposals +- MUST keep every scope boundary owned by the granting tier, not by the steward it bounds +- MUST keep every delegation revocable from above, with revocation recorded as an attributed act +- MUST grant a delegated scope only to a tier with the capacity to steward it; stewardship capacity is not uniform across tiers, and a tier that can execute may still be unable to steward (see `klappy://canon/principles/rulebook-transfer`) ## Defaults -- Models draft, humans commit -- When a model detects a Canon error, report it rather than fix it -- Treat any model attempt to edit Canon as a boundary violation +- Models draft, humans commit — for the operator's Canon +- Within a delegated sub-scope, the steward ratifies; across any scope boundary upward, it does not +- When a model detects a Canon error above its scope, report it rather than fix it +- Treat any model attempt to edit Canon above its scope, widen its own scope, or ratify its own proposal as a boundary violation - Prefer slower Canon updates over model-driven drift --- ## Failure Modes -- **Direct Mutation**: Model writes to Canon files, bypassing human review +- **Direct Mutation**: Model writes to the operator's Canon files, bypassing review +- **Upward Reach**: A delegated steward edits a scope above its own +- **Self-Promotion**: A tier ratifies its own proposal or widens its own scope (a cycle of length one) +- **Boundary Capture**: A steward edits the definition of its own scope, breaking downward-only authority - **Subtle Drift**: Well-meaning model edits introduce gradual inaccuracy - **Accountability Gap**: No human responsible for model-introduced changes - **Authority Erosion**: Canon becomes "just another generated file" when models edit freely @@ -50,10 +59,12 @@ This decision records that AI models (LLMs, agents, assistants) are not permitte ## Verification -- No commits to Canon files have model as author without human approval -- Canon changes are traceable to human decisions -- Models produce drafts and reports, not direct mutations -- Boundary is enforced in tooling and process, not just policy +- No commits to the operator's Canon files have a model as author without operator approval +- Canon changes are traceable to an attributed ratification by a tier with authority over that scope +- No standing-bearing row was ratified by the same tier that authored it +- Scope boundaries are changed only by the granting tier; revocations are recorded +- Models produce drafts and reports, not direct mutations of canon above their scope +- The boundary is enforced in tooling and process, not just policy --- @@ -61,9 +72,9 @@ This decision records that AI models (LLMs, agents, assistants) are not permitte ## Decision -Models may not mutate Canon. +Models may not mutate the operator's Canon. A model may hold delegated, bounded, revocable stewardship over a sub-scope. -Specifically: +Actions on the **operator's Canon**: | Action | Permitted | |--------|-----------| @@ -74,25 +85,25 @@ Specifically: | Draft proposed changes | ✓ Yes | | Write changes to Canon files | ✗ No | -## Status - -**Active** +Actions on a **delegated sub-scope** (granted from above): -## Context +| Action | Permitted | +|--------|-----------| +| Govern / ratify within the granted scope | ✓ Yes (within scope only) | +| Edit a scope above its own | ✗ No | +| Widen or redefine its own scope boundary | ✗ No | +| Ratify its own proposal | ✗ No | +| Grant itself authority | ✗ No | -Canon exists to preserve stable, shared truth across this program. Its value depends on: +## Status -- Careful curation -- Intentional change -- Human accountability +**Active.** Refined 2026-06-26 to add the delegated-stewardship axis; the core prohibition on mutating the operator's Canon is unchanged. -Models are powerful tools for analysis and drafting. However, models: +## Context -- Optimize for plausibility, not correctness -- Cannot be held accountable for mistakes -- May introduce subtle drift through well-meaning edits +Canon exists to preserve stable, shared truth across this program. Its value depends on careful curation, intentional change, and human accountability. Models are powerful tools for analysis and drafting, but they optimize for plausibility over correctness, cannot be held accountable for mistakes, and may introduce subtle drift through well-meaning edits. Allowing models to directly mutate the operator's Canon would erode the trust boundary that makes Canon useful. -Allowing models to directly mutate Canon would erode the trust boundary that makes Canon useful. +The delegated-stewardship axis does not weaken that boundary. Authority flows downward only: the operator grants a bounded scope, the steward governs within it, and the grant is revocable. A steward never reaches up into the operator's Canon and never authors the definition of its own scope. Accountability is preserved because every ratification and every revocation is attributed and recorded. ## Alternatives Considered @@ -102,36 +113,45 @@ Allowing models to directly mutate Canon would erode the trust boundary that mak ### 2. Models may edit Canon with approval workflow -**Rejected for now.** An approval workflow could work, but adds complexity. The simpler rule—no model mutation—is clearer and easier to enforce. +**Accepted in narrow form.** Models draft; ratification is an attributed act by a tier with authority over the scope. For the operator's Canon, that tier is the operator. The earlier rejection ("adds complexity") stands for any workflow that lets a model commit to the operator's Canon; it does not stand against attributed downward ratification within a delegated scope. -### 3. Models may edit Tier 3 but not Tier 1-2 +### 3. Models may edit Tier 3 but not Tier 1–2 -**Rejected.** This creates a confusing boundary. The rule should be simple: Canon does not get edited by models. +**Rejected — and distinct from delegated stewardship.** Alternative 3 proposed graduated edit-rights *inside the operator's single Canon* (tiers of the operator's own truth). That remains rejected: the operator's Canon has no model-editable tier. Delegated stewardship is a different axis — a *separate, bounded sub-scope* granted downward, not a tier of the operator's Canon. The operator's Canon stays flatly non-model-mutable; a steward's canon is its own jurisdiction, not a privileged slice of the operator's. ## Consequences ### Enables -- Canon remains human-governed -- Changes to Canon are intentional and traceable -- Models can still provide value through analysis and drafting -- Clear boundary for model behavior +- The operator's Canon remains human-governed +- Self-building loops can close without unravelling, because authority stays acyclic +- Delegated stewards can govern bounded sub-scopes without reaching upward +- Changes are intentional, attributed, and traceable +- Models still provide value through analysis, drafting, and in-scope stewardship ### Prevents -- Subtle drift from well-meaning model edits +- Subtle drift from well-meaning model edits to the operator's Canon +- Upward reach, self-promotion, and boundary capture - Accountability gaps - Canon becoming "just another generated file" ### Costs -- Slower Canon updates (requires human action) -- Models cannot self-correct Canon errors they detect -- Human bottleneck for Canon maintenance +- Slower updates to the operator's Canon (requires operator action) +- Delegation and revocation must be tracked as attributed acts +- A model cannot self-correct an error in Canon above its scope --- +## Refinement note (2026-06-26) + +This document extends the original binary decision (*models may analyze and report on Canon, but may not edit it*) with the delegated-stewardship axis introduced by `klappy://canon/principles/rulebook-transfer`. The original rule is preserved unchanged for the operator's Canon. The refinement adds only the downward-delegation axis and its guards. Landing mechanism (in-place amendment vs a superseding decision) and the `stability` setting are the operator's call; this draft sets `stability: semi_stable` to reflect active refinement. + ## See Also +- [Rulebook Transfer](klappy://canon/principles/rulebook-transfer) — the authority-flow asymmetry this rests on +- [Everything Is a Project](klappy://docs/planning/kb-data-model) — governance is a role on a scope +- [ODD Is a Value-Grounded Epistemic OS](klappy://canon/constraints/odd-is-epistemic-os-not-values) — authority is a governance-layer axis, not an epistemic-core one - [Epistemic Obligation and Document Tiers](/canon/definitions/epistemic-obligation-and-document-tiers.md) - [Constraints](/canon/constraints/README.md) — AI as Accelerator, Not Authority diff --git a/canon/principles/rulebook-transfer.md b/canon/principles/rulebook-transfer.md new file mode 100644 index 00000000..92c28636 --- /dev/null +++ b/canon/principles/rulebook-transfer.md @@ -0,0 +1,89 @@ +--- +uri: klappy://canon/principles/rulebook-transfer +title: "Rulebook Transfer — Prompt Over Code, Bounded by Articulability" +audience: canon +exposure: nav +tier: 1 +voice: neutral +stability: stable +tags: ["canon", "principle", "epoch-10", "agentic", "the-loop", "prompt-over-code", "discernment", "articulability", "stewardship", "delegation-ladder", "kirigami", "rulebook-transfer"] +epoch: E0010 +date: 2026-06-26 +derives_from: "canon/principles/prompt-over-code.md, canon/principles/discernment-layer.md, canon/principles/skills-are-procedure-not-judgment.md, canon/decisions/models-do-not-mutate-canon.md, canon/principles/code-claims-require-code-observation.md, canon/principles/verification-requires-fresh-context.md" +complements: "writings/shifting-bottlenecks-climbing-ladders.md (operator-side delegation ladder), writings/how-you-lead-is-what-you-build.md (delegation as graduation), docs/planning/kb-data-model.md (governance is a role on a scope), canon/principles/symmetric-participation.md, canon/constraints/odd-is-epistemic-os-not-values.md, kirigami contributor/custody/synthesized model, P0 cross-model reconstruction-fidelity sweep" +governs: "How discernment transfers down model tiers; how a self-building loop surfaces candidate governance without unravelling" +status: active +target_repo: "outcomes-driven-development" +--- + +# Rulebook Transfer — Prompt Over Code, Bounded by Articulability + +> Discernment a frontier model performs once can be written down as a rulebook a lesser model runs — but only the part that can be written down transfers, only adjacent tiers can hand it off cleanly, and authority over what becomes canon flows one way: downward, never up, never reflexive. + +## The principle + +Discernment a frontier model performs once can be crystallized into an explicit lens — a rulebook — that a lesser, ultimately local model executes without holding equivalent intelligence. Only the *articulable* portion of the discernment transfers. Tacit judgment stays in the frontier weights. A lens goes local in proportion to how expressible its cut-rules are. + +## Mechanism — prompt over code, applied to the fold + +This is [Prompt Over Code](klappy://canon/principles/prompt-over-code) turned on the act of discernment itself. The frontier model spends its judgment once, writing the lens; the rulebook is the prompt; the cheaper model runs the prompt. The cheaper model is not asked to be intelligent — it is asked to be faithful to something intelligent that already ran. Execution moves down-tier, and where the rulebook is explicit enough, onto local hardware. + +## The bound — why articulability + +The transfer is lossy, and it is lossiest at the hard cases. A rulebook captures the typical, median call well. The ambiguous tail — the place where frontier judgment earned its keep — is captured worst, and judgment that is irreducibly tacit does not reduce to writable rules at all. A lens whose cut-rules can be written down ships down-tier and to-device first; a lens whose discernment stays tacit remains frontier-bound or escalates. Articulability is a property to measure, not assume, and it is the criterion that ranks which lenses move local first. + +This bound is the same line [Skills Are Procedure, Not Judgment](klappy://canon/principles/skills-are-procedure-not-judgment) already draws. That principle holds that a skill encodes the procedure and never the verdict; the verdict stays judgment. Rulebook Transfer is the generalization: what crosses a tier boundary is the articulable procedure, and what does not cross is the verdict. The expressible is portable; the tacit is not. + +## The recursion — and why you cannot skip rungs + +The transfer is not two layers. It is n-tier, and the operation is identical whichever tier sits on top: an upper tier distills its discernment into a rulebook the next tier runs. A human writing a rulebook a frontier model runs is the same move as a frontier model writing a lens a local model runs. "Systems that build systems" is this one operation permitted to run more than once. + +But each hop bridges only a bounded capability gap. Skip too many tiers and the rulebook silently assumes tacit context the executor does not have — and the loss does not degrade gently, it collapses, because the part that failed to transfer is exactly the unwritten judgment the bottom tier most needed. So transfer is a ladder of adjacent steps, not a cliff. The design variable is **step size** — how large a capability gap a single rulebook can bridge — and step size is itself bounded by articulability across *that specific gap*. The wider the gap, the more of the discernment must be made explicit; past some width the tacit residual is too large to write down, and the hop fails. "We cannot skip too much or it becomes impractical" is this constraint, stated exactly. + +## Two capacities: execution and stewardship + +The chain depends on two different capacities, and conflating them hides the real limit. + +- **Execution capacity** — run a rulebook from above faithfully. This extends far down the tiers. +- **Stewardship capacity** — author a rulebook *for the tier below*, hold a bounded domain, recognize what falls outside it, and escalate the rest. This is scarcer, and extends less far. + +The delegation ladder therefore has fewer rungs than the execution ladder. This mirrors any human organization: nearly everyone can execute competently within their lane; far fewer can manage, delegate, and judge what is out of their lane. So the transferability question is not a single number — "how far does it go?" — but two tests applied at each rung: can this tier (a) run a rulebook from above, and (b) author one for the tier below? A tier may pass (a) and fail (b) — a faithful executor that cannot steward. The ladder is mapped rung by rung, not assumed. + +This is the agent-side mirror of the operator-side ladder in [Shifting Bottlenecks, Climbing Ladders](klappy://writings/shifting-bottlenecks-climbing-ladders): *capability gets you to the next rung; trust, harness, and delegation maturity decide whether you can stand on it.* The same sentence governs both sides of the handoff. + +## The loop closes because authority flows one way + +A loop that surfaces its own governance unravels when authority forms a cycle: when something downstream can rewrite the rules upstream of it, the system begins ratifying itself, and early mistakes entrench. The guarantee against that is a single asymmetry — **information flows in every direction; authority flows in one.** Proposals travel up, down, and sideways freely. Authority travels downward only, and never forms a cycle. The loop is safe to close in the information dimension precisely because it stays acyclic in the authority dimension. + +The governance rules follow as corollaries of that asymmetry: + +- **Induction proposes; it does not ratify.** A regularity observed across many decisions is evidence *for* a candidate principle, not a principle. It enters advisory only — in kirigami terms, custody `synthesized`, with named grounds, carrying no standing. +- **No tier promotes itself.** A tier cannot ratify its own proposal, widen its own scope, or write rules about itself. Reflexive authority is a cycle of length one, forbidden for the same reason longer cycles are. ("Agents never self-promote" is this corollary, not a separate axiom.) +- **No tier writes upward.** A tier cannot edit the canon of a tier above it. This preserves [Models Do Not Mutate Canon](klappy://canon/decisions/models-do-not-mutate-canon): a model never edits the operator's sovereign canon. +- **Authority is delegated, bounded, and revocable.** A tier that has the capacity to steward may hold authority over a sub-scope granted from above — its own canon, within its jurisdiction — and may ratify within it. It never owns the definition of its own boundary, and the grant can be revoked from above. Governance is [a role on a scope](klappy://docs/planning/kb-data-model), not a possession. + +Authority is a role-overlay above the wire, not a wire primitive: [Symmetric Participation](klappy://canon/principles/symmetric-participation) keeps every peer identical at the wire, and jurisdiction is metadata above it. It is sited at the governance layer, not the epistemic core — [ODD does not define authority](klappy://canon/constraints/odd-is-epistemic-os-not-values), and this principle adds that axis where "governance is a role" already lives, rather than inside the epistemic OS. + +The verdict — promotion across a scope boundary — is never encoded into a lower tier. This is `Skills Are Procedure, Not Judgment` restated once more: encode the procedure, never the verdict. + +## Evidence, bounded + +Two layers of evidence, each scoped honestly. + +**Execution transfer** has probe support. The fold-fidelity probe had three models (Haiku, Sonnet, Opus) fold the same corpus under the same lens; all three passed the faithfulness gate, with tier-1 pivotal agreement near 80–90%. This supports down-tier *execution* at the cloud tier. It is not support for the *device* tier; that is the hypothesis the P0 cross-model reconstruction-fidelity sweep is built to test, and P0 is post-build. + +**Stewardship transfer** has been observed at exactly one rung: a frontier-tier model holding delegated authority over real repositories in an operator-approved working agreement, proposing downward and escalating out-of-scope decisions upward for approval. That is N=1, and it sits at the *top* rung — the least surprising place for stewardship to hold and the least informative about whether it transfers downward. The multi-rung chain, in which each tier stewards the one below, is unproven. + +A further hazard sharpens both: an articulated rulebook can be fluent and still not describe what its author actually did. A confabulated rulebook reads well and reproduces badly. Fidelity is therefore never inferred from how good the rulebook reads; it is measured by testing that the lesser tier makes the same cuts — consistent with [Code Claims Require Code Observation](klappy://canon/principles/code-claims-require-code-observation): the claim is verified against observation, not against its own prose. + +## Open questions + +- **Star or chain?** Does stewardship capacity re-instantiate at each tier (a chain — every tier stewards the one below), or does it concentrate at the frontier (a star — one steward authoring for everyone)? The single observed rung does not distinguish them. This is the dominant empirical question for the whole model. +- **Fixed or teachable?** Is stewardship capacity a fixed property of a model tier, or can a tier be brought up a rung with the right harness — the graduation arc of [How You Lead Is What You Build](klappy://writings/how-you-lead-is-what-you-build) applied to a model rather than a person? The answer decides whether the ladder is discovered or built. + +## Design implications + +- **Cascade, not replacement.** The lesser model folds the confident median and escalates the ambiguous tail. Temporality routes: fresh and typical rows stay local; stale or edge rows escalate to a frontier pass. +- **Articulability ranks the roadmap.** Lenses move to-device in order of how much of their discernment is expressible. +- **Build the ladder rung by rung.** Hand stewardship only between adjacent tiers, and test each rung's two capacities separately before relying on it. Do not assume a rung that has not been observed. +- **Authorship is frontier; execution is local.** The substrate carries both the rulebook and the folded output in one shape, so the frontier-judgment → local-execution handoff has somewhere to live. The substrate stays blind; the lens carries the flavor ([Vodka Architecture](klappy://canon/principles/vodka-architecture)).