Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
catalogue_id: F31
title: "ADR scope-reality divergence — batch frame over-scopes due to missing pre-dispatch source verification"
family: F1-Sediment (verification-gap sub-form)
severity: P1
status: ratified_2026-05-16
empirical_project: Cobrust v0.3.0 Phase F.3 sprint
cobrust_local_id: F27-candidate (adr-scope-reality-divergence.md)
date_ratified: 2026-05-16
second_corroborator: confirmed (P9-A + P9-B independent rediscovery before audit)
---

# F31 — ADR scope-reality divergence

## Symptoms

A Phase batch ADR (or any strategic-altitude planning document) cites "work
needed across crates X / Y / Z" without source-code verification. Sub-agents
dispatched against the ADR re-discover at spike time that the work is already
partly or fully shipped. They either:

- Pivot scope on the branch (organic recovery — correct, but wastes spike-time)
- Implement redundantly (regression — incorrect, and wasted dispatch cost)

Two or more sub-agents independently re-discover the same divergence,
confirming that the gap is structural (not an individual agent oversight).

## Root cause

Strategic-altitude authorship is required to write a coherent batch frame. But
the same altitude is inherently lossy on local source state. The author models
the codebase's state from memory/prior ADRs rather than live `grep` evidence,
and memory lags behind recent sprint merges.

This is F1-family: the rule "verify before scoping" exists as common sense but
has no enforcement gate at ADR-authorship time. Without an explicit
"pre-dispatch verification commit", the gap propagates to every sub-agent
dispatched against the stale scope.

## SOP fix — ADR pre-dispatch source-code verification gate

Add this gate to every batch-frame ADR authorship:

**Phase 1 — source verification (mandatory, in the same commit as the ADR)**:
1. Run at least 3 representative `grep -nE` calls against the cited crates
(symbol search, not file-existence check).
2. For each claimed "work needed", record the grep result in ADR §"Verification"
section: either "not found — gap confirmed" or "found at `file::symbol` —
gap already shipped".
3. Only scope the sub-ADR (sub-sprint) for unconfirmed gaps.

**Phase 2 — sub-dispatch (once Phase 1 committed)**:
Dispatch sub-agents with a reference to the verification commit. The sub-agent's
§"Done means" criteria must start with "confirm gap still exists at HEAD SHA".

The two-phase pattern ensures the verification is co-located with the ADR,
visible to every future reader, and forms a diff-based audit trail.

## Evidence

Cobrust ADR-0050 Phase F.3 batch, 2026-05-16 (SHA `891d235`):

- ADR scoped 5 P0 features as "work needed".
- Pre-impl audit (read-only opus `afe53e8f`) + two independent P9 spike commits
(`1998dbe`, `909811f`) found that 3/5 features were already substantially shipped
at HEAD `30cf2b2`:
- `break`/`continue` — fully shipped end-to-end (lexer → AST → MIR → Cranelift).
- `for`-loop protocol — operational over `list[i64]` + `list[str]` since ADR-0044.
- `f64` — 80% shipped; remaining gap was a D2-sonnet scope, not D4-opus-1-week.
- Only `Str`-ownership debt (ADR-0050c) and `dict` (Wave 3) survived as honestly
large work.
- Batch estimate revised from 4-5 weeks to 2-3 weeks after correction.
- Two redundant re-discoveries (P9-A + P9-B) before the dedicated pre-impl audit
confirmed the pattern is structural, not accidental.

## Counter-pattern

Instead of:
```
Write batch ADR → dispatch sub-agents → sub-agents re-discover scope
```

Use:
```
Write batch ADR frame → run 3+ verification greps → amend ADR with results
→ dispatch sub-agents against verified gaps only
```

The pre-dispatch source verification gate converts a passive documentation
convention into an active confirmation step.

## Cross-references

- F34 (numeric-anchor degradation) — sibling; F31 is a scope-gap at ADR
authorship time; F34 is a symbol-anchor gap at doc maintenance time.
- F32 (wave-2 cascade discovery deficit) — downstream: when F31 produces
over-scoped sub-ADRs, F32-style cascade bugs surface during the impl sprint
because the over-scoped design didn't enumerate all consumers.
- Cobrust finding: `docs/agent/findings/adr-scope-reality-divergence.md`
- Cobrust ADR: `docs/agent/adr/0050-phase-f3-language-completeness-batch.md`
§"Amendment 2026-05-16"

## Status

Ratified 2026-05-16. Two-phase gate adopted in Cobrust CTO runbook for all
batch-frame ADR authorship. Second-corroborator requirement satisfied by
independent P9-A + P9-B rediscoveries.
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
---
catalogue_id: F32
title: "PAIR pattern impl gap under single-layer sub-agent architecture"
family: methodology-gap (PAIR-topology sub-form)
severity: P0 (methodology integrity)
status: ratified_2026-05-16
empirical_project: Cobrust v0.3.0 Phase F.3 sprint
cobrust_local_id: F28-candidate (adsd-pair-pattern-impl-gap.md)
date_ratified: 2026-05-16
second_corroborator: structural (platform inspection confirms no Agent tool in sub-agents)
---

# F32 — PAIR pattern impl gap under single-layer sub-agent architecture

## Symptoms

ADSD §"Dev/test pair pattern" prescribes "P9 spawns P7-TEST first, then P9
reviews the corpus, then P9 spawns P7-DEV". A project adopts this ceremony but
sub-agents (P9-level) do not have the `Agent` tool — the platform only exposes
it to the top-level orchestrator (P10/main session).

Result: the PAIR ceremony is **structurally unimplementable as written** on
single-layer platforms. Sub-agents either:

- Silently ignore the instruction and perform TEST + DEV as a single-Opus pass
(ceremonial PAIR, same-agent bias retained)
- Write sequential "phases" within their own context (still single agent,
bias not eliminated)
- Send a message back to the orchestrator requesting a new dispatch
(workable, but high coordination overhead)

The same-agent bias ADSD designed PAIR to prevent is present even when the
PAIR ceremony is nominally followed.

## Root cause

ADSD's PAIR pattern was written assuming a multi-layer agent architecture where
P9 can recursively dispatch P7-tier agents. Under Claude Code (and any other
single-layer orchestration platform), sub-agents have a constrained tool set
that excludes the Agent/dispatch tool. The PAIR ceremony cannot be implemented
by the sub-agent itself.

Same-agent bias: when one agent writes both the failing tests and the
implementation, the tests tend to mirror the author's mental model rather than
independently probe the spec. Constitution §6 "test-first" is honored in form
but not in spirit.

## SOP fix — P10-direct PAIR dispatch

On single-layer platforms, the orchestrator (P10/main session) MUST directly
dispatch both TEST and DEV agents as parallel calls:

1. **P10 dispatches TEST agent**: "Write failing test corpus only; forbidden
to write impl; report `[TEST-CORPUS-READY]` with file paths + assertion
counts + `cargo test` fail count."
2. **P10 reviews TEST corpus** (~5-10 min): coverage / spec-faithfulness /
edge cases. Sends amendment message if needed.
3. **P10 dispatches DEV agent** with TEST's commit SHA + corpus paths as
**required input**. DEV implements until `cargo test` 0-fails.
4. **P10 verifies** all gates green + merges.

**When NOT to use P10-direct PAIR** (P9 single-Opus is fine):
- ADR-authoring sprints (doc-only, no impl)
- Strategic decomposition where there's no impl yet
- Doc-only edits, runbook updates, frontmatter stamps
- Pre-impl audits (read-only is correct)

**Coordination overhead trade-off**: P10-direct PAIR costs ~2× dispatch
ceremony per sprint. For load-bearing sprints (contract-bearing public API,
novel semantics, multi-crate refactor) the methodological guarantee is worth
the cost. For trivial sprints (D1 well-scoped doc fix), single-sonnet is fine.

## Evidence

Cobrust Phase F.3 Wave 1, 2026-05-16:

- `cto_operations_runbook.md` §"Dev/test pair pattern" prescribed
"P9-spawns-P7-TEST-then-P7-DEV".
- 3 P9 Opus sprint dispatches with full PAIR ceremony in the prompt.
- Tool surface inspection confirmed: P9 sub-agents have no `Agent` tool.
- 2/3 sprints (P9-A break/continue `1998dbe`, P9-B for-loop `909811f`)
executed as single-Opus contract-seal + corpus. The PAIR ceremony was
ceremonial — no double-blind separation achieved.
- 1/3 sprint (P9-C dict design `8466433`) was ADR-only and didn't need PAIR.
- User surfaced this gap 2026-05-16 during Wave 1 dispatch review.
- Runbook updated 2026-05-16 to mark P9-PAIR as structurally invalid and
replace with P10-direct PAIR for D1-D3 / D5 sprints.

## Platform dependency note

This failure mode is specific to platforms that do not expose the Agent/dispatch
tool to sub-agents. On platforms that support recursive agent dispatch (e.g.
AutoGen, CrewAI, future Claude Code multi-layer), P9 can dispatch P7 as
originally written. ADSD methodology should declare PAIR's
implementation-layer responsibility explicitly per platform tier:

- Multi-layer platform: P9 dispatches P7-TEST + P7-DEV as written.
- Single-layer platform: Orchestrator (P10) directly dispatches TEST + DEV;
P9 layer reserved for ADR-authoring + strategic decomposition.

## Cross-references

- F33 (agent self-disciplinary rule skip) — F32 is the structural reason PAIR
discipline breaks: the rule is there, but the agent physically cannot execute it.
- F36 (TEST corpus exit-0 claim drift) — downstream: even when P10-direct PAIR
runs, F36 shows the TEST corpus clean-claim itself needs independent re-verification.
- Cobrust finding: `docs/agent/findings/adsd-pair-pattern-impl-gap.md`
- Cobrust memory: `feedback_adsd_pair_pattern_impl_gap.md`
- ADR reference: `docs/agent/adr/0050-phase-f3-language-completeness-batch.md`
§"Amendment 2026-05-16" §A7

## Status

Ratified 2026-05-16. P10-direct PAIR pattern adopted in Cobrust CTO runbook
for all D1-D3 / D5 sprints. Cobrust Phase F.3 Wave 2 + Wave 3 used P10-direct
PAIR as the new standard pattern.
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
---
catalogue_id: F33
title: "Predicate-flip cascade discovery deficit — F29-style enumeration misses latent consumers"
family: cascade-discovery-gap
severity: P2 (methodology integrity)
status: ratified_2026-05-16
empirical_project: Cobrust v0.3.0 Phase F.3 Wave 2 (ADR-0050c Str-ownership)
cobrust_local_id: F30-candidate (predicate-flip-cascade-discovery-deficit.md)
date_ratified: 2026-05-16
second_corroborator: audit teammate a15e69b315007f341 (post-Wave-2)
---

# F33 — Predicate-flip cascade discovery deficit

## Symptoms

A sub-ADR proposes flipping a shared MIR / codegen / type-system predicate
(e.g. `is_copy_type(Ty) → bool`, `is_drop_eligible(Ty) → bool`,
`is_pointer_type(Ty) → bool`). An F29-style §"Consequences" enumeration
captures direct consumers (all call sites of the predicate found via
static `grep`).

During implementation, the DEV agent surfaces additional cascade bugs that
the enumeration missed. These are **latent consumers** — code paths that
existed in the codebase but were unreachable under the old predicate state.
Recovery wall-time scales with the latent-consumer set size, not the
direct-consumer set size.

Signature symptom: the DEV dispatch stalls or runs significantly over time
budget while triaging cascade bugs serially.

## Root cause

F29-style static `grep` enumeration finds call sites — places in the code
that call the predicate function. It cannot enumerate:

1. **Placeholder-returning stubs** that were safe under the old predicate
(e.g. `lower_constant(Str)` returning `0` — zero overhead when Str was
never a non-Copy type; wrong placeholder when Str becomes non-Copy).
2. **Dispatch sites with IR-level type witnesses** that no longer correlate
with MIR type after the flip (e.g. f-string holes dispatching on `i64`
Cranelift value-type because Str pointers happen to be `i64` in IR).
3. **Bookkeeping calls** that had zero-overhead under the old predicate
(e.g. `set_param_count` with an off-by-one that produced correct output
when the predicate gated off non-Copy local enumeration).

All three classes are invisible to static symbol-search enumeration. They are
only discoverable via runtime test-failure analysis after the predicate flips.

## SOP fix — shadow-flip dry-run workflow

Every predicate-flip ADR must mandate a "shadow-flip dry-run" during design:

1. **Land the flip behind a feature flag** in the design-only ADR commit
(e.g. `#[cfg(predicate_flip_NN)]` or a runtime config toggle).
2. **Run `cargo test --workspace`** with the flag ON against the current
HEAD corpus.
3. **Classify each new failure**:
- Direct-consumer (enumerated in §"Consequences"): expected.
- Latent-consumer (new, not enumerated): add to §"Consequences addendum".
- Genuine semantic breakage from the flip: note in ADR, fix design or scope.
4. **Enumerate all latent consumers** in a §"Consequences addendum" before
removing the flag.
5. The pre-flag baseline + post-flag baseline diff IS the complete F29
enumeration; the pre-impl audit verifies completeness.

**Cost/benefit**: ~2× design-ADR effort (shadow-flip takes a few hours) pays
back ~10× in impl wall-time by surfacing latent consumers at design time
when enumeration-mismatch costs 1 line of doc, not 1 hour of impl debugging.

## Evidence

Cobrust ADR-0050c Wave 2, 2026-05-16:

- ADR-0050c §"Consequences" enumerated **27 direct consumers** via thorough
pre-impl audit.
- Wave 2 DEV agent (`a2056acb07469204f`) surfaced **7 additional latent
consumers** as cascade bugs:
- `lower_constant(Str)` returning `0` pointer sentinel (M9-era stub)
- f-string hole dispatch on `i64` Cranelift type
- `set_param_count` off-by-one
- 4 additional Wave-2 cascade fixes (per merge `aca5d87`)
- **Miss rate: 26%** (7 out of 27 enumerated consumers were missed).
- List[str] DEV recovery agent stalled at 600s mid-investigation; cascade
bugs surfaced serially over ~5h recovery wall-time.
- A shadow-flip dry-run during ADR-0050c design could have surfaced all 7
within 1-2h, allowing the impl PAIR DEV to start with a complete enumeration.

## Pattern signal

Watch for F33 when:
1. A sub-ADR proposes flipping a **shared predicate** (a function returning
`bool` that gates MIR / codegen / type-check behavior on type or value shape).
2. The §"Consequences" enumeration uses **static `grep`** of call sites rather
than **runtime-observed** consumer behavior.
3. The codebase has multiple eras of code (e.g. M9 stubs, earlier-phase paths,
compiler extension surfaces) where different eras gated off the predicate
differently.

## Cross-references

- F31 (ADR scope-reality divergence) — F33 extends F31's "verify-at-HEAD"
discipline to "verify-under-shadow-flip".
- F34 (wave-2 cascade discovery deficit) — third instance corroboration of
same pattern (narrower domain: method-dispatch infrastructure).
- Cobrust finding:
`docs/agent/findings/predicate-flip-cascade-discovery-deficit.md`
- Cobrust ADR: `docs/agent/adr/0050c-str-ownership.md`
- Latent consumer findings:
`docs/agent/findings/lower-constant-str-zero-pointer-m9-stub.md`,
`docs/agent/findings/fstring-hole-mir-type-dispatch.md`

## Status

Ratified 2026-05-16. Shadow-flip dry-run workflow added to Cobrust CTO runbook
as mandatory for all predicate-flip sub-ADRs. Post-Wave-2 audit teammate
`a15e69b315007f341` confirmed the 26% miss rate as the second corroborator.
Loading