Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/dev/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ A task moves to validation after implementation is complete. The work here is to
- **What it produces:** the auditor works on a separate throwaway checkout (never the implementation worktree) and never mutates the deliverable. It tries to REFUTE the validation — construct an adversarial edit that the deliverable's own tests should catch, and confirm they do. A test that stays green under an edit that breaks the claim is a hole. Findings come in two tiers — `Material:` (a real correctness or test-strength hole, e.g. an assertion that green-lights a regression) and `Polish:` (non-blocking). "Refuted nothing material" is itself a valid, recorded outcome.
- **How it is recorded:** material findings route back through the normal validation→implementation feedback flow (a `### Feedback Cycles` entry naming the audit and its adversarial edit); the gate is not presented as clean until they are closed. A clean audit is noted in the gate's reviewer-findings block (or a one-line "detached audit: no material findings").
- **Why:** the audit catches the class of hole where the test passes but would also pass on a broken future edit — which validation, trusting its own green suite, cannot see. Real catches on the record: #262 (`binary-absent-fo-bootstrap`) — validation passed correct prose, the audit then found two test-strength holes in `contract_gate_test.go` (a `strings.Count(...) > 0` check that skipped on zero mentions, and a bare `strings.Contains` satisfied by a negated disclaimer); `1x` (`code-cleanups-0193`) AC-6 and `external-tracker-checkpoint` AC-6 (a self-referential "verified by review of this entity's own section" that can never fail); `7h` (`release-notes-local-summary`) AC-3 (validation passed, the audit found the tag-cut folded the notes block into the tag subject instead of the body). This is read-only refutation, not a second implementation pass.
- **Instruction-file read quarantine:** tests do not read prompt or instruction files except in `internal/contractlint`, and that package is limited to structural checks: reference closure, frontmatter validity, structural absence, dedup, and similar machine-checkable properties. Prose-grep is banned: a test that asserts a skill, contract, agent file, or this workflow README contains its own wording proves only that the wording is present. Code-bound prose checks are banned too: a prose-to-code consistency lint is not a behavior test, and it must never substitute for running the behavior. If a deleted prose/code-bound read exposed an untested behavior that still matters, record the owed behavior test instead of keeping the read. The boundary guard fails on instruction-file reads outside the quarantine; high-stakes validation still uses detached adversarial audit to refute whether the remaining behavior tests would catch a broken edit.

### `done`

Expand Down
149 changes: 149 additions & 0 deletions internal/contractlint/boundary_guard_control_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
// ABOUTME: Mutation control for the boundary guard — a planted out-of-quarantine
// ABOUTME: instruction read must RED the detector; a non-instruction read must not.
package contractlint

import (
"go/parser"
"go/token"
"os"
"path/filepath"
"testing"
)

// TestBoundaryGuardDetectsAPlantedInstructionRead is the guard's mutation control:
// the guard is the boundary oracle, so it must be demonstrated to RED on the exact
// shape it bans (a test that reads an instruction file) and stay GREEN on a read
// that is not an instruction file. Without this, the guard could silently degrade
// to a no-op (e.g. the detector predicate inverted) and pass vacuously. It parses
// synthetic source through the same detector the real sweep uses.
func TestBoundaryGuardDetectsAPlantedInstructionRead(t *testing.T) {
dir := t.TempDir()

// A planted out-of-quarantine instruction read: reads a skill body and inspects
// it. The retired marker model would have let this pass with a declaration; the
// guard bans it regardless.
instructionRead := `package fixture
func TestReadsInstruction(t *T) {
data, _ := os.ReadFile("../../skills/first-officer/references/first-officer-shared-core.md")
if strings.Contains(string(data), "x") { _ = data }
}
`
// A non-instruction read: reads a JSON manifest (the binary parses it) and a
// generated state entity. Neither is an instruction surface, so the guard must
// not flag it.
nonInstructionRead := `package fixture
func TestReadsManifest(t *T) {
a, _ := os.ReadFile("../../.agents/plugins/marketplace.json")
b, _ := os.ReadFile(filepath.Join(stateDir, "entity", "index.md"))
_ = a
_ = b
}
`
nonInstructionWalk := `package fixture
func TestWalksFixtureMarkdown(t *T) {
filepath.WalkDir("testdata", func(path string, d DirEntry, err error) error {
if strings.HasSuffix(path, ".md") {
_ = path
}
return nil
})
}
`
writeFixture(t, filepath.Join(dir, "instruction_read_test.go"), instructionRead)
if got := instructionReadingTestFiles(t, dir); !contains(got, "instruction_read_test.go") {
t.Fatalf("guard failed to flag a planted instruction-file read; flagged=%v", got)
}

writeFixture(t, filepath.Join(dir, "manifest_read_test.go"), nonInstructionRead)
writeFixture(t, filepath.Join(dir, "fixture_walk_test.go"), nonInstructionWalk)
got := instructionReadingTestFiles(t, dir)
if contains(got, "manifest_read_test.go") {
t.Fatalf("guard wrongly flagged a non-instruction read (json manifest / generated state entity); flagged=%v", got)
}
if contains(got, "fixture_walk_test.go") {
t.Fatalf("guard wrongly flagged a non-instruction markdown fixture walk; flagged=%v", got)
}
if !contains(got, "instruction_read_test.go") {
t.Fatalf("adding a non-instruction fixture must not stop the guard flagging the instruction one; flagged=%v", got)
}
}

// TestBoundaryGuardSweepFlagsAPlantedDir drives the directory sweep itself (not just
// the per-file detector) against a planted repo layout: a policed dir outside the
// quarantine that reads an instruction file must appear as an offender, and the
// quarantine dir's own instruction read must NOT.
func TestBoundaryGuardSweepFlagsAPlantedDir(t *testing.T) {
root := t.TempDir()

// A policed dir (not the quarantine) with an instruction read -> offender.
policedDir := filepath.Join(root, "internal", "hostneutrality")
if err := os.MkdirAll(policedDir, 0o755); err != nil {
t.Fatal(err)
}
writeFixture(t, filepath.Join(policedDir, "leak_test.go"), `package fixture
func TestLeak(t *T) {
data, _ := os.ReadFile("../../skills/ensign/references/ensign-shared-core.md")
_ = data
}
`)

// The quarantine dir with an instruction read -> NOT an offender (the legal path).
quarantineDir := filepath.Join(root, quarantinePkg)
if err := os.MkdirAll(quarantineDir, 0o755); err != nil {
t.Fatal(err)
}
writeFixture(t, filepath.Join(quarantineDir, "legal_read_test.go"), `package fixture
func TestLegalRead(t *T) {
data, _ := os.ReadFile("../../skills/commission/SKILL.md")
_ = data
}
`)
// skills/integration must exist for the sweep's ReadDir; leave it empty (no reads).
if err := os.MkdirAll(filepath.Join(root, "skills", "integration"), 0o755); err != nil {
t.Fatal(err)
}

offenders := sweepInstructionReadsOutsideQuarantine(t, root)
if !contains(offenders, filepath.Join("internal", "hostneutrality", "leak_test.go")) {
t.Fatalf("sweep failed to flag the out-of-quarantine instruction read; offenders=%v", offenders)
}
otherDir := filepath.Join(root, "cmd", "spacedock")
if err := os.MkdirAll(otherDir, 0o755); err != nil {
t.Fatal(err)
}
writeFixture(t, filepath.Join(otherDir, "leak_test.go"), `package fixture
func TestLeak(t *T) {
data, _ := os.ReadFile("../../skills/first-officer/SKILL.md")
_ = data
}
`)
offenders = sweepInstructionReadsOutsideQuarantine(t, root)
if !contains(offenders, filepath.Join("cmd", "spacedock", "leak_test.go")) {
t.Fatalf("sweep failed to flag an instruction read outside the historical packages; offenders=%v", offenders)
}
for _, o := range offenders {
if filepath.Dir(o) == quarantinePkg {
t.Fatalf("sweep wrongly flagged the quarantine package's own read %q; the quarantine is the legal read path", o)
}
}
}

func contains(in []string, want string) bool {
for _, s := range in {
if s == want {
return true
}
}
return false
}

func writeFixture(t *testing.T, path, content string) {
t.Helper()
if err := os.WriteFile(path, []byte(content), 0o644); err != nil {
t.Fatal(err)
}
}

// ensure go/parser + token stay referenced for the fixture-shape sanity below.
var _ = parser.ParseFile
var _ = token.NewFileSet
168 changes: 168 additions & 0 deletions internal/contractlint/boundary_guard_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
// ABOUTME: The instruction-file-read boundary guard — instruction/prompt files are
// ABOUTME: read in tests ONLY here in the quarantine package; this sweep bans them everywhere else.
package contractlint

import (
"go/ast"
"go/parser"
"go/token"
"os"
"path/filepath"
"strings"
"testing"
)

// READING INSTRUCTION/PROMPT FILES IN A TEST IS BANNED BY DEFAULT.
//
// An instruction file is a markdown surface the model ingests — a skill body
// (skills/**/SKILL.md), a contract/runtime reference (skills/**/references/*.md),
// or an agent definition (agents/*.md). The ONLY legal place a test reads one is
// THIS quarantine package, and only for a STRUCTURAL check — a defect a machine
// can see without reading prose for meaning:
//
// - a @-reference resolves to a real file on disk (ref-closure),
// - YAML frontmatter parses and declares its required keys (frontmatter-validity),
// - a retired path is ABSENT from the shipped surface (structural-absence),
// - a fact appears in exactly one source / a count holds (dedup),
// - a file clears a size floor (line-floor).
//
// What is BANNED, here and everywhere:
//
// - PROSE-GREP — asserting an instruction file contains (or lacks) its own prose.
// The file is its own source of truth, so the assertion is a tautology: it
// cannot fail for a real reason, and a meaning-inverting paraphrase keeps every
// grepped token. "The skill says X" is never proof the system DOES X.
// - CODE-BOUND-AS-BEHAVIOR-SUBSTITUTE — asserting an instruction file's prose
// matches a code value (a const, a router subcommand, a seam name). That is a
// consistency lint, not a behavior test. If the behavior matters there must be a
// BEHAVIOR test that RUNS it; the prose⟷code check never proved the behavior.
//
// Behavior is proven by RUNNING it — a live scenario, an offline command-level
// drive, a code-side invariant over real parsed source — never by reading prose.
// The reader-shape axis (an undeclared read hiding via an undiscovered read shape)
// is backstopped by the detached adversarial audit at every high-stakes gate, not
// by enumerating read shapes here.
//
// This guard is the polarity flip of the retired AC-3 marker sweep: that sweep let
// a test read prose if it declared a marker (a permission slip); this guard bans
// the read outright outside this package. There is no marker. There is no taint
// flow. A read outside the quarantine is a failure, full stop.

// quarantinePkg is the only package directory where an instruction-file read is
// legal. Paths are repo-relative (the guard walks from the repo root).
const quarantinePkg = "internal/contractlint"

// TestNoInstructionReadsOutsideQuarantine is the boundary guard, re-runnable
// offline. It parses every *_test.go under the repo and FAILS if any test FILE
// outside the quarantine package contains a function that reads an instruction
// file's content. The quarantine package itself is exempt — it is the single legal
// read path for the structural checks. The count of out-of-quarantine reader files
// must be zero.
func TestNoInstructionReadsOutsideQuarantine(t *testing.T) {
offenders := sweepInstructionReadsOutsideQuarantine(t, repoRoot(t))
for _, o := range offenders {
t.Errorf("%s reads an instruction file's content in a test — instruction/prompt files are read ONLY in %s for structural checks; behavior is proven by RUNNING it, never by reading prose (see the package doc)", o, quarantinePkg)
}
if len(offenders) > 0 {
t.Fatalf("boundary guard: %d file(s) read instruction content outside %s; the count must be zero", len(offenders), quarantinePkg)
}
}

// sweepInstructionReadsOutsideQuarantine returns the repo-relative paths of
// *_test.go files OUTSIDE the quarantine package that contain a function reading
// an instruction file's content. Exported logic so the guard's mutation control
// can drive it against a planted fixture directory.
func sweepInstructionReadsOutsideQuarantine(t *testing.T, repoRootDir string) []string {
t.Helper()
var offenders []string
err := filepath.WalkDir(repoRootDir, func(path string, d os.DirEntry, err error) error {
if err != nil {
return err
}
rel, relErr := filepath.Rel(repoRootDir, path)
if relErr != nil {
return relErr
}
if d.IsDir() {
switch rel {
case ".git", ".worktrees", "docs/dev/.spacedock-state", "vendor", quarantinePkg:
return filepath.SkipDir
}
return nil
}
if !strings.HasSuffix(d.Name(), "_test.go") {
return nil
}
fset := token.NewFileSet()
f, parseErr := parser.ParseFile(fset, path, nil, 0)
if parseErr != nil {
return parseErr
}
if fileReadsInstructionContent(f) {
offenders = append(offenders, rel)
}
return nil
})
if err != nil {
t.Fatalf("sweep instruction reads under %s: %v", repoRootDir, err)
}
return sortedUnique(offenders)
}

// instructionReadingTestFiles returns the base names of *_test.go files in dir
// that contain at least one function reading an instruction file's content.
func instructionReadingTestFiles(t *testing.T, dir string) []string {
t.Helper()
fset := token.NewFileSet()
entries, err := os.ReadDir(dir)
if err != nil {
t.Fatalf("read package dir %s: %v", dir, err)
}
var files []string
for _, ent := range entries {
name := ent.Name()
if ent.IsDir() || !strings.HasSuffix(name, "_test.go") {
continue
}
f, err := parser.ParseFile(fset, filepath.Join(dir, name), nil, 0)
if err != nil {
t.Fatalf("parse %s: %v", name, err)
}
if fileReadsInstructionContent(f) {
files = append(files, name)
}
}
return sortedUnique(files)
}

// fileReadsInstructionContent reports whether any function declared in f reads an
// instruction file's content.
func fileReadsInstructionContent(f *ast.File) bool {
for _, decl := range f.Decls {
fn, ok := decl.(*ast.FuncDecl)
if !ok {
continue
}
if directlyReadsInstructionFile(fn) {
return true
}
}
return false
}

func sortedUnique(in []string) []string {
seen := map[string]bool{}
var out []string
for _, s := range in {
if !seen[s] {
seen[s] = true
out = append(out, s)
}
}
for i := 1; i < len(out); i++ {
for j := i; j > 0 && out[j-1] > out[j]; j-- {
out[j-1], out[j] = out[j], out[j-1]
}
}
return out
}
13 changes: 13 additions & 0 deletions internal/contractlint/doc_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
// ABOUTME: Package-level policy for the instruction-file-read quarantine.
// ABOUTME: Structural checks live here; prose-grep and code-bound behavior substitutes do not.
package contractlint

// This package is the instruction-file-read quarantine for tests. Tests outside
// this package must not read skill, contract, runtime-adapter, or agent markdown.
// Tests inside it may read those files only for structural facts a machine can
// verify without interpreting prose: reference closure, frontmatter validity,
// structural absence, deduplication, line/count floors, and portability markers.
//
// Do not add prose-grep checks here. Do not add prose-to-code consistency checks
// as behavior substitutes. If behavior matters, test it by running the behavior;
// if no behavior test exists yet, delete the read and report the owed test.
Loading
Loading