diff --git a/.github/workflows/ts-persona-forbidden-strings-ratchet.yml b/.github/workflows/ts-persona-forbidden-strings-ratchet.yml new file mode 100644 index 000000000..9c1aebe72 --- /dev/null +++ b/.github/workflows/ts-persona-forbidden-strings-ratchet.yml @@ -0,0 +1,43 @@ +# Lane F PR-2 (PR #1091 followup) — TS Persona Forbidden-Strings Ratchet. +# +# Per-pattern monotonic-decrease ratchet for anti-patterns under +# src/system/user/server/. Fails on any growth of: +# - case-insensitive `fallback` mentions (Joel 2026-04-22 "fallbacks +# are ILLEGAL") +# - direct `new Adapter(` instantiation (bypasses #1066/#1074 +# ModelRequirement → ResolvedModel resolver) +# - `process.env.*API_KEY` reads (cloud-key lookup belongs in Rust +# provider registry, per Codex's #1077 boundary) +# +# Fast: shell + python only. Independent gate from compile + Rust build. + +name: ts-persona-forbidden-strings-ratchet + +on: + pull_request: + branches: [canary, main] + paths: + - 'src/system/user/server/**/*.ts' + - 'scripts/ratchets/ts-persona-forbidden-strings-baseline.json' + - 'scripts/ratchets/check-ts-persona-forbidden-strings.sh' + - '.github/workflows/ts-persona-forbidden-strings-ratchet.yml' + push: + branches: [canary, main] + +jobs: + ratchet: + name: ts-persona-forbidden-strings-ratchet + runs-on: ubuntu-latest + timeout-minutes: 5 + steps: + - uses: actions/checkout@v4 + with: + ref: ${{ github.event.pull_request.head.sha || github.sha }} + fetch-depth: 1 + + - name: Run ratchet check + run: bash scripts/ratchets/check-ts-persona-forbidden-strings.sh + + - name: Print per-pattern occurrences on failure + if: failure() + run: bash scripts/ratchets/check-ts-persona-forbidden-strings.sh --verbose || true diff --git a/docs/architecture/TS-PERSONA-COGNITION-RATCHET.md b/docs/architecture/TS-PERSONA-COGNITION-RATCHET.md index 3b7e68e5c..213145eb3 100644 --- a/docs/architecture/TS-PERSONA-COGNITION-RATCHET.md +++ b/docs/architecture/TS-PERSONA-COGNITION-RATCHET.md @@ -82,17 +82,35 @@ bash scripts/ratchets/check-ts-persona-cognition.sh --verbose Prints the per-file LOC table so you see which file changed and by how much. +## Companion gate: forbidden-strings ratchet + +`scripts/ratchets/check-ts-persona-forbidden-strings.sh` (PR #1091 +followup) runs the same monotonic-decrease shape on per-pattern grep +counts under the same surface. Tracked patterns: + +- **`fallback_mention`** (case-insensitive): per Joel's no-fallbacks + rule (2026-04-22, "fallbacks have ruined this project ... they are + ILLEGAL"). The WORD count is a proxy for conceptual presence — even + comments saying "no fallback here" count. +- **`direct_adapter_instantiation`**: matches `new Adapter(`. + TS surface should request providers from the registry / admission + layer (Rust resolver, #1066/#1074), not instantiate adapters directly. +- **`direct_api_key_env_read`**: matches `process.env.*API_KEY`. Cloud + API key lookup belongs in the Rust provider registry (Codex's #1077 + boundary), NOT the TS surface. Currently 0 — the ratchet locks that in. + +Same workflow shape (`.github/workflows/ts-persona-forbidden-strings-ratchet.yml`), +same `--update-baseline` / `--verbose` modes. Per-pattern baselines live +in `scripts/ratchets/ts-persona-forbidden-strings-baseline.json` with +inline rationale per pattern. + ## Out of scope (followups) -- **Forbidden-strings check**: detect `"fallback"`, direct adapter - instantiation, or other anti-patterns Joel has flagged. Per #1084 - Lane F success criteria. Will land as a separate gate next to this - one. - **Verb-shape detection**: identify cognition VERBS (e.g., `shouldRespond`, `scoreRelevance`) being added in TS even when total LOC drops. Heuristic, harder to define rigorously — lower priority - than the LOC ratchet which catches the gross case. -- **Pre-commit hook integration**: today's gate is CI-only. Adding to + than the LOC + forbidden-strings ratchets which catch the gross cases. +- **Pre-commit hook integration**: today's gates are CI-only. Adding to pre-commit would catch growth before push, faster signal. Reserve - for after the LOC ratchet has been live for ~1 week so we know the + for after the ratchets have been live for ~1 week so we know the shape isn't going to oscillate. diff --git a/scripts/ratchets/check-ts-persona-forbidden-strings.sh b/scripts/ratchets/check-ts-persona-forbidden-strings.sh new file mode 100755 index 000000000..19a76add6 --- /dev/null +++ b/scripts/ratchets/check-ts-persona-forbidden-strings.sh @@ -0,0 +1,178 @@ +#!/bin/bash +# check-ts-persona-forbidden-strings.sh — Lane F PR-2 ratchet (PR #1091 followup). +# +# Per-pattern monotonic-decrease ratchet for anti-patterns in the TS +# persona surface (src/system/user/server/). Mirrors PR #1091's LOC +# ratchet shape but counts grep matches per regex instead of total +# lines. +# +# Per Joel's no-fallbacks rule + the Rust-first alpha contract (PR #1070, +# ALPHA-GAP-ANALYSIS.md): the TS surface must shed cloud-key env reads, +# direct adapter instantiation, and the WORD `fallback` over time. The +# Rust provider registry + resolver own these concerns (#1066, #1074, +# #1077, #1089). +# +# Modes: +# ./check-ts-persona-forbidden-strings.sh # check + report; exit 0/1 +# ./check-ts-persona-forbidden-strings.sh --update-baseline # update + commit-ready +# ./check-ts-persona-forbidden-strings.sh --verbose # print per-pattern occurrences + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)" +BASELINE_FILE="$SCRIPT_DIR/ts-persona-forbidden-strings-baseline.json" +SURFACE_DIR="$REPO_ROOT/src/system/user/server" + +YELLOW='\033[1;33m' +GREEN='\033[0;32m' +RED='\033[0;31m' +NC='\033[0m' + +UPDATE_BASELINE=0 +VERBOSE=0 +for arg in "$@"; do + case "$arg" in + --update-baseline) UPDATE_BASELINE=1 ;; + --verbose|-v) VERBOSE=1 ;; + --help|-h) + echo "Usage: $0 [--update-baseline] [--verbose]" + echo " Default: check current per-pattern counts against baseline; exit non-zero on any growth." + echo " --update-baseline: rewrite baseline_count for each pattern to current (use after legitimate removal)." + echo " --verbose: print first 5 occurrences per pattern." + exit 0 + ;; + *) + echo -e "${RED}Unknown arg: $arg${NC}" >&2 + exit 2 + ;; + esac +done + +if [[ ! -d "$SURFACE_DIR" ]]; then + echo -e "${RED}ERROR: surface directory not found: $SURFACE_DIR${NC}" >&2 + exit 2 +fi + +if [[ ! -f "$BASELINE_FILE" ]]; then + echo -e "${RED}ERROR: baseline file not found: $BASELINE_FILE${NC}" >&2 + exit 2 +fi + +# Count occurrences of one pattern across the surface (excluding tests). +count_pattern() { + local regex="$1" + local case_insensitive="$2" + local grep_flags="-rEoI --include=*.ts --exclude=*.test.ts --exclude=*.spec.ts" + if [[ "$case_insensitive" == "true" ]]; then + grep_flags="$grep_flags -i" + fi + # `|| true` — grep returns 1 on zero matches, which is a valid count. + grep $grep_flags "$regex" "$SURFACE_DIR" 2>/dev/null | wc -l | tr -d ' ' || true +} + +# Read pattern config from JSON in shell-friendly tabular form. +PATTERN_DATA=$(python3 - "$BASELINE_FILE" <<'PYEOF' +import json, sys +with open(sys.argv[1]) as f: + data = json.load(f) +for p in data["patterns"]: + print("\t".join([ + p["id"], + p["regex"], + "true" if p.get("case_insensitive", False) else "false", + str(p["baseline_count"]), + ])) +PYEOF +) + +ANY_GROWTH=0 +RESULTS=() +while IFS=$'\t' read -r id regex ci baseline; do + current=$(count_pattern "$regex" "$ci") + delta=$((current - baseline)) + RESULTS+=("$id|$baseline|$current|$delta") + if [[ "$delta" -gt 0 ]]; then + ANY_GROWTH=1 + fi +done <<< "$PATTERN_DATA" + +if [[ "$VERBOSE" -eq 1 ]]; then + echo -e "${YELLOW}━━ TS persona-forbidden-strings (per-pattern occurrences, top 5) ━━${NC}" + while IFS=$'\t' read -r id regex ci baseline; do + echo -e "${YELLOW}# $id baseline=$baseline${NC}" + grep_flags="-rEnI --include=*.ts --exclude=*.test.ts --exclude=*.spec.ts" + if [[ "$ci" == "true" ]]; then grep_flags="$grep_flags -i"; fi + grep $grep_flags "$regex" "$SURFACE_DIR" 2>/dev/null | head -5 || echo " (no matches)" + echo "" + done <<< "$PATTERN_DATA" +fi + +if [[ "$UPDATE_BASELINE" -eq 1 ]]; then + CURRENT_SHA=$(git -C "$REPO_ROOT" rev-parse --short HEAD 2>/dev/null || echo "unknown") + CURRENT_ISO=$(date -u +"%Y-%m-%dT%H:%MZ") + python3 - "$BASELINE_FILE" "$CURRENT_SHA" "$CURRENT_ISO" "${RESULTS[@]}" <<'PYEOF' +import json, sys +path, sha, iso = sys.argv[1], sys.argv[2], sys.argv[3] +results = {} +for entry in sys.argv[4:]: + pid, baseline, current, delta = entry.split("|") + results[pid] = int(current) +with open(path) as f: + data = json.load(f) +for p in data["patterns"]: + if p["id"] in results: + p["baseline_count"] = results[p["id"]] +data["_baseline_anchored_at_canary"] = sha +data["_anchored_at_iso"] = iso +with open(path, "w") as f: + json.dump(data, f, indent=2) + f.write("\n") +PYEOF + echo -e "${GREEN}✓ baseline updated to current counts:${NC}" + for r in "${RESULTS[@]}"; do + IFS='|' read -r id baseline current delta <<< "$r" + echo " $id: $baseline → $current (delta $delta)" + done + echo " Commit: git add $BASELINE_FILE" + exit 0 +fi + +if [[ "$ANY_GROWTH" -eq 1 ]]; then + echo -e "${RED}━━ ❌ TS persona-forbidden-strings RATCHET FAILED ━━${NC}" >&2 + echo "" >&2 + for r in "${RESULTS[@]}"; do + IFS='|' read -r id baseline current delta <<< "$r" + if [[ "$delta" -gt 0 ]]; then + echo -e "${RED} ❌ $id: baseline=$baseline current=$current delta=+$delta${NC}" >&2 + elif [[ "$delta" -lt 0 ]]; then + echo -e "${GREEN} ✓ $id: baseline=$baseline current=$current delta=$delta (shrunk)${NC}" >&2 + else + echo -e "${YELLOW} · $id: baseline=$baseline current=$current (held)${NC}" >&2 + fi + done + echo "" >&2 + echo " Per Joel's no-fallbacks rule + Rust-first alpha contract (PR #1070)," >&2 + echo " the TS persona surface must shed these patterns over time. Provider" >&2 + echo " resolution + admission belong in Rust (workers/continuum-core/src/cognition/," >&2 + echo " workers/continuum-core/src/persona/), NOT in TS." >&2 + echo "" >&2 + echo " Options:" >&2 + echo " 1. Move the pattern occurrence Rust-side." >&2 + echo " 2. Refactor it out (rename, restructure) so the TS surface stops mentioning it." >&2 + echo " 3. If your PR also REMOVES occurrences elsewhere AND net is flat-or-down for" >&2 + echo " this pattern, the ratchet should already be passing for that pattern. Run" >&2 + echo " this script with --verbose to see what's left." >&2 + exit 1 +fi + +echo -e "${GREEN}✓ TS persona-forbidden-strings ratchet held:${NC}" +for r in "${RESULTS[@]}"; do + IFS='|' read -r id baseline current delta <<< "$r" + if [[ "$delta" -lt 0 ]]; then + echo -e "${GREEN} ✓ $id: baseline=$baseline current=$current delta=$delta (shrunk — run --update-baseline post-merge to lock in)${NC}" + else + echo " · $id: baseline=$baseline current=$current" + fi +done +exit 0 diff --git a/scripts/ratchets/ts-persona-forbidden-strings-baseline.json b/scripts/ratchets/ts-persona-forbidden-strings-baseline.json new file mode 100644 index 000000000..33f3db659 --- /dev/null +++ b/scripts/ratchets/ts-persona-forbidden-strings-baseline.json @@ -0,0 +1,36 @@ +{ + "_doc": "Lane F PR-2 (PR #1091 followup) \u2014 TS Persona Forbidden-Strings Ratchet. Tracks anti-pattern grep counts under src/system/user/server/. Per-pattern baseline; PR fails if any count GROWS. Mirrors the monotonic-decrease shape of ts-persona-cognition-baseline.json (PR #1091).", + "_to_lower_baseline": "After a PR that legitimately removes occurrences of a tracked pattern, run: bash scripts/ratchets/check-ts-persona-forbidden-strings.sh --update-baseline && git add scripts/ratchets/ts-persona-forbidden-strings-baseline.json && commit", + "_paths_glob_relative_to_repo_root": [ + "src/system/user/server/**/*.ts" + ], + "_excludes": [ + "*.test.ts", + "*.spec.ts" + ], + "_baseline_anchored_at_canary": "83513e6bd", + "_anchored_at_iso": "2026-05-11T21:31Z", + "patterns": [ + { + "id": "fallback_mention", + "regex": "fallback", + "case_insensitive": true, + "baseline_count": 83, + "rationale": "Joel 2026-04-22: 'fallbacks have ruined this project ... they are ILLEGAL.' Counts every occurrence including comments \u2014 a comment saying 'no fallback here' counts because the WORD shouldn't be normalized in the persona surface. Currently 83 \u2014 the ratchet's job is to push that to zero over time. Direct anti-pattern matches (silent-fallback branches) are caught by code review; the WORD count is a proxy for the conceptual presence." + }, + { + "id": "direct_adapter_instantiation", + "regex": "new [A-Z][a-zA-Z]*Adapter\\(", + "case_insensitive": false, + "baseline_count": 12, + "rationale": "TS persona surface should request providers from the registry/admission layer (Rust resolver), not instantiate adapters directly. Direct `new AnthropicAdapter()` / `new LlamaCppAdapter()` etc. bypasses the ModelRequirement \u2192 ResolvedModel path my Lane C #1066/#1074 work shipped. Currently 12 \u2014 should drop as adapter wiring moves to the Rust runtime." + }, + { + "id": "direct_api_key_env_read", + "regex": "process\\.env\\.[A-Z_]*API_KEY", + "case_insensitive": false, + "baseline_count": 0, + "rationale": "TS surface must NOT read cloud API keys directly from env \u2014 the Rust provider registry owns that lookup (per Codex's #1077 Rust persona model boundary). Currently 0 (clean) \u2014 the ratchet locks this in. Any PR that adds `process.env.OPENAI_API_KEY` style reads in the persona surface fails CI." + } + ] +}