Skip to content

feat(batch): pre-flight liveness gate (Theme 5 Layer 4)#324

Draft
mitwilli-create wants to merge 1 commit into
mainfrom
feat/theme5-l4-preflight-liveness-2026-05-29
Draft

feat(batch): pre-flight liveness gate (Theme 5 Layer 4)#324
mitwilli-create wants to merge 1 commit into
mainfrom
feat/theme5-l4-preflight-liveness-2026-05-29

Conversation

@mitwilli-create
Copy link
Copy Markdown
Owner

Summary

Theme 5 Layer 4 from the 2026-05-29 task-audit. Wires a pre-flight liveness gate into batch-runner-batches.mjs::phaseSubmit that filters URLs marked status='expired' in data/liveness-cache.json BEFORE the expensive Playwright JD-fetch loop. Saves ~$0.50-1/URL on dead postings.

Implementation

  • Reads data/liveness-cache.json once per submit
  • For each item in the LIMIT × TIERS slice: if cache entry is expired, skip it + dequeue from batch/triage-advance.tsv + archive
  • Appends a per-URL audit row to .claude/audit/<DATE>/dead-url-skip-<DATE>.jsonl (gitignored — contains personal pipeline URLs)
  • Bypass: --skip-liveness-gate (emergency only)
  • Fails open: cache missing or read fails → gate is a no-op, batch proceeds unchanged

What this defends against

The 2026-05-26 batch incident burned ~$0.50-1/URL of Playwright fetch work on URLs the liveness-cache already knew were dead. The existing post-fetch dequeue at line ~720 caught them AFTER the spend; this gate catches them BEFORE.

Combined with the existing post-fetch terminal-failure dequeue (PR #308) and the expired-status mark in data/liveness-cache.json (populated by check-liveness.mjs + the launchd liveness-sweep plist), this completes the liveness-aware batch flow:

liveness-sweep.plist (nightly)
  └─→ check-liveness.mjs
        └─→ data/liveness-cache.json (URL → status)
              ├─→ THIS PR: pre-flight gate filters BEFORE submit
              └─→ existing: post-fetch dequeue marks terminal errors

Locked decisions (interview 2026-05-29)

  • Q1: Layer 4 pre-flight + Layer 5 cost-trace audit (Layer 5 follows in a separate PR)
  • Production-surface file → bucket-B DRAFT per Q4 of the chain's locked decisions

Test plan

  • node --check batch-runner-batches.mjs passes
  • node batch-runner-batches.mjs --help prints status with no errors
  • Empty cache: gate is a no-op, batch proceeds as before
  • Cache with expired URLs in triage-advance: filtered out, dequeued, audit log written
  • --skip-liveness-gate flag bypasses the gate (prints BYPASSED line)
  • .claude/audit/<DATE>/dead-url-skip-<DATE>.jsonl is gitignored (verified via git check-ignore)

Layer 5 follows separately

Layer 5 (cost-trace truthfulness audit) is a standalone script (scripts/audit-cost-trace.mjs) and ships as its own PR for clean review.

🤖 Generated with Claude Code

…ayer 4)

Closes Theme 5 Layer 4 from the 2026-05-29 task-audit. Adds a pre-flight
filter to batch-runner-batches.mjs::phaseSubmit that reads
data/liveness-cache.json BEFORE the expensive Playwright JD-fetch loop
and skips URLs already marked status='expired'.

Why this matters
  The 2026-05-26 batch incident burned ~$0.50-1/URL of Playwright fetch
  work on URLs the liveness-cache already knew were dead. The existing
  post-fetch dequeue (line ~720) caught them AFTER the spend; this gate
  catches them BEFORE.

Implementation
  - Reads data/liveness-cache.json once per submit.
  - For each item in scopedItems × LIMIT × TIERS slice, looks up the URL.
  - If entry.status === 'expired', adds to cacheExpiredFromGate set + filters
    the items list.
  - Logs each skip to .claude/audit/<DATE>/dead-url-skip-<DATE>.jsonl
    (append-only, gitignored, contains per-URL ts + reason + cache_ts).
  - Inline dequeue from batch/triage-advance.tsv + archive to
    batch/triage-advance-expired/<DATE>-liveness-gate-N.tsv so the gate
    is idempotent (next run doesn't re-attempt).
  - All errors are non-fatal (the gate fails open — proceeds without
    filtering rather than blocking the whole batch).

Bypass
  --skip-liveness-gate (CLI flag, emergency only). Logs the bypass.

Safety
  - Empty filter result early-exits cleanly ("nothing to batch") rather
    than submitting a zero-row payload.
  - No external API calls; pure local-file read + filter.
  - Gracefully degrades when liveness-cache.json is missing (skips gate,
    behavior unchanged from pre-fix).

Test plan
  - node --check passes.
  - --help still prints status correctly.
  - With empty cache: gate is a no-op, batch proceeds normally.
  - With cache containing expired URLs that ARE in triage-advance: those
    URLs are removed from items + dequeued + logged + the rest submit.
  - --skip-liveness-gate: prints BYPASSED line, processes all URLs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant