fix(flapping-plists): sweep 4 flapping launchd plists — TCC wrapper drop + dashboard-dependency inline + exit-1 misclassification#336
Merged
mitwilli-create merged 1 commit intoMay 29, 2026
Conversation
…per drop + dashboard-dependency inline + exit-1 misclassification
Four plists were flapping for distinct reasons. Same-PR sweep so the
launchd surface goes 6 → 2 (buttons-smoke + scan-email-poll need
follow-up investigation, not addressed here).
1. network-database-build (exit 126 daily since 2026-05-23)
- Drop /bin/bash + cron-run.sh wrapper. macOS Tahoe TCC blocks bash
from reading scripts under ~/Documents/ even when bash is allowed.
Node invocation works because /Users/.../node lives outside the
protected tree. Matches bug-intake-mapper / pipeline-health /
health-column-liveness pattern.
2. phase-B-prime-daily (exit 1 daily since 2026-05-25)
- Inline `node scripts/build-dashboard.mjs` at start of main() so
_CONTACTS_DATA is populated regardless of when the morning
rebuild fires. Idempotent. Override via SKIP_DASHBOARD_PRELOAD=1.
3. pipeline-health-check.mjs (false-positive flapping on 2+ concurrent
Claude sessions — the new normal)
- Env-configurable PIPELINE_HEALTH_MAX_ORCHESTRATORS (default 1).
- process.exit(0) always. The JSON file IS the signal per the
top-of-file comment intent.
4. health-column-liveness.mjs (false-positive flapping when
data-coverage < 90% — a valid signal, wrong protocol)
- process.exit(0) on unhealthy. Keep exit 2 for true FATAL.
- The JSON at data/health-column-coverage.json is the signal.
AGENTS.md adds two bug-class entries:
- launchd-bash-wrapper-tahoe-tcc-block
- launchd-exit-1-misclassified-as-flapping-on-data-signals
Smoke-tested:
- plutil -lint OK on all 4 plists
- node --check OK on all 3 .mjs files
- pipeline-health exits 0 with JSON signal preserved
- health-column-liveness exits 0 with JSON signal preserved
- PIPELINE_HEALTH_MAX_ORCHESTRATORS env knob honored
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the R3 residual from the 2026-05-29 chain handover: 4 of 6 flapping launchd plists fixed in one same-PR sweep. Two remaining (
buttons-smoke,scan-email-poll) need investigation; not addressed here.Three failure modes, four fixes:
network-database-build/bin/bashfrom reading scripts under~/Documents/cron-run.shwrapper; invoke node directly. Matchesbug-intake-mapperpattern.phase-B-prime-daily_CONTACTS_DATAnot populated when 03:30 PT job fires before morning rebuildnode scripts/build-dashboard.mjsat start ofmain(). Override viaSKIP_DASHBOARD_PRELOAD=1.pipeline-healthpids.length > 1orchestrator thresholdPIPELINE_HEALTH_MAX_ORCHESTRATORS(default 1) +process.exit(0)always. The JSON file IS the signal per top-of-file comment.health-column-livenessprocess.exit(0)on unhealthy. Keepexit 2for true FATAL. JSON atdata/health-column-coverage.jsonis the signal.AGENTS.md additions
Two new bug-class entries:
### Bug class: launchd-bash-wrapper-tahoe-tcc-block— full doc of why bash hits TCC and node doesn't, with safe-pattern XML.### Bug class: launchd-exit-1-misclassified-as-flapping-on-data-signals— generalizable pattern: when a script writes its real signal to a JSON file AND returns an exit code, treat the file as the signal. Reserve non-zero for true script failure. Companion pattern: env-configurable thresholds replace hardcoded "expected 0 or 1" assertions.Smoke tests (all pass)
Test plan
launchctl kickstart -k gui/$(id -u)/com.mitchell.career-ops.network-database-build→ exit 0 expectedlaunchctl kickstart -k gui/$(id -u)/com.mitchell.career-ops.phase-B-prime-daily→ exit 0 expected (cost ~$2.50 if run during work day)launchctl kickstart -k gui/$(id -u)/com.mitchell.career-ops.pipeline-health→ exit 0 expectedlaunchctl kickstart -k gui/$(id -u)/com.mitchell.career-ops.health-column-liveness→ exit 0 expectednode scripts/agents/system-maintainer.mjs --health→ should show 2 flapping instead of 6launchctl setenv PIPELINE_HEALTH_MAX_ORCHESTRATORS 2if concurrent-session orchestrators are routineOut of scope (follow-ups)
buttons-smoke: 1/14 assertion failure ("batch-runner dry-run reports expected queue size") — code investigation, not a crash. File a separate PR.scan-email-poll: produces no.err/.outlogs at all — likely plist misconfiguration or job stub that never starts. Investigation needed.🤖 Generated with Claude Code