Skip to content

TheColliery/CoalBoard

Repository files navigation

⚖️ CoalBoard

A coal board governs operations and resolves disputes for the mines — this one is the board for the work where a single mistake is catastrophic.

A diverse-lens consensus & debate board for error-not-allowed work — on a critical task, with your consent, blind epistemic lenses debate in parallel, a judge synthesizes on verified inputs, and you sign off before anything touches your files.

version license status

Contributing · Changelog · Security · Privacy · Releases

Part of TheColliery — siblings: CoalMine (quality canaries) · CoalTipple (model/effort routing).


⚖️ What it is

On an error-not-allowed task — security/crypto, a DB/financial migration, high-precision math/physics — and only with your consent, CoalBoard convenes a board:

  • three epistemic lenses debate the task in parallel, blind to each other (so they don't anchor): an empirical lens that grounds claims in live, cross-referenced sources; a formal lens that reasons from logic and proof; and a "show-me" skeptic that turns every doubt into a concrete evidence-demand ("show the date", "show it actually runs");
  • a judge synthesizes — on verified inputs, never on which answer sounds best;
  • on a deadlock, an independent out-of-frame solver re-derives the answer blind and breaks the tie by agreement;
  • every proposed change is staged to .coalboard/proposed/ (reports land in .coalboard/reports/) and you sign off before a single live file changes.

Its auto-trigger stays on the critical slice (off ~90%, cost-disciplined) — but you can manually convene it (say "convene the board" in chat, or run the /coalboard:coalboard plugin command) on any hard problem worth several diverse perspectives, in any domain — code, docs, math, research, translation, legal — not just code. Always behind a consent gate.

🛡️ What it guarantees (and what it doesn't)

CoalBoard is NASA-inspired in structure (redundancy + design-diversity + human-in-the-loop + trigger-only-on-critical) — not in numbers. It honestly guarantees two things:

  1. Bounded cost — a solo agent thrashing on a hard bug is an unbounded token-bleed; the board converges (single-turn, judge-final, human-escape), so its cost is high but predictable and capped. Pay a known premium to cap the tail.
  2. Zero-breakage — staging + propose-not-execute means the live workspace is never touched until verified and approved (a side-effect — a run migration, an API call — is gated behind your approval, never executed during the debate).

Both guarantees are contract-enforced — the board's staging discipline + your sign-off — not an OS sandbox; a skill cannot OS-enforce. The human gate is the load-bearing node.

It improves correctness; it does not claim a defect rate or a reliability figure (an LLM ensemble is probabilistic, not formally proven — and 10⁻⁹ is unverifiable by any system). It gets more accurate as the underlying models improve, for free — the structure is model-agnostic.

📊 Benchmark

Headline: with the board 10/10 (100% consistent) vs an un-primed strong solo ~13/20 (~65%) on 5 error-not-allowed tasks (Claude Code, Opus-class; 2026-06-19).

With the board vs without, on a fixed set of error-not-allowed tasks (each a known gold + a subtle trap a single pass ships), measured 2026-06-19. The board makes the rigor automatic where a casual pass is inconsistent — forced precision, live grounding, completeness — and the lift is larger on a weaker model. Full method, per-task scoring, and the honest-ceiling finding live in the series records: TheColliery/.github/benchmarks/CoalBoard.

Honest scope: small, dated samples; the board improves correctness, it does not prove a defect rate. A board whose lenses share one model still shares that model's blind spot (the honest ceiling). The honest sell is bounded cost + zero-breakage, not a reliability number.

🚀 Install

Claude Code — one command (also enables hook auto-activation + the cheap-lenses / premium-judge cost discount, both CC-only):

claude plugin marketplace add TheColliery/CoalBoard
claude plugin install coalboard@coalboard

Other concurrent-subagent platforms (Cursor, Codex, Copilot, Amp, Goose, … — design-supported, unverified) — the board is a plain skill: point your agent at skills/coalboard/SKILL.md (the contract is platform-neutral; it convenes via your platform's native subagent tool). There is no one-command installer, and the conductor hook + cost-tiering are CC-only. The DEBATE structure is cross-agent by design, but it is VERIFIED on Claude Code only — every actuatable artifact here (installer, hook, cost-tiering) is CC-specific, so treat the named platforms as supported-not-yet-proven and re-verify subagent support on yours.

⚙️ Configure

Everything is tunable in .coalboard.json (global ~/.claude/ overlaid by project .claude/). The headline dial is rigorrelaxed | standard | high | nasa — a preset that sets the board's strictness; any individual key overrides it. (nasa = maximum paranoia: trust nothing, the human signs off — not a 10⁻⁹ claim.) Key groups: activation (coalboardMode, criticalPaths, triggerConfidence), the board (lenses, consensusThreshold, maxRounds), verify (qaStrictness, sastCommand, applyConsent), and self-update. See the skill contract for the full set. A fully-commented template ships at platform-configs/.coalboard.json — copy it to ~/.claude/.coalboard.json (or your project's .claude/.coalboard.json) and edit.

🧭 Part of TheColliery

CoalBoard is the consensus & debate board of the mining series, alongside CoalMine (quality canaries) and CoalTipple (model/effort routing). Install one and it stands alone; install all and they compose without conflict. Shared doctrine: Phoenix-13 hooks (zero-dependency, no network, fail-silent, no child processes, deterministic), single-source-of-truth config schemas, and a strict no-overkill discipline. Series doctrine: TheColliery/.github.

Zero-dependency, offline, no API keys.


📄 License

MIT License. See LICENSE.

About

Consensus & debate board for error-not-allowed work — diverse lenses verify before it ships. Beta. Part of TheColliery.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors