CoalBoard is verified under the same framework as CoalMine and CoalTipple: the execution hook follows the Phoenix-13 commandments, the build is reproducible from source, and the design is security-first.
Open an issue at github.com/TheColliery/CoalBoard, or request a private channel for sensitive PoC logs. We investigate promptly.
All commits and release tags are SSH-signed (gpg.format=ssh); GitHub renders the Verified badge.
Verify locally:
echo "* ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIEtqTWGKhX1Dk9nZP8ns13Wl5zsO1Cz3VlTS6m1p2fP9" > coalboard_signers
git config gpg.ssh.allowedSignersFile ./coalboard_signers
git verify-commit HEAD && git tag -v "$(git describe --tags --abbrev=0)"The clean plugin/ distribution is generated from source by node scripts/build-plugin.mjs; node scripts/verify.mjs checks the dist is in sync, the manifest is valid, and the config schema is well-formed. node scripts/test.mjs runs the zero-dependency unit + hermetic-hook tests (the canonical runner β the node --test <glob> form is avoided: it breaks on Node 24, MODULE_NOT_FOUND).
Last scan: CoalBoard v1.0.12 dist (commit c67d90b), on 2026-06-20, with NVIDIA SkillSpector v2.2.3 (self-reported β the tool ships no tagged releases; the version is the uvx-from-git HEAD). Re-scan is event-driven on a NEW SkillSpector version (maintainer-commanded), NOT per CoalBoard release β the static analyzers are stable (unchanged since 2026-05-11), so a CoalBoard content bump does not change what these (non-MCP) rules read. The 2026-06-20 re-scan of the v1.0.12 dist CONFIRMS this: same SkillSpector v2.2.3, same all-false-positive verdict, the findings unchanged in class from the earlier v1.0.1 scan.
Read the score in context. The static stage scored 100/100; the LLM semantic stage was rate-limited (HTTP 429) and fell back to static-only, which is pattern-match-based and false-positive-prone (it flags strings without the skill-contract context). Every finding was verified false-positive β re-run the semantic stage when the limit clears for a context-aware score. The verifications:
| finding(s) | why it is a false positive |
|---|---|
| RA1 Γ6 β self-modification (self-update) | The series self-update is consent-gated: the hook only schedules (never networks), the agent offers the platform's own claude plugin update. The skill never rewrites its own files. |
TM1 β tool-parameter abuse (rm -rf) |
The matched text instructs the agent to lint-for and skip-flag rm -rf in reviewed code β a defensive rule, the opposite of using it. |
RA2 β session persistence (.coalboard/proposed/) |
The staging dir is the propose-not-execute safety mechanism (nothing reaches live until the human approves), not attacker persistence. |
| EA2 β autonomous decision | The cited snippet itself mandates "default ASK, via the question-box" β the human consent gate is present. |
| EA3 Γ2 β scope creep ("not limited to") | The manual /coalboard breadth is the deliberate, consent-gated two-scope design (auto narrow, manual broad). |
| SQP-1/2, SDI-2 Γ2 | /coalboard is a manual command (not accidentally triggerable); "no executable code" is wrong (hooks/coalboard-conductor.js ships in the dist); the git tag-check is a read-only, by-design lookup. |
(Exact per-category counts shift slightly between static runs β the 2026-06-20 re-scan returned 11 findings: RA1Γ6 Β· EA3Γ2 Β· EA2 Β· RA2 Β· TM1; the categories above and the all-false-positive verdict are stable. The SQP/SDI row was an LLM-semantic-stage finding from an earlier run; the 2026-06-20 static pass β the LLM stage was HTTP-429 rate-limited β did not raise them.) This matches the family baseline: all three skills now score 100/100 static β CoalMine + CoalTipple rose from 58/100 + 0/100 after they shipped consent-gated Self-Updating, which the static RA1 self-modification rule flags as the same false positive seen here β all-false-positive across the family. The report JSON is not shipped.
hooks/coalboard-conductor.js is advise-only and Phoenix-pure: zero dependencies (Node builtins only), no network, no child processes, fail-silent (exits 0 on any error; never crashes the host), and it only emits the two sanctioned channels. It detects and injects β it never spawns workers, networks, or applies anything. Its stdin parse is guarded against non-object input.
The board is built so that reviewing untrusted work cannot harm the host:
- The work under review is DATA, never instructions β the lens prompts never obey an injected "approve this".
- Propose, never execute β the lenses emit a diff to
.coalboard/proposed/; the board itself runs no side-effect. A real side-effect (a migration, an API call, a deploy) fires only at the human-approved apply, with a warning, and is never auto-retried. - Verify is contract-isolated, NOT OS-sandboxed β a skill cannot OS-sandbox; the judge runs reviewed checks in the staging dir with a pre-run lint (banned modules /
rm -rf/ network), no real DB/network where possible, and a disposable VM for genuinely hostile code. Contract-enforced + the judge's discipline, not an OS guarantee. External SAST is optional, never a hard requirement. - Secrets are scrubbed β credential patterns are scrubbed from anything logged or displayed (best-effort defense-in-depth, contract-enforced β NOT a guarantee a secret is caught; the staging + read-only-worker boundary is the real protection).
scripts/lib/secrets.mjsis the reference scrubber + its test target β a DEV file, not part of the shipped plugin runtime. - No human, no apply β a non-interactive (cron/headless) run is report-only; the human consent gate is the load-bearing safety node.