Epistemic confidence Phase A — honest confidence scoring by msitarzewski · Pull Request #8 · msitarzewski/duh

msitarzewski · 2026-02-18T06:22:33Z

Confidence now reflects inherent uncertainty of the question domain, not just challenge quality. Rigor (renamed from old confidence) measures challenge genuineness [0.5–1.0]; confidence = min(domain_cap, rigor) where domain caps are factual=0.95, technical=0.90, creative=0.85, judgment=0.80, strategic=0.70. Adds calibration module (ECE metric), duh calibration CLI, GET /api/calibration endpoint, and calibration dashboard in web UI. Full-stack propagation of rigor field across ORM, handlers, CLI, API, WebSocket, MCP, and frontend (47 source files + 5 memory-bank files). 1586 Python + 126 Vitest = 1712 tests passing.

Confidence now reflects inherent uncertainty of the question domain, not just challenge quality. Rigor (renamed from old confidence) measures challenge genuineness [0.5–1.0]; confidence = min(domain_cap, rigor) where domain caps are factual=0.95, technical=0.90, creative=0.85, judgment=0.80, strategic=0.70. Adds calibration module (ECE metric), duh calibration CLI, GET /api/calibration endpoint, and calibration dashboard in web UI. Full-stack propagation of rigor field across ORM, handlers, CLI, API, WebSocket, MCP, and frontend (47 source files + 5 memory-bank files). 1586 Python + 126 Vitest = 1712 tests passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

msitarzewski merged commit 253476d into main Feb 18, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epistemic confidence Phase A — honest confidence scoring#8

Epistemic confidence Phase A — honest confidence scoring#8
msitarzewski merged 1 commit intomainfrom
epistemic-confidence-phase-a

msitarzewski commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

msitarzewski commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant