Skip to content

Epistemic confidence Phase A — honest confidence scoring#8

Merged
msitarzewski merged 1 commit intomainfrom
epistemic-confidence-phase-a
Feb 18, 2026
Merged

Epistemic confidence Phase A — honest confidence scoring#8
msitarzewski merged 1 commit intomainfrom
epistemic-confidence-phase-a

Conversation

@msitarzewski
Copy link
Owner

Confidence now reflects inherent uncertainty of the question domain, not just challenge quality. Rigor (renamed from old confidence) measures challenge genuineness [0.5–1.0]; confidence = min(domain_cap, rigor) where domain caps are factual=0.95, technical=0.90, creative=0.85, judgment=0.80, strategic=0.70. Adds calibration module (ECE metric), duh calibration CLI, GET /api/calibration endpoint, and calibration dashboard in web UI. Full-stack propagation of rigor field across ORM, handlers, CLI, API, WebSocket, MCP, and frontend (47 source files + 5 memory-bank files). 1586 Python + 126 Vitest = 1712 tests passing.

Confidence now reflects inherent uncertainty of the question domain,
not just challenge quality. Rigor (renamed from old confidence) measures
challenge genuineness [0.5–1.0]; confidence = min(domain_cap, rigor)
where domain caps are factual=0.95, technical=0.90, creative=0.85,
judgment=0.80, strategic=0.70. Adds calibration module (ECE metric),
duh calibration CLI, GET /api/calibration endpoint, and calibration
dashboard in web UI. Full-stack propagation of rigor field across ORM,
handlers, CLI, API, WebSocket, MCP, and frontend (47 source files +
5 memory-bank files). 1586 Python + 126 Vitest = 1712 tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@msitarzewski msitarzewski merged commit 253476d into main Feb 18, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant