The Marked Bench: a versioned contradiction-detection benchmark for AI reasoning evaluation.
leaderboard schema-validation ai-safety reasoning contradiction-detection multihop-reasoning ai-evaluation result-card explanation-evaluation ai-benchmark reasoning-evaluation benchmark-submissions external-submissions benchmark-governance submission-evidence conformance-report benchmark-standard
-
Updated
May 31, 2026 - Python