Skip to content

fixup! Add Claude Sonnet 4.6 to benchmark — now best no-tools model a…

3caf19f
Select commit
Loading
Failed to load commit list.
Merged

PolicyBench v2: AI tax/benefit benchmark #2

fixup! Add Claude Sonnet 4.6 to benchmark — now best no-tools model a…
3caf19f
Select commit
Loading
Failed to load commit list.