Skip to content

Adding Passed Threshold to Flags#121

Merged
xzrderek merged 1 commit intomainfrom
derekx/adding-ci
Aug 26, 2025
Merged

Adding Passed Threshold to Flags#121
xzrderek merged 1 commit intomainfrom
derekx/adding-ci

Conversation

@xzrderek
Copy link
Copy Markdown
Contributor

now i can do stuff like:
pytest eval_protocol/benchmarks/test_gpqa.py --ep-success-threshold 0.8 --ep-se-threshold 0.01 --ep-summary-json gpqa-test.json --ep-max-rows 5 --ep-num-runs 2

@xzrderek xzrderek merged commit 1a9d702 into main Aug 26, 2025
1 check passed
@xzrderek xzrderek deleted the derekx/adding-ci branch August 26, 2025 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant