Merged
Conversation
| return 1 | ||
| if len(selected) != 1: | ||
| print("Error: Please select exactly one evaluation test for 'local-test'.") | ||
| return 1 |
There was a problem hiding this comment.
Bug: Non-interactive --yes fails multiple tests.
When --yes is used without --entry and multiple tests exist, _prompt_select returns all tests (because non_interactive=True), causing the check if len(selected) != 1 to always fail. The error message doesn't guide users to use --entry, making the --yes flag unusable in multi-test scenarios. The function should either fail earlier with a helpful message about requiring --entry, or handle the non-interactive case differently.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
Adds a
local-testCLI to run evaluation tests locally or in Docker with extra flags, updatesEvaluationRow.created_atto UTC, tweaks upload prompts, and adds comprehensive tests.local-testcommand ineval_protocol/cli.pyandcli_commands/local_test.pyto run a selected evaluation test viapytest.--entry(path or path::function) or uses selector; enforces single selection.Dockerfile; runs in Docker if present (or on host with--ignore-docker).--docker-build-extraand--docker-run-extra; mounts project/logs and maps user IDs.cli_commands/upload.py): change prompts to say "Select this test?" and "Enter the number to select:".models.py):EvaluationRow.created_atnow defaults todatetime.now(timezone.utc)(UTC timestamp).tests/test_cli_local_test.pycovering host/Docker execution, multiple Dockerfiles error, extra flag passing, selector behavior, and path normalization.Written by Cursor Bugbot for commit 9b476dc. This will update automatically on new commits. Configure here.