change create rft command to use the selector#325
Conversation
| # Stub selector to return the single test; stub upload and polling | ||
| import eval_protocol.cli_commands.upload as upload_mod | ||
|
|
||
| monkeypatch.setattr(upload_mod, "_prompt_select", lambda tests, non_interactive=False: tests[:1]) |
There was a problem hiding this comment.
Bug: Imported Functions Evade Monkeypatch
The monkeypatch targets upload_mod._prompt_select, but create_rft.py imports _prompt_select directly into its namespace with from .upload import _prompt_select. This means the monkeypatch won't affect the function that create_rft_command actually calls. The patch should target cr._prompt_select instead to properly mock the imported function.
There was a problem hiding this comment.
Bug: Trust Explicit Evaluator IDs for RFT.
When a user explicitly provides --evaluator-id for an existing evaluator and the evaluator check fails (network error or evaluator doesn't exist yet), the code errors out if multiple tests exist and the evaluator ID doesn't match any discovered test. This prevents users from creating RFT jobs for existing evaluators that were uploaded with custom IDs or from different projects. The code should trust the explicitly provided evaluator_id instead of requiring it to match a discovered test.
eval_protocol/cli_commands/create_rft.py#L420-L429
python-sdk/eval_protocol/cli_commands/create_rft.py
Lines 420 to 429 in 480fe2a
Bug: Auto-Extraction Forgets User's Test Choice
When auto-extracting dataset JSONL without --dataset-jsonl, the code re-discovers all tests and only extracts if exactly one test exists. However, if the user selected a specific test via the selector earlier (when --evaluator-id was not provided) from multiple tests, this logic fails because it doesn't use the originally selected test. The selected test information from lines 334-356 is not preserved, causing dataset extraction to fail when multiple tests exist even though the user already selected one.
eval_protocol/cli_commands/create_rft.py#L483-L506
python-sdk/eval_protocol/cli_commands/create_rft.py
Lines 483 to 506 in 480fe2a
Note
Refactors
create rftto use the upload selector, adds robust dataset inference (dataloader/input_dataset or builder), simplifies selection UX, and removes evaluator trace persistence._prompt_select) to pick exactly one evaluation test; derivesevaluator_idand entry accordingly._resolve_selected_testto mapevaluator_id→ source file and function.input_dataset, then auto-detect and materialize a dataset builder.ACTIVEbefore proceeding.__all__.Written by Cursor Bugbot for commit 2d75acf. This will update automatically on new commits. Configure here.