eval protocol upload by benjibc · Pull Request #241 · eval-protocol/python-sdk

benjibc · 2025-10-01T04:22:08Z

Screen.Recording.2025-09-30.at.5.04.16.PM.mov

Screen.Recording.2025-09-30.at.6.04.39.PM.mov

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

@codex fix this CI failure
@codex address that feedback

chatgpt-codex-connector · 2025-10-01T04:27:36Z

eval_protocol/cli_commands/upload.py

+from eval_protocol.models import EvaluationRow, Message
+from {module} import {func} as _ep_test
+
+def evaluate(messages: List[Dict[str, Any]], ground_truth: Optional[Union[str, List[Dict[str, Any]]]] = None, tools=None, **kwargs):


Package test code in uploaded evaluator

The generated TS-mode snippet imports the evaluation test from the user’s module (from {module} import {func}) but create_evaluation only uploads this single file as python_code_to_evaluate. When the evaluator runs on Fireworks, the referenced module is not present in that environment, so every uploaded evaluator fails immediately with ModuleNotFoundError unless the user’s entire project is already installed remotely. The upload command needs to embed the test source (e.g., via inspect.getsource) or package the module alongside the snippet.

Useful? React with 👍 / 👎.

mayinghan · 2025-10-01T21:19:11Z

eval_protocol/pytest/remote_rollout_processor.py

-        self._model_base_url = model_base_url
        if os.getenv("EP_REMOTE_ROLLOUT_PROCESSOR_BASE_URL"):
            self._remote_base_url = os.getenv("EP_REMOTE_ROLLOUT_PROCESSOR_BASE_URL")
+        self._model_base_url = model_base_url


do we need option to overwrite this by env var as well? i guess no right? (cuz its gonna be fixed to something like https://api.fireworks.ai/inference/v1/chat/completions)

oh seems like we need the overwrite option (overwrite to https://tracing.fireworks.ai/project_id/xxxxxx)

mayinghan · 2025-10-01T21:44:20Z

eval_protocol/cli_commands/upload.py

+    return code, file_name, qualname
+
+
+def _generate_ts_mode_code(test: DiscoveredTest) -> tuple[str, str]:


what is ts mode

mayinghan · 2025-10-01T21:45:25Z

eval_protocol/cli_commands/upload.py

+        try:
+            result = create_evaluation(
+                evaluator_id=evaluator_id,
+                python_code_to_evaluate=code,


for the uploaded code here, should we just upload their full code base with a selected pyargs (entry)? i can create a new backend endpoint for it.

eval protocol upload

1cd89b6

benjibc force-pushed the eval_protocol_upload branch from a8a8a9a to 1cd89b6 Compare October 1, 2025 04:23

chatgpt-codex-connector bot reviewed Oct 1, 2025

View reviewed changes

mayinghan reviewed Oct 1, 2025

View reviewed changes

benjibc merged commit 33185d7 into main Oct 2, 2025
7 checks passed

benjibc deleted the eval_protocol_upload branch October 2, 2025 02:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval protocol upload#241

eval protocol upload#241
benjibc merged 1 commit intomainfrom
eval_protocol_upload

benjibc commented Oct 1, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Oct 1, 2025

Uh oh!

mayinghan Oct 1, 2025

Uh oh!

mayinghan Oct 1, 2025

Uh oh!

mayinghan Oct 1, 2025

Uh oh!

mayinghan Oct 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		return code, file_name, qualname


		def _generate_ts_mode_code(test: DiscoveredTest) -> tuple[str, str]:

Conversation

benjibc commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

mayinghan Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

mayinghan Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

mayinghan Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

mayinghan Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

benjibc commented Oct 1, 2025 •

edited

Loading