Skip to content

eval protocol upload#241

Merged
benjibc merged 1 commit intomainfrom
eval_protocol_upload
Oct 2, 2025
Merged

eval protocol upload#241
benjibc merged 1 commit intomainfrom
eval_protocol_upload

Conversation

@benjibc
Copy link
Copy Markdown
Contributor

@benjibc benjibc commented Oct 1, 2025

Screen.Recording.2025-09-30.at.5.04.16.PM.mov
Screen.Recording.2025-09-30.at.6.04.39.PM.mov

@benjibc benjibc force-pushed the eval_protocol_upload branch from a8a8a9a to 1cd89b6 Compare October 1, 2025 04:23
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

Comment on lines +251 to +254
from eval_protocol.models import EvaluationRow, Message
from {module} import {func} as _ep_test

def evaluate(messages: List[Dict[str, Any]], ground_truth: Optional[Union[str, List[Dict[str, Any]]]] = None, tools=None, **kwargs):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Package test code in uploaded evaluator

The generated TS-mode snippet imports the evaluation test from the user’s module (from {module} import {func}) but create_evaluation only uploads this single file as python_code_to_evaluate. When the evaluator runs on Fireworks, the referenced module is not present in that environment, so every uploaded evaluator fails immediately with ModuleNotFoundError unless the user’s entire project is already installed remotely. The upload command needs to embed the test source (e.g., via inspect.getsource) or package the module alongside the snippet.

Useful? React with 👍 / 👎.

self._model_base_url = model_base_url
if os.getenv("EP_REMOTE_ROLLOUT_PROCESSOR_BASE_URL"):
self._remote_base_url = os.getenv("EP_REMOTE_ROLLOUT_PROCESSOR_BASE_URL")
self._model_base_url = model_base_url
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need option to overwrite this by env var as well? i guess no right? (cuz its gonna be fixed to something like https://api.fireworks.ai/inference/v1/chat/completions)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh seems like we need the overwrite option (overwrite to https://tracing.fireworks.ai/project_id/xxxxxx)

return code, file_name, qualname


def _generate_ts_mode_code(test: DiscoveredTest) -> tuple[str, str]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is ts mode

try:
result = create_evaluation(
evaluator_id=evaluator_id,
python_code_to_evaluate=code,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the uploaded code here, should we just upload their full code base with a selected pyargs (entry)? i can create a new backend endpoint for it.

@benjibc benjibc merged commit 33185d7 into main Oct 2, 2025
7 checks passed
@benjibc benjibc deleted the eval_protocol_upload branch October 2, 2025 02:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants