Harden /evaluate error handling and remove mutable kwargs default#438
Open
bezzchen wants to merge 1 commit intoeval-protocol:mainfrom
Open
Harden /evaluate error handling and remove mutable kwargs default#438bezzchen wants to merge 1 commit intoeval-protocol:mainfrom
bezzchen wants to merge 1 commit intoeval-protocol:mainfrom
Conversation
Author
| eval-protocol | ||
| _pytest_deps/ | ||
| .test_deps/ | ||
| .test_deps/ |
| exit(1) | ||
|
|
||
| print(f"Starting server for reward function: {args.import_string} on http://{args.host}:{args.port}") | ||
| logger.info("Starting server for reward function: %s on http://%s:%s", args.import_string, args.host, args.port) |
There was a problem hiding this comment.
Missing logging configuration silences startup info messages
Medium Severity
The if __name__ == "__main__" block replaces print() with logger.info() but never calls logging.basicConfig(). Python's root logger defaults to WARNING level, so logger.info() messages — like "Successfully loaded reward function" and "Starting server for reward function" — are silently dropped. Other __main__ blocks in this project (e.g., gcp_tools.py, platform_api.py) include a logging.basicConfig() call. The logger.error() calls still appear via Python's lastResort handler, but operators lose the confirmation that the server started successfully.
Additional Locations (1)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


This update hardens eval_protocol/generic_server.py by removing the shared mutable default for EvaluationRequest.kwargs (changed from {} to None) and replacing print(...) calls with structured logging. The broad /evaluate exception handler now logs full stack traces server-side while returning a safe, non-leaking HTTP 500 error message to clients. Unit tests in tests/test_generic_server.py were updated/added accordingly and verified passing.
Note
Low Risk
Low risk: behavior changes are limited to safer error responses/logging and a Pydantic default fix, with corresponding unit test updates.
Overview
Hardens
eval_protocol/generic_server.pyby replacingprintstatements with structuredlogging, including stack traces server-side while returning a non-leaking generic 500 message from/evaluate.Fixes
EvaluationRequest.kwargsto default toNone(avoiding a shared mutable{}) and updates/adds tests to assert the new default and the sanitized error message. Also extends.gitignoreto exclude local test dependency directories (_pytest_deps/,.test_deps/).Written by Cursor Bugbot for commit 115c765. This will update automatically on new commits. Configure here.