Skip to content

Harden /evaluate error handling and remove mutable kwargs default#438

Open
bezzchen wants to merge 1 commit intoeval-protocol:mainfrom
bezzchen:error-handling
Open

Harden /evaluate error handling and remove mutable kwargs default#438
bezzchen wants to merge 1 commit intoeval-protocol:mainfrom
bezzchen:error-handling

Conversation

@bezzchen
Copy link

@bezzchen bezzchen commented Mar 20, 2026

This update hardens eval_protocol/generic_server.py by removing the shared mutable default for EvaluationRequest.kwargs (changed from {} to None) and replacing print(...) calls with structured logging. The broad /evaluate exception handler now logs full stack traces server-side while returning a safe, non-leaking HTTP 500 error message to clients. Unit tests in tests/test_generic_server.py were updated/added accordingly and verified passing.


Note

Low Risk
Low risk: behavior changes are limited to safer error responses/logging and a Pydantic default fix, with corresponding unit test updates.

Overview
Hardens eval_protocol/generic_server.py by replacing print statements with structured logging, including stack traces server-side while returning a non-leaking generic 500 message from /evaluate.

Fixes EvaluationRequest.kwargs to default to None (avoiding a shared mutable {}) and updates/adds tests to assert the new default and the sanitized error message. Also extends .gitignore to exclude local test dependency directories (_pytest_deps/, .test_deps/).

Written by Cursor Bugbot for commit 115c765. This will update automatically on new commits. Configure here.

@bezzchen
Copy link
Author

@xzrderek

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

eval-protocol
_pytest_deps/
.test_deps/
.test_deps/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate .test_deps/ entry in .gitignore

Low Severity

.test_deps/ appears twice on consecutive lines in .gitignore. One of the duplicate entries can be removed.

Fix in Cursor Fix in Web

exit(1)

print(f"Starting server for reward function: {args.import_string} on http://{args.host}:{args.port}")
logger.info("Starting server for reward function: %s on http://%s:%s", args.import_string, args.host, args.port)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing logging configuration silences startup info messages

Medium Severity

The if __name__ == "__main__" block replaces print() with logger.info() but never calls logging.basicConfig(). Python's root logger defaults to WARNING level, so logger.info() messages — like "Successfully loaded reward function" and "Starting server for reward function" — are silently dropped. Other __main__ blocks in this project (e.g., gcp_tools.py, platform_api.py) include a logging.basicConfig() call. The logger.error() calls still appear via Python's lastResort handler, but operators lose the confirmation that the server started successfully.

Additional Locations (1)
Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant