Coding Env improvements for further development #235

albertodepaola · 2025-12-04T22:06:34Z

🎯 Summary

This PR adds in-repo development support for the coding_env environment and fixes the reward computation bug that prevented the transform pipeline from evaluating code and assigning rewards.

Testing

Runs the environment and prints rewards

$ python ./examples/local_coding_env.py
============================================================
CodingEnv.from_docker_image() Test
============================================================

Creating client from Docker image...
  CodingEnv.from_docker_image('coding-env:latest')

✓ Client created and container started!

Testing the environment:
------------------------------------------------------------

1. Reset:
   stdout: 
   stderr: 
   exit_code: 0
   State: episode_id=35c24391-037d-4d25-bd0b-85ee048de8fe, step_count=0

2. Execute Python code:
   1. Code: print('Hello, World!')...
      → stdout: Hello, World!
      → exit_code: 0
      → reward: 0.1
...

Running reward tests before the fix:

$ pytest ./tests/envs/test_python_codeact_rewards.py
...
=========== 29 failed, 2 passed in 0.46s ===========

After

$ pytest ./tests/envs/test_python_codeact_rewards.py
=========================================================================== test session starts ===========================================================================
platform darwin -- Python 3.12.9, pytest-8.4.2, pluggy-1.5.0
rootdir: /Users/betodepaola/projects/OpenEnv
configfile: pyproject.toml
plugins: asyncio-1.2.0, anyio-4.11.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 31 items                                                                                                                                                        

tests/envs/test_python_codeact_rewards.py ...............................                                                                                           [100%]

===========  31 passed in 0.36s ===========

✨ Features & Enhancements

Added dual-mode import support for both standalone (PyPI) and in-repo development
- Client, server, models, and transforms now support both import paths
- Allows developers to work on coding_env within the OpenEnv monorepo without publishing to PyPI
Updated Dockerfile for in-repo build mode
- Supports BUILD_MODE argument for flexible Docker builds
- Enables development workflow without PyPI package cycles
Updated package dependencies
- Bumped version in pyproject.toml, both for the env and main project, as the docker termination was updated.
- Fixed relative imports for .models module
Enhanced README with development instructions
- Added setup guides for in-repo development workflow
Improved container cleanup logic
- Properly removes containers on exit to prevent resource leaks
- Fixes issue where containers would accumulate after testing

🐛 Bug Fixes

Reward Computation Fix

Fixed reward calculation in python_codeact_env.py
- Added metadata={"last_code": action.code} to CodeObservation in step() method
- Enables transform pipeline (CodeSafetyTransform and CodeQualityTransform) to evaluate code
- Root cause: Transforms require code in metadata["last_code"] to calculate rewards, but it was never being set
- Impact: Rewards now correctly computed instead of returning None
  - Safe + concise code (≤100 chars): reward = 0.1
  - Safe + verbose code (>100 chars): reward = 0.0
  - Dangerous patterns (import os, eval(), etc.): reward = -1.0
  - Syntax errors: reward = -0.2

Test Script Enhancement

Added reward output to local_coding_env.py example
- Displays result.reward for both successful executions and error scenarios
- Provides visibility into the reward computation for debugging and verification
- Helps users confirm the reward system is functioning correctly

albertodepaola added 6 commits December 2, 2025 15:25

Adding support for in repo develoment with the coding env

6064aa9

updating package version and imports for relative .models import

34d1da9

Updating readme, still needs work

96ba203

Adding proper cleanup logic to remove the container everytime

f73c70a

fixing code not being passed to transform for reward calculation

b63e61f

cleanup readme from useless hallucinations

e55a2a8

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 4, 2025

albertodepaola added 3 commits December 4, 2025 15:21

Adding unit tests with pytests, requires package install

386d433

Adding dev dependency to toml file for source installation

26928dc

updating readme with pytest dev only dependency. Bumping minor version.

469cf1a

albertodepaola mentioned this pull request Dec 8, 2025

Openenv integration beto HamidShojanazeri/agentbeats#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Coding Env improvements for further development #235

Coding Env improvements for further development #235

albertodepaola commented Dec 4, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Coding Env improvements for further development #235

Are you sure you want to change the base?

Coding Env improvements for further development #235

Conversation

albertodepaola commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎯 Summary

Testing

Runs the environment and prints rewards

Running reward tests before the fix:

After

✨ Features & Enhancements

🐛 Bug Fixes

Reward Computation Fix

Test Script Enhancement

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

albertodepaola commented Dec 4, 2025 •

edited

Loading