This document describes the Docker-based testing infrastructure for the MBARC Atari project.
The testing infrastructure uses:
- Docker for reproducible, isolated test environments
- uv for fast, reliable Python dependency management
- pytest for test execution and organization
- Python 3.12 with modern dependencies
docker build -f Dockerfile.test -t mbarc-test .First build time: ~5-8 minutes (downloads dependencies) Subsequent builds: ~30 seconds (uses Docker cache)
docker run --rm mbarc-test# Run only fast tests (exclude slow tests)
docker run --rm mbarc-test pytest tests/ -m "not slow" -v
# Run only integration tests
docker run --rm mbarc-test pytest tests/ -m "integration" -v
# Run a specific test file
docker run --rm mbarc-test pytest tests/integration/test_training_poc.py -v
# Run a specific test function
docker run --rm mbarc-test pytest tests/integration/test_training_poc.py::test_pytorch_installation -v# Start an interactive shell in the container
docker run --rm -it mbarc-test /bin/bash
# Inside the container, you can run:
pytest tests/ -v # Run tests
pytest tests/ -v -s # Run tests with print output
pytest tests/ -k "pytorch" # Run tests matching pattern
pytest tests/ --lf # Run only last failed tests
python -m mbarc --help # Run the applicationIf you want to test local changes without rebuilding the Docker image:
docker run --rm -v $(pwd):/app mbarc-test pytest tests/ -vNote: This mounts your local code into the container, so changes are immediately reflected.
mbarc_atari/
├── Dockerfile.test # Docker image for testing
├── pyproject.toml # Project metadata and dependencies (uv format)
├── pytest.ini # Pytest configuration
├── .dockerignore # Files to exclude from Docker build
├── tests/
│ ├── __init__.py
│ ├── conftest.py # Shared pytest fixtures
│ └── integration/
│ ├── __init__.py
│ └── test_training_poc.py # Integration tests
├── mbarc/ # Source code
└── atari_utils/ # Utilities
These tests validate that all dependencies work correctly together:
- ✅
test_pytorch_installation- PyTorch installation and basic operations - ✅
test_gymnasium_installation- Gymnasium with Atari support - ✅
test_create_simple_environment- CartPole environment creation - ✅
test_create_atari_environment- Atari (Pong) environment creation - ✅
test_opencv_installation- OpenCV operations - ✅
test_core_imports- MBARC module imports - ✅
test_torch_and_numpy_compatibility- PyTorch/NumPy interop - ✅
test_simple_environment_rollout- Complete environment rollout - ✅
test_torch_basic_neural_network- Neural network creation - ✅
test_dependencies_versions- Dependency version checks
Available pytest fixtures:
test_device- Returns CPU or CUDA device for testingminimal_config- Minimal configuration for MBARC componentstemp_model_dir- Temporary directory for model saving/loadingset_random_seeds- Auto-applied fixture for reproducible tests
Use markers to categorize and select tests:
# Available markers
pytest -m "integration" # Run integration tests only
pytest -m "slow" # Run slow tests only
pytest -m "not slow" # Skip slow testsdocker run --rm mbarc-test pytest tests/ --cov=mbarc --cov=atari_utils --cov-report=htmlCoverage report will be generated in htmlcov/ directory.
# Show local variables in traceback
docker run --rm mbarc-test pytest tests/ -v --tb=long -l
# Drop into debugger on failure
docker run --rm -it mbarc-test pytest tests/ --pdb
# Show print statements
docker run --rm mbarc-test pytest tests/ -v -s# Override default timeout (600s)
docker run --rm mbarc-test pytest tests/ --timeout=300
# Disable timeout for debugging
docker run --rm -it mbarc-test pytest tests/ --timeout=0# Use CUDA device (if available in Docker)
docker run --rm --gpus all -e DEVICE=cuda mbarc-test
# Set custom environment variables
docker run --rm -e CUSTOM_VAR=value mbarc-test pytest tests/ -vname: Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build test image
run: docker build -f Dockerfile.test -t mbarc-test .
- name: Run tests
run: docker run --rm mbarc-test pytest tests/ -v --tb=short
- name: Run tests with coverage
run: |
docker run --rm -v $(pwd)/coverage:/app/coverage mbarc-test \
pytest tests/ --cov=mbarc --cov=atari_utils --cov-report=xml:/app/coverage/coverage.xml
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
file: ./coverage/coverage.xmlgym==0.15.7→ Now usinggymnasium>=0.29.0torch==1.7.1→ Now usingtorch>=2.1.0numpy==1.16.4→ Now usingnumpy>=1.24.0- Python 3.7 → Now using Python 3.12
This means:
- The existing MBARC code will not run without modifications to support the new gymnasium API
- The integration tests validate the infrastructure (Docker, uv, pytest, dependencies)
- To run the actual training code, you'll need to update the codebase to use gymnasium's API
To make the existing code work with modern dependencies:
- Replace
gymimports withgymnasium - Update environment creation:
gym.make("PongNoFrameskip-v4")→gym.make("ALE/Pong-v5") - Update step API:
obs, reward, done, info→obs, reward, terminated, truncated, info - Update reset API:
obs = env.reset()→obs, info = env.reset()
If you prefer to develop without Docker:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or via pip
pip install uv# Install all dependencies
uv sync
# Activate virtual environment
source .venv/bin/activate # Linux/macOS
# or
.venv\Scripts\activate # Windows
# Run tests
pytest tests/ -v# Clean Docker cache and rebuild
docker system prune -a
docker build --no-cache -f Dockerfile.test -t mbarc-test .# Increase timeout or run without slow tests
docker run --rm mbarc-test pytest tests/ -m "not slow" --timeout=1200The Dockerfile automatically installs Atari ROMs via gymnasium[atari,accept-rom-license]. If you see ROM errors:
# Rebuild the image to ensure ROMs are installed
docker build --no-cache -f Dockerfile.test -t mbarc-test .If you see import errors for mbarc or atari_utils:
# Make sure the project is installed in editable mode
docker run --rm -it mbarc-test /bin/bash
uv pip install -e .Typical runtimes (on modern hardware):
- Docker build (first time): ~5-8 minutes
- Docker build (cached): ~30 seconds
- Full test suite: ~2-3 minutes
- Fast tests only (
-m "not slow"): ~30 seconds
- Add more tests - Expand test coverage for specific components
- Update codebase - Migrate code to gymnasium API
- Add unit tests - Create
tests/unit/for individual component testing - CI/CD integration - Set up automated testing on pull requests