# Clone and install in editable mode with dev dependencies
git clone https://github.com/sacredvoid/alignrl.git
cd alignrl
pip install -e ".[dev]"This installs pytest, ruff, and mypy alongside the base package. You do not need GPU dependencies for running tests or linting.
# Run all tests
pytest
# Run a specific test file
pytest tests/test_rewards.py
# Run with verbose output
pytest -vTests are in tests/ and use pytest. GPU-dependent tests are skipped automatically if CUDA is not available.
The test suite covers:
- Reward functions (answer extraction, math verification, format checking)
- Config loading and validation
- Result type serialization
- Runner initialization (not full training, since that requires a GPU)
The project uses Ruff for linting and formatting.
# Check for lint errors
ruff check .
# Auto-fix what can be fixed
ruff check . --fix
# Format code
ruff format .Configuration is in pyproject.toml:
- Target: Python 3.10
- Line length: 100
- Enabled rules: E, F, I, N, UP, B, SIM, TCH
Type checking with mypy:
mypy src/alignrlalignrl/
src/alignrl/ # Package source
config.py # BaseTrainConfig (Pydantic)
sft.py # SFTConfig, SFTRunner
grpo.py # GRPOConfig, GRPORunner
dpo.py # DPOConfig, DPORunner
eval.py # EvalConfig, EvalRunner, compare_stages
inference.py # InferenceConfig, ModelServer
rewards.py # Reward functions for GRPO
demo.py # Gradio comparison UI
cli.py # CLI entry point
types.py # TrainResult, EvalResult, Trainer protocol
configs/ # YAML configs for each training stage
docs/ # Documentation and GitHub Pages dashboard
examples/ # Runnable example scripts
notebooks/ # Colab-ready Jupyter notebooks
tests/ # Test suite
pyproject.toml # Build config and dependencies
To add a new post-training technique (e.g., KTO, SPIN, or ORPO):
-
Create the module at
src/alignrl/<technique>.py. Follow the existing pattern:- A config class extending
BaseTrainConfigwith technique-specific fields - A runner class with
train() -> TrainResult,save(), andload()methods - The runner should implement the
Trainerprotocol fromalignrl.types
- A config class extending
-
Register it in the CLI. Add a new branch in
cmd_train()incli.py:elif stage == "your_technique": from alignrl.your_technique import YourConfig, YourRunner config = YourConfig.from_yaml(config_path) runner = YourRunner(config)
And add the stage name to the
choiceslist in the argparser. -
Add a YAML config in
configs/your_technique.yaml. -
Write tests in
tests/test_your_technique.py. At minimum, test config construction and validation. If you can test the runner logic without a GPU, do that too. -
Add an example script in
examples/. -
Add a notebook in
notebooks/if the technique warrants a walkthrough.
- Fork the repo and create a branch from
main - Make your changes
- Run
ruff check .andruff format .to ensure code style compliance - Run
pytestto make sure tests pass - Write a clear PR description explaining what changed and why
- Submit the PR against
main
Keep PRs focused. One feature or fix per PR. If your change touches multiple modules, explain the connection in the PR description.