Thank you for considering contributing to this research project! This document outlines the collaborative coding standards and workflow for contributing to this repository.
- Code of Conduct
- Getting Started
- Development Workflow
- Pull Request Process
- Verification Requirements
- Coding Standards
- Commit Message Guidelines
We are committed to providing a welcoming and inclusive environment for all contributors. Please:
- Be respectful and considerate in all interactions
- Welcome newcomers and help them get started
- Focus on constructive feedback
- Respect differing viewpoints and experiences
- Accept responsibility and apologize for mistakes
- Fork the repository and clone your fork locally
- Set up your development environment:
uv venv source .venv/bin/activate uv pip install -e ".[dev]" pre-commit install
- Create a new branch for your feature or bugfix:
git checkout -b feature/your-feature-name
-
Ensure you're working on the latest code:
git checkout main git pull origin main
-
Create a new branch from main:
git checkout -b feature/descriptive-name
- Write clean, documented code following our coding standards
- Add type hints to all function signatures
- Write or update tests for your changes
- Update documentation if you're changing functionality
- Run tests frequently to catch issues early:
pytest tests/
Pre-commit hooks will automatically run, but you can also run checks manually:
# Run all pre-commit checks
pre-commit run --all-files
# Run specific checks
black src experiments tests
isort src experiments tests
flake8 --config .dev-config/.flake8 src experiments tests
mypy --config-file .dev-config/mypy.ini src experiments
pylint --rcfile .dev-config/.pylintrc src experiments
pytest tests/-
Push your branch to your fork:
git push origin feature/your-feature-name
-
Open a Pull Request on GitHub with a clear title and description
-
Fill out the PR template completely (see below)
Every pull request MUST include the following sections:
## Description
[Provide a clear and concise description of your changes]
## Motivation and Context
[Why is this change required? What problem does it solve?]
## Type of Change
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Experiment/research code
## Related Issues
[Link to related issues, e.g., "Closes #123" or "Related to #456"]
## Verification
**This section is MANDATORY for all PRs involving code changes.**
### Testing Performed
[Describe what testing you performed]
Example:
- Unit tests added/updated: [list test files]
- Integration tests run: [describe]
- Manual testing performed: [describe]
### Sample Data Runs
**For research/experiment code, you MUST provide evidence of runs on sample data.**
Provide:
1. **Command used to run**:
```bash
python experiments/my_experiment.py --config example-
Sample output/results:
[Paste relevant output, metrics, or logs] -
Artifacts (if applicable):
- Attach screenshots of plots/visualizations
- Link to W&B run: [wandb-run-url]
- Configuration used: [link to config file or paste inline]
-
Performance metrics (if applicable):
- Runtime: X seconds/minutes
- Memory usage: Y GB
- Accuracy/Loss: [relevant metrics]
- Python version: [e.g., 3.9]
- Operating System: [e.g., Ubuntu 22.04, macOS 13]
- Key dependencies: [if relevant to the change]
Before submitting your PR, ensure:
- My code follows the project's coding standards
- I have added type hints to all function signatures
- I have written/updated tests that prove my fix/feature works
- All tests pass locally (
pytest tests/) - I have updated the documentation accordingly
- My changes generate no new warnings from linters
- I have added docstrings to all new functions/classes (Google style)
- I have included the Verification section with sample data runs (if applicable)
- I have run the code on sample/test data and verified it works
- Pre-commit hooks pass without errors
### Review Process
1. **Automated checks** will run via GitHub Actions
2. **At least one reviewer** must approve your PR
3. **Address all comments** from reviewers promptly
4. **Update your PR** based on feedback
## Verification Requirements
### For All Code Changes
- All tests must pass
- Code coverage should not decrease
- All linters must pass without errors
- Pre-commit hooks must succeed
### For Research/Experiment Code
**MANDATORY: You must demonstrate that your code runs successfully on sample data.**
This requirement ensures:
- The code is functional and not just syntactically correct
- Results are reproducible
- Other researchers can validate your approach
- The implementation matches the intended algorithm/method
#### What to Include
1. **Minimal working example**: Show the simplest way to run your code
2. **Sample dataset**: Use a small, representative dataset (can be synthetic)
3. **Expected behavior**: Document what the output should look like
4. **Screenshots/plots**: Visual evidence of results (if applicable)
5. **Metrics**: Quantitative results or validation metrics
#### Example Verification Section
```markdown
## Verification
### Testing Performed
Added unit tests in `tests/test_new_model.py` covering:
- Model initialization
- Forward pass
- Loss calculation
- Parameter updates
All tests pass:
```bash
$ pytest tests/test_new_model.py -v
tests/test_new_model.py::test_model_init PASSED
tests/test_new_model.py::test_forward_pass PASSED
tests/test_new_model.py::test_loss_calculation PASSED
tests/test_new_model.py::test_parameter_update PASSED
Ran the new model on synthetic MNIST-like data (100 samples):
Command:
python experiments/train_new_model.py data.num_samples=100 training.epochs=5Output:
Epoch 1/5: Loss=0.892, Accuracy=0.645
Epoch 2/5: Loss=0.543, Accuracy=0.782
Epoch 3/5: Loss=0.321, Accuracy=0.891
Epoch 4/5: Loss=0.198, Accuracy=0.934
Epoch 5/5: Loss=0.145, Accuracy=0.956
W&B Run: https://wandb.ai/project/runs/abc123
Configuration Used:
model:
name: "new_model_v1"
hidden_size: 128
dropout: 0.1
training:
learning_rate: 0.001
batch_size: 32
epochs: 5Performance:
- Training time: 2.3 minutes
- Memory usage: ~500MB
- Final validation accuracy: 95.6%
- Python 3.10
- Ubuntu 22.04
- PyTorch 2.0.1
- CUDA 11.8
## Coding Standards
### Python Style
- **Line length**: Maximum 100 characters
- **Formatting**: Use `black` (automatically enforced)
- **Import sorting**: Use `isort` (automatically enforced)
- **Style guide**: Follow PEP 8 (enforced by `flake8`)
### Type Hints
All functions should include type hints:
```python
from typing import List, Optional, Dict, Any
def process_data(
data: List[float],
config: Dict[str, Any],
threshold: Optional[float] = None
) -> Dict[str, float]:
"""Process the input data according to configuration.
Args:
data: List of numerical values to process.
config: Configuration dictionary with processing parameters.
threshold: Optional threshold value for filtering.
Returns:
Dictionary containing processed results and metadata.
"""
# Implementation here
pass
Use Google-style docstrings for all public functions and classes:
def calculate_metrics(predictions: np.ndarray, targets: np.ndarray) -> Dict[str, float]:
"""Calculate evaluation metrics for predictions.
This function computes various metrics including accuracy, precision,
recall, and F1 score for the given predictions and targets.
Args:
predictions: Array of predicted values with shape (n_samples,).
targets: Array of ground truth values with shape (n_samples,).
Returns:
Dictionary containing metric names as keys and computed values.
Keys include: 'accuracy', 'precision', 'recall', 'f1_score'.
Raises:
ValueError: If predictions and targets have different shapes.
Example:
>>> preds = np.array([0, 1, 1, 0])
>>> targets = np.array([0, 1, 0, 0])
>>> metrics = calculate_metrics(preds, targets)
>>> print(metrics['accuracy'])
0.75
"""
# Implementation here
pass- Write tests for all new functionality
- Aim for >80% code coverage
- Use descriptive test names
- Follow the Arrange-Act-Assert pattern
def test_calculate_metrics_with_perfect_predictions():
"""Test metric calculation when all predictions are correct."""
# Arrange
predictions = np.array([0, 1, 1, 0])
targets = np.array([0, 1, 1, 0])
# Act
metrics = calculate_metrics(predictions, targets)
# Assert
assert metrics['accuracy'] == 1.0
assert metrics['precision'] == 1.0
assert metrics['recall'] == 1.0For research experiments:
- Use Hydra for configuration management
- Log all experiments to W&B
- Include configuration files in
experiments/configs/ - Document hyperparameters and their expected ranges
- Save model checkpoints and artifacts
- Log random seeds for reproducibility
Write clear, concise commit messages:
<type>(<scope>): <subject>
<body>
<footer>
- feat: New feature
- fix: Bug fix
- docs: Documentation changes
- style: Code style changes (formatting, missing semicolons, etc.)
- refactor: Code refactoring
- test: Adding or updating tests
- chore: Maintenance tasks
feat(experiments): add new attention mechanism for transformer
Implemented multi-head attention with configurable number of heads
and dropout. Includes unit tests and sample experiment.
Related to #42
fix(data): correct normalization in preprocessing pipeline
The normalization was using incorrect mean/std values. Updated to
use per-channel statistics for image data.
Fixes #38
If you have questions about contributing:
- Check existing documentation
- Search closed issues and PRs
- Open a new issue with the "question" label
- Reach out to maintainers
All contributors will be acknowledged in the project documentation. Thank you for helping improve this research project!