Reusable building blocks for running LLM pre-training and post-training experiments with modern tooling and documentation.
- Modular package layout with
pretrain_llmandposttrain_llmsubpackages - Reusable trainer packages:
sft_trainer(Supervised Fine-Tuning) anddpo_trainer(Direct Preference Optimization) - Hydra-ready configs and Typer CLI entry points for reproducible experiments
- Apple Silicon & Linux friendly environment managed by
mambaand editable installs - Developer experience powered by
ruff,black,isort,mypy,pytest, andpre-commit - Comprehensive docs with tutorials, explainers, and research notes
Supervised Fine-Tuning with PEFT support (LoRA, DoRA, QLoRA, etc.)
# CLI usage
python -m sft_trainer train --model HuggingFaceTB/SmolLM2-135M --dataset banghua/DL-SFT-Dataset --peft lora# Python API
from sft_trainer import SFTTrainerWrapper, PEFTConfig
trainer = SFTTrainerWrapper(
model_name="HuggingFaceTB/SmolLM2-135M",
dataset_name="banghua/DL-SFT-Dataset",
peft_config=PEFTConfig.from_preset("lora_default"),
)
trainer.train()Direct Preference Optimization for alignment without reward models.
# CLI usage
python -m dpo_trainer train --model HuggingFaceTB/SmolLM2-135M-Instruct --dataset banghua/DL-DPO-Dataset
# Identity shift training
python -m dpo_trainer identity-shift --model HuggingFaceTB/SmolLM2-135M-Instruct --original-name Qwen --new-name "Deep Qwen"# Python API
from dpo_trainer import DPOTrainerWrapper, build_identity_shift_dataset
trainer = DPOTrainerWrapper(
model_name="HuggingFaceTB/SmolLM2-135M-Instruct",
dataset_name="banghua/DL-DPO-Dataset",
)
trainer.train()llm-lab/
├── src/llm_lab/ # Python package (installed in editable mode)
│ ├── pretrain_llm/ # Pre-training utilities
│ └── posttrain_llm/ # Post-training + alignment utilities
├── pretrain_llm/ # Pre-training lessons and notebooks
│ ├── Lesson_1-3.ipynb # Pre-training tutorials
│ └── docs/ # Pre-training documentation
├── posttrain_llm/ # Post-training packages and lessons
│ ├── sft_trainer/ # SFT package with PEFT support
│ ├── dpo_trainer/ # DPO package for preference optimization
│ ├── llm_eval/ # LLM evaluation utilities
│ ├── L3/, L5/, L7/ # Lesson notebooks
│ └── M1/ # Module 1 materials
├── examples/ # Example scripts and workflows
├── tests/ # Pytest-based test suite
├── docs/ # Documentation portal
│ ├── setup/ # Installation guides
│ ├── workflows/ # Document workflows
│ └── LLM/ # LLM research notes
├── dev/ # Development notes and explainers
├── pyproject.toml # Package metadata and tooling config
├── environment.yml # Mamba environment specification
└── requirements.txt # Pip installation manifest
-
Install the environment with mamba:
mamba env create -f environment.yml mamba activate llm-lab
-
Install the package in editable mode:
pip install -e .[dev]
-
Run the test suite:
pytest
-
Try a trainer package:
# Test DPO trainer CLI python -m dpo_trainer --help # Test SFT trainer CLI python -m sft_trainer --help
Complete documentation is available in the docs/ directory:
- Documentation Portal - Main entry point for all documentation
- Quick Start - Get up and running in 5 minutes
- Setup Guides - Installation, environment, LaTeX, dependencies
- Workflows - Document creation, markdown→PDF conversion
- LLM Research - Technical notes on architectures, memory mechanisms, training
- SFT Trainer - Supervised Fine-Tuning with PEFT
- DPO Trainer - Direct Preference Optimization
- DPO Explainer - Technical deep-dive into DPO
- DPO for Computational Biology - DPO applications in biology
- Pre-training Guide - LLM pre-training tutorials
- Full fine-tuning and PEFT methods (LoRA, DoRA, QLoRA, VeRA, AdaLoRA, IA3, Prompt/Prefix Tuning)
- CLI and Python API
- HuggingFace integration
- GPU/MPS/CPU support
- Direct Preference Optimization training
- Identity shift training (change model's self-identification)
- Custom preference dataset builders
- Model comparison utilities
- CLI commands:
train,identity-shift,test,compare,info
- Browse
examples/for runnable scripts - Try the trainer packages with
python -m sft_trainer --helporpython -m dpo_trainer --help - Follow setup guides in
docs/setup/for detailed installation - Read technical content in
docs/LLM/for research notes - Explore lesson notebooks in
pretrain_llm/andposttrain_llm/