This repo contains a single training script train.py that:
- Loads a base model (e.g. Qwen3)
- Streams
AI-MO/NuminaMath-CoT - Injects
<|jeton|>tokens before each solution paragraph (optionally merging short paragraphs into larger blocks) - Trains with NLL + LeJEPA-style latent losses (SIGReg if
lejepais installed) - Uses Adafactor with micro-batch size 1 (and optional gradient accumulation)
Install deps (once):
/venv/main/bin/pip install -U pip
/venv/main/bin/pip install -r requirements.txtRun:
/venv/main/bin/python train.py \
--model_name_or_path "unsloth/Qwen3-30B-A3B" \
--output_dir "./out" \
--max_steps 1000 \
--grad_accum 1/venv/main/bin/python -m pytest--load_in_4bitis supported for loading, but full fine-tuning quantized weights generally won’t work without adapters. This script is written for full fine-tuning.- If
lejepacan’t be imported,train.pyuses a lightweight isotropy regularizer as a fallback.