██╗ ██████╗ ██████╗██╗ ██╗
██║ ██╔═══██╗██╔════╝██║ ██║
██║ ██║ ██║██║ ██║ ██║
██║ ██║ ██║██║ ██║ ██║
███████╗╚██████╔╝╚██████╗███████╗██║
╚══════╝ ╚═════╝ ╚═════╝╚══════╝╚═╝
Fine-tune LLMs locally with AI-optimized defaults
by t21.dev
LoCLI makes fine-tuning LLMs accessible to developers. Just point it at your dataset and go.
- Multiple Model Families - Llama, Mistral, Qwen, Phi from HuggingFace
- LoRA & QLoRA - Fine-tune on consumer GPUs (6GB+ VRAM)
- AI-Optimized Defaults - Analyzes your dataset and suggests hyperparameters
- Interactive CLI - Guided step-by-step setup
- Export Options - LoRA adapters, merged models, GGUF for Ollama
# Clone the repository
git clone https://github.com/t21dev/locli.git
cd locli
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# For AI-powered suggestions (requires OpenAI API key)
pip install openai
# For training charts and visualizations
pip install matplotlib
# For GGUF export
pip install llama-cpp-pythonNote: The default OpenAI model is
gpt-4.1-mini. Make sure you have access to this model enabled in your OpenAI developer account. You can change the model in.envby settingOPENAI_MODEL=gpt-4oor another supported model.
Copy .env.example to .env and configure:
cp .env.example .envEdit .env:
# For gated models (Llama, etc.)
HF_TOKEN=hf_your_token_here
# For AI-powered suggestions (optional)
OPENAI_API_KEY=sk-your_key_here# Start the training wizard
python app.py trainThe interactive wizard guides you through:
Step 1: Dataset → Enter path, validate, show stats
Step 2: Model → Choose HuggingFace model
Step 3: Method → LoRA or QLoRA (auto-recommended)
Step 4: Parameters → AI-suggested or custom
Step 5: Output → Choose output directory
Summary → Review and start training
All commands are interactive and will prompt for required inputs:
python app.py train # Training wizard
python app.py analyze # Analyze dataset & get suggestions
python app.py export # Export model (LoRA/merged/GGUF)
python app.py test # Interactive chat with trained model
python app.py stats # View training metrics and charts
python app.py models list # List supported model families
python app.py models search # Search HuggingFace models
python app.py models info # Show model details & VRAM requirements
python app.py info # Check GPU, VRAM, CUDA status| Model Size | Method | Min VRAM |
|---|---|---|
| 3B | QLoRA | 4GB |
| 3B | LoRA | 8GB |
| 7B | QLoRA | 6GB |
| 7B | LoRA | 14GB |
| 13B | QLoRA | 10GB |
LoCLI supports JSONL files with these formats:
Chat format (recommended):
{"messages": [{"role": "system", "content": "You are helpful."}, {"role": "user", "content": "Hello"}, {"role": "assistant", "content": "Hi!"}]}Instruction format:
{"instruction": "Write a greeting", "output": "Hello!"}Completion format:
{"prompt": "Hello", "completion": "World"}A sample dataset is included: sample.jsonl
Create locli.yaml for custom defaults:
lora:
r: 16
lora_alpha: 32
training:
learning_rate: 2e-4
num_epochs: 3
batch_size: 4After training completes, LoCLI automatically saves training metrics and can generate visualizations.
Chat with your trained model to evaluate its responses:
python app.py test
# → Enter path to trained model (e.g., ./output/my-model)
# → Start chatting! Type 'exit' to quitView training metrics and generate charts:
python app.py stats
# → Enter path to training output directory
# → View loss curves, learning rate schedule, and summaryCharts generated (requires pip install matplotlib):
- loss_chart.png - Training loss over steps with eval loss overlay
- learning_rate_chart.png - Learning rate schedule visualization
- epoch_loss_chart.png - Average loss per epoch
- Python 3.10+
- NVIDIA GPU with CUDA support
- 4GB+ VRAM (QLoRA with 3B models) / 6GB+ for 7B models
The default pip install may install CPU-only PyTorch. For GPU training, install PyTorch with CUDA:
# Check if CUDA is working
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
# If False, reinstall PyTorch with CUDA
pip uninstall torch torchvision torchaudio -y
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124For RTX 40 series, use CUDA 12.4 (cu124). For older GPUs, use cu118 or cu121.
RTX 50 Series (5070, 5080, 5090): These GPUs use the new Blackwell architecture and require PyTorch nightly build with CUDA 12.8+:
pip uninstall torch torchvision torchaudio -y
# Install torch only (torchvision/torchaudio not needed for LLM training)
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128Verify it works:
python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0))"If you still get "no kernel image available" errors:
The RTX 50 series (Blackwell/SM_100) is very new and CUDA kernel support is still being added to PyTorch. If nightly builds don't work:
-
Check for newer nightlies - Support is being actively added:
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128 --upgrade
-
Verify CUDA version - RTX 50 series requires CUDA 12.8+:
nvcc --version nvidia-smi
-
Check PyTorch build info:
python -c "import torch; print(torch.__version__); print(torch.version.cuda)" -
Temporary workaround - Use CPU training (slow but works):
- The tool will detect missing CUDA and offer CPU fallback
-
Wait for stable release - PyTorch stable releases with full Blackwell support are expected in 2025
Gated models require HuggingFace authentication. If you get 401 or 403 errors:
- 403 Forbidden: Token exists but lacks permissions → Create new token with read access
- 401 Unauthorized: Token invalid/missing → Re-run
huggingface-cli login
Step 1: Create a Fine-Grained Token
Go to https://huggingface.co/settings/tokens and create a new token:
- Select "Fine-grained token"
- Name it (e.g., "llama-access")
- Under Permissions, select: Read access to contents of all public gated repos you can access
- Click "Create token"
- Copy the token (starts with
hf_)
Step 2: Accept Meta's License
Visit https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct and click "Agree and access repository" button.
Step 3: Login with Your Token
huggingface-cli login
# When prompted, paste your token (it won't show as you type)Step 4: Verify Login
huggingface-cli whoami
# Should show your HF usernameStep 5: Add Token to .env
HF_TOKEN=hf_your_token_hereVerification Checklist:
- Visited meta-llama repo page and clicked "Agree"
- Generated fine-grained token with gated repo read access
- Ran
huggingface-cli loginwith new token - Confirmed
huggingface-cli whoamishows your username - Set
HF_TOKENin.envfile - If still failing, try clearing cache:
rm -rf ~/.cache/huggingface/
# Run tests
pip install pytest
pytest tests/ -v
# Run linting
pip install ruff
ruff check src testsTransform documentation into LLM training datasets. Use DocSet Gen to generate JSONL training data from your docs, then fine-tune with LoCLI.
# Generate dataset from docs
docset-gen ./docs --output training_data.jsonl
# Fine-tune with LoCLI
python app.py train
# → Enter training_data.jsonl when promptedMIT License - see LICENSE
Created by @TriptoAfsin | t21.dev
