Skip to content

Latest commit

 

History

History
65 lines (50 loc) · 1.8 KB

File metadata and controls

65 lines (50 loc) · 1.8 KB

LLM

Complete implementations of large language models including all sub-components. Also includes training/fine-tuning implementations (i.e LoRA, QLoRA).

Repository Structure

finetuning/
├── lora/              # LoRA implementation & intuition
└── qlora/             # QLoRA implementation & intuition

models/
├── gpt/               # GPT-1 style implementation
└── llama/             # LLaMA-1/2 implementation

What's Implemented

GPT (models/gpt/):

  • Multi-head self-attention with causal masking
  • Learned positional embeddings
  • LayerNorm, feedforward blocks
  • Training loop with loss estimation

LLaMA (models/llama/):

  • Multi-head attention with Rotary Position Embeddings (RoPE)
  • RMSNorm (instead of LayerNorm)
  • SwiGLU feedforward network
  • Top-p sampling for generation
  • SentencePiece tokenizer

Usage

GPT:

cd models/gpt
python train.py

LLaMA:

cd models/llama
python generate.py

Default Configurations

Parameter GPT LLaMA
Embedding dim 384 4096
Hidden dim - 11008
Heads 6 32
Layers 6 32
Context length 256 2048
Dropout 0.2 0.0

References