Clean and comprehensible implementation of the Llama architecture.
Built on top of the ARENA 2.0 transformer implementation
- Rotary positional embeddings
- Test that the implementation of the grouped multi query attention is correct
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Clean and comprehensible implementation of the Llama architecture.
Built on top of the ARENA 2.0 transformer implementation