A new model and implementation to reduce VRAM usage on transformer models.
Reduce the VRAM usage of GPT2-XL by 25%. We can run GPT2-XL(float32) with Pytorch on the colab or with our gpu.
Always install the library from PyPI:
pip install recursers- Re-implement recurser for other models
- Enable MPS acceleration on Mac
- Retraining: The model training of the recurser is a little different from the usual.
Karpathy's elegant GPT implementation
https://github.com/karpathy/nanoGPT
Hugging Face's library
https://github.com/huggingface/transformers