Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
-
Updated
Jan 7, 2024 - Python
Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
HRM-sMoE LLM training toolkit.
Long Context, Less Focus: A Scaling Gap in LLMs Revealed through Privacy and Personalization
This repository is not intended to serve as the next 'library' for text summarization. Instead, it is designed to be an educational resource, providing insights into the inner workings of text summarization.
Computing algorithms to increase the context windows of LLMs at a smaller scale
Investigating Why the Effective Context Length of LLMs Falls Short (Based on STRING, ICLR 2025)
Add a description, image, and links to the context-length topic page so that developers can more easily learn about it.
To associate your repository with the context-length topic, visit your repo's landing page and select "manage topics."