We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
https://arxiv.org/pdf/2402.19427.pdf
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
There was an error while loading. Please reload this page.