Skip to content

Flash Attention and native PyTorch weights #5

@shaltielshmid

Description

@shaltielshmid

I saw on the TODO list Flash Attention, so I wanted to bring to your attention the announcement here.

Two packages were announced there:

1] Loading model weights saved using the PyTorch format / safetensors format (including handling for HuggingFace's sharding)

2] Flash Attention - self explanatory :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions