Skip to content

Bug with some LLMs #93

@Mjall23

Description

@Mjall23

The new 2.0.3 version is really great! Before, I was getting around 3 t/s for the Qwen3.5 2b model. Now I'm getting 5.2 t/s with the same model. Thanks to your optimizations, there's a significant increase in speed and performance !
But, I was also able to load the Gemma 2 2b parameters one. Now I can't load it, the app crashes if I try.
@Siddhesh2377

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions