Skip to content

assert attn_channel_group_size % num_key_value_groups == 0 and (attn_channel_group_size // num_key_value_groups) % head_size == 0 #37

@aqe670

Description

@aqe670

When applying MultiPruner to qwen2.5-0.5b, because qwen2.5-0.5b has num_attention_heads: 14, num_key_value_heads: 2, and head_size: 64, the attn_channel_group_size must be set to at least 448. However, this results in too few groups (only two), which might impact the results. Could this have a significant effect on the outcome? Are there alternative approaches?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions