Skip to content

性能问题 #11

Description

@SnifferCaptain

在推理llm的示例中,encode速度异常的缓慢,在v10中:
[Info] encoding length: 28, decoding length: 183, encoding speed: 33.8989 tokens/s, decoding speed: 31.8915 tokens/s
context length: 211/8192 tokens
但是在v8中(虽然v8的输出不太对):
[Info] encoding length: 27, decoding length: 222, encoding speed: 52.5056 tokens/s, decoding speed: 23.3753 tokens/s
context length: 249/8192 tokens

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions