Skip to content

Naive brainstorm: accept length simulator #63

@fzyzcjy

Description

@fzyzcjy

WARN: I have not learnt spec in details so this is just a naive brainstorm and I can be totally wrong!

Currently, it seems we we report accuracy and loss on eval data. However, what we really care about is the accept length.

Therefore, it would be great to have an API to simulate accept length on various eagle configurations at once, and then both call it automatically on train/val data, and also maybe expose as a normal function to allow users to use it.

From my naive view, we may firstly compute outputs of draft model, and then use very quick calculations (e.g. a sliding window on the output token ids) for each configuration to know the accept length. EDIT: briefly read EAGLE 3 and realize they have different hidden states, thus we may need to rerun draft model for each config, but anyway that may be lightweight compared to run full experiments using inference engine.

This has two use cases from my naive view: (1) We know the e2e metric we really care about during training, with almost no extra cost, which may help us a bit in training. (2) We know what may be the best config without having to test each and every eagle configuration, which is time consuming. (3) This may be useful for other scenarios I am interested in as a lightweight simulator.

Potential drawback: I dnk whether the error introduced by inference engine will be so large that this number may be inaccurate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions