Naive brainstorm: accept length simulator

WARN: I have not learnt spec in details so this is just a naive brainstorm and I can be totally wrong!

Currently, it seems we we report accuracy and loss on eval data. However, what we really care about is the accept length.

Therefore, it would be great to have an API to simulate accept length on various eagle configurations at once, and then both call it automatically on train/val data, and also maybe expose as a normal function to allow users to use it. 

~~From my naive view, we may firstly compute outputs of draft model, and then use very quick calculations (e.g. a sliding window on the output token ids) for each configuration to know the accept length.~~ EDIT: briefly read EAGLE 3 and realize they have different hidden states, thus we may need to rerun draft model for each config, but anyway that may be lightweight compared to run full experiments using inference engine.

This has two use cases from my naive view: (1) We know the e2e metric we really care about during training, with almost no extra cost, which may help us a bit in training. (2) We know what may be the best config without having to test each and every eagle configuration, which is time consuming. (3) This may be useful for other scenarios I am interested in as a lightweight simulator.

Potential drawback: I dnk whether the error introduced by inference engine will be so large that this number may be inaccurate.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naive brainstorm: accept length simulator #63

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Naive brainstorm: accept length simulator #63

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions