Skip to content

Token estimation#172

Open
saimouu wants to merge 4 commits intomainfrom
feature/token-estimation
Open

Token estimation#172
saimouu wants to merge 4 commits intomainfrom
feature/token-estimation

Conversation

@saimouu
Copy link
Copy Markdown
Collaborator

@saimouu saimouu commented Apr 2, 2026

Add token estimation including separate input and output token estimations.

Estimation happens automatically once papers have been uploaded meaning it is not model dependent. Uses o200k_base encoding to count the input tokens which is used for example by gpt-4o. Using other encoding could be considered and possibly it could be somehow possible to make it model dependent? Though tiktoken is mainly for openAI models.

Rough output estimate is calculated by

# Overhead + Overall decision + per criteria
paper_output = 50 + 30 + (num_criteria * 15)

which seems to yield decent results.

The estimates are buffered by 1.10 since it's probably better to be a bit higher than too low.

Resolves #171

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Token estimation

1 participant