Caching judge completion

Currently, we use `set_langchain_cache()` to cache completions which does not work with vLLM for some reason and offer little control. 

I would propose to support a caching mechanism for judge completions, e.g. when calling `annotate_battles` 
https://github.com/OpenEuroLLM/JudgeArena/blob/main/judgearena/evaluate.py#L261 which caches judge-annotations in files like `{judge_arena_dir}/cache/db/{benchmark}/{judge}.db` (the naming would allow the user to manually delete some entries easily).

For the storage, we could use sqlite which would not require any dependency.

For the schema, we could use `"benchmark", "instruction_id", "model_a", "model_b",  "judge"` as keys and retrieve any entry from the db if we get a hit and generate otherwise. 

We could store the following which would allow to return completion from cache and perform small analysis:

```
@dataclass
class AnnotationEntry:
    benchmark: str
    instruction_id: str
    model_a: str
    model_b: str
    judge: str
    judge_input: str
    judge_completion: str
    date: str
```

Any thoughts @ErlisLushtaku @kargibora ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caching judge completion #36

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Caching judge completion #36

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions