Beyond Speedup – Utilizing KV Cache for Sampling and Reasoning

Accepted by ICLR 2026

This repository contains the code used to evaluate KV-cache–based methods proposed in our ICLR 2026 submission, including MTEB evaluation, KV-CoE inference, and KV-based classification.

MTEB

We follow the official MTEB evaluation protocol without modification.

Set up the environment according to the official MTEB instructions.

After the environment is ready, run custom_model.py, and another baselines with different suffix names.

The script evaluates our KV-cache–based setup by exposing KV representations as embeddings within the MTEB framework.

KV-CoE

Create the environment using requirements.txt.

Run Scripts/llm_infer.sh to perform LLM inference based KV-based representations.

Run Scripts/llm_eval.sh to evaluate the performance.

KVClassifier

The KVClassifier pipeline consists of the following steps:

Set up the environment using requirements.txt.
Run prep_fast_slow_thinking_results.py to generate baseline fast and slow thinking results.
Run prep_kv_classfier_training_data.py to construct the training dataset for KVClassifier. The distribution of difficulty labels can be inspected using count_difficulty.py.
Run train_kv_classifier.py to train the KVClassifier.
Run eval_kv_classifier_classification.py, followed by parse_kv_classifier_results_classification.py, to evaluate the KVClassifier in the classification setting.
Run eval_kv_classifier_generative.py, followed by parse_kv_classifier_results_generative.py, to evaluate the KVClassifier in the generative setting.

The core implementation of KVClassifier is provided in kv_classfier.py.

Reference

@inproceedings{
xing2026beyond,
title={Beyond Speedup - Utilizing {KV} Cache for Sampling and Reasoning},
author={Xing, Zeyu and Li, Xing and Zhen, Hui-Ling and Yuan, Mingxuan and Pan, Sinno Jialin},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=GUhmiJaAzv}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
KV-Chain-of-Embedding @ 41d4699		KV-Chain-of-Embedding @ 41d4699
KVClassifier		KVClassifier
mteb		mteb
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Beyond Speedup – Utilizing KV Cache for Sampling and Reasoning

MTEB

KV-CoE

KVClassifier

Reference

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Beyond Speedup – Utilizing KV Cache for Sampling and Reasoning

MTEB

KV-CoE

KVClassifier

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages