Skip to content

brownkat6/reasoning-scheduling

Repository files navigation

Predictive Scheduling for Efficient Inference-Time Reasoning in Large Language Models

Brief current file directory

gen_datasets.py - pulls data from huggingface and saves it in the Dynasor data directory. Run once and never again for each of the datasets we want to add.

mlp_datagen.py - generates X, y for our prediction task. E.g. y is the early stopping probabilities of generating the correct answer if we terminate after W, 2W, .... tokens. X is the predictors we want to use, e.g. the hidden states and potentially other data. This script generates X and y for given datasets and saves it to netscratch.

generate_data.sh - sbatch script that batches out the data generation jobs with mlp_datagen.py`

mlp_train.py loads in data, trains in MLP, and saves the MLP

mlp.py - defines the MLP class

Dynasor/benchmark/TokenDeprivation/run_adaptive.py - script that loads in MLP predictions and/or oracle data and records the tokens per accuracy given adaptive greedy strategy of allocating more size/reasoning budget to queries that will benefit from it more

run.sh - runs benchmark tokens vs accuracy experiment for non-adaptive allocation strategy

run_adaptive.sh - runs benchmark tokens vs accuracy experiment for adaptive allocation strategy using MLP predictions

run_adaptive_oracle.sh - runs benchmark tokens vs accuracy experiment for adaptive allocation strategy using oracle ground truth for early stopping correctness probabilities (using the same data used to train the MLP)

vis_adaptive.py - creates visualization for tokens vs accuracy experiment for non-adaptive vs adaptive vs oracle

NOTE: most of dynasor vllm server scheduling layer is in Dynasor.dynasor.cli/*. Read and understand this. We will likely want to adapt/reuse large portions of this code later to write our own scheduling layer.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors