Skip to content

tedkyi/llm-sonification

Repository files navigation

LLM Sonification

Turns the internal hidden states of GPT-2 into music. As the model processes each word in a prompt, its layer activations are projected into a low-dimensional space and mapped to MIDI notes — giving an auditory window into how the model "thinks."

How it works

Part 1 — Learning the activation space

Before sonifying anything, the model is run over a reference corpus (~1000 diverse sentences) to understand the range of its internal representations.

  1. part1_extract.py runs GPT-2 inference on the corpus and saves the hidden states (activations from each of the 13 layers) to disk.
  2. part1_reduce.py fits a PCA transform on those activations — 16 principal components per layer. These transforms define the "map" of the model's activation space.

The fitted PCA objects and corpus projections are saved to data/pca/ and reused for all future sonifications.

Part 2 — Sonifying a prompt (CLI)

part2_sonify.py takes a text prompt, runs inference, projects the hidden states through the pre-fitted PCA transforms, and maps the trajectories to MIDI notes — one per word. Three melodic voices (flute, clarinet, strings) track activations from layers 0, 6, and 12 respectively, over a bass drone. The result is rendered to a WAV file via fluidsynth.

python part2_sonify.py --prompt "The forest was silent." --soundfont data/GeneralUser-GS.sf2

Part 3 — Web app

part3_app.py is a Flask web app that exposes the same pipeline through a browser UI. The model and PCA artifacts are loaded once at startup. You enter a prompt, click Sonify, and the page plays the resulting audio while highlighting each word in sync with playback.

Web app screenshot

python part3_app.py --soundfont data/GeneralUser-GS.sf2
# then open http://localhost:5000

Setup

Install dependencies into a Python environment with CUDA-enabled PyTorch:

pip install -r requirements.txt
# also required (install separately):
conda install -c conda-forge fluidsynth
pip install pretty_midi pyfluidsynth

You also need a General MIDI soundfont at data/GeneralUser-GS.sf2 (not tracked by git).

Run Part 1 scripts once to build the PCA artifacts before using Part 2 or 3:

python build_corpus.py
python part1_extract.py
python part1_reduce.py

About

Sonifying GPT-2 hidden states: turns layer activations into music via PCA projection and MIDI synthesis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors