Turns the internal hidden states of GPT-2 into music. As the model processes each word in a prompt, its layer activations are projected into a low-dimensional space and mapped to MIDI notes — giving an auditory window into how the model "thinks."
Before sonifying anything, the model is run over a reference corpus (~1000 diverse sentences) to understand the range of its internal representations.
part1_extract.pyruns GPT-2 inference on the corpus and saves the hidden states (activations from each of the 13 layers) to disk.part1_reduce.pyfits a PCA transform on those activations — 16 principal components per layer. These transforms define the "map" of the model's activation space.
The fitted PCA objects and corpus projections are saved to data/pca/ and reused for all future sonifications.
part2_sonify.py takes a text prompt, runs inference, projects the hidden states through the pre-fitted PCA transforms, and maps the trajectories to MIDI notes — one per word. Three melodic voices (flute, clarinet, strings) track activations from layers 0, 6, and 12 respectively, over a bass drone. The result is rendered to a WAV file via fluidsynth.
python part2_sonify.py --prompt "The forest was silent." --soundfont data/GeneralUser-GS.sf2part3_app.py is a Flask web app that exposes the same pipeline through a browser UI. The model and PCA artifacts are loaded once at startup. You enter a prompt, click Sonify, and the page plays the resulting audio while highlighting each word in sync with playback.
python part3_app.py --soundfont data/GeneralUser-GS.sf2
# then open http://localhost:5000Install dependencies into a Python environment with CUDA-enabled PyTorch:
pip install -r requirements.txt
# also required (install separately):
conda install -c conda-forge fluidsynth
pip install pretty_midi pyfluidsynthYou also need a General MIDI soundfont at data/GeneralUser-GS.sf2 (not tracked by git).
Run Part 1 scripts once to build the PCA artifacts before using Part 2 or 3:
python build_corpus.py
python part1_extract.py
python part1_reduce.py