CSGY 6923 Project: Music Generation using Transformer-based

Models

Overview

This repository contains the code and configuration for the CSGY 6923 project investigating scaling laws in symbolic music generation using decoder-only Transformer models (nanoGPT architecture adapted for ABC notation).

The project includes custom ETL pipelines for MIDI to ABC conversion, a family of scaled Transformer models, and scripts for training, evaluation, and music generation.

1. Environment Setup

Critical Requirement: Training and full generation must be performed on high-performance machines with sufficient GPU memory (minimum 40GB VRAM recommended for XL models).

Tested Hardware: NVIDIA A100 / H100 / H200 GPU instances (specifically Digital Ocean GPU Cluster).

1.1 Dependencies and Installation

Clone the Repository:

git clone https://github.com/arrxy/MusicGenerator.git
cd MusicGenerator

1.1. Setup Git LFS:

1.1.1. Homebrew

brew install git-lfs

1.1.2 Debian

sudo apt-get install git-lfs

1.2

git lfs install
git lfs pull

Install midi2abc: This is required for the data conversion pipeline. Install the command-line tool on your machine. (Installation process varies by OS, e.g., sudo apt-get install midi2abc on Debian/Ubuntu).
Install Python Dependencies: This project uses uv for dependency management.

# Install uv if you don't have it
curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh
# Sync dependencies from pyproject.toml
uv sync

2. Data Preparation Pipeline

2.1 Download and Extraction

Download the Lakh MIDI Dataset (LMD): The project relies on the full LMD dataset. Download and extract it:

# Download the dataset (using the provided link)
wget [https://colinraffel.com/projects/lmd/Lakh_MIDI_Dataset.zip](https://colinraff
unzip Lakh_MIDI_Dataset.zip

Create Directory Structure:

mkdir -p data/raw
mkdir -p data/processed

Place Data: Extract the LMD files into the data/raw directory.

2.2 Tokenization and Preprocessing

The custom Python script handles MIDI conversion, cleaning, filtering, and tokenization, producing the final vocab.json and tokenized .bin files used for training.

Estimated Time: This process takes approximately 2-5 minutes on the target machine.

uv run python main.py

(The main.py script executes the entire Extract-Transform-Load pipeline as described in the report, including tokenization and train/val/test split generation.)

3. Training and Evaluation

3.1 Training the Best Model (XL)

The primary training script is train_transformer.py. This script is configured to train the XL model (n_layer=12, n_embd=768) based on the scaling study results.

uv run python train_transformer.py

(Ensure the script configuration matches the "XL" settings and targets the optimal token budget ( Billion tokens) documented in the project report.)

3.2 Generating Music Samples

After training and saving the final checkpoint (ckpt_XL_extended.pt), use the following scripts for music generation:

A. Unconditional Generation (From Scratch)

This generates music starting only from the token.

uv run python generate.py

B. Conditional Generation (Continuation/Prompting)

This script requires defining an initial sequence (ABC prefix) inside the script (generate_continuation.py) to guide the model's output.

# First, open the script and edit the initial sequence prompt:
# nano generate_continuation.py
# Then run the script:
uv run python generate_continuation.py

Note: For generation, refer to the configuration variables (MODEL_SIZE, CHECKPOINT_PATH, MAX_NEW_TOKENS, TEMPERATURE, etc.) located within the respective Python scripts (generate.py or generate_continuation.py).

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.idea		.idea
batch_evaluation_results		batch_evaluation_results
configs		configs
generated_content		generated_content
src		src
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
CSGY_6923_Project_Report.pdf		CSGY_6923_Project_Report.pdf
README.md		README.md
batch_evaluator.py		batch_evaluator.py
ckpt_Large_extended.pt		ckpt_Large_extended.pt
ckpt_Large_robust.pt		ckpt_Large_robust.pt
ckpt_Medium_extended.pt		ckpt_Medium_extended.pt
ckpt_Medium_robust.pt		ckpt_Medium_robust.pt
ckpt_Small_extended.pt		ckpt_Small_extended.pt
ckpt_Small_robust.pt		ckpt_Small_robust.pt
ckpt_Tiny_extended.pt		ckpt_Tiny_extended.pt
ckpt_Tiny_robust.pt		ckpt_Tiny_robust.pt
ckpt_XL_extended.pt		ckpt_XL_extended.pt
ckpt_XL_robust.pt		ckpt_XL_robust.pt
ckpt_rnn_Large.pt		ckpt_rnn_Large.pt
ckpt_rnn_Medium.pt		ckpt_rnn_Medium.pt
ckpt_rnn_Small.pt		ckpt_rnn_Small.pt
ckpt_rnn_Tiny.pt		ckpt_rnn_Tiny.pt
compare_scaling.py		compare_scaling.py
continued_song.abc		continued_song.abc
generate.py		generate.py
generate_continuation.py		generate_continuation.py
main.py		main.py
optimal_results.csv		optimal_results.csv
optimal_training_log.txt		optimal_training_log.txt
plot_generator.py		plot_generator.py
pyproject.toml		pyproject.toml
rnn_scaling_results.csv		rnn_scaling_results.csv
rnn_training.py		rnn_training.py
scaling_comparison.png		scaling_comparison.png
scaling_plot.png		scaling_plot.png
study_optimal.py		study_optimal.py
test.py		test.py
train_transformer.py		train_transformer.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CSGY 6923 Project: Music Generation using Transformer-based

Models

Overview

1. Environment Setup

2. Data Preparation Pipeline

3. Training and Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CSGY 6923 Project: Music Generation using Transformer-based

Models

Overview

1. Environment Setup

2. Data Preparation Pipeline

3. Training and Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages