Skip to content

andrewbelles/compressive-augmentation

Repository files navigation

Compressive Sensing on Mel-Spectrogram Representations for FMA Music

Data And Preprocessing

The expected dataset is FMA-small: 8 top-level genres with 1,000 tracks per genre. The preprocessing path downsamples audio to 22.05 kHz, computes 64-bin mel-spectrograms, applies log scaling, then min-max normalizes each track.

Download metadata:

bash -v preprocess/meta.sh

Download and extract FMA-small:

bash -v preprocess/small.sh

Generate mel tensors:

python -m preprocess.mel -d preprocess/data/fma_small

Add --sample-images to write a few preview spectrograms to preprocess/images/.

Citation

This dataset was made possible by the work of Defferrard et al. If you use FMA downstream, credit the original dataset authors:

Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson.
"FMA: A Dataset for Music Analysis"
18th International Society for Music Information Retrieval Conference (ISMIR), 2017.
Official FMA GitHub Repository

About

Compressive-sensing operators offer a unique control on view alignment by varying measurement ratio. This experiment determines if sensing operators can induce sufficient augmentation for optimal view-based learning by comparing to traditional methods on audio signals.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages