Skip to content

mcPear/voice_style_transfer_decompression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voice recordings decompression with neural style transfer

This is modification of Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations for decompression problem.

Preprocess

Model is trained on CSTR VCTK Corpus. First run change_bit_rate.sh from VCTK-Corpus to compress wavs to 8kbps mp3. Then run mp3_to_wav.sh to get wav from mp3. It's possible to omit the second step if you can generate spectrograms directly from mp3.

Feature extraction, training, testing

Similar to base repo but moved to jupyter notebooks.

WaveNet synthesis

Download pretrained model (found in AutoVC repo) and move it to implementation directory. I use code from r9y9's wavenet_vocoder for spectrograms generation and synthesis.

About

Use neural style transfer models to decompress mp3 utterances from 8 bit compression(poor quality style) to original(good quality style).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors