Voice recordings decompression with neural style transfer

This is modification of Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations for decompression problem.

Preprocess

Model is trained on CSTR VCTK Corpus. First run change_bit_rate.sh from VCTK-Corpus to compress wavs to 8kbps mp3. Then run mp3_to_wav.sh to get wav from mp3. It's possible to omit the second step if you can generate spectrograms directly from mp3.

Feature extraction, training, testing

Similar to base repo but moved to jupyter notebooks.

WaveNet synthesis

Download pretrained model (found in AutoVC repo) and move it to implementation directory. I use code from r9y9's wavenet_vocoder for spectrograms generation and synthesis.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data		data
implementation		implementation
.gitignore		.gitignore
README.md		README.md
draft		draft
polish report.pdf		polish report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice recordings decompression with neural style transfer

Preprocess

Feature extraction, training, testing

WaveNet synthesis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voice recordings decompression with neural style transfer

Preprocess

Feature extraction, training, testing

WaveNet synthesis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages