this project using wav2vec2 to supervise generate audio with target emotion vectors.
- put some train files in
./train_dataset/*.wav - preprocess wav files to mel-spectrogram by runing
preprocess.py - download
wav2vec2model - modifies training option from
train.py - run
train.py
wav2vec2 pretrain model link
download model and extract demo code.
import os
import audeer
url = 'https://zenodo.org/record/6221127/files/w2v2-L-robust-12.6bc4a7fd-1.1.0.zip'
model_path = './model.zip'
audeer.download_url(
url,
model_path,
verbose=True,
)
audeer.extract_archive(
model_path,
".",
verbose=True,
)all code and pretrain emotion wav2vec2 model from:
- valence (the pleasantness of a stimulus)
- arousal (the intensity of emotion provoked by a stimulus)
- dominance (the degree of control exerted by a stimulus)