- get the scipy whl and numpy blas whl whl
TODO:
- make init script nicer for someone else to use
- tests for learning.py - v important
- produce a graph at the end for the test data
- Dockerise so others can easily develop (no real way of testing with scientific python packages with travis at the moment, unless docker is used)
- Turn TODO list into github issues
- make the fpython stuff nicer to use (currently in the bin folder of env, setup needs to add it to the folder)
- think of ways of making the experiments even more lightweight
- 2 methods - spectrograms and raw waveform.
- http://cs231n.stanford.edu/reports/Cs_231n_paper.pdf
- https://papers.nips.cc/paper/3674-unsupervised-feature-learning-for-audio-classification-using-convolutional-deep-belief-networks.pdf
- 20ms window size, 10ms overlap
- PCA whitening, (80 components)
- http://research.microsoft.com/pubs/230136/IS140441.PDF
- http://benanne.github.io/2014/08/05/spotify-cnns.html
- used mel scale for frequency and logarithmic amplitude scaling
https://www.youtube.com/watch?v=qv6UVOQ0F44
-
RML emotion database: promising, although maybe a bit small, could go well with another corpus.
-
Montreal affected voices corpus: emotional vocalizations (laughing, crying etc.)
-
RAVDESS: Currently in use
-
Vera am Mittag German Audio-Visual Spontaneous Speech Database.
-
Estonian Emotional Speech Corpus http://peeter.eki.ee:5000/
-
buckeye corpus: Long conversations of american english - not much metadata
-
voxforge: Lots of recordings, none of it about the users emotions/stress levels, therefore likely not to have much range.
-
Santa Barbara Corpus of Spoken American English: detailed but with no stress/emotion.
-
Spoken Language Corpora at the Research Center on Multilingualism: same as the UCSB
-
The Spoken Turkish Corpus at METU Ankara: Also normal corpus
-
Spoken Corpus Klient with the Corp-Oral Corpus at ILTEC Lisbon
-
Simmortel Speech Recognition Corpus for Indian English and Hindi