-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
It seems that in the datasets/datatset.py, AudioVisualDataset expects also to see "landmarks" of the video, which I guess should refer to the lip landmark. However, I did not see any description on how to obtain the CREMA-D video landmark. Could you please illustrate further about how to obtain the audio encoding, how to organize the dataset folder structure, and how to include the landmark for training process?
Metadata
Metadata
Assignees
Labels
No labels