Multimodal Dataset

Previous Dataset Shortcoming

Diversity in the training samples:
- Previously proposed datasets for multimodal language are generally small in size due to difficulties associated with data acquisition and costs of annotations.
- The diversity in training samples is crucial for comprehensive multimodal language studies due to the complex- ity of the underlying distribution.
Variety in the topics:
- Models trained on only few topics gener- alize poorly as language and nonverbal behaviors tend to change based on the impression of the topic on speakers’ internal mental state
Diversity of speakers:
- speaking styles are highly idiosyncratic.
Variety in annotations:
- Having multiple labels to predict allows for studying the relations between labels.
- Another positive aspect of having variety of labels is allowing for multi-task learning which has shown excellent performance in past research.

contains 23,453 annotated video segments from 1,000 distinct speakers and 250 topics.
contains manual transcription aligned with audio to phoneme level.

contains video form YouTube that span a wide range of product reviews and opinion videos.

consists of product review videos in Spanish. Each video consists of multiple segments labeled to display positive, negative or neutral sentiment.

consists of 151 videos of recorded dialogues, with 2 speakers per session for a total of 302 videos across the dataset.
each segment 9 emotions as well as valence arousal and dominance.

Jason Yuan

http://Columbine21.github.io/2020/07/19/multimodal-language-analysis-datasets/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源 Jason Yuan !

MultiModal Dataset Sentiment Analysis

paper: Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph Reference From

2020-08-18 Multimodal Analysis

title: 动态规划 Summarydate: 2020-07-06 14:52:26author: Jason Yuantags: Algorithm Online Judgecategories: Online Judge （1）

2020-07-06 Jason Yuan