The CG-MER Dyadic Multimodal Dataset for Spontaneous French Conversations: Annotation, Analysis and Assessment Benchmark
Article : Articles dans des revues internationales ou nationales avec comité de lecture
Emotion recognition is crucial for enhancing human-computer interaction systems. However, the development of robust methodologies for French emotion recognition is hindered by the scarcity of labeled, interactive multimodal datasets. In this work, we outline the acquisition and annotation procedures and provide an evaluation benchmark for the Card Game-based Multimodal Emotion Recognition (CG-MER) dataset that we designed to capture spontaneous emotional expressions in French conversations. The dataset comprises approximately ten hours of video recordings featuring dyadic interactions between 20 French participants (11 males, 9 females) engaged in a card game, capturing natural expressions through facial cues, speech, and gestures. Unlike existing corpora, CG-MER provides refined annotations across all three modalities, enabling a detailed investigation of emotion dynamics and their associated gestures in a French-speaking context. Additionally, we establish baseline results using state-of-the-art models for each modality and propose a standardized evaluation protocol, facilitating future comparative studies on multimodal emotion recognition.