Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

ArXiv Preprint

Facial expression recognition is important for various purpose such as emotion detection, mental health analysis, and human-machine interaction. In facial expression recognition, incorporating audio information along with still images can provide a more comprehensive understanding of an expression state. This paper presents the Multi-modal facial expression recognition methods for Affective Behavior in-the-wild (ABAW) challenge at CVPR 2023. We propose a Modal Fusion Module (MFM) to fuse audio-visual information. The modalities used are image and audio, and features are extracted based on Swin Transformer to forward the MFM. Our approach also addresses imbalances in the dataset through data resampling in training dataset and leverages the rich modal in a single frame using dynmaic data sampling, leading to improved performance.

Jun-Hwa Kim, Namho Kim, Chee Sun Won

2023-03-15