In Computers in biology and medicine
Emotion recognition is a key component of human-computer interaction technology, for which facial electromyogram (fEMG) is an important physiological modality. Recently, deep-learning-based emotion recognition using fEMG signals has drawn increased attention. However, the ability of effective feature extraction and the demand of large-scale training data are two dominant factors that restrict the performance of emotion recognition. In this paper, a novel spatio-temporal deep forest (STDF) model is proposed to classify three categories of discrete emotions (neutral, sadness, and fear) using multi-channel fEMG signals. The feature extraction module fully extracts effective spatio-temporal features of fEMG signals using a combination of 2D frame sequences and multi-grained scanning. Meanwhile, a cascade forest-based classifier is designed to provide optimal structures for different scales of training data via automatically adjusting the number of cascade layers. The proposed model and five comparison methods were evaluated on our in-house fEMG dataset that included three discrete emotions and three channels of fEMG electrodes with a total of twenty-seven subjects. Experimental results demonstrate that the proposed STDF model achieves the best recognition performance with an average accuracy of 97.41%. Besides, our proposed STDF model can reduced the scale of training data to 50% while the average accuracy of emotion recognition is only reduced by about 5%. Our proposed model offers an effective solution for practical applications of fEMG-based emotion recognition.
Xu Muhua, Cheng Juan, Li Chang, Liu Yu, Chen Xun
2023-Feb-24
Deep forest, Emotion recognition, Multi-channel facial electromyogram (fEMG), Spatio-temporal features