In Circuits, systems, and signal processing
This paper presents a deep learning-based analysis and classification of cold speech observed when a person is diagnosed with the common cold. The common cold is a viral infectious disease that affects the throat and the nose. Since speech is produced by the vocal tract after linear filtering of excitation source information, during a common cold, its attributes are impacted by the throat and the nose. The proposed study attempts to develop a deep learning-based classification model that can accurately predict whether a person has a cold or not based on their speech. The common cold-related information is captured using Mel-frequency cepstral coefficients (MFCC) and linear predictive coding (LPC) from the speech signal. The data imbalance is handled using the sampling strategy, SMOTE-Tomek links. Then, utilizing MFCC and LPC features, a deep learning-based model is trained and then used to categorize cold speech. The performance of a deep learning-based method is compared to logistic regression, random forest, and gradient boosted tree classifiers. The proposed model is less complex and uses a smaller feature set while giving comparable results to other state-of-the-art methods. The proposed method gives an UAR of 67.71 % , higher than the benchmark OpenSMILE SVM result of 64 % . The study's success will yield a noninvasive method for cold detection, which can further be extended to detect other speech-affecting pathologies.
Deb Suman, Warule Pankaj, Nair Amrita, Sultan Haider, Dash Rahul, Krajewski Jarek
2022-Oct-03
Cold speech, Deep neural network, Gradient boosted trees, LPC, MFCC, Random forest