In Frontiers in public health
Machine learning is about finding patterns and making predictions from raw data. In this study, we aimed to achieve two goals by utilizing the modern logistic regression model as a statistical tool and classifier. First, we analyzed the associations between Major Depressive Episode with Severe Impairment (MDESI) in adolescents with a list of broadly defined sociodemographic characteristics. Using findings from the logistic model, the second and ultimate goal was to identify the potential MDESI cases using a logistic model as a classifier (i.e., a predictive mechanism). Data on adolescents aged 12-17 years who participated in the National Survey on Drug Use and Health (NSDUH), 2011-2017, were pooled and analyzed. The logistic regression model revealed that compared with males and adolescents aged 12-13, females and those in the age groups of 14-15 and 16-17 had higher risk of MDESI. Blacks and Asians had lower risk of MDESI than Whites. Living in single-parent household, having less authoritative parents, having negative school experiences further increased adolescents' risk of having MDESI. The predictive model successfully identified 66% of the MDESI cases (recall rate) and accurately identified 72% of the MDESI and MDESI-free cases (accuracy rate) in the training data set. The rates of both recall and accuracy remained about the same (66 and 72%) using the test data. Results from this study confirmed that the logistic model, when used as a classifier, can identify potential cases of MDESI in adolescents with acceptable recall and reasonable accuracy rates. The algorithmic identification of adolescents at risk for depression may improve prevention and intervention.
Chiu I-Ming, Lu Wenhua, Tian Fangming, Hart Daniel
National Survey on Drug Use and Health, logistic regression model/classifier, machine learning, major depressive episode with severe impairment, recall/accuracy rate