In The Science of the total environment
With a remarkable increase in industrialization among fast-developing countries, air pollution is rising at an alarming rate and has become a public health concern. The study aims to examine the effect of air pollution on patient's hospital visits for respiratory diseases, particularly Acute Respiratory Infections (ARI). Outpatient hospital visits, air pollution and meteorological parameters were collected from March 2018 to October 2021. Eight machine learning algorithms (Random Forest model, K-Nearest Neighbors regression model, Linear regression model, LASSO regression model, Decision Tree Regressor, Support Vector Regression, X.G. Boost and Deep Neural Network with 5-layers) were applied for the analysis of daily air pollutants and outpatient visits for ARI. The evaluation was done by using 5-cross-fold confirmations. The data was randomly divided into test and training data sets at a scale of 1:2, respectively. Results show that among the studied eight machine learning models, the Random Forest model has given the best performance with R2 = 0.606, 0.608 without lag and 1-day lag respectively on ARI patients and R2 = 0.872, 0.871 without lag and 1-day lag respectively on total patients. All eight models did not perform well with the lag effect on the ARI patient dataset but performed better on the total patient dataset. Thus, the study did not find any significant association between ARI patients and ambient air pollution due to the intermittent availability of data during the COVID-19 period. This study gives insight into developing machine learning programs for risk prediction that can be used to predict analytics for several other diseases apart from ARI, such as heart disease and other respiratory diseases.
Ravindra Khaiwal, Bahadur Samsher Singh, Katoch Varun, Bhardwaj Sanjeev, Kaur-Sidhu Maninder, Gupta Madhu, Mor Suman
ARI, Air pollution, Machine learning programs, Random forest regression, Risk prediction