In Marine pollution bulletin
Dissolved oxygen (DO) is an important indicator of river health for environmental engineers and ecological scientists to understand the state of river health. This study aims to evaluate the reliability of four feature selector algorithms i.e., Boruta, genetic algorithm (GA), multivariate adaptive regression splines (MARS), and extreme gradient boosting (XGBoost) to select the best suited predictor of the applied water quality (WQ) parameters; and compare four tree-based predictive models, namely, random forest (RF), conditional random forests (cForest), RANdom forest GEneRator (Ranger), and XGBoost to predict the changes of dissolved oxygen (DO) in the Klang River, Malaysia. The total features including 15 WQ parameters from monitoring site data and 7 hydrological components from remote sensing data. All predictive models performed well as per the features selected by the algorithms XGBoost and MARS in terms applied statistical evaluators. Besides, the best performance noted in case of XGBoost predictive model among all applied predictive models when the feature selected by MARS and XGBoost algorithms, with the coefficient of determination (R2) values of 0.84 and 0.85, respectively, nonetheless the marginal performance came up by Boruta-XGBoost model on in this scenario.
Tiyasha Tiyasha, Tung Tran Minh, Bhagat Suraj Kumar, Tan Mou Leong, Jawad Ali H, Mohtar Wan Hanna Melini Wan, Yaseen Zaher Mundher
Artificial intelligence, Dissolved oxygen, Feature selection, Remote sensing data, Surface water quality