In Journal of contaminant hydrology
Predicting in-stream water quality is necessary to support the decision-making process of protecting healthy waterbodies and restoring impaired ones. Data-driven modeling is an efficient technique that can be used to support such efforts. Our objective was to determine if in-stream concentrations of contaminants, nutrients-total phosphorus (TP) and total nitrogen (TN) -total suspended solids (TSS), dissolved oxygen (DO), and fecal coliform bacteria (FC) can be predicted satisfactorily using machine learning (ML) algorithms based on publicly available datasets. To achieve this objective, we evaluated four modeling scenarios, differing in terms of the required inputs (i.e., publicly available datasets (e.g., land-use/land cover)), antecedent conditions, and additional in-stream water quality observations (e.g., pH and turbidity). We implemented five ML algorithms-Support Vector Machines, Random Forest (RF), eXtreme Gradient Boost (XGB), ensemble RF-XGB, and Artificial Neural Network (ANN) -and demonstrated our modeling framework in an inland stream-Bullfrog Creek, located near Tampa, Florida. The results showed that, while including additional water quality drivers improved overall model performance for all target constituents, TP, TN, DO, and TSS could still be predicted satisfactorily using only publicly available datasets (Nash-Sutcliffe efficiency [NSE] > 0.75 and percent bias [PBIAS] < 10%), whereas FC could not (NSE < 0.49 and PBIAS >25%). Additionally, antecedent conditions slightly improved predictions and reduced the predictive uncertainty, particularly when paired with other water quality observations (6.9% increase in NSE for FC, and 2.7% for TP, TN, DO, and TSS). Also, comparable model performances of all water quality constituents in wet and dry seasons suggest minimal season-dependence of the predictions (<4% difference in NSE and < 10% difference in PBIAS). Our developed modeling framework is generic and can serve as a complementary tool for monitoring and predicting in-stream water quality constituents.
Adedeji Itunu C, Ahmadisharaf Ebrahim, Sun Yanshuo
In-stream water quality, Machine learning, Seasonality, Uncertainty quantification