In Microbiology spectrum
The use of water contaminated with Salmonella for produce production contributes to foodborne disease burden. To reduce human health risks, there is a need for novel, targeted approaches for assessing the pathogen status of agricultural water. We investigated the utility of water microbiome data for predicting Salmonella contamination of streams used to source water for produce production. Grab samples were collected from 60 New York streams in 2018 and tested for Salmonella. Separately, DNA was extracted from the samples and used for Illumina shotgun metagenomic sequencing. Reads were trimmed and used to assign taxonomy with Kraken2. Conditional forest (CF), regularized random forest (RRF), and support vector machine (SVM) models were implemented to predict Salmonella contamination. Model performance was assessed using 10-fold cross-validation repeated 10 times to quantify area under the curve (AUC) and Kappa score. CF models outperformed the other two algorithms based on AUC (0.86, CF; 0.81, RRF; 0.65, SVM) and Kappa score (0.53, CF; 0.41, RRF; 0.12, SVM). The taxa that were most informative for accurately predicting Salmonella contamination based on CF were compared to taxa identified by ALDEx2 as being differentially abundant between Salmonella-positive and -negative samples. CF and differential abundance tests both identified Aeromonas salmonicida (variable importance [VI] = 0.012) and Aeromonas sp. strain CA23 (VI = 0.025) as the two most informative taxa for predicting Salmonella contamination. Our findings suggest that microbiome-based models may provide an alternative to or complement existing water monitoring strategies. Similarly, the informative taxa identified in this study warrant further investigation as potential indicators of Salmonella contamination of agricultural water. IMPORTANCE Understanding the associations between surface water microbiome composition and the presence of foodborne pathogens, such as Salmonella, can facilitate the identification of novel indicators of Salmonella contamination. This study assessed the utility of microbiome data and three machine learning algorithms for predicting Salmonella contamination of Northeastern streams. The research reported here both expanded the knowledge on the microbiome composition of surface waters and identified putative novel indicators (i.e., Aeromonas species) for Salmonella in Northeastern streams. These putative indicators warrant further research to assess whether they are consistent indicators of Salmonella contamination across regions, waterways, and years not represented in the data set used in this study. Validated indicators identified using microbiome data may be used as targets in the development of rapid (e.g., PCR-based) detection assays for the assessment of microbial safety of agricultural surface waters.
Chung Taejung, Yan Runan, Weller Daniel L, Kovac Jasna
Salmonella, indicators, machine learning, microbiome, surface water, water safety