Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Radiology Radiology

Predicting Hypoperfusion Lesion and Target Mismatch in Stroke from Diffusion-weighted MRI Using Deep Learning.

In Radiology ; h5-index 91.0

Background Perfusion imaging is important to identify a target mismatch in stroke but requires contrast agents and postprocessing software. Purpose To use a deep learning model to predict the hypoperfusion lesion in stroke and identify patients with a target mismatch profile from diffusion-weighted imaging (DWI) and clinical information alone, using perfusion MRI as the reference standard. Materials and Methods Imaging data sets of patients with acute ischemic stroke with baseline perfusion MRI and DWI were retrospectively reviewed from multicenter data available from 2008 to 2019 (Imaging Collaterals in Acute Stroke, Diffusion and Perfusion Imaging Evaluation for Understanding Stroke Evolution 2, and University of California, Los Angeles stroke registry). For perfusion MRI, rapid processing of perfusion and diffusion software automatically segmented the hypoperfusion lesion (time to maximum, ≥6 seconds) and ischemic core (apparent diffusion coefficient [ADC], ≤620 × 10-6 mm2/sec). A three-dimensional U-Net deep learning model was trained using baseline DWI, ADC, National Institutes of Health Stroke Scale score, and stroke symptom sidedness as inputs, with the union of hypoperfusion and ischemic core segmentation serving as the ground truth. Model performance was evaluated using the Dice score coefficient (DSC). Target mismatch classification based on the model was compared with that of the clinical-DWI mismatch approach defined by the DAWN trial by using the McNemar test. Results Overall, 413 patients (mean age, 67 years ± 15 [SD]; 207 men) were included for model development and primary analysis using fivefold cross-validation (247, 83, and 83 patients in the training, validation, and test sets, respectively, for each fold). The model predicted the hypoperfusion lesion with a median DSC of 0.61 (IQR, 0.45-0.71). The model identified patients with target mismatch with a sensitivity of 90% (254 of 283; 95% CI: 86, 93) and specificity of 77% (100 of 130; 95% CI: 69, 83) compared with the clinical-DWI mismatch sensitivity of 50% (140 of 281; 95% CI: 44, 56) and specificity of 89% (116 of 130; 95% CI: 83, 94) (P < .001 for all). Conclusion A three-dimensional U-Net deep learning model predicted the hypoperfusion lesion from diffusion-weighted imaging (DWI) and clinical information and identified patients with a target mismatch profile with higher sensitivity than the clinical-DWI mismatch approach. ClinicalTrials.gov registration nos. NCT02225730, NCT01349946, NCT02586415 © RSNA, 2022 Online supplemental material is available for this article. See also the editorial by Kallmes and Rabinstein in this issue.

Yu Yannan, Christensen Soren, Ouyang Jiahong, Scalzo Fabien, Liebeskind David S, Lansberg Maarten G, Albers Gregory W, Zaharchuk Greg

2022-Dec-06

General General

Know Before You Go: Data-Driven Beach Water Quality Forecasting.

In Environmental science & technology ; h5-index 132.0

Forecasting environmental hazards is critical in preventing or building resilience to their impacts on human communities and ecosystems. Environmental data science is an emerging field that can be harnessed for forecasting, yet more work is needed to develop methodologies that can leverage increasingly large and complex data sets for decision support. Here, we design a data-driven framework that can, for the first time, forecast bacterial standard exceedances at marine beaches with 3 days lead time. Using historical data sets collected at two California sites, we train nearly 400 forecast models using statistical and machine learning techniques and test forecasts against predictions from both a naive "persistence" model and a baseline nowcast model. Overall, forecast models are found to have similar sensitivities and specificities to the persistence model, but significantly higher areas under the ROC curve (a metric distinguishing a model's ability to effectively parse classes across decision thresholds), suggesting that forecasts can provide enhanced information beyond past observations alone. Forecast model performance at all lead times was similar to that of nowcast models. Together, results suggest that integrating the forecasting framework developed in this study into beach management programs can enable better public notification and aid in proactive pollution and health risk management.

Searcy Ryan T, Boehm Alexandria B

2022-Dec-06

data-driven models, machine learning, water quality forecasting

General General

Large Data Set-Driven Machine Learning Models for Accurate Prediction of the Thermoelectric Figure of Merit.

In ACS applied materials & interfaces ; h5-index 147.0

The figure of merit (zT) is a key parameter to measure the performance of thermoelectric materials. At present, the prediction of zT values via machine leaning has emerged as a promising method for exploring high-performance materials. However, the machine learning-based predictions still suffer from unsatisfactory accuracy, and this is related to the size of the data set, the hyperparameters of models, and the quality of the data. In this work, 5038 pieces of data of thermoelectric materials were selected, and several regression models were generated to predict zT values. This large data set-driven light gradient boosting (LGB) model with 57 features performed with an excellent accuracy, achieving a coefficient of determination (R2) value of 0.959, a root mean squared error (RMSE) of 0.094, a mean absolute error (MAE) of 0.057, and a correlation coefficient (R) of 0.979. Owing to the large size of the data set, the prediction accuracy exceeds that of most reported zT predictions via machine learning. The "ME Lattice Parameter" was verified as the most important feature in the zT prediction. Furthermore, nine potential candidates were screened out from among one million pieces of data. This study solves the problem of the data set size, adjusts the hyperparameters of the models, uses feature engineering to improve data quality, and provides an efficient strategy to perform wide-ranging screening for promising materials.

Li Yi, Zhang Jingzi, Zhang Ke, Zhao Mengkun, Hu Kailong, Lin Xi

2022-Dec-06

data-driven, machine learning, the figure of merit, thermoelectric materials, zT prediction

General General

Structural Analysis and Prediction of Hematotoxicity Using Deep Learning Approaches.

In Journal of chemical information and modeling

Hematotoxicity has been becoming a serious but overlooked toxicity in drug discovery. However, only a few in silico models have been reported for the prediction of hematotoxicity. In this study, we constructed a high-quality dataset comprising 759 hematotoxic compounds and 1623 nonhematotoxic compounds and then established a series of classification models based on a combination of seven machine learning (ML) algorithms and nine molecular representations. The results based on two data partitioning strategies and applicability domain (AD) analysis illustrate that the best prediction model based on Attentive FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver operating characteristic curve (AUC) value of 76.8% for the validation set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition, compared with existing filtering rules and models, our model achieved the highest BA value of 67.5% for the external validation set. Additionally, the shapley additive explanation (SHAP) and atom heatmap approaches were utilized to discover the important features and structural fragments related to hematotoxicity, which could offer helpful tips to detect undesired positive substances. Furthermore, matched molecular pair analysis (MMPA) and representative substructure derivation technique were employed to further characterize and investigate the transformation principles and distinctive structural features of hematotoxic chemicals. We believe that the novel graph-based deep learning algorithms and insightful interpretation presented in this study can be used as a trustworthy and effective tool to assess hematotoxicity in the development of new drugs.

Long Teng-Zhi, Shi Shao-Hua, Liu Shao, Lu Ai-Ping, Liu Zhao-Qian, Li Min, Hou Ting-Jun, Cao Dong-Sheng

2022-Dec-06

Pathology Pathology

Machine Learning To Stratify Methicillin-Resistant Staphylococcus aureus Risk among Hospitalized Patients with Community-Acquired Pneumonia.

In Antimicrobial agents and chemotherapy ; h5-index 79.0

Methicillin-resistant Staphylococcus aureus (MRSA) is an uncommon but serious cause of community-acquired pneumonia (CAP). A lack of validated MRSA CAP risk factors can result in overuse of empirical broad-spectrum antibiotics. We sought to develop robust models predicting the risk of MRSA CAP using machine learning using a population-based sample of hospitalized patients with CAP admitted to either a tertiary academic center or a community teaching hospital. Data were evaluated using a machine learning approach. Cases were CAP patients with MRSA isolated from blood or respiratory cultures within 72 h of admission; controls did not have MRSA CAP. The Classification Tree Analysis algorithm was used for model development. Model predictions were evaluated in sensitivity analyses. A total of 21 of 1,823 patients (1.2%) developed MRSA within 72 h of admission. MRSA risk was higher among patients admitted to the intensive care unit (ICU) in the first 24 h who required mechanical ventilation than among ICU patients who did not require ventilatory support (odds ratio [OR], 8.3; 95% confidence interval [CI], 2.4 to 32). MRSA risk was lower among patients admitted to ward units than among those admitted to the ICU (OR, 0.21; 95% CI, 0.07 to 0.56) and lower among ICU patients without a history of antibiotic use in the last 90 days than among ICU patients with antibiotic use in the last 90 days (OR, 0.03; 95% CI, 0.002 to 0.59). The final machine learning model was highly accurate (receiver operating characteristic [ROC] area = 0.775) in training and jackknife validity analyses. We identified a relatively simple machine learning model that predicted MRSA risk in hospitalized patients with CAP within 72 h postadmission.

Rhodes Nathaniel J, Rohani Roxane, Yarnold Paul R, Pawlowski Anna E, Malczynski Michael, Qi Chao, Sutton Sarah H, Zembower Teresa R, Wunderink Richard G

2022-Dec-06

MRSA infection, antibiotic stewardship, community-acquired pneumonia, machine learning, predictive model

General General

Development of a Peripheral Blood Transcriptomic Gene Signature to Predict Bronchopulmonary Dysplasia.

In American journal of physiology. Lung cellular and molecular physiology

BACKGROUND : Bronchopulmonary dysplasia (BPD) is the most common lung disease of extreme prematurity, yet mechanisms that associate with or identify neonates with increased susceptibility for BPD are largely unknown. Combining artificial intelligence with gene expression data is a novel approach that may assist in better understanding mechanisms underpinning BPD.

OBJECTIVE : Develop an early peripheral blood transcriptomic signature that can predict preterm neonates at risk for developing BPD.

METHODS : Secondary analysis of whole blood microarray data from 97 very low birth weight neonates day of life 5 was performed. BPD was defined as positive pressure ventilation or oxygen requirement at 28 days of age. Participants were randomly assigned into a training (70%) and testing cohort (30%). Four gene-centric machine learning models were built, and their discriminatory abilities were compared to gestational age or birthweight.

RESULTS : Neonates with BPD (n=62) exhibited a lower median gestational age (26.0 weeks vs. 30.0 weeks p<0.01) and birthweight (800 grams vs 1,280 grams, p<0.01) compared to non-BPD neonates. From an initial pool (33,252 genes/patient), 4,523 genes exhibited a false discovery rate (FDR) <1%. The area under the receiver operating characteristic curve (AUC) for predicting BPD utilizing gestational age or birthweight were 87.8% and 87.2%, respectively. The machine learning models revealed AUCs ranging between 85.8% and 96.1%. Pathways integral to T cell development and differentiation were most associated with BPD.

CONCLUSIONS : A derived 5-gene whole blood signature can accurately predict BPD in the first week of life.

Moreira Alvaro, Tovar Miriam, Smith Alisha M, Lee Grace C, Meunier Justin A, Cheema Zoya, Moreira Axel, Winter Caitlyn, Mustafa Shamimunisa B, Seidner Steven R, Findley Tina Oak, Garcia Joe G N, Thébaud Bernard, Kwinta Przemko, Ahuja Sunil K

2022-Dec-06

artificial intelligence, bronchopulmonary dysplasia, prediction, preterm neonate, transcriptome