Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General

General

Determination of conditioning factors for mapping nickel contamination susceptibility in groundwater in Kanchanaburi Province, Thailand, using random forest and maximum entropy.

In Environmental geochemistry and health
Groundwater pollution from nickel (Ni) has been a severe concern in Kanchanaburi Province, Thailand. Recent assessments revealed that the Ni concentration in groundwater, particularly in urban areas, often exceeded the permissible limit. The challenge for groundwater agencies is therefore to delineate regions with high susceptibility to Ni contamination. In this study, a novel modeling approach was applied to a dataset of 117 groundwater samples collected from Kanchanaburi Province between April and July 2021. Twenty site-specific initial variables were considered as influencing factors to Ni contamination. The Random Forest (RF) algorithm with Recursive Feature Elimination (RFE) function was used to select the fourteen most influencing variables. These variables were then used as input features to train a ME model to delineate the Ni contamination susceptibility at a high confidence (Area Under the Curve (AUC) validation value of 0.845). Ten input variables of the altitude, geology, land use, slope, soil type, distance to industrial areas, distance to mining areas, electric conductivity, oxidation-reduction potential, and groundwater depth were discovered in the most explaining the variation of spatial Ni contamination at very high (95.47 km²) and high (86.65 km²) susceptibility. This study devises the novel machine learning approach to identify the conditioning factors and map Ni contamination susceptibility in the groundwater, which provides a baseline dataset and reliable methods for the development of a sustainable groundwater management strategy.
Thanh Nguyen Ngoc, Chotpantarat Srilert, Ha Nam-Thang, Trung Nguyen H

2023-Mar-07

Groundwater, Maximum entropy, Nickel, Random forest, Recursive feature elimination, Thailand

General

General

Linear Binary Classifier to Predict Bacterial Biofilm Formation on Polyacrylates.

In ACS applied materials & interfaces ; h5-index 147.0
Bacterial infections are increasingly problematic due to the rise of antimicrobial resistance. Consequently, the rational design of materials naturally resistant to biofilm formation is an important strategy for preventing medical device-associated infections. Machine learning (ML) is a powerful method to find useful patterns in complex data from a wide range of fields. Recent reports showed how ML can reveal strong relationships between bacterial adhesion and the physicochemical properties of polyacrylate libraries. These studies used robust and predictive nonlinear regression methods that had better quantitative prediction power than linear models. However, as nonlinear models' feature importance is a local rather than global property, these models were hard to interpret and provided limited insight into the molecular details of material-bacteria interactions. Here, we show that the use of interpretable mass spectral molecular ions and chemoinformatic descriptors and a linear binary classification model of attachment of three common nosocomial pathogens to a library of polyacrylates can provide improved guidance for the design of more effective pathogen-resistant coatings. Relevant features from each model were analyzed and correlated with easily interpretable chemoinformatic descriptors to derive a small set of rules that give model features tangible meaning that elucidate relationships between the structure and function. The results show that the attachment of Pseudomonas aeruginosa and Staphylococcus aureus can be robustly predicted by chemoinformatic descriptors, suggesting that the obtained models can predict the attachment response to polyacrylates to identify anti-attachment materials to synthesize and test in the future.
Contreas Leonardo, Hook Andrew L, Winkler David A, Figueredo Grazziela, Williams Paul, Laughton Charles A, Alexander Morgan R, Williams Philip M

2023-Mar-07

bacterial attachment, classification, healthcare-associated infections, machine learning, polyacrylates

Cardiology

Cardiology

A Machine Learning Approach to Developing an Accurate Prediction of Maximal Heart Rate During Exercise Testing in Apparently Healthy Adults.

In Journal of cardiopulmonary rehabilitation and prevention

PURPOSE : Maximal heart rate (HRmax) continues to be an important measure of adequate effort during an exercise test. The aim of this study was to improve the accuracy of HRmax prediction using a machine learning (ML) approach.

METHODS : We used a sample from the Fitness Registry of the Importance of Exercise National Database, which included 17 325 apparently healthy individuals (81% males) who performed a maximal cardiopulmonary exercise test. Two standard formulas for HRmax prediction were tested: Formula1 = 220 - age (yr), root-mean-squared error (RMSE) 21.9, relative root-mean-squared error (RRMSE) 1.1; and Formula2 = 209.3 - 0.72 × age (yr), RMSE 22.7 and RRMSE 1.1. For ML model prediction, we used age, weight, height, resting HR, and systolic and diastolic blood pressure. The following ML algorithms to predict HRmax were applied: lasso regression (LR), neural networks (NN), support vector machine (SVM) and random forests (RF). An evaluation was performed using cross-validation and by computing the RMSE and RRMSE, Pearson correlation, and Bland-Altman plots. The best predictive model was explained with Shapley Additive Explanations (SHAP).

RESULTS : The HRmax for the cohort was 162 ± 20 bpm. All ML models improved HRmax prediction and reduced RMSE and RRMSE compared with Formula1 (LR: 20.2%, NN: 20.4%, SVM: 22.2%, and RF: 24.7%). The predictions of all algorithms significantly correlated with HRmax (r = 0.49, 0.51, 0.54, 0.57, respectively; P < .001). Bland-Altman analysis demonstrated lower bias and 95% CI for all ML models in comparison with standard equations. The SHAP explanation showed a high impact of all selected variables.

CONCLUSIONS : Machine learning, particularly the RF model, improved prediction of HRmax using readily available measures. This approach should be considered for clinical application to refine HRmax prediction.

Cundrič Larsen, Bosnić Zoran, Kaminsky Leonard A, Myers Jonathan, Peterman James E, Markovic Vidan, Arena Ross, Popović Dejana

2023-Mar-08

Radiology

Radiology

Autonomous Chest Radiograph Reporting Using AI: Estimation of Clinical Impact.

In Radiology ; h5-index 91.0
Background Automated interpretation of normal chest radiographs could alleviate the workload of radiologists. However, the performance of such an artificial intelligence (AI) tool compared with clinical radiology reports has not been established. Purpose To perform an external evaluation of a commercially available AI tool for (a) the number of chest radiographs autonomously reported, (b) the sensitivity for AI detection of abnormal chest radiographs, and (c) the performance of AI compared with that of the clinical radiology reports. Materials and Methods In this retrospective study, consecutive posteroanterior chest radiographs from adult patients in four hospitals in the capital region of Denmark were obtained in January 2020, including images from emergency department patients, in-hospital patients, and outpatients. Three thoracic radiologists labeled chest radiographs in a reference standard based on chest radiograph findings into the following categories: critical, other remarkable, unremarkable, or normal (no abnormalities). AI classified chest radiographs as high confidence normal (normal) or not high confidence normal (abnormal). Results A total of 1529 patients were included for analysis (median age, 69 years [IQR, 55-69 years]; 776 women), with 1100 (72%) classified by the reference standard as having abnormal radiographs, 617 (40%) as having critical abnormal radiographs, and 429 (28%) as having normal radiographs. For comparison, clinical radiology reports were classified based on the text and insufficient reports excluded (n = 22). The sensitivity of AI was 99.1% (95% CI: 98.3, 99.6; 1090 of 1100 patients) for abnormal radiographs and 99.8% (95% CI: 99.1, 99.9; 616 of 617 patients) for critical radiographs. Corresponding sensitivities for radiologist reports were 72.3% (95% CI: 69.5, 74.9; 779 of 1078 patients) and 93.5% (95% CI: 91.2, 95.3; 558 of 597 patients), respectively. Specificity of AI, and hence the potential autonomous reporting rate, was 28.0% of all normal posteroanterior chest radiographs (95% CI: 23.8, 32.5; 120 of 429 patients), or 7.8% (120 of 1529 patients) of all posteroanterior chest radiographs. Conclusion Of all normal posteroanterior chest radiographs, 28% were autonomously reported by AI with a sensitivity for any abnormalities higher than 99%. This corresponded to 7.8% of the entire posteroanterior chest radiograph production. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Park in this issue.
Plesner Louis L, Müller Felix C, Nybing Janus D, Laustrup Lene C, Rasmussen Finn, Nielsen Olav W, Boesen Mikael, Andersen Michael B

2023-Mar-07

Radiology

Radiology

Deep Learning for Head and Neck CT Angiography: Stenosis and Plaque Classification.

In Radiology ; h5-index 91.0
Background Studies have rarely investigated stenosis detection from head and neck CT angiography scans because accurate interpretation is time consuming and labor intensive. Purpose To develop an automated convolutional neural network-based method for accurate stenosis detection and plaque classification in head and neck CT angiography images and compare its performance with that of radiologists. Materials and Methods A deep learning (DL) algorithm was constructed and trained with use of head and neck CT angiography images that were collected retrospectively from four tertiary hospitals between March 2020 and July 2021. CT scans were partitioned into training, validation, and independent test sets at a ratio of 7:2:1. An independent test set of CT angiography scans was collected prospectively between October 2021 and December 2021 in one of the four tertiary centers. Stenosis grade categories were as follows: mild stenosis (<50%), moderate stenosis (50%-69%), severe stenosis (70%-99%), and occlusion (100%). The stenosis diagnosis and plaque classification of the algorithm were compared with the ground truth of consensus by two radiologists (with more than 10 years of experience). The performance of the models was analyzed in terms of accuracy, sensitivity, specificity, and areas under the receiver operating characteristic curve. Results There were 3266 patients (mean age ± SD, 62 years ± 12; 2096 men) evaluated. The consistency between radiologists and the DL-assisted algorithm on plaque classification was 85.6% (320 of 374 cases [95% CI: 83.2, 88.6]) on a per-vessel basis. Moreover, the artificial intelligence model assisted in visual assessment, such as increasing confidence in the degree of stenosis. This reduced the time needed for diagnosis and report writing of radiologists from 28.8 minutes ± 5.6 to 12.4 minutes ± 2.0 (P < .001). Conclusion A deep learning algorithm for head and neck CT angiography interpretation accurately determined vessel stenosis and plaque classification and had equivalent diagnostic performance when compared with experienced radiologists. © RSNA, 2023 Supplemental material is available for this article.
Fu Fan, Shan Yi, Yang Guang, Zheng Chao, Zhang Miao, Rong Dongdong, Wang Ximing, Lu Jie

2023-Mar-07

General

General

A data set of earthquake bulletin and seismic waveforms for Ghana obtained by deep learning.

In Data in brief
The Ghana Digital Seismic Network (GHDSN) data, with six broadband sensors, operating in southern Ghana for two years (2012-2014). The recorded dataset is processed for simultaneous event detection and phase picking by a Deep Learning (DL) model, the EQTransformer tool. Here, the detected earthquakes consisting of supporting data, waveforms (including P and S arrival phases), and earthquake bulletin are presented. The bulletin includes the 559 arrival times (292 P and 267 S phases) and waveforms of the 73 local earthquakes in SEISAN format. The supporting data encompasses the preliminary crustal velocity models obtained from the joint inversion analysis of the detected hypocentral parameters. These parameters comprised of a 6- layer model of the crustal velocity (Vp and Vp/Vs ratio), incident time sequence, and statistical analysis of the detected earthquakes and hypocentral parameters analyzed and relocated by the updated crustal velocity and graphic representation of them a 3D live figure enlighting the seismogenic depth of the region. This dataset has a unique appeal for earth science specialists to analyze and reprocess the detected waveforms and characterize the seismogenic sources and active faults in Ghana. The metadata and waveforms have been deposited at the Mendeley Data repository [1].
Mohammadigheymasi Hamzeh, Tavakolizadeh Nasrin, Matias Luís, Mousavi S Mostafa, Moradichaloshtori Yahya, Mousavirad Seyed Jalaleddin, Fernandes Rui

2023-Apr

Deep learning, Earthquake waveforms, Live Matlab figures, Seismic catalog