Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Public Health

Public Health

Development and validation of risk prediction models for large for gestational age infants using logistic regression and two machine learning algorithms.

In Journal of diabetes

BACKGROUND : Large for gestational age (LGA) is one of the adverse outcomes during pregnancy that endangers the life and health of mothers and offspring. We aimed to establish prediction models for LGA at late pregnancy.

METHODS : Data were obtained from an established Chinese pregnant women cohort of 1285 pregnant women. LGA was diagnosed as >90th percentile of birth weight distribution of Chinese corresponding to gestational age of the same-sex newborns. Women with gestational diabetes mellitus (GDM) were classified into three subtypes according to the indexes of insulin sensitivity and insulin secretion. Models were established by logistic regression and decision tree/random forest algorithms, and validated by the data.

RESULTS : A total of 139 newborns were diagnosed as LGA after birth. The area under the curve (AUC) for the training set is 0.760 (95% confidence interval [CI] 0.706-0.815), and 0.748 (95% CI 0.659-0.837) for the internal validation set of the logistic regression model, which consisted of eight commonly used clinical indicators (including lipid profile) and GDM subtypes. For the prediction models established by the two machine learning algorithms, which included all the variables, the training set and the internal validation set had AUCs of 0.813 (95% CI 0.786-0.839) and 0.779 (95% CI 0.735-0.824) for the decision tree model, and 0.854 (95% CI 0.831-0.877) and 0.808 (95% CI 0.766-0.850) for the random forest model.

CONCLUSION : We established and validated three LGA risk prediction models to screen out the pregnant women with high risk of LGA at the early stage of the third trimester, which showed good prediction power and could guide early prevention strategies.

Wang Ning, Guo Haonan, Jing Yingyu, Zhang Yifan, Sun Bo, Pan Xingyan, Chen Huan, Xu Jing, Wang Mengjun, Chen Xi, Song Lin, Cui Wei

2023-Mar-08

gestational diabetes mellitus, heterogeneity, large for gestational age, lipid profile, prediction models

General

General

Automatic Segmentation of the Left Atrium from Computed Tomography Angiography Images.

In Annals of biomedical engineering ; h5-index 52.0
The left atrial appendage (LAA) causes 91% of thrombi in atrial fibrillation patients, a potential harbinger of stroke. Leveraging computed tomography angiography (CTA) images, radiologists interpret the left atrium (LA) and LAA geometries to stratify stroke risk. Nevertheless, accurate LA segmentation remains a time-consuming task with high inter-observer variability. Binary masks of the LA and their corresponding CTA images were used to train and test a 3D U-Net to automate LA segmentation. One model was trained using the entire unified-image-volume while a second model was trained on regional patch-volumes which were run for inference and then assimilated back into the full volume. The unified-image-volume U-Net achieved median DSCs of 0.92 and 0.88 for the train and test sets, respectively; the patch-volume U-Net achieved median DSCs of 0.90 and 0.89 for the train and test sets, respectively. This indicates that the unified-image-volume and patch-volume U-Net models captured up to 88 and 89% of the LA/LAA boundary's regional complexity, respectively. Additionally, the results indicate that the LA/LAA were fully captured in most of the predicted segmentations. By automating the segmentation process, our deep learning model can expedite LA/LAA shape, informing stratification of stroke risk.
Kazi Amaan, Betko Sage, Salvi Anish, Menon Prahlad G

2023-Mar-08

Deep learning, Machine learning, Medical imaging, U-Net

General

General

Diagnostic accuracy of AI in orthodontic extraction decisions: "Are we ready to let Mr. Data run our Enterprise?" A commentary on a systematic review.

In Evidence-based dentistry

OBJECTIVE : To collect evidence on the ability of artificial intelligence programs to accurately make extraction decisions in orthodontic treatment planning.

DATA SOURCES : Authors electronically searched the following databases: PubMed/ MEDLINE, EMBASE, LILACS, Web of Science, Scopus, LIVIVO, Computers & Applied Science, ACM Digital Library, and Compendex, Open Grey, Google Scholar, and ProQuest Dissertation and Thesis.

STUDY SELECTION : Three independent reviewers collected the following data: number of cases of extraction and non-extraction, number of experts in orthodontics and their years of experience, number of variables used in the index model test, type of artificial intelligence and algorithms, accuracy outcomes, the three highest variable ranks weighted in the computational model, and the main conclusion.

DATA EXTRACTION AND SYNTHESIS : Risk of bias was assessed using Quadas 2 checklist for AI, and certainty of evidence was evaluated by GRADE.

RESULTS : After 2 phases of screening by 3 independent reviewers, 6 studies met the inclusion criteria for the final review. The AI programs used by the included studies were as follows: ensemble learning/random forest, artificial neural network/multilayer perceptron, machine learning/back propagation and machine learning/feature vectors. All studies showed an unclear risk of bias for patient selection. Two studies had high risk of bias in the index test, while two others presented an unclear risk of bias in the diagnostic test. Meta-analysis of the pooled data resulted in 0.87 accuracy value for all studies.

CONCLUSIONS : The authors conclude that AI's ability to predict extractions is promising but should be interpreted with caution.

Thirumoorthy Soumya

2023-Mar-08

General

General

Human-machine collaboration for improving semiconductor process development.

In Nature ; h5-index 368.0
One of the bottlenecks to building semiconductor chips is the increasing cost required to develop chemical plasma processes that form the transistors and memory storage cells^1,2. These processes are still developed manually using highly trained engineers searching for a combination of tool parameters that produces an acceptable result on the silicon wafer³. The challenge for computer algorithms is the availability of limited experimental data owing to the high cost of acquisition, making it difficult to form a predictive model with accuracy to the atomic scale. Here we study Bayesian optimization algorithms to investigate how artificial intelligence (AI) might decrease the cost of developing complex semiconductor chip processes. In particular, we create a controlled virtual process game to systematically benchmark the performance of humans and computers for the design of a semiconductor fabrication process. We find that human engineers excel in the early stages of development, whereas the algorithms are far more cost-efficient near the tight tolerances of the target. Furthermore, we show that a strategy using both human designers with high expertise and algorithms in a human first-computer last strategy can reduce the cost-to-target by half compared with only human designers. Finally, we highlight cultural challenges in partnering humans with computers that need to be addressed when introducing artificial intelligence in developing semiconductor processes.
Kanarik Keren J, Osowiecki Wojciech T, Lu Yu Joe, Talukder Dipongkar, Roschewsky Niklas, Park Sae Na, Kamon Mattan, Fried David M, Gottscho Richard A

2023-Mar-08

Pathology

Pathology

Preparing pathological data to develop an artificial intelligence model in the nonclinical study.

In Scientific reports ; h5-index 158.0
Artificial intelligence (AI)-based analysis has recently been adopted in the examination of histological slides via the digitization of glass slides using a digital scanner. In this study, we examined the effect of varying the staining color tone and magnification level of a dataset on the result of AI model prediction in hematoxylin and eosin stained whole slide images (WSIs). The WSIs of liver tissues with fibrosis were used as an example, and three different datasets (N20, B20, and B10) were prepared with different color tones and magnifications. Using these datasets, we built five models trained Mask R-CNN algorithm by a single or mixed dataset of N20, B20, and B10. We evaluated their model performance using the test dataset of three datasets. It was found that the models that were trained with mixed datasets (models B20/N20 and B10/B20), which consist of different color tones or magnifications, performed better than the single dataset trained models. Consequently, superior performance of the mixed models was obtained from the actual prediction results of the test images. We suggest that training the algorithm with various staining color tones and multi-scaled image datasets would be more optimized for consistent remarkable performance in predicting pathological lesions of interest.
Hwang Ji-Hee, Lim Minyoung, Han Gyeongjin, Park Heejin, Kim Yong-Bum, Park Jinseok, Jun Sang-Yeop, Lee Jaeku, Cho Jae-Woo

2023-Mar-08

General

General

Improved weed mapping in corn fields by combining UAV-based spectral, textural, structural, and thermal measurements.

In Pest management science

BACKGROUND : Spatial-explicit weed information is critical for controlling weed infestation and reducing corn yield losses. The development of unmanned aerial vehicle (UAV) based remote sensing presents an unprecedented opportunity for efficient, timely weed mapping. Spectral, textural, and structural measurements have been used for weed mapping, while thermal measurements (e.g., canopy temperature, CT) were seldom considered and used. In this study, we quantified the optimal combination of spectral, textural, structural, and CT measurements based on different machine learning algorithms for weed mapping.

RESULTS : (1) CT improved weed mapping accuracies as complementary information for spectral, textural, and structural features (up to 5% and 0.051 improvements in overall accuracy (OA) and Marco-F1, respectively); (2) the fusion of textural, structural, and thermal features achieved the best performance in weed mapping (OA = 96.4%, Marco-F1 = 0.964), followed by the fusion of structural and thermal features (OA = 93.6%, Marco-F1 = 0.936); (3) Support Vector Machine based model achieved the best performance in weed mapping, with 3.5% and 7.1% improvements in OA and 0.036 and 0.071 in Marco-F1 respectively, compared to the best models of Random Forest and Naïve Bayes Classifier.

CONCLUSION : Thermal measurement can complement other types of remote sensing measurements and improve the weed mapping accuracy within the data fusion framework. Importantly, integrating textural, structural, and thermal features achieved the best performance for weed mapping. Our study provides a novel method for weed mapping using UAV-based multi-source remote sensing measurements, which is critical for ensuring crop production in precision agriculture. This article is protected by copyright. All rights reserved.

Xu Binyuan, Meng Ran, Chen Gengshen, Liang Linlin, Lv Zhengang, Zhou Longfei, Sun Rui, Zhao Feng, Yang Wanneng

2023-Mar-08

Machine learning, Multi-source remote sensing, Smart agriculture, Thermal feature, Weed mapping