Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General

General

Combining pairwise structural similarity and deep learning interface contact prediction to estimate protein complex model accuracy in CASP15.

In bioRxiv : the preprint server for biology
Estimating the accuracy of quaternary structural models of protein complexes and assemblies (EMA) is important for predicting quaternary structures and applying them to studying protein function and interaction. The pairwise similarity between structural models is proven useful for estimating the quality of protein tertiary structural models, but it has been rarely applied to predicting the quality of quaternary structural models. Moreover, the pairwise similarity approach often fails when many structural models are of low quality and similar to each other. To address the gap, we developed a hybrid method (MULTICOM_qa) combining a pairwise similarity score (PSS) and an interface contact probability score (ICPS) based on the deep learning inter-chain contact prediction for estimating protein complex model accuracy. It blindly participated in the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) in 2022 and ranked first out of 24 predictors in estimating the global accuracy of assembly models. The average per-target correlation coefficient between the model quality scores predicted by MULTICOM_qa and the true quality scores of the models of CASP15 assembly targets is 0.66. The average per-target ranking loss in using the predicted quality scores to rank the models is 0.14. It was able to select good models for most targets. Moreover, several key factors (i.e., target difficulty, model sampling difficulty, skewness of model quality, and similarity between good/bad models) for EMA are identified and analayzed. The results demonstrate that combining the multi-model method (PSS) with the complementary single-model method (ICPS) is a promising approach to EMA. The source code of MULTICOM_qa is available at https://github.com/BioinfoMachineLearning/MULTICOM_qa .
Cheng Jianlin, Roy Raj Shekhor, Liu Jian, Giri Nabin, Guo Zhiye

2023-Mar-12

General

General

A multiscale functional map of somatic mutations in cancer integrating protein structure and network topology.

In bioRxiv : the preprint server for biology
A major goal of cancer biology is to understand the mechanisms underlying tumorigenesis driven by somatically acquired mutations. Existing computational approaches focus on either scoring the pathogenicity of mutations or characterizing their effects at specific scales. Here, we established a unified computational framework, NetFlow3D, that systematically maps the multiscale mechanistic effects of somatic mutations in cancer. The establishment of NetFlow3D hinges upon the Human Protein Structurome, a complete repository we first compiled that incorporates the 3D structures of every single protein as well as the binding interfaces for all known PPIs in humans. The vast majority of 3D structural information was resolved by recent deep learning algorithms. By applying NetFlow3D to 415,017 somatic protein-altering mutations in 5,950 TCGA tumors across 19 cancer types, we identified 1,656 intra- and 3,343 inter-protein 3D clusters of mutations throughout the Human Protein Structurome, of which ~50% would not have been found if using only experimentally-determined protein structures. These 3D clusters have converging effects on 377 cellular subnetworks. Compared to canonical PPI network analyses, NetFlow3D achieved a 5.5-fold higher statistical power for identifying significantly dysregulated subnetworks. The majority of identified subnetworks were previously obscured by the overwhelming background noise of non-clustered passenger mutations, including portions of non-canonical PRC1, mediator complex, MCM2-7 complex, neddylation of cullins, complement system, TRiC, etc. NetFlow3D and our pan-cancer results can be accessed from http://netflow3d.yulab.org. This work shows that mapping how individual mutations act across scales requires the integration of their local spatial organization on protein structures and their global topological organization in the PPI network.
Zhang Yingying, Leung Alden K, Qiu Tian, Li Le, Zhang Junke, Wierbowski Shayne, Booth James, Yu Haiyuan

2023-Mar-07

General

General

Fully-automated sarcopenia assessment in head and neck cancer: development and external validation of a deep learning pipeline.

In medRxiv : the preprint server for health sciences

PURPOSE : Sarcopenia is an established prognostic factor in patients diagnosed with head and neck squamous cell carcinoma (HNSCC). The quantification of sarcopenia assessed by imaging is typically achieved through the skeletal muscle index (SMI), which can be derived from cervical neck skeletal muscle (SM) segmentation and cross-sectional area. However, manual SM segmentation is labor-intensive, prone to inter-observer variability, and impractical for large-scale clinical use. To overcome this challenge, we have developed and externally validated a fully-automated image-based deep learning (DL) platform for cervical vertebral SM segmentation and SMI calculation, and evaluated the relevance of this with survival and toxicity outcomes.

MATERIALS AND METHODS : 899 patients diagnosed as having HNSCC with CT scans from multiple institutes were included, with 335 cases utilized for training, 96 for validation, 48 for internal testing and 393 for external testing. Ground truth single-slice segmentations of SM at the C3 vertebra level were manually generated by experienced radiation oncologists. To develop an efficient method of segmenting the SM, a multi-stage DL pipeline was implemented, consisting of a 2D convolutional neural network (CNN) to select the middle slice of C3 section and a 2D U-Net to segment SM areas. The model performance was evaluated using the Dice Similarity Coefficient (DSC) as the primary metric for the internal test set, and for the external test set the quality of automated segmentation was assessed manually by two experienced radiation oncologists. The L3 skeletal muscle area (SMA) and SMI were then calculated from the C3 cross sectional area (CSA) of the auto-segmented SM. Finally, established SMI cut-offs were used to perform further analyses to assess the correlation with survival and toxicity endpoints in the external institution with univariable and multivariable Cox regression.

RESULTS : DSCs for validation set (n = 96) and internal test set (n = 48) were 0.90 (95% CI: 0.90 - 0.91) and 0.90 (95% CI: 0.89 - 0.91), respectively. The predicted CSA is highly correlated with the ground-truth CSA in both validation (r = 0.99, p < 0.0001) and test sets (r = 0.96, p < 0.0001). In the external test set (n = 377), 96.2% of the SM segmentations were deemed acceptable by consensus expert review. Predicted SMA and SMI values were highly correlated with the ground-truth values, with Pearson r β 0.99 (p < 0.0001) for both the female and male patients in all datasets. Sarcopenia was associated with worse OS (HR 2.05 [95% CI 1.04 - 4.04], p = 0.04) and longer PEG tube duration (median 162 days vs. 134 days, HR 1.51 [95% CI 1.12 - 2.08], p = 0.006 in multivariate analysis.

CONCLUSION : We developed and externally validated a fully-automated platform that strongly correlates with imaging-assessed sarcopenia in patients with H&N cancer that correlates with survival and toxicity outcomes. This study constitutes a significant stride towards the integration of sarcopenia assessment into decision-making for individuals diagnosed with HNSCC.

SUMMARY STATEMENT : In this study, we developed and externally validated a deep learning model to investigate the impact of sarcopenia, defined as the loss of skeletal muscle mass, on patients with head and neck squamous cell carcinoma (HNSCC) undergoing radiotherapy. We demonstrated an efficient, fullyautomated deep learning pipeline that can accurately segment C3 skeletal muscle area, calculate cross-sectional area, and derive a skeletal muscle index to diagnose sarcopenia from a standard of care CT scan. In multi-institutional data, we found that pre-treatment sarcopenia was associated with significantly reduced overall survival and an increased risk of adverse events. Given the increased vulnerability of patients with HNSCC, the assessment of sarcopenia prior to radiotherapy may aid in informed treatment decision-making and serve as a predictive marker for the necessity of early supportive measures.

Ye Zezhong, Saraf Anurag, Ravipati Yashwanth, Hoebers Frank, Zha Yining, Zapaishchykova Anna, Likitlersuang Jirapat, Tishler Roy B, Schoenfeld Jonathan D, Margalit Danielle N, Haddad Robert I, Mak Raymond H, Naser Mohamed, Wahid Kareem A, Sahlsten Jaakko, Jaskari Joel, Kaski Kimmo, Mäkitie Antti A, Fuller Clifton D, Aerts Hugo J W L, Kann Benjamin H

2023-Mar-06

General

General

Microbiome Preterm Birth DREAM Challenge: Crowdsourcing Machine Learning Approaches to Advance Preterm Birth Research.

In medRxiv : the preprint server for health sciences
Globally, every year about 11% of infants are born preterm, defined as a birth prior to 37 weeks of gestation, with significant and lingering health consequences. Multiple studies have related the vaginal microbiome to preterm birth. We present a crowdsourcing approach to predict: (a) preterm or (b) early preterm birth from 9 publicly available vaginal microbiome studies representing 3,578 samples from 1,268 pregnant individuals, aggregated from raw sequences via an open-source tool, MaLiAmPi. We validated the crowdsourced models on novel datasets representing 331 samples from 148 pregnant individuals. From 318 DREAM challenge participants we received 148 and 121 submissions for our two separate prediction sub-challenges with top-ranking submissions achieving bootstrapped AUROC scores of 0.69 and 0.87, respectively. Alpha diversity, VALENCIA community state types, and composition (via phylotype relative abundance) were important features in the top performing models, most of which were tree based methods. This work serves as the foundation for subsequent efforts to translate predictive tests into clinical practice, and to better understand and prevent preterm birth.
Golob Jonathan L, Oskotsky Tomiko T, Tang Alice S, Roldan Alennie, Chung Verena, Ha Connie W Y, Wong Ronald J, Flynn Kaitlin J, Parraga-Leo Antonio, Wibrand Camilla, Minot Samuel S, Andreoletti Gaia, Kosti Idit, Bletz Julie, Nelson Amber, Gao Jifan, Wei Zhoujingpeng, Chen Guanhua, Tang Zheng-Zheng, Novielli Pierfrancesco, Romano Donato, Pantaleo Ester, Amoroso Nicola, Monaco Alfonso, Vacca Mirco, Angelis Maria De, Bellotti Roberto, Tangaro Sabina, Kuntzleman Abigail, Bigcraft Isaac, Techtmann Stephen, Bae Daehun, Kim Eunyoung, Jeon Jongbum, Joe Soobok, Theis Kevin R, Ng Sherrianne, Lee Li Yun S, Bennett Phillip R, MacIntyre David A, Stolovitzky Gustavo, Lynch Susan V, Albrecht Jake, Gomez-Lopez Nardhy, Romero Roberto, Stevenson David K, Aghaeepour Nima, Tarca Adi L, Costello James C, Sirota Marina

2023-Mar-09

Radiology

Radiology

A Computed Tomography-Based Radiomics Analysis of Low-Energy Proximal Femur Fractures in the Elderly Patients.

In Current radiopharmaceuticals

INTRODUCTION : Low-energy proximal femur fractures in elderly patients result from factors, like osteoporosis and falls. These fractures impose high rates of economic and social costs. In this study, we aimed to build predictive models by applying machine learning (ML) methods on radiomics features to predict low-energy proximal femur fractures.

METHODS : Computed tomography scans of 40 patients (mean ± standard deviation of age = 71 ± 6) with low-energy proximal femur fractures (before a fracture occurs) and 40 individuals (mean ± standard deviation of age = 73 ± 7) as a control group were included. The regions of interest, including neck, trochanteric, and intertrochanteric, were drawn manually. The combinations of 25 classification methods and 8 feature selection methods were applied to radiomics features extracted from ROIs. Accuracy and the area under the receiver operator characteristic curve (AUC) were used to assess ML models' performance.

RESULTS : AUC and accuracy values ranged from 0.408 to 1 and 0.697 to 1, respectively. Three classification methods, including multilayer perceptron (MLP), sequential minimal optimization (SMO), and stochastic gradient descent (SGD), in combination with the feature selection method, SVM attribute evaluation (SAE), exhibited the highest performance in the neck (AUC= 0.999, 0.971 and 0.971, respectively; accuracy = 0.988, 0.988, and 0.988, respectively) and the trochanteric (AUC = 1, 1 and 1, respectively; accuracy = 1, 1 and 1, respectively) regions. The same methods demonstrated the highest performance for the combination of the 3 ROIs' features (AUC = 1, 1 and 1, respectively; accuracy =1, 1 and 1, respectively). In the intertrochanteric region, the combination methods, MLP+SAE, SMO+SAE, and SGD+SAE, as well as the combination of the SAE method and logistic regression (LR) classification method exhibited the highest performance (AUC= 1, 1, 1 and 1, respectively; accuracy= 1, 1, 1 and 1, respectively).

CONCLUSION : Applying machine learning methods to radiomics features is a powerful tool to predict low-energy proximal femur fractures. The results of this study can be verified by conducting more research on bigger datasets.

Mohammadi Seyed Mohammad, Moniri Samir, Mohammadhoseini Payam, Hanafi Mohammad Ghasem, Farasat Maryam, Cheki Mohsen

2023-Mar-21

Radiomics, computed tomography, low-energy fracture, machine learning, osteoporosis, proximal femur

General

General

COVID-19 diagnosis: A comprehensive review of pre-trained deep learning models based on feature extraction algorithm.

In Results in engineering
Due to the augmented rise of COVID-19, clinical specialists are looking for fast faultless diagnosis strategies to restrict Covid spread while attempting to lessen the computational complexity. In this way, swift diagnosis techniques for COVID-19 with high precision can offer valuable aid to clinical specialists. RT- PCR test is an expensive and tedious COVID diagnosis technique in practice. Medical imaging is feasible to diagnose COVID-19 by X-ray chest radiography to get around the shortcomings of RT-PCR. Through a variety of Deep Transfer-learning models, this research investigates the potential of Artificial Intelligence -based early diagnosis of COVID-19 via X-ray chest radiographs. With 10,192 normal and 3616 Covid X-ray chest radiographs, the deep transfer-learning models are optimized to further the accurate diagnosis. The x-ray chest radiographs undergo a data augmentation phase before developing a modified dataset to train the Deep Transfer-learning models. The Deep Transfer-learning architectures are trained using the extracted features from the Feature Extraction stage. During training, the classification of X-ray Chest radiographs based on feature extraction algorithm values is converted into a feature label set containing the classified image data with a feature string value representing the number of edges detected after edge detection. The feature label set is further tested with the SVM, KNN, NN, Naive Bayes and Logistic Regression classifiers to audit the quality metrics of the proposed model. The quality metrics include accuracy, precision, F1 score, recall and AUC. The Inception-V3 dominates the six Deep Transfer-learning models, according to the assessment results, with a training accuracy of 84.79% and a loss function of 2.4%. The performance of Cubic SVM was superior to that of the other SVM classifiers, with an AUC score of 0.99, precision of 0.983, recall of 0.8977, accuracy of 95.8%, and F1 score of 0.9384. Cosine KNN fared better than the other KNN classifiers with an AUC score of 0.95, precision of 0.974, recall of 0.777, accuracy of 90.8%, and F1 score of 0.864. Wide NN fared better than the other NN classifiers with an AUC score of 0.98, precision of 0.975, recall of 0.907, accuracy of 95.5%, and F1 score of 0.939. According to the findings, SVM classifiers topped other classifiers in terms of performance indicators like accuracy, precision, recall, F1-score, and AUC. The SVM classifiers reported better mean optimal scores compared to other classifiers. The performance assessment metrics uncover that the proposed methodology can aid in preliminary COVID diagnosis.
Poola Rahul Gowtham, Pl Lahari, Y Siva Sankar

2023-Jun

Boundary tracing, Covid diagnosis, Deep transfer-learning, Medical imaging, Neural network models and classifiers