Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General

General

Neuropsychiatric Symptoms and Commonly Used Biomarkers of Alzheimer's Disease: A Literature Review from a Machine Learning Perspective.

In Journal of Alzheimer's disease : JAD
There is a growing interest in the application of machine learning (ML) in Alzheimer's disease (AD) research. However, neuropsychiatric symptoms (NPS), frequent in subjects with AD, mild cognitive impairment (MCI), and other related dementias have not been analyzed sufficiently using ML methods. To portray the landscape and potential of ML research in AD and NPS studies, we present a comprehensive literature review of existing ML approaches and commonly studied AD biomarkers. We conducted PubMed searches with keywords related to NPS, AD biomarkers, machine learning, and cognition. We included a total of 38 articles in this review after excluding some irrelevant studies from the search results and including 6 articles based on a snowball search from the bibliography of the relevant studies. We found a limited number of studies focused on NPS with or without AD biomarkers. In contrast, multiple statistical machine learning and deep learning methods have been used to build predictive diagnostic models using commonly known AD biomarkers. These mainly included multiple imaging biomarkers, cognitive scores, and various omics biomarkers. Deep learning approaches that combine these biomarkers or multi-modality datasets typically outperform single-modality datasets. We conclude ML may be leveraged to untangle the complex relationships of NPS and AD biomarkers with cognition. This may potentially help to predict the progression of MCI or dementia and develop more targeted early intervention approaches based on NPS.
Shah Jay, Rahman Siddiquee Md Mahfuzur, Krell-Roesch Janina, Syrjanen Jeremy A, Kremers Walter K, Vassilaki Maria, Forzani Erica, Wu Teresa, Geda Yonas E

2023-Mar-03

Alzheimer’s disease, cognition, deep learning, machine learning, neuropsychiatric symptoms

Cardiology

Cardiology

A multi-use deep learning method for CITE-seq and single-cell RNA-seq data integration with cell surface protein prediction and imputation.

In Nature machine intelligence
CITE-seq, a single-cell multi-omics technology that measures RNA and protein expression simultaneously in single cells, has been widely applied in biomedical research, especially in immune related disorders and other diseases such as influenza and COVID-19. Despite the proliferation of CITE-seq, it is still costly to generate such data. Although data integration can increase information content, this raises computational challenges. First, combining multiple datasets is prone to batch effects that need to be addressed. Secondly, it is difficult to combine multiple CITE-seq datasets because the protein panels in different datasets may only partially overlap. Integrating multiple CITE-seq and single-cell RNA-seq (scRNA-seq) datasets is important because this allows the utilization of as many data as possible to uncover cell population heterogeneity. To overcome these challenges, we present sciPENN, a multi-use deep learning approach that supports CITE-seq and scRNA-seq data integration, protein expression prediction for scRNA-seq, protein expression imputation for CITE-seq, quantification of prediction and imputation uncertainty, and cell type label transfer from CITE-seq to scRNA-seq. Comprehensive evaluations spanning multiple datasets demonstrate that sciPENN outperforms other current state-of-the-art methods.
Lakkis Justin, Schroeder Amelia, Su Kenong, Lee Michelle Y Y, Bashore Alexander C, Reilly Muredach P, Li Mingyao

2022-Nov

CITE-seq, deep learning, protein prediction, single-cell RNA-seq, single-cell multi-omics

General

General

Integrative analysis of ferroptosis regulators for clinical prognosis based on deep learning and potential chemotherapy sensitivity of prostate cancer.

In Precision clinical medicine
Exploring useful prognostic markers and developing a robust prognostic model for patients with prostate cancer are crucial for clinical practice. We applied a deep learning algorithm to construct a prognostic model and proposed the deep learning-based ferroptosis score (DLF_score) for the prediction of prognosis and potential chemotherapy sensitivity in prostate cancer. Based on this prognostic model, there was a statistically significant difference in the disease-free survival probability between patients with high and low DLF_score in the The Cancer Genome Atlas (TCGA) cohort (P < 0.0001). In the validation cohort GSE116918, we also observed a consistent conclusion with the training set (P = 0.02). Additionally, functional enrichment analysis showed that DNA repair, RNA splicing signaling, organelle assembly, and regulation of centrosome cycle pathways might regulate prostate cancer through ferroptosis. Meanwhile, the prognostic model we constructed also had application value in predicting drug sensitivity. We predicted some potential drugs for the treatment of prostate cancer through AutoDock, which could potentially be used for prostate cancer treatment.
Guo Tuanjie, Yuan Zhihao, Wang Tao, Zhang Jian, Tang Heting, Zhang Ning, Wang Xiang, Chen Siteng

2023-Mar

deep learning, drug sensitivity, ferroptosis, prognosis, prostate cancer

General

General

Deep belief network-based approach for detecting Alzheimer's disease using the multi-omics data.

In Computational and structural biotechnology journal
Alzheimer's disease (AD) is the most uncertain form of Dementia in terms of finding out the mechanism. AD does not have a vital genetic factor to relate to. There were no reliable techniques and methods to identify the genetic risk factors associated with AD in the past. Most of the data available were from the brain images. However, recently, there have been drastic advancements in the high-throughput techniques in bioinformatics. It has led to focused researches in discovering the AD causing genetic risk factors. Recent analysis has resulted in considerable prefrontal cortex data with which classification and prediction models can be developed for AD. We have developed a Deep Belief Network-based prediction model using the DNA Methylation and Gene Expression Microarray Data, with High Dimension Low Sample Size (HDLSS) issues. To overcome the HDLSS challenge, we performed a two-layer feature selection considering the biological aspects of the features as well. In the two-layered feature selection approach, first the differentially expressed genes and differentially methylated positions are identified, then both the datasets are combined using Jaccard similarity measure. As the second step, an ensemble-based feature selection approach is implemented to further narrow down the gene selection. The results show that the proposed feature selection technique outperforms the existing commonly used feature selection techniques, such as Support Vector Machine Recursive Feature Elimination (SVM-RFE), and Correlation-based Feature Selection (CBS). Furthermore, the Deep Belief Network-based prediction model performs better than the widely used Machine Learning models. Also, the multi-omics dataset shows promising results compared to the single omics.
Mahendran Nivedhitha, Vincent P M Durai Raj

2023

“Alzheimers disease”, DNA Methylation, Deep Belief Network, Feature Selection, Gene Expression, Multi-omics

General

General

Redesigning plant specialized metabolism with supervised machine learning using publicly available reactome data.

In Computational and structural biotechnology journal
The immense structural diversity of products and intermediates of plant specialized metabolism (specialized metabolites) makes them rich sources of therapeutic medicine, nutrients, and other useful materials. With the rapid accumulation of reactome data that can be accessible on biological and chemical databases, along with recent advances in machine learning, this review sets out to outline how supervised machine learning can be used to design new compounds and pathways by exploiting the wealth of said data. We will first examine the various sources from which reactome data can be obtained, followed by explaining the different machine learning encoding methods for reactome data. We then discuss current supervised machine learning developments that can be employed in various aspects to help redesign plant specialized metabolism.
Lim Peng Ken, Julca Irene, Mutwil Marek

2023

Encoding reactome data, Neural-network encoders, Plant specialized metabolism, Predicting enzyme promiscuity, Predicting reaction-feasibility, Reactome data-mining, Retrobiosynthesis, Supervised machine learning

oncology

Oncology

Recurrence risk stratification for locally advanced cervical cancer using multi-modality transformer network.

In Frontiers in oncology

OBJECTIVES : Recurrence risk evaluation is clinically significant for patients with locally advanced cervical cancer (LACC). We investigated the ability of transformer network in recurrence risk stratification of LACC based on computed tomography (CT) and magnetic resonance (MR) images.

METHODS : A total of 104 patients with pathologically diagnosed LACC between July 2017 and December 2021 were enrolled in this study. All patients underwent CT and MR scanning, and their recurrence status was identified by the biopsy. We randomly divided patients into training cohort (48 cases, non-recurrence: recurrence = 37: 11), validation cohort (21 cases, non-recurrence: recurrence = 16: 5), and testing cohort (35 cases, non-recurrence: recurrence = 27: 8), upon which we extracted 1989, 882 and 315 patches for model's development, validation and evaluation, respectively. The transformer network consisted of three modality fusion modules to extract multi-modality and multi-scale information, and a fully-connected module to perform recurrence risk prediction. The model's prediction performance was assessed by six metrics, including the area under the receiver operating characteristic curve (AUC), accuracy, f1-score, sensitivity, specificity and precision. Univariate analysis with F-test and T-test were conducted for statistical analysis.

RESULTS : The proposed transformer network is superior to conventional radiomics methods and other deep learning networks in both training, validation and testing cohorts. Particularly, in testing cohort, the transformer network achieved the highest AUC of 0.819 ± 0.038, while four conventional radiomics methods and two deep learning networks got the AUCs of 0.680 ± 0.050, 0.720 ± 0.068, 0.777 ± 0.048, 0.691 ± 0.103, 0.743 ± 0.022 and 0.733 ± 0.027, respectively.

CONCLUSIONS : The multi-modality transformer network showed promising performance in recurrence risk stratification of LACC and may be used as an effective tool to help clinicians make clinical decisions.

Wang Jian, Mao Yixiao, Gao Xinna, Zhang Yu

2023

cervical cancer, deep learning, multi-modality data, recurrence risk stratification, transformer network