Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Systematic integration of machine learning algorithms to develop immune escape-related signatures to improve clinical outcomes in lung adenocarcinoma patients.

In Frontiers in immunology ; h5-index 100.0

BACKGROUND : Immune escape has recently emerged as one of the barriers to the efficacy of immunotherapy in lung adenocarcinoma (LUAD). However, the clinical significance and function of immune escape markers in LUAD have largely not been clarified.

METHODS : In this study, we constructed a stable and accurate immune escape score (IERS) by systematically integrating 10 machine learning algorithms. We further investigated the clinical significance, functional status, TME interactions, and genomic alterations of different IERS subtypes to explore potential mechanisms. In addition, we validated the most important variable in the model through cellular experiments.

RESULTS : The IERS is an independent risk factor for overall survival, superior to traditional clinical variables and published molecular signatures. IERS-based risk stratification can be well applied to LUAD patients. In addition, high IERS is associated with stronger tumor proliferation and immunosuppression. Low IERS exhibited abundant lymphocyte infiltration and active immune activity. Finally, high IERS is more sensitive to first-line chemotherapy for LUAD, while low IERS is more sensitive to immunotherapy.

CONCLUSION : In conclusion, IERS may serve as a promising clinical tool to improve risk stratification and clinical management of individual LUAD patients and may enhance the understanding of immune escape.

Wang Ting, Huang Lin, Zhou Jie, Li Lu

2023

immune checkpoint inhibitors, immune escape, immunothearpy, lung adenocarcacinoma, machine learning (ML)

General General

Identification of genes related to immune enhancement caused by heterologous ChAdOx1-BNT162b2 vaccines in lymphocytes at single-cell resolution with machine learning methods.

In Frontiers in immunology ; h5-index 100.0

The widely used ChAdOx1 nCoV-19 (ChAd) vector and BNT162b2 (BNT) mRNA vaccines have been shown to induce robust immune responses. Recent studies demonstrated that the immune responses of people who received one dose of ChAdOx1 and one dose of BNT were better than those of people who received vaccines with two homologous ChAdOx1 or two BNT doses. However, how heterologous vaccines function has not been extensively investigated. In this study, single-cell RNA sequencing data from three classes of samples: volunteers vaccinated with heterologous ChAdOx1-BNT and volunteers vaccinated with homologous ChAd-ChAd and BNT-BNT vaccinations after 7 days were divided into three types of immune cells (3654 B, 8212 CD4+ T, and 5608 CD8+ T cells). To identify differences in gene expression in various cell types induced by vaccines administered through different vaccination strategies, multiple advanced feature selection methods (max-relevance and min-redundancy, Monte Carlo feature selection, least absolute shrinkage and selection operator, light gradient boosting machine, and permutation feature importance) and classification algorithms (decision tree and random forest) were integrated into a computational framework. Feature selection methods were in charge of analyzing the importance of gene features, yielding multiple gene lists. These lists were fed into incremental feature selection, incorporating decision tree and random forest, to extract essential genes, classification rules and build efficient classifiers. Highly ranked genes include PLCG2, whose differential expression is important to the B cell immune pathway and is positively correlated with immune cells, such as CD8+ T cells, and B2M, which is associated with thymic T cell differentiation. This study gave an important contribution to the mechanistic explanation of results showing the stronger immune response of a heterologous ChAdOx1-BNT vaccination schedule than two doses of either BNT or ChAdOx1, offering a theoretical foundation for vaccine modification.

Li Jing, Huang FeiMing, Ma QingLan, Guo Wei, Feng KaiYan, Huang Tao, Cai Yu-Dong

2023

ChAdOx1-BNT162b2 vaccine, immune, lymphocyte, machine learning, scRNA-seq profile

Surgery Surgery

Construction of a lipid metabolism-related risk model for hepatocellular carcinoma by single cell and machine learning analysis.

In Frontiers in immunology ; h5-index 100.0

One of the most common cancers is hepatocellular carcinoma (HCC). Numerous studies have shown the relationship between abnormal lipid metabolism-related genes (LMRGs) and malignancies. In most studies, the single LMRG was studied and has limited clinical application value. This study aims to develop a novel LMRG prognostic model for HCC patients and to study its utility for predictive, preventive, and personalized medicine. We used the single-cell RNA sequencing (scRNA-seq) dataset and TCGA dataset of HCC samples and discovered differentially expressed LMRGs between primary and metastatic HCC patients. By using the least absolute selection and shrinkage operator (LASSO) regression machine learning algorithm, we constructed a risk prognosis model with six LMRGs (AKR1C1, CYP27A1, CYP2C9, GLB1, HMGCS2, and PLPP1). The risk prognosis model was further validated in an external cohort of ICGC. We also constructed a nomogram that could accurately predict overall survival in HCC patients based on cancer status and LMRGs. Further investigation of the association between the LMRG model and somatic tumor mutational burden (TMB), tumor immune infiltration, and biological function was performed. We found that the most frequent somatic mutations in the LMRG high-risk group were CTNNB1, TTN, TP53, ALB, MUC16, and PCLO. Moreover, naïve CD8+ T cells, common myeloid progenitors, endothelial cells, granulocyte-monocyte progenitors, hematopoietic stem cells, M2 macrophages, and plasmacytoid dendritic cells were significantly correlated with the LMRG high-risk group. Finally, gene set enrichment analysis showed that RNA degradation, spliceosome, and lysosome pathways were associated with the LMRG high-risk group. For the first time, we used scRNA-seq and bulk RNA-seq to construct an LMRG-related risk score model, which may provide insights into more effective treatment strategies for predictive, preventive, and personalized medicine of HCC patients.

Mou Lisha, Pu Zuhui, Luo Yongxiang, Quan Ryan, So Yunhu, Jiang Hui

2023

GSEA, Machine learning, TMB, hepatocellular carcinoma, immune microenvironment, lipid metabolism, prediction model, scRNA-seq

General General

A simple array integrating machine learning for identification of flavonoids in red wines.

In RSC advances

Bioactive flavonoids, the major ingredients of red wines, have been proven to prevent atherosclerosis and cardiovascular disease due to their anti-inflammatory and anti-oxidant activity. However, flavonoids have proven challenging to identify, even when multiple approaches are combined. Hereby, a simple array was constructed to detect flavonoids by employing phenylboronic acid modified perylene diimide derivatives (PDIs). Through multiple non-specific interactions (hydrophilic, hydrophobic, charged, aromatic, hydrogen-bonded and reversible covalent interactions) with flavonoids, the fluorescence of PDIs can be modulated, and variations in intensity can be used to create fingerprints of flavonoids. This array successfully discriminated 14 flavonoids of diverse structures and concentrations with 100% accuracy, based on patterns in fluorescence intensity modulation, via optimized machine learning algorithms. As a result, this array demonstrated the parallel detection of 8 different types and origins of red wines with a high accuracy, revealing the excellent potential of the sensor array in food mixtures detection.

Qin Jiaojiao, Wang Hao, Xu Yu, Shi Fangfang, Yang Shijie, Huang Hui, Liu Jun, Stewart Callum, Li Linxian, Li Fei, Han Jinsong, Wu Wenwen

2023-Mar-14

General General

Large-Scale metabolomics: Predicting biological age using 10,133 routine untargeted LC-MS measurements.

In Aging cell ; h5-index 58.0

Untargeted metabolomics is the study of all detectable small molecules, and in geroscience, metabolomics has shown great potential to describe the biological age-a complex trait impacted by many factors. Unfortunately, the sample sizes are often insufficient to achieve sufficient power and minimize potential biases caused by, for example, demographic factors. In this study, we present the analysis of biological age in ~10,000 toxicologic routine blood measurements. The untargeted screening samples obtained from ultra-high pressure liquid chromatography-quadruple time of flight mass spectrometry (UHPLC- QTOF) cover + 300 batches and + 30 months, lack pooled quality controls, lack controlled sample collection, and has previously only been used in small-scale studies. To overcome experimental effects, we developed and tested a custom neural network model and compared it with existing prediction methods. Overall, the neural network was able to predict the chronological age with an rmse of 5.88 years (r2  = 0.63) improving upon the 6.15 years achieved by existing normalization methods. We used the feature importance algorithm, Shapley Additive exPlanations (SHAP), to identify compounds related to the biological age. Most importantly, the model returned known aging markers such as kynurenine, indole-3-aldehyde, and acylcarnitines along with a potential novel aging marker, cyclo (leu-pro). Our results validate the association of tryptophan and acylcarnitine metabolism to aging in a highly uncontrolled large-s cale sample. Also, we have shown that by using robust computational methods it is possible to deploy large LC-MS datasets for metabolomics studies to reduce the risk of bias and empower aging studies.

Lassen Johan K, Wang Tingting, Nielsen Kirstine L, Hasselstrøm Jørgen B, Johannsen Mogens, Villesen Palle

2023-Mar-19

accelerated aging, big data, inflammaging, machine learning, metabolomics, molecular biology of aging, tryptophan metabolism

General General

Blood RNA alternative splicing events as diagnostic biomarkers for infectious disease.

In Cell reports methods

Assays detecting blood transcriptome changes are studied for infectious disease diagnosis. Blood-based RNA alternative splicing (AS) events, which have not been well characterized in pathogen infection, have potential normalization and assay platform stability advantages over gene expression for diagnosis. Here, we present a computational framework for developing AS diagnostic biomarkers. Leveraging a large prospective cohort of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and whole-blood RNA sequencing (RNA-seq) data, we identify a major functional AS program switch upon viral infection. Using an independent cohort, we demonstrate the improved accuracy of AS biomarkers for SARS-CoV-2 diagnosis compared with six reported transcriptome signatures. We then optimize a subset of AS-based biomarkers to develop microfluidic PCR diagnostic assays. This assay achieves nearly perfect test accuracy (61/62 = 98.4%) using a naive principal component classifier, significantly more accurate than a gene expression PCR assay in the same cohort. Therefore, our RNA splicing computational framework enables a promising avenue for host-response diagnosis of infection.

Zhang Zijun, Sauerwald Natalie, Cappuccio Antonio, Ramos Irene, Nair Venugopalan D, Nudelman German, Zaslavsky Elena, Ge Yongchao, Gaitas Angelo, Ren Hui, Brockman Joel, Geis Jennifer, Ramalingam Naveen, King David, McClain Micah T, Woods Christopher W, Henao Ricardo, Burke Thomas W, Tsalik Ephraim L, Goforth Carl W, Lizewski Rhonda A, Lizewski Stephen E, Weir Dawn L, Letizia Andrew G, Sealfon Stuart C, Troyanskaya Olga G

2023-Feb-27

RNA splicing, SARS-CoV-2, diagnostic biomarker, host response assays, infectious disease, viral infection