Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General

General

A data-efficient deep learning tool for scRNA-Seq label transfer in neuroscience.

In bioRxiv : the preprint server for biology
Large single-cell RNA datasets have contributed to unprecedented biological insight. Often, these take the form of cell atlases and serve as a reference for automating cell labeling of newly sequenced samples. Yet, classification algorithms have lacked the capacity to accurately annotate cells, particularly in complex datasets. Here we present SIMS (Scalable, Interpretable Machine Learning for Single-Cell), an end-to-end data-efficient machine learning pipeline for discrete classification of single-cell data that can be applied to new datasets with minimal coding. We benchmarked SIMS against common single-cell label transfer tools and demonstrated that it performs as well or better than state of the art algorithms. We then use SIMS to classify cells in one of the most complex tissues: the brain. We show that SIMS classifies cells of the adult cerebral cortex and hippocampus at a remarkably higher accuracy than state-of-the-art single cell classifiers. This accuracy is maintained in trans-sample label transfers of the adult human cerebral cortex. We then apply SIMS to classify cells in the developing brain and demonstrate a high level of accuracy at predicting neuronal subtypes, even in periods of fate refinement. Finally, we apply SIMS to single cell datasets of cortical organoids to predict cell identities in previously unclassified cells and to uncover genetic variations in the developmental trajectories of organoids derived from different pluripotent stem cell lines. Altogether, we show that SIMS is a versatile and robust tool for cell-type classification from single-cell datasets.
Lehrer Julian, Gonzalez-Ferrer Jesus, Haussler David, Teodorescu Mircea, Jonsson Vanessa D, Mostajo-Radji Mohammed A

2023-Mar-01

General

General

Logic-based mechanistic machine learning on high-content images reveals how drugs differentially regulate cardiac fibroblasts.

In bioRxiv : the preprint server for biology

UNLABELLED : Fibroblasts are essential regulators of extracellular matrix deposition following cardiac injury. These cells exhibit highly plastic responses in phenotype during fibrosis in response to environmental stimuli. Here, we test whether and how candidate anti-fibrotic drugs differentially regulate measures of cardiac fibroblast phenotype, which may help identify treatments for cardiac fibrosis. We conducted a high content microscopy screen of human cardiac fibroblasts treated with 13 clinically relevant drugs in the context of TGFβ and/or IL-1β, measuring phenotype across 137 single-cell features. We used the phenotypic data from our high content imaging to train a logic-based mechanistic machine learning model (LogiMML) for fibroblast signaling. The model predicted how pirfenidone and Src inhibitor WH-4-023 reduce F-actin assembly and F-actin stress fiber formation, respectively. Validating the LogiMML model prediction that PI3K partially mediates the effects of Src inhibition, we found that PI3K inhibition reduces F-actin fiber formation and procollagen I production in human cardiac fibroblasts. In this study, we establish a modeling approach combining the strengths of logic-based network models and regularized regression models, apply this approach to predict mechanisms that mediate the differential effects of drugs on fibroblasts, revealing Src inhibition acting via PI3K as a potential therapy for cardiac fibrosis.

SIGNIFICANCE : Cardiac fibrosis is a dysregulation of the normal wound healing response, resulting in excessive scarring and cardiac dysfunction. As cardiac fibroblasts primarily regulate this process, we explored how candidate anti-fibrotic drugs alter the fibroblast phenotype. We identify a set of 137 phenotypic features that change in response to drug treatments. Using a new computational modeling approach termed logic-based mechanistic machine learning, we predict how pirfenidone and Src inhibition affect the regulation of the phenotypic features F-actin assembly and F-actin stress fiber formation. We also show that inhibition of PI3K reduces F-actin fiber formation and procollagen I production in human cardiac fibroblasts, supporting a role for PI3K as a mechanism by which Src inhibition may suppress fibrosis.

Nelson Anders R, Christiansen Steven L, Naegle Kristen M, Saucerman Jeffrey J

2023-Mar-02

General

General

Pretraining strategies for effective promoter-driven gene expression prediction.

In bioRxiv : the preprint server for biology
Advances in gene delivery technologies are enabling rapid progress in molecular medicine, but require precise expression of genetic cargo in desired cell types, which is predominantly achieved via a regulatory DNA sequence called a promoter; however, only a handful of cell type-specific promoters are known. Efficiently designing compact promoter sequences with a high density of regulatory information by leveraging machine learning models would therefore be broadly impactful for fundamental research and direct therapeutic applications. However, models of expression from such compact promoter sequences are lacking, despite the recent success of deep learning in modelling expression from endogenous regulatory sequences. Despite the lack of large datasets measuring promoter-driven expression in many cell types, data from a few well-studied cell types or from endogenous gene expression may provide relevant information for transfer learning, which has not yet been explored in this setting. Here, we evaluate a variety of pretraining tasks and transfer strategies for modelling cell type-specific expression from compact promoters and demonstrate the effectiveness of pretraining on existing promoter-driven expression datasets from other cell types. Our approach is broadly applicable for modelling promoter-driven expression in any data-limited cell type of interest, and will enable the use of model-based optimization techniques for promoter design for gene delivery applications. Our code and data are available at https://github.com/anikethjr/promoter_models .
Reddy Aniketh Janardhan, Herschl Michael H, Kolli Sathvik, Lu Amy X, Geng Xinyang, Kumar Aviral, Hsu Patrick D, Levine Sergey, Ioannidis Nilah M

2023-Feb-27

General

General

NMRQNet: a deep learning approach for automatic identification and quantification of metabolites using Nuclear Magnetic Resonance (NMR) in human plasma samples.

In bioRxiv : the preprint server for biology
Nuclear Magnetic Resonance is a powerful platform that reveals the metabolomics profiles within biofluids or tissues and contributes to personalized treatments in medical practice. However, data volume and complexity hinder the exploration of NMR spectra. Besides, the lack of fast and accurate computational tools that can handle the automatic identification and quantification of essential metabolites from NMR spectra also slows the wide application of these techniques in clinical. We present NMRQNet, a deep-learning-based pipeline for automatic identification and quantification of dominant metabolite candidates within human plasma samples. The estimated relative concentrations could be further applied in statistical analysis to extract the potential biomarkers. We evaluate our method on multiple plasma samples, including species from mice to humans, curated using three anticoagulants, covering healthy and patient conditions in neurological disorder disease, greatly expanding the metabolomics analytical space in plasma. NMRQNet accurately reconstructed the original spectra and obtained significantly better quantification results than the earlier computational methods. Besides, NMRQNet also proposed relevant metabolites biomarkers that could potentially explain the risk factors associated with the condition. NMRQNet, with improved prediction performance, highlights the limitations in the existing approaches and has shown strong application potential for future metabolomics disease studies using plasma samples.
Wang Wanli, Ma Li-Hua, Maletic-Savatic Mirjana, Liu Zhandong

2023-Mar-02

Radiology

Radiology

Beyond diagnosis: is there a role for radiomics in prostate cancer management?

In European radiology experimental
The role of imaging in pretreatment staging and management of prostate cancer (PCa) is constantly evolving. In the last decade, there has been an ever-growing interest in radiomics as an image analysis approach able to extract objective quantitative features that are missed by human eye. However, most of PCa radiomics studies have been focused on cancer detection and characterisation. With this narrative review we aimed to provide a synopsis of the recently proposed potential applications of radiomics for PCa with a management-based approach, focusing on primary treatments with curative intent and active surveillance as well as highlighting on recurrent disease after primary treatment. Current evidence is encouraging, with radiomics and artificial intelligence appearing as feasible tools to aid physicians in planning PCa management. However, the lack of external independent datasets for validation and prospectively designed studies casts a shadow on the reliability and generalisability of radiomics models, delaying their translation into clinical practice.Key points• Artificial intelligence solutions have been proposed to streamline prostate cancer radiotherapy planning.• Radiomics models could improve risk assessment for radical prostatectomy patient selection.• Delta-radiomics appears promising for the management of patients under active surveillance.• Radiomics might outperform current nomograms for prostate cancer recurrence risk assessment.• Reproducibility of results, methodological and ethical issues must still be faced before clinical implementation.
Stanzione Arnaldo, Ponsiglione Andrea, Alessandrino Francesco, Brembilla Giorgio, Imbriaco Massimo

2023-Mar-13

Artificial intelligence, Clinical decision-making, Prostatic neoplasms, Radiomics, Reproducibility of results

Public Health

Public Health

An anti-infodemic virtual center for the Americas.

In Revista panamericana de salud publica = Pan American journal of public health
The Pan American Health Organization/World Health Organization (PAHO/WHO) Anti-Infodemic Virtual Center for the Americas (AIVCA) is a project led by the Department of Evidence and Intelligence for Action in Health, PAHO and the Center for Health Informatics, PAHO/WHO Collaborating Center on Information Systems for Health, at the University of Illinois, with the participation of PAHO staff and consultants across the region. Its goal is to develop a set of tools-pairing AI with human judgment-to help ministries of health and related health institutions respond to infodemics. Public health officials will learn about emerging threats detected by the center and get recommendations on how to respond. The virtual center is structured with three parallel teams: detection, evidence, and response. The detection team will employ a mixture of advanced search queries, machine learning, and other AI techniques to sift through more than 800 million new public social media posts per day to identify emerging infodemic threats in both English and Spanish. The evidence team will use the EasySearch federated search engine backed by AI, PAHO's knowledge management team, and the Librarian Reserve Corps to identify the most relevant authoritative sources. The response team will use a design approach to communicate recommended response strategies based on behavioural science, storytelling, and information design approaches.
Brooks Ian, D’Agostino Marcelo, Marti Myrna, McDowell Kate, Mejia Felipe, Betancourt-Cravioto Miguel, Gatzke Lisa, Hicks Elaine, Kyser Rebecca, Leicht Kevin, Pereira Dos Santos Eliane, Saw Jessica Jia-Wen, Tomio Ailin, Garcia Saiso Sebastian

2023

Americas, COVID-19, Public Health Informatics, artificial intelligence, communication, social media