Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Public Health Public Health

Artificial intelligence accuracy assessment in NO2 concentration forecasting of metropolises air.

In Scientific reports ; h5-index 158.0

Air quality has been the main concern worldwide and Nitrous oxide (NO2) is one of the pollutants that have a significant effect on human health and environment. This study was conducted to compare the regression analysis and neural network model for predicting NO2 pollutants in the air of Tehran metropolis. Data has been collected during a year in the urban area of Tehran and was analyzed using multi-linear regression (MLR) and multilayer perceptron (MLP) neural networks. Meteorological parameters, urban traffic data, urban green space information, and time parameters are applied as input to forecast the daily concentration of NO2 in the air. The results demonstrate that artificial neural network modeling (R2 = 0.89, RMSE = 0.32) results in more accurate predictions than MLR analysis (R2 = 0.81, RMSE = 13.151). According to the result of sensitivity analysis of the model, the value of park area, the average of green space area and one-day time delay are the crucial parameters influencing NO2 concentration of air. Artificial neural network models could be a powerful, effective and suitable tool for analysis and modeling complex and non-linear relation of environmental variables such as ability in forecasting air pollution. Green spaces establishment has a significant role in NO2 reduction even more than traffic volume.

Shams Seyedeh Reyhaneh, Jahani Ali, Kalantary Saba, Moeinaddini Mazaher, Khorasani Nematollah


Surgery Surgery

MethylationToActivity: a deep-learning framework that reveals promoter activity landscapes from DNA methylomes in individual tumors.

In Genome biology ; h5-index 114.0

Although genome-wide DNA methylomes have demonstrated their clinical value as reliable biomarkers for tumor detection, subtyping, and classification, their direct biological impacts at the individual gene level remain elusive. Here we present MethylationToActivity (M2A), a machine learning framework that uses convolutional neural networks to infer promoter activities based on H3K4me3 and H3K27ac enrichment, from DNA methylation patterns for individual genes. Using publicly available datasets in real-world test scenarios, we demonstrate that M2A is highly accurate and robust in revealing promoter activity landscapes in various pediatric and adult cancers, including both solid and hematologic malignant neoplasms.

Williams Justin, Xu Beisi, Putnam Daniel, Thrasher Andrew, Li Chunliang, Yang Jun, Chen Xiang


Convolutional neural network, DNA methylation, Histone modifications, Transfer learning

General General

DeepLPI: a multimodal deep learning method for predicting the interactions between lncRNAs and protein isoforms.

In BMC bioinformatics

BACKGROUND : Long non-coding RNAs (lncRNAs) regulate diverse biological processes via interactions with proteins. Since the experimental methods to identify these interactions are expensive and time-consuming, many computational methods have been proposed. Although these computational methods have achieved promising prediction performance, they neglect the fact that a gene may encode multiple protein isoforms and different isoforms of the same gene may interact differently with the same lncRNA.

RESULTS : In this study, we propose a novel method, DeepLPI, for predicting the interactions between lncRNAs and protein isoforms. Our method uses sequence and structure data to extract intrinsic features and expression data to extract topological features. To combine these different data, we adopt a hybrid framework by integrating a multimodal deep learning neural network and a conditional random field. To overcome the lack of known interactions between lncRNAs and protein isoforms, we apply a multiple instance learning (MIL) approach. In our experiment concerning the human lncRNA-protein interactions in the NPInter v3.0 database, DeepLPI improved the prediction performance by 4.7% in term of AUC and 5.9% in term of AUPRC over the state-of-the-art methods. Our further correlation analyses between interactive lncRNAs and protein isoforms also illustrated that their co-expression information helped predict the interactions. Finally, we give some examples where DeepLPI was able to outperform the other methods in predicting mouse lncRNA-protein interactions and novel human lncRNA-protein interactions.

CONCLUSION : Our results demonstrated that the use of isoforms and MIL contributed significantly to the improvement of performance in predicting lncRNA and protein interactions. We believe that such an approach would find more applications in predicting other functional roles of RNAs and proteins.

Shaw Dipan, Chen Hao, Xie Minzhu, Jiang Tao


General General

MIScnn: a framework for medical image segmentation with convolutional neural networks and deep learning.

In BMC medical imaging

BACKGROUND : The increased availability and usage of modern medical imaging induced a strong need for automatic medical image segmentation. Still, current image segmentation platforms do not provide the required functionalities for plain setup of medical image segmentation pipelines. Already implemented pipelines are commonly standalone software, optimized on a specific public data set. Therefore, this paper introduces the open-source Python library MIScnn.

IMPLEMENTATION : The aim of MIScnn is to provide an intuitive API allowing fast building of medical image segmentation pipelines including data I/O, preprocessing, data augmentation, patch-wise analysis, metrics, a library with state-of-the-art deep learning models and model utilization like training, prediction, as well as fully automatic evaluation (e.g. cross-validation). Similarly, high configurability and multiple open interfaces allow full pipeline customization.

RESULTS : Running a cross-validation with MIScnn on the Kidney Tumor Segmentation Challenge 2019 data set (multi-class semantic segmentation with 300 CT scans) resulted into a powerful predictor based on the standard 3D U-Net model.

CONCLUSIONS : With this experiment, we could show that the MIScnn framework enables researchers to rapidly set up a complete medical image segmentation pipeline by using just a few lines of code. The source code for MIScnn is available in the Git repository: .

Müller Dominik, Kramer Frank


Biomedical image segmentation, Computer aided diagnosis, Deep learning, Medical image analysis, Open-source framework, U-Net

General General

MegaR: an interactive R package for rapid sample classification and phenotype prediction using metagenome profiles and machine learning.

In BMC bioinformatics

BACKGROUND : Diverse microbiome communities drive biogeochemical processes and evolution of animals in their ecosystems. Many microbiome projects have demonstrated the power of using metagenomics to understand the structures and factors influencing the function of the microbiomes in their environments. In order to characterize the effects from microbiome composition for human health, diseases, and even ecosystems, one must first understand the relationship of microbes and their environment in different samples. Running machine learning model with metagenomic sequencing data is encouraged for this purpose, but it is not an easy task to make an appropriate machine learning model for all diverse metagenomic datasets.

RESULTS : We introduce MegaR, an R Shiny package and web application, to build an unbiased machine learning model effortlessly with interactive visual analysis. The MegaR employs taxonomic profiles from either whole metagenome sequencing or 16S rRNA sequencing data to develop machine learning models and classify the samples into two or more categories. It provides various options for model fine tuning throughout the analysis pipeline such as data processing, multiple machine learning techniques, model validation, and unknown sample prediction that can be used to achieve the highest prediction accuracy possible for any given dataset while still maintaining a user-friendly experience.

CONCLUSIONS : Metagenomic sample classification and phenotype prediction is important particularly when it applies to a diagnostic method for identifying and predicting microbe-related human diseases. MegaR provides various interactive visualizations for user to build an accurate machine-learning model without difficulty. Unknown sample prediction with a properly trained model using MegaR will enhance researchers to identify the sample property in a fast turnaround time.

Dhungel Eliza, Mreyoud Yassin, Gwak Ho-Jin, Rajeh Ahmad, Rho Mina, Ahn Tae-Hyuk


Machine learning, Metagenomics, Phenotype prediction, R-package, Sample classification

oncology Oncology

Machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type.

In Translational oncology

Cancer of Unknown Primary (CUP) occurs in 3-5% of patients when standard histological diagnostic tests are unable to determine the origin of metastatic cancer. Typically, a CUP diagnosis is treated empirically and has very poor outcomes, with median overall survival less than one year. Gene expression profiling alone has been used to identify the tissue of origin but struggles with low neoplastic percentage in metastatic sites which is where identification is often most needed. MI GPSai, a Genomic Prevalence Score, uses DNA sequencing and whole transcriptome data coupled with machine learning to aid in the diagnosis of cancer. The algorithm trained on genomic data from 34,352 cases and genomic and transcriptomic data from 23,137 cases and was validated on 19,555 cases. MI GPSai predicted the tumor type in the labeled data set with an accuracy of over 94% on 93% of cases while deliberating amongst 21 possible categories of cancer. When also considering the second highest prediction, the accuracy increases to 97%. Additionally, MI GPSai rendered a prediction for 71.7% of CUP cases. Pathologist evaluation of discrepancies between submitted diagnosis and MI GPSai predictions resulted in change of diagnosis in 41.3% of the time. MI GPSai provides clinically meaningful information in a large proportion of CUP cases and inclusion of MI GPSai in clinical routine could improve diagnostic fidelity. Moreover, all genomic markers essential for therapy selection are assessed in this assay, maximizing the clinical utility for patients within a single test.

Abraham Jim, Heimberger Amy B, Marshall John, Heath Elisabeth, Drabick Joseph, Helmstetter Anthony, Xiu Joanne, Magee Daniel, Stafford Phillip, Nabhan Chadi, Antani Sourabh, Johnston Curtis, Oberley Matthew, Korn Wolfgang Michael, Spetzler David