Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Machine learning analysis of DNA methylation in a hypoxia-immune model of oral squamous cell carcinoma.

In International immunopharmacology

BACKGROUND : Hypoxia status and immunity are related with the development and prognosis of oral squamous cell carcinoma (OSCC). Here, we constructed a hypoxia-immune model to explore its upstream mechanism and identify potential CpG sites.

METHODS : The hypoxia-immune model was developed and validated by the iCluster algorithm. The LASSO, SVM-RFE and GA-ANN were performed to screen CpG sites correlated to the hypoxia-immune microenvironment.

RESULTS : We found seven hypoxia-immune related CpG sites. Lasso had the best classification performance among three machine learning algorithms.

CONCLUSION : We explored the clinical significance of the hypoxia-immune model and found seven hypoxia-immune related CpG sites by multiple machine learning algorithms. This model and candidate CpG sites may have clinical applications to predict the hypoxia-immune microenvironment.

Zeng Hao, Luo Meng, Chen Linyan, Ma Xinyu, Ma Xuelei


DNA methylation, Hypoxia, Machine learning, Oral squamous cell carcinoma, Tumor immune microenvironment

General General

Predicting the consequences of accidents involving dangerous substances using machine learning.

In Ecotoxicology and environmental safety ; h5-index 67.0

A new dimension of learning lessons from the occurrence of hazardous events involving dangerous substances is considered relying on the availability of representative data and the significant evolution of a wide range of machine learning tools. The importance of such a dimension lies in the possibility of predicting the associated nature of damages without imposing any unrealistic simplifications or restrictions. To provide the best possible modeling framework, several implementations are tested using logistic regression, decision trees, neural networks, support vector machine, naive Bayes classifier and random forests to forecast the occurrence of the human, environmental and material consequences of industrial accidents based on the EU Major Accident Reporting System's records. Many performance metrics are estimated to select the most suitable model in each treated case. The obtained results show the distinctive ability of random forests and neural networks to predict the occurrence of specific consequences of accidents in the industrial installations, with an obvious exception concerning the performance of this latter algorithm when the involved datasets are highly unbalanced.

Chebila Mourad


Industrial accidents, Machine learning, Neural networks, Performance metrics, Random forests

Pathology Pathology

Non-Muscle-Invasive Bladder Carcinoma with Respect to Basal Versus Luminal Keratin Expression.

In International journal of molecular sciences ; h5-index 102.0

Non-muscle-invasive bladder cancer (NMIBC) consists of transcriptional subtypes that are distinguishable from those of muscle-invasive cancer. We aimed to identify genetic signatures of NMIBC related to basal (K5/6) and luminal (K20) keratin expression. Based on immunohistochemical staining, papillary high-grade NMIBC was classified into K5/6-only (K5/6High-K20Low), K20-only (K5/6Low-K20High), double-high (K5/6High-K20High), and double-low (K5/6Low-K20Low) groups (n = 4 per group). Differentially expressed genes identified between each group using RNA sequencing were subjected to functional enrichment analyses. A public dataset was used for validation. Machine learning algorithms were implemented to predict our samples against UROMOL subtypes. Transcriptional investigation demonstrated that the K20-only group was enriched in the cell cycle, proliferation, and progression gene sets, and this result was also observed in the public dataset. The K5/6-only group was closely regulated by basal-type gene sets and showed activated invasive or adhesive functions. The double-high group was enriched in cell cycle arrest, macromolecule biosynthesis, and FGFR3 signaling. The double-low group moderately expressed genes related to cell cycle and macromolecule biosynthesis. All K20-only group tumors were classified as UROMOL "class 2" by the machine learning algorithms. K5/6 and K20 expression levels indicate the transcriptional subtypes of NMIBC. The K5/6Low-K20High expression is a marker of high-risk NMIBC.

Jung Minsun, Jang Insoon, Kim Kwangsoo, Moon Kyung Chul


biomarkers, gene expression profiling, keratin-20, keratin-5/6, non-muscle-invasive bladder cancer, urinary bladder neoplasms

Radiology Radiology

Integrative analysis for COVID-19 patient outcome prediction.

In Medical image analysis

While image analysis of chest computed tomography (CT) for COVID-19 diagnosis has been intensively studied, little work has been performed for image-based patient outcome prediction. Management of high-risk patients with early intervention is a key to lower the fatality rate of COVID-19 pneumonia, as a majority of patients recover naturally. Therefore, an accurate prediction of disease progression with baseline imaging at the time of the initial presentation can help in patient management. In lieu of only size and volume information of pulmonary abnormalities and features through deep learning based image segmentation, here we combine radiomics of lung opacities and non-imaging features from demographic data, vital signs, and laboratory findings to predict need for intensive care unit (ICU) admission. To our knowledge, this is the first study that uses holistic information of a patient including both imaging and non-imaging data for outcome prediction. The proposed methods were thoroughly evaluated on datasets separately collected from three hospitals, one in the United States, one in Iran, and another in Italy, with a total 295 patients with reverse transcription polymerase chain reaction (RT-PCR) assay positive COVID-19 pneumonia. Our experimental results demonstrate that adding non-imaging features can significantly improve the performance of prediction to achieve AUC up to 0.884 and sensitivity as high as 96.1%, which can be valuable to provide clinical decision support in managing COVID-19 patients. Our methods may also be applied to other lung diseases including but not limited to community acquired pneumonia. The source code of our work is available at

Chao Hanqing, Fang Xi, Zhang Jiajin, Homayounieh Fatemeh, Arru Chiara D, Digumarthy Subba R, Babaei Rosa, Mobin Hadi K, Mohseni Iman, Saba Luca, Carriero Alessandro, Falaschi Zeno, Pasche Alessio, Wang Ge, Kalra Mannudeep K, Yan Pingkun


Artificial intelligence, COVID-19, Chest CT, Outcome prediction

Pathology Pathology

PAIP 2019: Liver cancer segmentation challenge.

In Medical image analysis

Pathology Artificial Intelligence Platform (PAIP) is a free research platform in support of pathological artificial intelligence (AI). The main goal of the platform is to construct a high-quality pathology learning data set that will allow greater accessibility. The PAIP Liver Cancer Segmentation Challenge, organized in conjunction with the Medical Image Computing and Computer Assisted Intervention Society (MICCAI 2019), is the first image analysis challenge to apply PAIP datasets. The goal of the challenge was to evaluate new and existing algorithms for automated detection of liver cancer in whole-slide images (WSIs). Additionally, the PAIP of this year attempted to address potential future problems of AI applicability in clinical settings. In the challenge, participants were asked to use analytical data and statistical metrics to evaluate the performance of automated algorithms in two different tasks. The participants were given the two different tasks: Task 1 involved investigating Liver Cancer Segmentation and Task 2 involved investigating Viable Tumor Burden Estimation. There was a strong correlation between high performance of teams on both tasks, in which teams that performed well on Task 1 also performed well on Task 2. After evaluation, we summarized the top 11 team's algorithms. We then gave pathological implications on the easily predicted images for cancer segmentation and the challenging images for viable tumor burden estimation. Out of the 231 participants of the PAIP challenge datasets, a total of 64 were submitted from 28 team participants. The submitted algorithms predicted the automatic segmentation on the liver cancer with WSIs to an accuracy of a score estimation of 0.78. The PAIP challenge was created in an effort to combat the lack of research that has been done to address Liver cancer using digital pathology. It remains unclear of how the applicability of AI algorithms created during the challenge can affect clinical diagnoses. However, the results of this dataset and evaluation metric provided has the potential to aid the development and benchmarking of cancer diagnosis and segmentation.

Kim Yoo Jung, Jang Hyungjoon, Lee Kyoungbun, Park Seongkeun, Min Sung-Gyu, Hong Choyeon, Park Jeong Hwan, Lee Kanggeun, Kim Jisoo, Hong Wonjae, Jung Hyun, Liu Yanling, Rajkumar Haran, Khened Mahendra, Krishnamurthi Ganapathy, Yang Sen, Wang Xiyue, Han Chang Hee, Kwak Jin Tae, Ma Jianqiang, Tang Zhe, Marami Bahram, Zeineh Jack, Zhao Zixu, Heng Pheng-Ann, Schmitz Rüdiger, Madesta Frederic, Rösch Thomas, Werner Rene, Tian Jie, Puybareau Elodie, Bovio Matteo, Zhang Xiufeng, Zhu Yifeng, Chun Se Young, Jeong Won-Ki, Park Peom, Choi Jinwook


Challenge, Digital pathology, Liver cancer, Segmentation, Tumor burden

General General

Predicting the progression of mild cognitive impairment using machine learning: A systematic, quantitative and critical review.

In Medical image analysis

We performed a systematic review of studies focusing on the automatic prediction of the progression of mild cognitive impairment to Alzheimer's disease (AD) dementia, and a quantitative analysis of the methodological choices impacting performance. This review included 172 articles, from which 234 experiments were extracted. For each of them, we reported the used data set, the feature types, the algorithm type, performance and potential methodological issues. The impact of these characteristics on the performance was evaluated using a multivariate mixed effect linear regressions. We found that using cognitive, fluorodeoxyglucose-positron emission tomography or potentially electroencephalography and magnetoencephalography variables significantly improved predictive performance compared to not including them, whereas including other modalities, in particular T1 magnetic resonance imaging, did not show a significant effect. The good performance of cognitive assessments questions the wide use of imaging for predicting the progression to AD and advocates for exploring further fine domain-specific cognitive assessments. We also identified several methodological issues, including the absence of a test set, or its use for feature selection or parameter tuning in nearly a fourth of the papers. Other issues, found in 15% of the studies, cast doubts on the relevance of the method to clinical practice. We also highlight that short-term predictions are likely not to be better than predicting that subjects stay stable over time. These issues highlight the importance of adhering to good practices for the use of machine learning as a decision support system for the clinical practice.

Ansart Manon, Epelbaum Stéphane, Bassignana Giulia, Bône Alexandre, Bottani Simona, Cattai Tiziana, Couronné Raphaël, Faouzi Johann, Koval Igor, Louis Maxime, Thibeau-Sutre Elina, Wen Junhao, Wild Adam, Burgos Ninon, Dormont Didier, Colliot Olivier, Durrleman Stanley


Alzheimer’s disease, Automatic prediction, Cognition, Mild cognitive impairment, Progression, Quantitative review