Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Internal Medicine Internal Medicine

Predictors of mortality and endoscopic intervention in patients with upper gastrointestinal bleeding in the intensive care unit.

In Gastroenterology report

Background : The outcomes of patients undergoing esophagogastroduodenoscopy (EGD) in the intensive care unit (ICU) for upper gastrointestinal bleeding (UGIB) are not well described. Our aims were to determine predictors of 30-day mortality and endoscopic intervention, and assess the utility of existing clinical-prediction tools for UGIB in this population.

Methods : Patients hospitalized in an ICU between 2008 and 2015 who underwent EGD were identified using a validated, machine-learning algorithm. Logistic regression was used to determine factors associated with 30-day mortality and endoscopic intervention. Area under receiver-operating characteristics (AUROC) analysis was used to evaluate established UGIB scoring systems in predicting mortality and endoscopic intervention in patients who presented to the hospital with UGIB.

Results : A total of 606 patients underwent EGD for UGIB while admitted to an ICU. The median age of the cohort was 62 years and 55.9% were male. Multivariate analysis revealed that predictors associated with 30-day mortality included American Society of Anesthesiologists (ASA) class (odds ratio [OR] 4.1, 95% confidence interval [CI] 2.2-7.9), Charlson score (OR 1.2, 95% CI 1.0-1.3), and duration from hospital admission to EGD (OR 1.04, 95% CI 1.01-1.07). Rockall, Glasgow-Blatchford, and AIMS65 scores were poorly predictive of endoscopic intervention (AUROC: 0.521, 0.514, and 0.540, respectively) and in-hospital mortality (AUROC: 0.510, 0.568, and 0.506, respectively).

Conclusions : Predictors associated with 30-day mortality include ASA classification, Charlson score, and duration in the hospital prior to EGD. Existing risk tools are poorly predictive of clinical outcomes, which highlights the need for a more accurate risk-stratification tool to predict the benefit of intervention within the ICU population.

Rao Vijaya L, Gupta Nina, Swei Eric, Wagner Thomas, Aronsohn Andrew, Reddy K Gautham, Sengupta Neil


esophagogastroduodenoscopy, intensive care unit, upper gastrointestinal bleeding

Radiology Radiology

Differences in cohort study data affect external validation of artificial intelligence models for predictive diagnostics of dementia - lessons for translation into clinical practice.

In The EPMA journal

Artificial intelligence (AI) approaches pose a great opportunity for individualized, pre-symptomatic disease diagnosis which plays a key role in the context of personalized, predictive, and finally preventive medicine (PPPM). However, to translate PPPM into clinical practice, it is of utmost importance that AI-based models are carefully validated. The validation process comprises several steps, one of which is testing the model on patient-level data from an independent clinical cohort study. However, recruitment criteria can bias statistical analysis of cohort study data and impede model application beyond the training data. To evaluate whether and how data from independent clinical cohort studies differ from each other, this study systematically compares the datasets collected from two major dementia cohorts, namely, the Alzheimer's Disease Neuroimaging Initiative (ADNI) and AddNeuroMed. The presented comparison was conducted on individual feature level and revealed significant differences among both cohorts. Such systematic deviations can potentially hamper the generalizability of results which were based on a single cohort dataset. Despite identified differences, validation of a previously published, ADNI trained model for prediction of personalized dementia risk scores on 244 AddNeuroMed subjects was successful: External validation resulted in a high prediction performance of above 80% area under receiver operator characteristic curve up to 6 years before dementia diagnosis. Propensity score matching identified a subset of patients from AddNeuroMed, which showed significantly smaller demographic differences to ADNI. For these patients, an even higher prediction performance was achieved, which demonstrates the influence systematic differences between cohorts can have on validation results. In conclusion, this study exposes challenges in external validation of AI models on cohort study data and is one of the rare cases in the neurology field in which such external validation was performed. The presented model represents a proof of concept that reliable models for personalized predictive diagnostics are feasible, which, in turn, could lead to adequate disease prevention and hereby enable the PPPM paradigm in the dementia field.

Birkenbihl Colin, Emon Mohammad Asif, Vrooman Henri, Westwood Sarah, Lovestone Simon, Hofmann-Apitius Martin, Fröhlich Holger


Alzheimer’s disease, Artificial intelligence, Bioinformatics, Cohort comparison, Cohort data, Data science, Dementia, Digital clinic, Disease modeling, Disease risk prediction, Health data, Individualized patient profiling, Interdisciplinary, Machine learning, Medical data, Model performance, Model validation, Multiprofessional, Neurodegeneration, Precision medicine, Predictive preventive personalized medicine (3 PM/PPPM), Propensity score matching, Risk modeling, Sampling bias, Survival analysis, Translational medicine

General General

HSMA_WOA: A hybrid novel Slime mould algorithm with whale optimization algorithm for tackling the image segmentation problem of chest X-ray images.

In Applied soft computing

Recently, a novel virus called COVID-19 has pervasive worldwide, starting from China and moving to all the world to eliminate a lot of persons. Many attempts have been experimented to identify the infection with COVID-19. The X-ray images were one of the attempts to detect the influence of COVID-19 on the infected persons from involving those experiments. According to the X-ray analysis, bilateral pulmonary parenchymal ground-glass and consolidative pulmonary opacities can be caused by COVID-19 - sometimes with a rounded morphology and a peripheral lung distribution. But unfortunately, the specification or if the person infected with COVID-19 or not is so hard under the X-ray images. X-ray images could be classified using the machine learning techniques to specify if the person infected severely, mild, or not infected. To improve the classification accuracy of the machine learning, the region of interest within the image that contains the features of COVID-19 must be extracted. This problem is called the image segmentation problem (ISP). Many techniques have been proposed to overcome ISP. The most commonly used technique due to its simplicity, speed, and accuracy are threshold-based segmentation. This paper proposes a new hybrid approach based on the thresholding technique to overcome ISP for COVID-19 chest X-ray images by integrating a novel meta-heuristic algorithm known as a slime mould algorithm (SMA) with the whale optimization algorithm to maximize the Kapur's entropy. The performance of integrated SMA has been evaluated on 12 chest X-ray images with threshold levels up to 30 and compared with five algorithms: Lshade algorithm, whale optimization algorithm (WOA), FireFly algorithm (FFA), Harris-hawks algorithm (HHA), salp swarm algorithms (SSA), and the standard SMA. The experimental results demonstrate that the proposed algorithm outperforms SMA under Kapur's entropy for all the metrics used and the standard SMA could perform better than the other algorithms in the comparison under all the metrics.

Abdel-Basset Mohamed, Chang Victor, Mohamed Reda


COVID-19, Image segmentation problem, Kapur’s entropy, Slime mould algorithm (SMA), Whale optimization algorithm, X-ray images

General General

Time series forecasting of Covid-19 using deep learning models: India-USA comparative case study.

In Chaos, solitons, and fractals

Covid-19 is a highly contagious virus which almost freezes the world along with its economy. Its ability of human-to-human and surface-to-human transmission turns the world into catastrophic phase. In this study, our aim is to predict the future conditions of novel Coronavirus to recede its impact. We have proposed deep learning based comparative analysis of Covid-19 cases in India and USA. The datasets of confirmed and death cases of Covid-19 are taken into consideration. The recurrent neural network (RNN) based variants of long short term memory (LSTM) such as Stacked LSTM, Bi-directional LSTM and Convolutional LSTM are used to design the proposed methodology and forecast the Covid-19 cases for one month ahead. Convolution LSTM outperformed the other two models and predicts the Covid-19 cases with high accuracy and very less error for all four datasets of both countries. Upward/downward trend of forecasted Covid-19 cases are also visualized graphically, which would be helpful for researchers and policy makers to mitigate the mortality and morbidity rate by streaming the Covid-19 into right direction.

Shastri Sourabh, Singh Kuljeet, Kumar Sachin, Kour Paramjit, Mansotra Vibhakar


Covid-19, Deep learning, Forecasting, LSTM, Recurrent neural networks, Time series

Public Health Public Health

Spatio-temporal estimation of the daily cases of COVID-19 in worldwide using random forest machine learning algorithm.

In Chaos, solitons, and fractals

Novel Coronavirus pandemic, which negatively affected public health in social, psychological and economical terms, spread to the whole world in a short period of 6 months. However, the rate of increase in cases was not equal for every country. The measures implemented by the countries changed the daily spreading speed of the disease. This was determined by changes in the number of daily cases. In this study, the performance of the Random Forest (RF) machine learning algorithm was investigated in estimating the near future case numbers for 190 countries in the world and it is mapped in comparison with actual confirmed cases results. The number of confirmed cases between 23/01/2020 - 17/06/2020 were divided into 3 main sub-datasets: training sub-data, testing sub-data (interpolation data) and estimating sub-data (extrapolation data) for the random forest model. At the end of the study, it has been found that R2 values for testing sub-data of RF model estimates range between 0.843 and 0.995 (average R2= 0.959), and RMSE values between 141.76 and 526.18 (mean RMSE = 259.38); and that R2 values for estimating sub-data range between 0.690 and 0.968 (mean R2 = 0.914), and RMSE values between 549.73 and 2500.79 (mean RMSE = 909.37). These results show that the random forest machine learning algorithm performs well in estimating the number of cases for the near future in case of an epidemic like Novel Coronavirus, which outbreaks suddenly and spreads rapidly.

YeŞİlkanat Cafer Mert


COVID-19, Estimating, Machine learning, Mapping, Random forest

General General

PreLnc: An Accurate Tool for Predicting lncRNAs Based on Multiple Features.

In Genes

Accumulating evidence indicates that long non-coding RNAs (lncRNAs) have certain similarities with messenger RNAs (mRNAs) and are associated with numerous important biological processes, thereby demanding methods to distinguish them. Based on machine learning algorithms, a variety of methods are developed to identify lncRNAs, providing significant basic data support for subsequent studies. However, many tools lack certain scalability, versatility and balance, and some tools rely on genome sequence and annotation. In this paper, we propose a convenient and accurate tool "PreLnc", which uses high-confidence lncRNA and mRNA transcripts to build prediction models through feature selection and classifiers. The false discovery rate (FDR) adjusted P-value and Z-value were used for analyzing the tri-nucleotide composition of transcripts of different species. Conclusions can be drawn from the experiment that there were significant differences in RNA transcripts among plants, which may be related to evolutionary conservation and the fact that plants are under evolutionary pressure for a longer time than animals. Combining with the Pearson correlation coefficient, we use the incremental feature selection (IFS) method and the comparison of multiple classifiers to build the model. Finally, the balanced random forest was used to construct the classifier, and PreLnc obtained 91.09% accuracy for 349,186 transcripts of animals and plants. In addition, by comparing standard performance measurements, PreLnc performed better than other prediction tools.

Cao Lei, Wang Yupeng, Bi Changwei, Ye Qiaolin, Yin Tongming, Ye Ning


feature selection, lncRNA, prediction, tri-nucleotide