Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

oncology Oncology

Evaluating upfront high-dose consolidation after R-CHOP for follicular lymphoma by clinical and genetic risk models.

In Blood advances

High-dose therapy and autologous stem cell transplantation (HDT/ASCT) is an effective salvage treatment for eligible patients with follicular lymphoma (FL) and early progression of disease (POD). Since the introduction of rituximab, HDT/ASCT is no longer recommended in first remission. We here explored whether consolidative HDT/ASCT improved survival in defined subgroups of previously untreated patients. We report survival analyses of 431 patients who received frontline rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone (R-CHOP) for advanced FL, and were randomized to receive consolidative HDT/ASCT. We performed targeted genotyping of 157 diagnostic biopsies, and calculated genotype-based risk scores. HDT/ASCT improved failure-free survival (FFS; hazard ratio [HR], 0.8, P = .07; as-treated: HR, 0.7, P = .04), but not overall survival (OS; HR, 1.3, P = .27; as-treated: HR, 1.4, P = .13). High-risk cohorts identified by FL International Prognostic Index (FLIPI), and the clinicogenetic risk models m7-FLIPI and POD within 24 months-prognostic index (POD24-PI) comprised 27%, 18%, and 22% of patients. HDT/ASCT did not significantly prolong FFS in high-risk patients as defined by FLIPI (HR, 0.9; P = .56), m7-FLIPI (HR, 0.9; P = .91), and POD24-PI (HR, 0.8; P = .60). Similarly, OS was not significantly improved. Finally, we used a machine-learning approach to predict benefit from HDT/ASCT by genotypes. Patients predicted to benefit from HDT/ASCT had longer FFS with HDT/ASCT (HR, 0.4; P = .03), but OS did not reach statistical significance. Thus, consolidative HDT/ASCT after frontline R-CHOP did not improve OS in unselected FL patients and subgroups selected by genotype-based risk models.

Alig Stefan, Jurinovic Vindi, Shahrokh Esfahani Mohammad, Haebe Sarah, Passerini Verena, Hellmuth Johannes C, Gaitzsch Erik, Keay William, Tahiri Natyra, Zoellner Anna, Rosenwald Andreas, Klapper Wolfram, Stein Harald, Feller Alfred, Ott German, Staiger Annette M, Horn Heike, Hansmann Martin L, Pott Christiane, Unterhalt Michael, Schmidt Christian, Dreyling Martin, Alizadeh Ash A, Hiddemann Wolfgang, Hoster Eva, Weigert Oliver


Internal Medicine Internal Medicine

Increasing tendency of urine protein is a risk factor for rapid eGFR decline in patients with CKD: A machine learning-based prediction model by using a big database.

In PloS one ; h5-index 176.0

Artificial intelligence is increasingly being adopted in medical fields to predict various outcomes. In particular, chronic kidney disease (CKD) is problematic because it often progresses to end-stage kidney disease. However, the trajectories of kidney function depend on individual patients. In this study, we propose a machine learning-based model to predict the rapid decline in kidney function among CKD patients by using a big hospital database constructed from the information of 118,584 patients derived from the electronic medical records system. The database included the estimated glomerular filtration rate (eGFR) of each patient, recorded at least twice over a period of 90 days. The data of 19,894 patients (16.8%) were observed to satisfy the CKD criteria. We characterized the rapid decline of kidney function by a decline of 30% or more in the eGFR within a period of two years and classified the available patients into two groups-those exhibiting rapid eGFR decline and those exhibiting non-rapid eGFR decline. Following this, we constructed predictive models based on two machine learning algorithms. Longitudinal laboratory data including urine protein, blood pressure, and hemoglobin were used as covariates. We used longitudinal statistics with a baseline corresponding to 90-, 180-, and 360-day windows prior to the baseline point. The longitudinal statistics included the exponentially smoothed average (ESA), where the weight was defined to be 0.9*(t/b), where t denotes the number of days prior to the baseline point and b denotes the decay parameter. In this study, b was taken to be 7 (7-day ESA). We used logistic regression (LR) and random forest (RF) algorithms based on Python code with scikit-learn library ( for model creation. The areas under the curve for LR and RF were 0.71 and 0.73, respectively. The 7-day ESA of urine protein ranked within the first two places in terms of importance according to both models. Further, other features related to urine protein were likely to rank higher than the rest. The LR and RF models revealed that the degree of urine protein, especially if it exhibited an increasing tendency, served as a prominent risk factor associated with rapid eGFR decline.

Inaguma Daijo, Kitagawa Akimitsu, Yanagiya Ryosuke, Koseki Akira, Iwamori Toshiya, Kudo Michiharu, Yuzawa Yukio


Ophthalmology Ophthalmology

Association between metabolic risk factors and optic disc cupping identified by deep learning method.

In PloS one ; h5-index 176.0

PURPOSE : This study aims to investigate correlation between metabolic risk factors and optic disc cupping and the development of glaucoma.

METHODS : This study is a retrospective, cross-sectional study with over 20-year-old patients that underwent health screening examinations. Intraocular pressure (IOP), fundus photographs, Body Mass Index (BMI), waist circumference (WC), serum triglycerides, serum HDL cholesterol (HDL-C), serum LDL cholesterol (LDL-C), systolic blood pressure (BP), diastolic BP, and serum HbA1c were obtained to analyse correlation between metabolic risk factors and glaucoma. Eye with glaucomatous optic neuropathy(GON) was defined as having an optic disc with either vertical cup-to-disc ratio(VCDR) ≥ 0.7 or a VCDR difference ≥ 0.2 between the right and left eyes by measuring VCDR with deep learning approach.

RESULTS : The study comprised 15,585 subjects and 877 subjects were diagnosed as GON. In univariate analyses, age, BMI, systolic BP, diastolic BP, WC, triglyceride, LDL-C, HbA1c, and IOP were significantly and positively correlated with VCDR in the optic nerve head. In linear regression analysis as independent variables, stepwise multiple regression analyses revealed that age, BMI, systolic BP, HbA1c, and IOP showed positive correlation with VCDR. In multivariate logistic analyses of risk factors and GON, higher age (odds ratio [OR], 1.054; 95% confidence interval [CI], 1.046-1.063), male gender (OR, 0.730; 95% CI, 0.609-0.876), more obese (OR, 1.267; 95% CI, 1.065-1.507), and diabetes (OR, 1.575; 95% CI, 1.214-2.043) remained statistically significant correlation with GON.

CONCLUSIONS : Among the metabolic risk factors, obesity and diabetes as well as older age and male gender are risk factors of developing GON. The glaucoma screening examinations should be considered in the populations with these indicated risk factors.

Shin Jonghoon, Kang Min Seung, Park Keunheung, Lee Jong Soo


Public Health Public Health

Prediction of hepatitis E using machine learning models.

In PloS one ; h5-index 176.0

BACKGROUND : Accurate and reliable predictions of infectious disease can be valuable to public health organizations that plan interventions to decrease or prevent disease transmission. A great variety of models have been developed for this task. However, for different data series, the performance of these models varies. Hepatitis E, as an acute liver disease, has been a major public health problem. Which model is more appropriate for predicting the incidence of hepatitis E? In this paper, three different methods are used and the performance of the three methods is compared.

METHODS : Autoregressive integrated moving average(ARIMA), support vector machine(SVM) and long short-term memory(LSTM) recurrent neural network were adopted and compared. ARIMA was implemented by python with the help of statsmodels. SVM was accomplished by matlab with libSVM library. LSTM was designed by ourselves with Keras, a deep learning library. To tackle the problem of overfitting caused by limited training samples, we adopted dropout and regularization strategies in our LSTM model. Experimental data were obtained from the monthly incidence and cases number of hepatitis E from January 2005 to December 2017 in Shandong province, China. We selected data from July 2015 to December 2017 to validate the models, and the rest was taken as training set. Three metrics were applied to compare the performance of models, including root mean square error(RMSE), mean absolute percentage error(MAPE) and mean absolute error(MAE).

RESULTS : By analyzing data, we took ARIMA(1, 1, 1), ARIMA(3, 1, 2) as monthly incidence prediction model and cases number prediction model, respectively. Cross-validation and grid search were used to optimize parameters of SVM. Penalty coefficient C and kernel function parameter g were set 8, 0.125 for incidence prediction, and 22, 0.01 for cases number prediction. LSTM has 4 nodes. Dropout and L2 regularization parameters were set 0.15, 0.001, respectively. By the metrics of RMSE, we obtained 0.022, 0.0204, 0.01 for incidence prediction, using ARIMA, SVM and LSTM. And we obtained 22.25, 20.0368, 11.75 for cases number prediction, using three models. For MAPE metrics, the results were 23.5%, 21.7%, 15.08%, and 23.6%, 21.44%, 13.6%, for incidence prediction and cases number prediction, respectively. For MAE metrics, the results were 0.018, 0.0167, 0.011 and 18.003, 16.5815, 9.984, for incidence prediction and cases number prediction, respectively.

CONCLUSIONS : Comparing ARIMA, SVM and LSTM, we found that nonlinear models(SVM, LSTM) outperform linear models(ARIMA). LSTM obtained the best performance in all three metrics of RSME, MAPE, MAE. Hence, LSTM is the most suitable for predicting hepatitis E monthly incidence and cases number.

Guo Yanhui, Feng Yi, Qu Fuli, Zhang Li, Yan Bingyu, Lv Jingjing


General General

Phenotypic clustering of heart failure with preserved ejection fraction reveals different rates of hospitalization.

In Journal of cardiovascular medicine (Hagerstown, Md.)

AIMS : Approximately 50% of patients with heart failure have preserved (≥50%) ejection fraction (HFpEF). Improved understanding of the phenotypic heterogeneity of HFpEF might facilitate development of targeted therapies and interventions.

METHODS : This retrospective study characterized a cohort of patients with HFpEF based on similar clinical profiles and evaluated 1-year heart failure related hospitalization. Enrolment, medical and pharmacy data were used to identify patients newly diagnosed with heart failure enrolled in a Medicare Advantage Prescription Drug or commercial healthcare plan. To identify only those patients with HFpEF, we used natural language processing techniques of ejection fraction values abstracted from a linked free-text clinical notes data source. The study population comprised 1515 patients newly identified with HFpEF between 1 January 2011 and 31 December 2015.

RESULTS : Using unsupervised machine learning, we identified three distinguishable patient clusters representing different phenotypes: cluster-1 patients had the lowest prevalence of heart failure comorbidities and highest mean age; cluster-2 patients had higher prevalence of metabolic syndrome and pulmonary disease, despite younger mean age; and cluster-3 patients had higher prevalence of cardiac arrhythmia and renal disease. Cluster-3 had the highest 1-year heart failure related hospitalization rates. Within-cluster analysis, prior use of diuretics (cluster-1 and cluster-2) and age (cluster-2 and cluster-3) was associated with 1-year heart failure related hospitalization. Combination therapy was associated with decreased 1-year heart failure related hospitalization in cluster-1.

CONCLUSION : This study demonstrated that clustering can be used to characterize subgroups of patients with newly identified HFpEF, assess differences in heart failure related hospitalization rates at 1 year and suggest patient subtypes may respond differently to treatments or interventions.

Casebeer Adrianne, Horter Libby, Hayden Jennifer, Simmons Jeff, Evers Thomas


Radiology Radiology

Impact of radiomics on prostate cancer detection: a systematic review of clinical applications.

In Current opinion in urology

PURPOSE OF REVIEW : To systematically review the current literature to assess the role of radiomics in the detection and evaluation of prostate cancer (PCa).

RECENT FINDINGS : Radiomics involves the high-throughput extraction of radiologic features from clinical imaging, using a panel of sophisticated data-characterization algorithms to make an objective and quantitative determination of diagnoses and clinical characteristics. Radiomics evaluation of existing clinical images would increase their clinical value in many cancer management pathways, including PCa. However, a consensus on the implementation of radiomics has not been established across different sites, delaying its implementation in clinical practice. There are many potential advantages to radiomics. The ability to extract features from existing clinical imaging is one such advantage. A second is the empiric nature of the analysis. The third lies in the application of new technologies, such as machine learning, to be able to evaluate large quantities of data to make clinical conclusions. In this systematic review, we identify publications regarding the role of radiomics in PCa detection and evaluation. Many of these studies noted that radiomics, when incorporated into predictive models, had an advantageous impact on detection of PCa, clinically significant PCa, and extracapsular extension. This may assist in individualized decision making not only for diagnosis of PCa, but also for surveillance and surgical planning. With additional validation in large sample sizes, and randomized, multicenter studies using a consensus driven methodology, radiomics has the potential to alter the landscape of PCa detection and management, necessitating further prospective randomized investigation.

SUMMARY : Radiomics is a promising new field, allowing for high-throughput analysis of imaging features for PCa detection and evaluation. These features can be extracted from existing data; therefore, the potential for future study is immense.

Sugano Dordaneh, Sanford Daniel, Abreu Andre, Duddalwar Vinay, Gill Inderbir, Cacciamani Giovanni E