Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Surgery Surgery

Transcriptomes of the tumor-adjacent normal tissues are more informative than tumors in predicting recurrence in colorectal cancer patients.

In Journal of translational medicine

BACKGROUND : Previous investigations of transcriptomic signatures of cancer patient survival and post-therapy relapse have focused on tumor tissue. In contrast, here we show that in colorectal cancer (CRC) transcriptomes derived from normal tissues adjacent to tumors (NATs) are better predictors of relapse.

RESULTS : Using the transcriptomes of paired tumor and NAT specimens from 80 Korean CRC patients retrospectively determined to be in recurrence or nonrecurrence states, we found that, when comparing recurrent with nonrecurrent samples, NATs exhibit a greater number of differentially expressed genes (DEGs) than tumors. Training two prognostic elastic net-based machine learning models-NAT-based and tumor-based in our Samsung Medical Center (SMC) cohort, we found that NAT-based model performed better in predicting the survival when the model was applied to the tumor-derived transcriptomes of an independent cohort of 450 COAD patients in TCGA. Furthermore, compositions of tumor-infiltrating immune cells in NATs were found to have better prognostic capability than in tumors. We also confirmed through Cox regression analysis that in both SMC-CRC as well as in TCGA-COAD cohorts, a greater proportion of genes exhibited significant hazard ratio when NAT-derived transcriptome was used compared to when tumor-derived transcriptome was used.

CONCLUSIONS : Taken together, our results strongly suggest that NAT-derived transcriptomes and immune cell composition of CRC are better predictors of patient survival and tumor recurrence than the primary tumor.

Kim Jinho, Kim Hyunjung, Lee Min-Seok, Lee Heetak, Kim Yeon Jeong, Lee Woo Yong, Yun Seong Hyeon, Kim Hee Cheol, Hong Hye Kyung, Hannenhalli Sridhar, Cho Yong Beom, Park Donghyun, Choi Sun Shim

2023-Mar-21

Colorectal cancer, Elastic net-based machine learning, Normal tissues adjacent to tumors, Recurrence, Tumor-infiltrating immune cells

Pathology Pathology

Use of machine learning-based integration to develop a monocyte differentiation-related signature for improving prognosis in patients with sepsis.

In Molecular medicine (Cambridge, Mass.)

BACKGROUND : Although significant advances have been made in intensive care medicine and antibacterial treatment, sepsis is still a common disease with high mortality. The condition of sepsis patients changes rapidly, and each hour of delay in the administration of appropriate antibiotic treatment can lead to a 4-7% increase in fatality. Therefore, early diagnosis and intervention may help improve the prognosis of patients with sepsis.

METHODS : We obtained single-cell sequencing data from 12 patients. This included 14,622 cells from four patients with bacterial infectious sepsis and eight patients with sepsis admitted to the ICU for other various reasons. Monocyte differentiation trajectories were analyzed using the "monocle" software, and differentiation-related genes were identified. Based on the expression of differentiation-related genes, 99 machine-learning combinations of prognostic signatures were obtained, and risk scores were calculated for all patients. The "scissor" software was used to associate high-risk and low-risk patients with individual cells. The "cellchat" software was used to demonstrate the regulatory relationships between high-risk and low-risk cells in a cellular communication network. The diagnostic value and prognostic predictive value of Enah/Vasp-like (EVL) were determined. Clinical validation of the results was performed with 40 samples. The "CBNplot" software based on Bayesian network inference was used to construct EVL regulatory networks.

RESULTS : We systematically analyzed three cell states during monocyte differentiation. The differential analysis identified 166 monocyte differentiation-related genes. Among the 99 machine-learning combinations of prognostic signatures constructed, the Lasso + CoxBoost signature with 17 genes showed the best prognostic prediction performance. The highest percentage of high-risk cells was found in state one. Cell communication analysis demonstrated regulatory networks between high-risk and low-risk cell subpopulations and other immune cells. We then determined the diagnostic and prognostic value of EVL stabilization in multiple external datasets. Experiments with clinical samples demonstrated the accuracy of this analysis. Finally, Bayesian network inference revealed potential network mechanisms of EVL regulation.

CONCLUSIONS : Monocyte differentiation-related prognostic signatures based on the Lasso + CoxBoost combination were able to accurately predict the prognostic status of patients with sepsis. In addition, low EVL expression was associated with poor prognosis in sepsis.

Ning Jingyuan, Sun Keran, Wang Xuan, Fan Xiaoqing, Jia Keqi, Cui Jinlei, Ma Cuiqing

2023-Mar-20

EVL, Machine learning, Prognosis, Sepsis, Single cell

General General

Intensive care photoplethysmogram datasets and machine-learning for blood pressure estimation: Generalization not guarantied.

In Frontiers in physiology

The large MIMIC waveform dataset, sourced from intensive care units, has been used extensively for the development of Photoplethysmography (PPG) based blood pressure (BP) estimation algorithms. Yet, because the data comes from patients in severe conditions-often under the effect of drugs-it is regularly noted that the relationship between BP and PPG signal characteristics may be anomalous, a claim that we investigate here. A sample of 12,000 records from the MIMIC waveform dataset was stacked up against the 219 records of the PPG-BP dataset, an alternative public dataset obtained under controlled experimental conditions. The distribution of systolic and diastolic BP data and 31 PPG pulse morphological features was first compared between datasets. Then, the correlation between features and BP, as well as between the features themselves, was analysed. Finally, regression models were trained for each dataset and validated against the other. Statistical analysis showed significant p < 0.001 differences between the datasets in diastolic BP and in 20 out of 31 features when adjusting for heart rate differences. The eight features showing the highest rank correlation ρ   >   0.40 to systolic BP in PPG-BP all displayed muted correlation levels ρ   <   0.10 in MIMIC. Regression tests showed twice higher baseline predictive power with PPG-BP than with MIMIC. Cross-dataset regression displayed a practically complete loss of predictive power for all models. The differences between the MIMIC and PPG-BP dataset exposed in this study suggest that BP estimation models based on the MIMIC dataset have reduced predictive power on the general population.

Weber-Boisvert Guillaume, Gosselin Benoit, Sandberg Frida

2023

BP estimation, PPG datasets, PPG-BP, UCI, blood pressure estimation, intensive care datasets, mimic, photoplethysmography

General General

Using machine learning to identify early predictors of adolescent emotion regulation development.

In Journal of research on adolescence : the official journal of the Society for Research on Adolescence

As 20% of adolescents develop emotion regulation difficulties, it is important to identify important early predictors thereof. Using the machine learning algorithm SEM-forests, we ranked the importance of (87) candidate variables assessed at age 13 in predicting quadratic latent trajectory models of emotion regulation development from age 14 to 18. Participants were 497 Dutch families. Results indicated that the most important predictors were individual differences (e.g., in personality), aspects of relationship quality and conflict behaviors with parents and peers, and internalizing and externalizing problems. Relatively less important were demographics, bullying, delinquency, substance use, and specific parenting practices-although negative parenting practices ranked higher than positive ones. We discuss implications for theory and interventions, and present an open source risk assessment tool, ERRATA.

Van Lissa Caspar J, Beinhauer Lukas, Branje Susan, Meeus Wim H J

2023-Mar-20

adolescence, emotion regulation, machine learning, random forests, theory formation

General General

Obstacles to effective model deployment in healthcare.

In Journal of bioinformatics and computational biology

Despite an exponential increase in publications on clinical prediction models over recent years, the number of models deployed in clinical practice remains fairly limited. In this paper, we identify common obstacles that impede effective deployment of prediction models in healthcare, and investigate their underlying causes. We observe a key underlying cause behind most obstacles - the improper development and evaluation of prediction models. Inherent heterogeneities in clinical data complicate the development and evaluation of clinical prediction models. Many of these heterogeneities in clinical data are unreported because they are deemed to be irrelevant, or due to privacy concerns. We provide real-life examples where failure to handle heterogeneities in clinical data, or sources of biases, led to the development of erroneous models. The purpose of this paper is to familiarize modeling practitioners with common sources of biases and heterogeneities in clinical data, both of which have to be dealt with to ensure proper development and evaluation of clinical prediction models. Proper model development and evaluation, together with complete and thorough reporting, are important prerequisites for a prediction model to be effectively deployed in healthcare.

Chan Wei Xin, Wong Limsoon

2023-Mar-18

Clinical prediction models, deployment, machine learning

General General

Asynchrony rescues statistically optimal group decisions from information cascades through emergent leaders.

In Royal Society open science

It is usually assumed that information cascades are most likely to occur when an early but incorrect opinion spreads through the group. Here, we analyse models of confidence-sharing in groups and reveal the opposite result: simple but plausible models of naive-Bayesian decision-making exhibit information cascades when group decisions are synchronous; however, when group decisions are asynchronous, the early decisions reached by Bayesian decision-makers tend to be correct and dominate the group consensus dynamics. Thus early decisions actually rescue the group from making errors, rather than contribute to it. We explore the likely realism of our assumed decision-making rule with reference to the evolution of mechanisms for aggregating social information, and known psychological and neuroscientific mechanisms.

Reina Andreagiovanni, Bose Thomas, Srivastava Vaibhav, Marshall James A R

2023-Mar

Bayesian brain, collective decision-making, emergent leaders, information cascades