Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Pathology Pathology

Characterizing Deep Gaussian Processes via Nonlinear Recurrence Systems

ArXiv Preprint

Recent advances in Deep Gaussian Processes (DGPs) show the potential to have more expressive representation than that of traditional Gaussian Processes (GPs). However, there exists a pathology of deep Gaussian processes that their learning capacities reduce significantly when the number of layers increases. In this paper, we present a new analysis in DGPs by studying its corresponding nonlinear dynamic systems to explain the issue. Existing work reports the pathology for the squared exponential kernel function. We extend our investigation to four types of common stationary kernel functions. The recurrence relations between layers are analytically derived, providing a tighter bound and the rate of convergence of the dynamic systems. We demonstrate our finding with a number of experimental results.

Anh Tong, Jaesik Choi

2020-10-19

General General

A four-methylated LncRNA signature predicts survival of osteosarcoma patients based on machine learning.

In Genomics

Risk stratification using prognostic markers facilitates clinical decision-making in treatment of osteosarcoma (OS). In this study, we performed a comprehensive analysis of DNA methylation and transcriptome data from OS patients to establish an optimal methylated lncRNA signature for determining OS patient prognosis. The original OS datasets were downloaded from the the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) database. Univariate, Lasso, and machine learning algorithm-iterative Lasso Cox regression analyses were used to establish a methylated lncRNA signature that significantly correlated with OS patient survival. The validity of this signature was verified by the Kaplan-Meier curves, Receiver Operating Characteristic (ROC) curves. We established a four-methylated lncRNA signature that can predict OS patient survival (verified in independent cohort [GSE39055]). Kaplan-Meier analysis showed that the signature can distinguish between the survival of high- and low-risk patients. ROC analysis corroborated this finding and revealed that the signature had higher prediction accuracy than known biomarkers. Kaplan-Meier analysis of the clinical subgroup showed that the signature's prognostic ability was independent of clinicopathological factors. The four-methylated lncRNA signature is an independent prognostic biomarker of OS.

Deng Yajun, Yuan Wenhua, Ren Enhui, Wu Zuolong, Zhang Guangzhi, Xie Qiqi

2020-Oct-15

Biomarker, DNA methylation, LncRNA, Osteosarcoma, Prognosis

Pathology Pathology

Characterizing Deep Gaussian Processes via Nonlinear Recurrence Systems

ArXiv Preprint

Recent advances in Deep Gaussian Processes (DGPs) show the potential to have more expressive representation than that of traditional Gaussian Processes (GPs). However, there exists a pathology of deep Gaussian processes that their learning capacities reduce significantly when the number of layers increases. In this paper, we present a new analysis in DGPs by studying its corresponding nonlinear dynamic systems to explain the issue. Existing work reports the pathology for the squared exponential kernel function. We extend our investigation to four types of common stationary kernel functions. The recurrence relations between layers are analytically derived, providing a tighter bound and the rate of convergence of the dynamic systems. We demonstrate our finding with a number of experimental results.

Anh Tong, Jaesik Choi

2020-10-19

General General

Explainable Automated Fact-Checking for Public Health Claims

ArXiv Preprint

Fact-checking is the task of verifying the veracity of claims by assessing their assertions against credible evidence. The vast majority of fact-checking studies focus exclusively on political claims. Very little research explores fact-checking for other topics, specifically subject matters for which expertise is required. We present the first study of explainable fact-checking for claims which require specific expertise. For our case study we choose the setting of public health. To support this case study we construct a new dataset PUBHEALTH of 11.8K claims accompanied by journalist crafted, gold standard explanations (i.e., judgments) to support the fact-check labels for claims. We explore two tasks: veracity prediction and explanation generation. We also define and evaluate, with humans and computationally, three coherence properties of explanation quality. Our results indicate that, by training on in-domain data, gains can be made in explainable, automated fact-checking for claims which require specific expertise.

Neema Kotonya, Francesca Toni

2020-10-19

General General

Atypical myelinogenesis and reduced axon caliber in the Scn1a variant model of Dravet syndrome: An electron microscopy pilot study of the developing and mature mouse corpus callosum.

In Brain research

Dravet Syndrome (DS) is a genetic neurodevelopmental disease. Recurrent severe seizures begin in infancy and co-morbidities follow, including developmental delay, cognitive and behavioral dysfunction. A majority of DS patients have an SCN1A heterozygous gene mutation. This mutation causes a loss-of-function in inhibitory neurons, initiating seizure onset. We have investigated whether the sodium channelopathy may result in structural changes in the DS model independent of seizures. Morphometric analyses of axons within the corpus callosum were completed at P16 and P50 in Scn1a heterozygote KO male mice and their age-matched wild-type littermates. Trainable machine learning algorithms were used to examine electron microscopy images of ∼400 myelinated axons per animal, per genotype, including myelinated axon cross-section area, frequency distribution and g-ratios. Pilot data for Scn1a heterozygote KO mice demonstrate the average axon caliber was reduced in developing and adult mice. Qualitative analysis also shows micro-features marking altered myelination at P16 in the DS model, with myelin out-folding and myelin debris within phagocytic cells. The data has indicated, in the absence of behavioral seizures, factors that governed a shift toward small calibre axons at P16 have persisted in adult Scn1a heterozygote KO corpus callosum. The pilot study provides a basis for future meta-analysis that will enable robust estimates of the effects of the sodium channelopathy on axon architecture. We propose that early therapeutic strategies in DS could help minimize the effect of sodium channelopathies, beyond the impact of overt seizures, and therefore achieve better long-term treatment outcomes.

Richards Kay, Jancovski Nikola, Hanssen Eric, Connelly Alan, Petrou Steve

2020-Oct-15

General General

Efficient Estimation and Evaluation of Prediction Rules in Semi-Supervised Settings under Stratified Sampling

ArXiv Preprint

In many contemporary applications, large amounts of unlabeled data are readily available while labeled examples are limited. There has been substantial interest in semi-supervised learning (SSL) which aims to leverage unlabeled data to improve estimation or prediction. However, current SSL literature focuses primarily on settings where labeled data is selected randomly from the population of interest. Non-random sampling, while posing additional analytical challenges, is highly applicable to many real world problems. Moreover, no SSL methods currently exist for estimating the prediction performance of a fitted model under non-random sampling. In this paper, we propose a two-step SSL procedure for evaluating a prediction rule derived from a working binary regression model based on the Brier score and overall misclassification rate under stratified sampling. In step I, we impute the missing labels via weighted regression with nonlinear basis functions to account for nonrandom sampling and to improve efficiency. In step II, we augment the initial imputations to ensure the consistency of the resulting estimators regardless of the specification of the prediction model or the imputation model. The final estimator is then obtained with the augmented imputations. We provide asymptotic theory and numerical studies illustrating that our proposals outperform their supervised counterparts in terms of efficiency gain. Our methods are motivated by electronic health records (EHR) research and validated with a real data analysis of an EHR-based study of diabetic neuropathy.

Jessica Gronsbell, Molei Liu, Lu Tian, Tianxi Cai

2020-10-19