Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Public Health Public Health

Integrative analysis of long extracellular RNAs reveals a detection panel of noncoding RNAs for liver cancer.

In Theranostics

Rationale: Long extracellular RNAs (exRNAs) in plasma can be profiled by new sequencing technologies, even with low abundance. However, cancer-related exRNAs and their variations remain understudied. Methods: We investigated different variations (i.e. differential expression, alternative splicing, alternative polyadenylation, and differential editing) in diverse long exRNA species (e.g. long noncoding RNAs and circular RNAs) using 79 plasma exosomal RNA-seq (exoRNA-seq) datasets of multiple cancer types. We then integrated 53 exoRNA-seq datasets and 65 self-profiled cell-free RNA-seq (cfRNA-seq) datasets to identify recurrent variations in liver cancer patients. We further combined TCGA tissue RNA-seq datasets and validated biomarker candidates by RT-qPCR in an individual cohort of more than 100 plasma samples. Finally, we used machine learning models to identify a signature of 3 noncoding RNAs for the detection of liver cancer. Results: We found that different types of RNA variations identified from exoRNA-seq data were enriched in pathways related to tumorigenesis and metastasis, immune, and metabolism, suggesting that cancer signals can be detected from long exRNAs. Subsequently, we identified more than 100 recurrent variations in plasma from liver cancer patients by integrating exoRNA-seq and cfRNA-seq datasets. From these datasets, 5 significantly up-regulated long exRNAs were confirmed by TCGA data and validated by RT-qPCR in an independent cohort. When using machine learning models to combine two of these validated circular and structured RNAs (SNORD3B-1, circ-0080695) with a miRNA (miR-122) as a panel to classify liver cancer patients from healthy donors, the average AUROC of the cross-validation was 89.4%. The selected 3-RNA panel successfully detected 79.2% AFP-negative samples and 77.1% early-stage liver cancer samples in the testing and validation sets. Conclusions: Our study revealed that different types of RNA variations related to cancer can be detected in plasma and identified a 3-RNA detection panel for liver cancer, especially for AFP-negative and early-stage patients.

Zhu Yumin, Wang Siqi, Xi Xiaochen, Zhang Minfeng, Liu Xiaofan, Tang Weina, Cai Peng, Xing Shaozhen, Bao Pengfei, Jin Yunfan, Zhao Weihao, Chen Yinghui, Zhao Huanan, Jia Xiaodong, Lu Shanshan, Lu Yinying, Chen Lei, Yin Jianhua, Lu Zhi John


RNA biomarker, cancer, circular RNA, extracellular RNA, liquid biopsy, noncoding RNA

oncology Oncology

Research progress of radiation-induced hypothyroidism in head and neck cancer.

In Journal of Cancer

This paper reviews the factors related to hypothyroidism after radiotherapy in patients with head and neck cancer to facilitate the prevention of radiation-induced hypothyroidism and reduce its incidence. Hypothyroidism is a common complication after radiotherapy in patients with head and neck cancer, wherein the higher the radiation dose to the thyroid and pituitary gland, the higher the incidence of hypothyroidism. With prolonged follow-up time, the incidence of hypothyroidism gradually increases. Intensity modulated radiotherapy should limit the dose to the thyroid, which would reduce the incidence of hypothyroidism. In addition, the risk factors for hypothyroidism include small thyroid volume size, female sex, and previous neck surgery. The incidence of radiation-induced hypothyroidism in head and neck cancer is related to the radiation dose, radiotherapy technique, thyroid volume, sex, and age. A prospective, large sample and long-term follow-up study should be carried out to establish a model of normal tissue complications that are likely to be related to radiation-induced hypothyroidism.

Zhou Ling, Chen Jia, Tao Chang-Juan, Chen Ming, Yu Zhong-Hua, Chen Yuan-Yuan


Head and Neck Cancer, Hypothyroidism, Radiotherapy

General General

Predicting Rice Heading Date Using an Integrated Approach Combining a Machine Learning Method and a Crop Growth Model.

In Frontiers in genetics ; h5-index 62.0

Accurate prediction of heading date under various environmental conditions is expected to facilitate the decision-making process in cultivation management and the breeding process of new cultivars adaptable to the environment. Days to heading (DTH) is a complex trait known to be controlled by multiple genes and genotype-by-environment interactions. Crop growth models (CGMs) have been widely used to predict the phenological development of a plant in an environment; however, they usually require substantial experimental data to calibrate the parameters of the model. The parameters are mostly genotype-specific and are thus usually estimated separately for each cultivar. We propose an integrated approach that links genotype marker data with the developmental genotype-specific parameters of CGMs with a machine learning model, and allows heading date prediction of a new genotype in a new environment. To estimate the parameters, we implemented a Bayesian approach with the advanced Markov chain Monte-Carlo algorithm called the differential evolution adaptive metropolis and conducted the estimation using a large amount of data on heading date and environmental variables. The data comprised sowing and heading dates of 112 cultivars/lines tested at 7 locations for 14 years and the corresponding environmental variables (day length and daily temperature). We compared the predictive accuracy of DTH between the proposed approach, a CGM, and a single machine learning model. The results showed that the extreme learning machine (one of the implemented machine learning models) was superior to the CGM for the prediction of a tested genotype in a tested location. The proposed approach outperformed the machine learning method in the prediction of an untested genotype in an untested location. We also evaluated the potential of the proposed approach in the prediction of the distribution of DTH in 103 F2 segregation populations derived from crosses between a common parent, Koshihikari, and 103 cultivars/lines. The results showed a high correlation coefficient (ca. 0.8) of the 10, 50, and 90th percentiles of the observed and predicted distribution of DTH. In this study, the integration of a machine learning model and a CGM was better able to predict the heading date of a new rice cultivar in an untested potential environment.

Chen Tai-Shen, Aoike Toru, Yamasaki Masanori, Kajiya-Kanegae Hiromi, Iwata Hiroyoshi


Markov chain Monte-Carlo, bayesian inference, crop growth model, differential evolution adaptive metropolis, machine learning

General General

Machine Learning Prediction of Crossbred Pig Feed Efficiency and Growth Rate From Single Nucleotide Polymorphisms.

In Frontiers in genetics ; h5-index 62.0

This research assessed the ability of a Support Vector Machine (SVM) regression model to predict pig crossbred (CB) performance from various sources of phenotypic and genotypic information for improving crossbreeding performance at reduced genotyping cost. Data consisted of average daily gain (ADG) and residual feed intake (RFI) records and genotypes of 5,708 purebred (PB) boars and 5,007 CB pigs. Prediction models were fitted using individual PB genotypes and phenotypes (trn.1); genotypes of PB sires and average of CB records per PB sire (trn.2); and individual CB genotypes and phenotypes (trn.3). The average of CB offspring records was the trait to be predicted from PB sire's genotype using cross-validation. Single nucleotide polymorphisms (SNPs) were ranked based on the Spearman Rank correlation with the trait. Subsets with an increasing number (from 50 to 2,000) of the most informative SNPs were used as predictor variables in SVM. Prediction performance was the median of the Spearman correlation (SC, interquartile range in brackets) between observed and predicted phenotypes in the testing set. The best predictive performances were obtained when sire phenotypic information was included in trn.1 (0.22 [0.03] for RFI with SVM and 250 SNPs, and 0.12 [0.05] for ADG with SVM and 500-1,000 SNPs) or when trn.3 was used (0.29 [0.16] with Genomic best linear unbiased prediction (GBLUP) for RFI, and 0.15 [0.09] for ADG with just 50 SNPs). Animals from the last two generations were assigned to the testing set and remaining animals to the training set. Individual's PB own phenotype and genotype improved the prediction ability of CB offspring of young animals for ADG but not for RFI. The highest SC was 0.34 [0.21] and 0.36 [0.22] for RFI and ADG, respectively, with SVM and 50 SNPs. Predictive performance using CB data for training leads to a SC of 0.34 [0.19] with GBLUP and 0.28 [0.18] with SVM and 250 SNPs for RFI and 0.34 [0.15] with SVM and 500 SNPs for ADG. Results suggest that PB candidates could be evaluated for CB performance with SVM and low-density SNP chip panels after collecting their own RFI or ADG performances or even earlier, after being genotyped using a reference population of CB animals.

Tusell Llibertat, Bergsma Rob, Gilbert Hélène, Gianola Daniel, Piles Miriam


crossbred, genomic prediction, machine learning, pigs, single nucleotide polymorphism, support vector machine

General General

Accuracy of a Smartphone-Based Object Detection Model, PlantVillage Nuru, in Identifying the Foliar Symptoms of the Viral Diseases of Cassava-CMD and CBSD.

In Frontiers in plant science

Nuru is a deep learning object detection model for diagnosing plant diseases and pests developed as a public good by PlantVillage (Penn State University), FAO, IITA, CIMMYT, and others. It provides a simple, inexpensive and robust means of conducting in-field diagnosis without requiring an internet connection. Diagnostic tools that do not require the internet are critical for rural settings, especially in Africa where internet penetration is very low. An investigation was conducted in East Africa to evaluate the effectiveness of Nuru as a diagnostic tool by comparing the ability of Nuru, cassava experts (researchers trained on cassava pests and diseases), agricultural extension officers and farmers to correctly identify symptoms of cassava mosaic disease (CMD), cassava brown streak disease (CBSD) and the damage caused by cassava green mites (CGM). The diagnosis capability of Nuru and that of the assessed individuals was determined by inspecting cassava plants and by using the cassava symptom recognition assessment tool (CaSRAT) to score images of cassava leaves, based on the symptoms present. Nuru could diagnose symptoms of cassava diseases at a higher accuracy (65% in 2020) than the agricultural extension agents (40-58%) and farmers (18-31%). Nuru's accuracy in diagnosing cassava disease and pest symptoms, in the field, was enhanced significantly by increasing the number of leaves assessed to six leaves per plant (74-88%). Two weeks of Nuru practical use provided a slight increase in the diagnostic skill of extension workers, suggesting that a longer duration of field experience with Nuru might result in significant improvements. Overall, these findings suggest that Nuru can be an effective tool for in-field diagnosis of cassava diseases and has the potential to be a quick and cost-effective means of disseminating knowledge from researchers to agricultural extension agents and farmers, particularly on the identification of disease symptoms and their management practices.

Mrisho Latifa M, Mbilinyi Neema A, Ndalahwa Mathias, Ramcharan Amanda M, Kehs Annalyse K, McCloskey Peter C, Murithi Harun, Hughes David P, Legg James P


Africa, Kenya, Tanzania, cassava brown streak disease, cassava mosaic disease, e-extension services, image recognition systems, mobile applications for agriculture

General General

An integrated approach based on artificial intelligence and novel meta-heuristic algorithms to predict demand for dairy products: a case study.

In Network (Bristol, England)

This research specifically addresses the prediction of dairy product demand (DPD). Since dairy products have a short consumption period, it is important to have accurate information about their future demand. The main contribution of this research is to provide an integrated framework based on statistical tests, time-series neural networks, and improved MLP, ANFIS, and SVR with novel meta-heuristic algorithms in order to obtain the best prediction of DPD in Iran. At first, a series of economic and social indicators that seemed to be effective in the demand for dairy products is identified. Then, the ineffective indices are eliminated by using the Pearson correlation coefficient, and statistically significant variables are determined. Since the regression relation is not able to predict this demand properly, the artificial intelligence tools including MLP, ANFIS, and SVR are implemented and improved with the help of novel meta-heuristic algorithms such as grey wolf optimization (GWO), invasive weed optimization (IWO), cultural algorithm (CA), and particle swarm optimization (PSO). The designed hybrid method is used to predict the DPD in Iran by using data from 2013 to 2017. The high accurate results confirm that the proposed hybrid methods have the ability to improve the prediction of the demand for various products.

Goli Alireza, Khademi-Zare Hasan, Tavakkoli-Moghaddam Reza, Sadeghieh Ahmad, Sasanian Mazyar, Malekalipour Kordestanizadeh Ramina


Artificial intelligence, demand prediction, novel meta-heuristic algorithm, regression, time series neural network