Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In Frontiers in genetics ; h5-index 62.0

Genomic sequence mutations can be pathogenic in both germline and somatic cells. Several authors have observed that often the same genes are involved in cancer when mutated in somatic cells and in genetic diseases when mutated in the germline. Recent advances in high-throughput sequencing techniques have provided us with large databases of both types of mutations, allowing us to investigate this issue in a systematic way. Hence, we applied a machine learning based framework to this problem, comparing multiple models. The models achieved significant predictive power as shown by both cross-validation and their application to recently discovered gene/phenotype associations not used for training. We found that genes characterized by high frequency of somatic mutations in the most common cancers and ancient evolutionary age are most likely to be involved in abnormal phenotypes and diseases. These results suggest that the combination of tolerance for mutations at the cell viability level (measured by the frequency of somatic mutations in cancer) and functional relevance (demonstrated by evolutionary conservation) are the main predictors of disease genes. Our results thus confirm the deep relationship between pathogenic mutations in somatic and germline cells, provide new insight into the common origin of cancer and genetic diseases, and can be used to improve the identification of new disease genes.

Draetta Edoardo Luigi, Lazarević Dejan, Provero Paolo, Cittaro Davide

2022

cancer, gene prioritization, human disease, machine learning, mutations