In Frontiers in bioengineering and biotechnology
The broad spectrum of intellectual disability (ID) patients' clinical manifestations, the heterogeneity of ID genetic variation, and the diversity of the phenotypic variation represent major challenges for ID diagnosis. By exploiting a manually curated systematic phenotyping cohort of 3803 patients harboring ID, we identified 704 pathogenic genes, 3848 pathogenic sites, and 2075 standard phenotypes for underlying molecular perturbations and their phenotypic impact. We found the positive correlation between the number of phenotypes and that of patients that revealed their extreme heterogeneities, and the relative contribution of multiple determinants to the heterogeneity of ID phenotypes. Nevertheless, despite the extreme heterogeneity in phenotypes, the ID genes had a specific bias of mutation types, and the top 44 genes that ranked by the number of patients accounted for 39.9% of total patients. More interesting, enriched co-occurrent phenotypes and co-occurrent phenotype networks for each gene had the potential for prioritizing ID genes, further exhibited the convergences of ID phenotypes. Then we established a predictor called IDpred using machine learning methods for ID pathogenic genes prediction. Using10-fold cross-validation, our evaluation shows remarkable AUC values for IDpred (auc = 0.978), demonstrating the robustness and reliability of our tool. Besides, we built the most comprehensive database of ID phenotyped cohort to date: IDminer http://126.96.36.199:3100/IDminer/, which included the curated ID data and integrated IDpred tool for both clinical and experimental researchers. The IDminer serves as an important resource and user-friendly interface to help researchers investigate ID data, and provide important implications for the diagnosis and pathogenesis of developmental disorders of cognition.
Wang Yan, Zhu Li-Na, Ma Xiu-Wei, Yang Fang, Xu Xi-Lin, Yang Yao, Yang Xiao, Peng Wei, Zhang Wan-Qiao, Liang Jin-Yu, Zhu Wei-Dong, Jiang Tai-Jiao, Zhang Xin-Lei, Feng Zhi-Chun
co-occurrent phenotype, gene-focused networks, intellectual disability, machine learning, pathogenic genes prediction, phenotypic convergence