In Cell biochemistry and function
The expeditious transmission of the severe acute respiratory coronavirus 2 (SARS-CoV-2), a strain of COVID-19, crumbled the global economic strength and caused a veritable collapse in health infrastructure. The molecular modeling of the novel coronavirus research sounds promising and equips more evidence about the pragmatic therapeutic options. This article proposes a machine-learning framework for identifying potential COVID-19 transcriptomic signatures. The transcriptomics data contains immune-related genes collected from multiple tissues (blood, nasal, and buccal) with accession number: GSE183071. Extensive bioinformatics work was carried out to identify the potential candidate markers, including differential expression analysis, protein interactions, gene ontology, and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment studies. The overlapping investigation found SERPING1, the gene that encodes a glycosylated plasma protein C1-INH, in all three datasets. Furthermore, the immuno-informatics study was conducted on the C1-INH protein. 5DU3, the protein identifier of C1-INH, was fetched to identify the antigenicity, major histocompatibility (MHC) Class I and II binding epitopes, allergenicity, toxicity, and immunogenicity. The screening of peptides satisfying the vaccine-design criteria based on the metrics mentioned above is performed. The drug-gene interaction study reported that Rhucin is strongly associated with SERPING1. HSIC-Lasso (Hilbert-Schmidt independence criterion-least absolute shrinkage and selection operator), a model-free biomarker selection technique, was employed to identify the genes having a nonlinear relationship with the target class. The gene subset is trained with supervised machine learning models by a leave-one-out cross-validation method. Explainable artificial intelligence techniques perform the model interpretation analysis.
Sekaran Karthik, Polachirakkal Varghese Rinku, Gnanasambandan R, Karthik G, Ramya I, George Priya Doss C
2022-Dec-14
C1-inhibitor, COVID-19, Shapley additive explanations, druggability assessment, explainable artificial intelligence, gene expression, immune-related genes, machine learning