In Current genomics
Background : A newly emerging novel coronavirus appeared and rapidly spread worldwide and World Health Organization declared a pandemic on March 11, 2020. The roles and characteristics of coronavirus have captured much attention due to its power of causing a wide variety of infectious diseases, from mild to severe, on humans. The detection of the lethality of human coronavirus is key to estimate the viral toxicity and provide perspectives for treatment.
Methods : We developed an alignment-free framework that utilizes machine learning approaches for an ultra-fast and highly accurate prediction of the lethality of human-adapted coronavirus using genomic sequences. We performed extensive experiments through six different feature transformation and machine learning algorithms combining digital signal processing to identify the lethality of possible future novel coronaviruses using existing strains.
Results : The results tested on SARS-CoV, MERS-CoV and SARS-CoV-2 datasets show an average 96.7% prediction accuracy. We also provide preliminary analysis validating the effectiveness of our models through other human coronaviruses. Our framework achieves high levels of prediction performance that is alignment-free and based on RNA sequences alone without genome annotations and specialized biological knowledge.
Conclusion : The results demonstrate that, for any novel human coronavirus strains, this study can offer a reliable real-time estimation for its viral lethality.
Yin Rui, Luo Zihan, Kwoh Chee Keong
Coronavirus, SARS-CoV, alignment-free, genomic nucleotide, lethality inference, machine learning