In ACS omega
The problem of virus classification is always a subject of concern for virology or epidemiology over the decades. In this regard, a machine learning technique can be used to predict the novel coronavirus by considering its sequence. Thus, we are proposing a machine learning-based novel coronavirus prediction technique, called COVID-Predictor, where 1000 sequences of SARS-CoV-1, MERS-CoV, SARS-CoV-2, and other viruses are used to train a Naive Bayes classifier so that it can predict any unknown sequences of these viruses. The model has been validated using 10-fold cross-validation in comparison with other machine learning techniques. The results show the superiority of our predictor by achieving an average 99.7% accuracy on an unseen validation set of viruses. The same pre-trained model has been used to design a web-based application where sequences of unknown viruses can be uploaded to predict the novel coronavirus.
Sarkar Jnanendra Prasad, Saha Indrajit, Ghosh Nimisha, Maity Debasree, Plewczynski Dariusz