In Computers in biology and medicine
Neurotoxins are a class of proteins that have a significant damaging effect on nerve tissue. Neurotoxins are classified into presynaptic neurotoxins and postsynaptic neurotoxins, and accurate identification of neurotoxins plays a key role in drug development. In this study, 90 presynaptic neurotoxins and 165 postsynaptic neurotoxins were classified. The features of the presynaptic and postsynaptic neurotoxin sequences were extracted using the AutoProp feature extraction method and feature selection was performed using the maximum relevance maximum distance (MRMD) program, Finally, only two features were retained to achieve 84.7% classification accuracy. Moreover, it was found that the two retained features were present in the conserved sites and motifs of presynaptic neurotoxins and could represent the critical structures of presynaptic neurotoxins. This method demonstrates that using a few key features to classify proteins can effectively identify critical protein structures.
Wan Hao, Liu Qing, Ju Ying
2022-Nov-30
Biological property, Few features, Machine learning, Neurotoxin, Reduced amino acid alphabet, Support vector machine