In Bioinformatics advances
MOTIVATION : Antimicrobial peptides (AMPs) are increasingly being used in the development of new therapeutic drugs in areas such as cancer therapy and hypertension. Additionally, they are seen as an alternative to antibiotics due to the increasing occurrence of bacterial resistance. Wet-laboratory experimental identification, however, is both time-consuming and costly, so in silico models are now commonly used in order to screen new AMP candidates.
RESULTS : This paper proposes a novel approach for creating model inputs; using pre-trained language models to produce contextualized embeddings, representing the amino acids within each peptide sequence, before a convolutional neural network is trained as the classifier. The results were validated on two datasets-one previously used in AMP prediction research, and a larger independent dataset created by this paper. Predictive accuracies of 93.33% and 88.26% were achieved, respectively, outperforming previous state-of-the-art classification models.
AVAILABILITY AND IMPLEMENTATION : All codes are available and can be accessed here: https://github.com/williamdee1/LMPred_AMP_Prediction.
SUPPLEMENTARY INFORMATION : Supplementary data are available at Bioinformatics Advances online.
Dee William
2022