In Bioinformatics (Oxford, England)
MOTIVATION : Anticancer peptides (ACPs) have recently emerged as effective anticancer drugs in cancer therapy. Machine-learning-based predictors have been developed to identify ACPs and achieve satisfactory performance. However, existing methods suffer from experience-based feature engineering, which not only restricts the representation ability of the models to a certain extent but also lacks adaptivity for different data, limiting the further improvement of the predictive performance and impacting the robustness of the predictive models. To alleviate the above problems, we propose a novel deep-learning-based predictor named ACPred-LAF, in which we propose a novel multi-sense and multi-scaled embedding algorithm to automatically learn and extract context sequential characteristics of ACPs.
RESULTS : Through the feature comparative analysis, we demonstrate that our learnable and self-adaptive embedding features are better than hand-crafted features in capturing discriminative information, which can effectively benefit the performance improvement for ACP prediction. In addition, benchmarking comparison results demonstrate that our ACPred-LAF outperforms the state-of-the-art methods both on existing benchmark datasets and our newly constructed dataset. Furthermore, we also prove and validate the robustness of the model via the data interference experiment. To avoid potential evaluation bias, here we construct a new ACP benchmark dataset named ACP-Mixed by integrating existing datasets. We expect our newly constructed dataset to be a golden standard benchmark dataset in this field. To facilitate the use of our model, we develop a web server as the implementation of ACPred-LAF.
AVAILABILITY : Our proposed ACPred-LAF, newly constructed benchmark dataset ACP-Mixed are open source collaborative initiatives available in the GitHub repository (https://github.com/TearsWaiting/ACPred-LAF). Besides, a webserver as the implementation of ACPred-LAF that can be accessed via: http://server.malab.cn/ACPred-LAF.
SUPPLEMENTARY INFORMATION : Supplementary data are available at Bioinformatics online.
He Wenjia, Wang Yu, Cui Lizhen, Su Ran, Wei Leyi