Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

LBCE-XGB: A XGBoost Model for Predicting Linear B-Cell Epitopes Based on BERT Embeddings.

In Interdisciplinary sciences, computational life sciences

Accurately detecting linear B-cell epitopes (BCEs) makes great sense in vaccine design, immunodiagnostic test, antibody production, disease prevention and treatment. Wet-lab experiments for determining linear BCEs are both expensive and laborious, which are not able to meet the recognition needs of modern massive protein sequence data. Instead, computational methods can efficiently identify linear BCEs with low cost. Although several computational methods are available, the performance is still not satisfactory. Thus, we propose a new method, LBCE-XGB, to forecast linear BCEs based on XGBoost algorithm. To represent the biological information concealed in peptide sequences, the embeddings of the residues were obtained from a pre-trained domain-specific BERT model. In addition, the other five types of attributes comprising amino acid composition, amino acid antigenicity scale were also extracted. The best feature combination was determined according to the cross-validation results. Against the models developed by other deep learning and machine learning algorithms, LBCE-XGB achieves the top performance with an AUROC of 0.845 for fivefold cross-validation. The results on the independent test set show that our model attains an AUROC of 0.838 which is substantially higher than other state-of-the-art methods. The outcomes indicate that the representations of BERT could be an effective feature in predicting linear BCEs and we believe that LBCE-XGB could be a useful medium for detecting linear B cell epitopes with high accuracy and low cost.

Liu Yufeng, Liu Yinbo, Wang Shuyu, Zhu Xiaolei

2023-Jan-16

BERT, Linear B cell epitope, Machine learning, Natural language processing, XGBoost

17 Jan 2023

LBCE-XGB: A XGBoost Model for Predicting Linear B-Cell Epitopes Based on BERT Embeddings.

Weekly Summary