Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In Computers in biology and medicine

Cannabinoid receptors, as part of the family of the G protein-coupled receptors (GPCRs), are involved in various physiological functions. Its subtype cannabinoid receptor subtype 2 (CB2), mainly distributed in the periphery, is a crucial therapeutic target for anti-epileptic, anti-inflammation, anti-fibrosis, and bone metabolism regulation, and it regulates these physiological functions without psychiatric side effects. Recently machine learning methods for predicting biophysics properties have attracted much attention. Successful application of machine learning usually highly depends on the appropriate representation of the compounds. In this study, we comprehensively evaluate the performance of the descriptor-based models (including XGBoost, Random Forest, and KNN) and two graph-based models (D-MPNN, MolMap) for the prediction of the CB2 regulators, and found that XGBoost offers outstanding performance for both regression tasks and classification tasks. 13 different molecular fingerprints and 12 descriptors, as well as their combination were further screened; AvalonFP + AtomPairFP + RDkitFP + MorganFP and AtomPairFP + MorganFP + AvalonFP were the optimum combinations for regression task (R2 increase to 0.667) and classification task (AUC-ROC increase to 0.933), respectively. Specifically, the best XGBoost regression model with optimum features achieves better performance than Mizera's QSAR model on the same dataset developed by Mizera (R2 0.664 versus 0.62). It also achieves optimal performance with an AUC-ROC of 0.917 on the external validation set. By comparison, MolMap and D-MPNN only provide 0.912 and 0.898. The Shapley additive explanation method was used to interpret the models, and features importance were shown for both regression and classification task. The XGBoost model equipped with essential molecular fingerprints combination in this paper may provide valuable clues to designing novel CB2 ligands and developing models for other properties prediction.

Zhou Hao, Shan Mengyi, Qin Lu-Ping, Cheng Gang

2022-Nov-30

CB2, Fingerprints combination, In silico prediction, Machine learning, XGBoost