Respiratory sensitization has been considered an important toxicological endpoint, because of the severe risk to human health. A great part of sensitization events were caused by low molecular weight (< 1000) respiratory sensitizers in the past decades. However, there is currently no widely accepted test method that can identify prospective low molecular weight respiratory sensitisers. Herein, we performed the study of modeling and insights into molecular basis of low molecular weight respiratory sensitizers with a high-quality data set containing 136 respiratory sensitizers and 518 nonsensitizers. We built a number of classification models by using OCHEM tools, and a consensus model was developed based on the ten best individual models. The consensus model showed good predictive ability with a balanced accuracy of 0.78 and 0.85 on fivefold cross-validation and external validation, respectively. The readers can predict the respiratory sensitization of organic compounds via The effect of several molecular properties on respiratory sensitization was also evaluated. The results indicated that these properties differ significantly between respiratory sensitizers and nonsensitizers. Furthermore, 14 privileged substructures responsible for respiratory sensitization were identified. We hope the models and the findings could provide useful help for environmental risk assessment.

Cui Xueyan, Yang Rui, Li Siwen, Liu Juan, Wu Qiuyun, Li Xiao


Consensus model, Machine learning, Molecular property, Respiratory sensitizer, Structural alert