In Journal of chemical information and modeling
In this study, a framework for the prediction of thermophysical properties based on transfer learning from existing estimation models is explored. The predictive capabilities of conventional group-contribution methods and traditional machine-learning approaches rely heavily on the availability of experimental datasets and their uncertainty. Through the use of a pretraining scheme, which leverages the knowledge established by other estimation methods, improved prediction models for thermophysical properties can be obtained after fine-tuning networks with more accurate experimental data. As our experiments show, for the case of critical properties of compounds, this pipeline not only improves the performance of the models on commonly found organic structures but can also help these models generalize to less explored areas of chemical space, where experimental data is scarce, such as inorganics and heavier organic compounds. Transfer learning from estimation models data also allows for graph-based deep learning models to create more flexible molecular features over a bigger chemical space, which leads to improved predictive capabilities and can give insights into the relationship between molecular structures and thermophysical properties. The generated molecular features can discriminate behavior discrepancy between isomers without the need of additional parameters. Also, this approach shows better robustness to outliers in experimental datasets.
Hormazabal Rodrigo S, Kang Jeong Won, Park Kiho, Yang Dae Ryook
2022-Oct-31