In ACS applied materials & interfaces ; h5-index 147.0
The figure of merit (zT) is a key parameter to measure the performance of thermoelectric materials. At present, the prediction of zT values via machine leaning has emerged as a promising method for exploring high-performance materials. However, the machine learning-based predictions still suffer from unsatisfactory accuracy, and this is related to the size of the data set, the hyperparameters of models, and the quality of the data. In this work, 5038 pieces of data of thermoelectric materials were selected, and several regression models were generated to predict zT values. This large data set-driven light gradient boosting (LGB) model with 57 features performed with an excellent accuracy, achieving a coefficient of determination (R2) value of 0.959, a root mean squared error (RMSE) of 0.094, a mean absolute error (MAE) of 0.057, and a correlation coefficient (R) of 0.979. Owing to the large size of the data set, the prediction accuracy exceeds that of most reported zT predictions via machine learning. The "ME Lattice Parameter" was verified as the most important feature in the zT prediction. Furthermore, nine potential candidates were screened out from among one million pieces of data. This study solves the problem of the data set size, adjusts the hyperparameters of the models, uses feature engineering to improve data quality, and provides an efficient strategy to perform wide-ranging screening for promising materials.
Li Yi, Zhang Jingzi, Zhang Ke, Zhao Mengkun, Hu Kailong, Lin Xi
2022-Dec-06
data-driven, machine learning, the figure of merit, thermoelectric materials, zT prediction