In Ying yong sheng tai xue bao = The journal of applied ecology
We explored the application of different feature mining methods combined with genera-lized boosted regression models in digital soil mapping. Environmental covariates were selected by two feature selection methods i.e., recursive feature elimination and selection by filtering. Using the original environmental covariates and the selected optimal variable combination as independent varia-bles, soil pH prediction model of Anhui Province was established and mapped based on the genera-lized boosted regression model and random forest model. The results showed that both kinds of feature mining methods could effectively improve the accuracy of soil pH prediction by generalized boosted regression models and random forest model, and could reduce dimensionality. Compared with the random forest model, the prediction accuracy of the validation set of the generalized boosted regression model was slightly lower. In the training set, the accuracy of the generalized boosted regression models was much higher than that of the random forest model, with higher interpretation and better overall effect. The main parameters of the random forest model, ntree and mtry, had limi-ted effect on the model. Different parameters and their combination could affect the prediction accuracy of the generalized boosted regression models, and thus should be tuned before modeling. The results of spatial mapping showed that soil pH in Anhui Province showed a pattern of "south acid and north alkali".
Wang Shi-Hang, Lu Hong-Liang, Zhao Ming-Song, Zhou Ling-Mei
Anhui Province, feature mining, generalized boosted regression models, machine learning, random forest, soil pH