Influence of Sample Size, Model Selection, and Land Use on Prediction Accuracy of Soil Properties

Samira Safaee,Zamir Libohova,Eileen J. Kladivko, Andrew Brown,Edwin Winzeler,Quentin Read,Shams Rahmani,Kabindra Adhikari

Geoderma regional（2024）

引用 0|浏览6

暂无评分

摘要

Digital soil mapping (DSM) uses models that integrate field and laboratory data with environmental factors to predict soils and soil properties. The accuracy of predictions depends on the models used, the data collected, and the environmental factors. This study assesses the influence of sampling density and distribution, covariates, and modeling approach on the prediction accuracy of soil organic matter (SOM) and cation exchange capacity (CEC) at three sites in Indiana (ACRE; DPAC; SEPAC) with different management intensity and sampling designs. Ordinary Kriging (OK) and three machine learning models Cubist (CB), Random Forest (RF), and Regression Kriging (RK) were used. The Coefficient of Determination (R2), Root Mean Square Error (RMSE), Mean Square Error (MSE), concordance coefficient (pc), and bias were used for the accuracy assessment. The accuracy of the predictions was influenced by the site, sample density, model type, soil property, and their interactions. Sites were the single largest source of significant variation followed by sampling density and model type for both SOM and CEC. ACRE, with multiple fields and complex management practices, had a higher average RMSE and wider range of RMSE for SOM compared to SEPAC and DPAC with uniform management. At ACRE the RMSE for SOM decreased from 2.75 to 0.85 and from 17.38 to 3.61 for CEC with increasing number of samples from 36 (6 points/ha) to 66 (12points/ha), but did not change with further increases up to 146 samples. At SEPAC and DPAC the RMSE decreased only slightly at sampling densities above 5 points/ha and 1-2 points/ ha, respectively (68 and 43 samples, respectively). Based on cross validation, all models performed poorly for SOM with R2 varying from 0.13 to 0.38, while for CEC the model performance varied widely from 0.11 to 0.64. The accuracy predictions for CEC were higher compared to SOM at all sites. Overall, RF performed better while OK performed the worst for both SOM and CEC. The mean R2 values across all sites were 0.35 (SOM) and 0.51 (CEC) for RF and 0.19 (SOM) and 0.17 (CEC) for OK. At ACRE, OK performed worse for both SOM and CEC with only slight differences among the other models, while at SEPAC and DPAC there were only slight differences among all models. Spatial predictions for CB, RF and RK were more detailed and conformed to soil landscape models compared to OK. The spatial differences between sampling densities for predicted SOM and CEC were greater in lower elevation areas compared to higher elevation areas. The results from this study demonstrate that the selection of modeling approach is site-specific, and depends on sampling density, soil properties and their interactions.

查看译文

关键词

Digital soil mapping (DSM),Soil organic matter (SOM),Cation exchange capacity,(CEC),Spatial distribution,Machine learning,Terrain attributes,Management,Complexity

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要