Back to Search Start Over

Random sampling or geostatistical modelling? Choosing between design-based and model-based sampling strategies for soil (with discussion)

Authors :
Dick J. Brus
J.J. de Gruijter
Source :
Geoderma, 80(1/2), 1-59, Geoderma 80 (1997) 1/2
Publication Year :
1997
Publisher :
Elsevier BV, 1997.

Abstract

Classical sampling theory has been repeatedly identified with classical statistics which assumes that data are identically and independently distributed. This explains the switch of many soil scientists from design-based sampling strategies, based on classical sampling theory, to the model-based approach, which is based on geostatistics. However, in design-based sampling, independence has a different meaning and is determined by the sampling design, whereas in the model-based approach it is determined by the postulated model for the process studied. Design-based strategies are therefore also valid in areas with autocorrelation. Design-based and model-based estimates of spatial means are compared in a simulation study on the basis of the design-based quality criteria. The simulated field consists of four homogeneous units that are realizations of models with different means, variances and variograms. Performance is compared for two sample sizes (140 and 1520) and two block sizes (8 × 6.4 km 2 , 1.6 × 1.6 km 2 ). The two strategies are Stratified Simple Random Sampling combined with the Horvitz-Thompson estimator ( STSI , t HT ), and Systematic Sampling combined with the block kriging predictor ( SY , t OK ). Point estimates of spatial means by ( SY , t OK ) were more accurate in all cases except the global mean (8 × 6.4 km 2 block) estimated from the small sample. In interval estimates on the other hand, p -coverages were in general better with the design-based strategy, except when the number of sample points in the block was small. Factors that determine the effectiveness and efficiency of the two approaches are the type of request, the interest in objective estimates, the need for separate unique estimates of the estimation variance for all points or subregions, the interest in valid and accurate estimates of the estimation or prediction variance, the quality of the model, the autocorrelation between observation and prediction points, and the sample size. These factors will be assembled in a decision-tree that can be helpful in choosing between the two approaches. Models can also be used in the design-based approach. They describe the population itself, whereas in the model-based approach they describe the data generating processes. Errors in such models result in less accurate estimates, but the estimated accuracy is still valid.

Details

ISSN :
00167061
Volume :
80
Database :
OpenAIRE
Journal :
Geoderma
Accession number :
edsair.doi.dedup.....7017e0d2a7a77023cb87098689e1a599