Abstract
When measurements of values that are less than the limit of detection are reported as not detected, the data are referred to as censored. The non-recording of values below the limit of detection is common in soil science research although modelling data affected by censoring can be problematic. This paper develops and tests a modified version of Spatial Simulated Annealing, called Simulated Annealing by Variogram and Histogram form, for drawing values for censored points given a mixed set of observed and censored data. The algorithm aims to maximise the goodness of fitting between the experimental and theoretical variograms (by allowing variation in its parameters) while the imputed values are constrained to a target histogram form. In practice, the experimental histogram is estimated by transforming the available data (interval and exact observations) to quantiles and fitting a plausible distribution. The theoretical distribution of the data is used to constrain the variogram fitting. The proposed simulated annealing method is designed to find the optimal spatial arrangement of values, given by the lowest errors in variogram and histogram fitting and kriging prediction. The accuracy of the method proposed is assessed on a simulated data set in which the censored point values are known and compared with the Spatial Simulated Annealing algorithm. According to the results obtained, the Simulated Annealing by Variogram and Histogram form (SAVH) approach can be recommended as a useful tool for the analysis of spatially distributed data with censoring.



Similar content being viewed by others
Notes
The R project. www.r-project.org.
The Apulia Region. http://bdt.regione.puglia.it/home.html.
References
Aarts E, Korst J (1989) Simulated annealing and Boltzmann machines—a stochastic approach to combinatorial optimization and neural computing. Wiley & Sons, New York
Abrahamsen P, Benth FE (2001) Kriging with inequality constraints. Math Geol 33(6):719–744
Agarwal R, Sharma M (2003) Parameter estimation for non-linear environmental models using below-detection data. Ad Environ Res 7(2):249–261
Alkhamis TM, Ahmed MA (2004) Simulation-based optimization using simulated annealing with confidence interval. In: Proceedings 2004 Winter simulation conference, IEEE, Washington D.C., pp 514–519
Bang H, Tsiatis AA (2002) Median regression with censored cost data. Biometrics 58(3):643–649
Bölte A, Thonemann UW (1996) Optimizing simulated annealing schedules with genetic programming. Eur J Oper Res 92(2):402–416
Bouktif S, Sahraoui H, Antoniol G (2006) Simulated annealing for improving software quality prediction. In: Proceedings genetic and evolutionary computation conference GECCO 06, ACM, New York, pp 1893–1991
Box GEP, Cox DR (1964) An analysis of transformations. J Roy Stat Soc B 26(2):211–252
Caudill SB (1996) Maximum likelihood estimation in a model with interval data: a comment and extension. J Appl Stat 23(1):97–104
Christakos G, Killam BR (1993) Sampling design for classifying contaminant level using annealing search algorithms. Water Resour Res 29(12):4063–4076
Corana A, Marchesi M, Martini C, Ridella S (1987) Minimizing multimodal functions of continuous variables with the “simulated annealing” algorithm. ACM T Math Software 13(3):262–280
De Oliveira V (2005) Bayesian inference and prediction of Gaussian random fields based on censored data. J Comput Graph Stat 14(1):95–115
Dennis JE, Schnabel RB (1983) Numerical methods for unconstrained optimization and nonlinear equations. Prentice-Hall, Englewood Cliffs
Deutsch CV, Journel AG (1998) GSLIB Geostatistical Software Library and user’s guide, 2nd edn. Oxford University Press, New York
Deutsch CV, Wen XH (1998) An improved perturbation mechanism for simulated annealing simulation. Math Geol 30(7):801–816
Dueck G, Scheuer T (1990) Threshold accepting: a general purpose optimization algorithm appearing superior to simulated annealing. J Comput Phys 90(1):161–175
Fridley BL, Dixon P (2007) Data augmentation for a Bayesian spatial model involving censored observations. Environmetrics 18(2):107–123
Gelman A, Roberts G, Gilks W (1995) Efficient Metropolis jumping rules. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM (eds) Bayesian statistics, vol 5. Oxford University Press, New York, pp 599–608
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distribution and the Bayesian restoration of images. IEEE T Pattern Anal 6(6):721–741
Gibbons R (1995) Some statistical and conceptual issues in the detection of low-level environmental pollutants. Environ Ecol Stat 2(2):125–167
Gilliom RJ, Helsel DR (1984) Estimation of distributional parameters of censored trace-level water quality data. Water Resour Res 22(2):147–155
Goovaerts P (2009) AUTO-IK: a 2D indicator kriging program for the automated non-parametric modeling of local uncertainty in earth sciences. Comput Geosci 35(6):1255–1270
Gringarten E, Deutsch CV (2001) Theacher’s aide. Variogram interpretation and modeling. Math Geol 27(5):659–672
Helsel DR (2005) Nondetects and data analysis. Wiley & Sons, New York
Holla MS (1966) On a poisson-inverse Gaussian distribution. Metrika 11(1):115–121
Hopke PK, Liu C, Rubin DB (2001) Multiple imputation for multivariate data with missing and below-threshold measurements: time-series concentrations of pollutants in the Arctic. Biometrics 57(1):22–33
Huzurbazar AV (2005) A censored data histogram. Commun Stat Simulat 34(1):113–120
Ingber L (1996) Adaptive simulated annealing (ASA): lessons learned. J Control Cybern 25(1):33–54
Kerry R, Oliver MA (2007a) Determining the effect of asymmetric data on the variogram. I. Underlying asymmetry. Comput Geosci 33(10):1212–1232
Kerry R, Oliver MA (2007b) Determining the effect of asymmetric data on the variogram. II. Outliers. Comput Geosci 33(10):1233–1260
Kerry R, Oliver MA (2007c) Comparing sampling needs for variograms of soil properties computed by method of moments and residual maximum likelihood. Geoderma 140(10):383–396
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
Knotters M, Brus DJ, Oude-Voshaar JH (1995) A comparison of kriging, co-kriging and kriging combined with regression for spatial interpolation of horizon depth with censored observations. Geoderma 67(3–4):227–246
Kuo SF, Liu CW, Merkley GP (2001) Application of the simulated annealing method to agricultural water resource management. J Agr Eng Res 80(1):109–124
Lark RM (2000a) A comparison of some robust estimators of the variogram for use in soil survey. Eur J Soil Sci 51(1):137–157
Lark RM (2000b) Estimating variograms of soil properties by the method-of-moments and maximum likelihood. Eur J Soil Sci 51(4):717–728
Lark RM, Papritz A (2003) Fitting a linear model of coregionalization for soil properties using simulated annealing. Geoderma 115(3–4):245–260
Leuangthong O, Deutsch CV (2003) Stepwise conditional transformation for simulation of multiple variables. Math Geol 35(2):155–173
Liu C (2001) The art of data augmentation: discussion. J Comput Graph Stat 10(1):75–81
Macmillan W (2001) Redistricting in a GIS environment: an optimisation algorithm using switching-points. J Geogr Syst 3(2):167–180
Marcotte D (1995) Generalized cross-validation for covariance model selection. Math Geol 27(5):659–672
McBratney AB, Webster R (1986) Choosing functions for semivariograms of soil properties and fitting them to sampling estimates. J Soil Sci 37(4):617–639
Meng XL, Van Dyk DA (1999) Seeking efficient data augmentation schemes via conditional and marginal augmentation. Biometrika 86(2):301–320
Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092
Militino AF, Ugarte MD (1999) Analyzing censored spatial data. Math Geol 31(5):551–561
Ministero per le Politiche Agricole e Forestali (1999) Metodi ufficiali di analisi chimica del suolo. Gazzetta Ufficiale Supplemento Ordinario 248:1–162
Odell PM, Anderson KM, D’Agostino RB (1992) Maximum likelihood estimation for interval censored data using a weibull-based accelerated failure time model. Biometrics 48(3):951–959
Oliver MA (2010) The variogram and kriging. In: Fischer MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin, pp 319–352
Orton TG, Lark RM (2007) Estimating the local mean for Bayesian maximum entropy by generalized least squares and maximum likelihood, and an application to the spatial analysis of a censored soil variable. Eur J Soil Sci 58(1):60–73
Pardo-Igúzquiza E (1998) Optimal selection of number and location of rainfall gauges for areal rainfall estimation using geostatistics and simulated annealing. J Hydrol 210(1–4):206–220
Porter PS, Ward RC, Bell HF (1988) The detection limit. Water quality monitoring data are plagued with levels of chemicals that are too low to be measured precisely. Environ Sci Technol 22(8):856–861
Raimo F, Napolitano A (2003) Studio della distribuzione spaziale di alcuni parametri chimici. Il Tabacco 11:11–17
Rajasekaran S (2000) On simulated annealing and nested annealing. J Global Optim 16(1):43–56
Ribeiro PJ Jr, Diggle PJ (2001) geoR: a package for geostatistical analysis. R-NEWS 1(2):15–18
Rigby RA, Stasinopoulos DM (2005) Generalized additive models for location, scale and shape, (with discussion). Appl Stat-J Roy St C 54(3):507–554
Rivoirard J (1994) Introduction to disjunctive kriging and non-linear geostatistics. Clarendon, Oxford
Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley & Sons, New York
Saito H, Goovaerts P (2000) Geostatistical interpolation of positively skewed and censored data in a Dioxin-contaminated site. Environ Sci Technol 34(19):4228–4235
Stein ML (1992) Prediction and inference for truncated spatial data. J Comput Graph Stat 1(1):354–372
Svensson I, Sjöstedt-De Luna S, Bondesson L (2006) Estimation of wood fibre length distributions from censored data through an EM algorithm. Scand J Stat 33(3):503–522
Triki E, Collette Y, Siarry P (2005) A theoretical study on the behaviour of simulated annealing leading to a new cooling schedule. Eur J Oper Res 166(1):77–92
Tsiatis AA (1990) Estimating regression parameters using linear regression rank test for censored data. Ann Stat 18(1):354–372
Van Breemen N, Mulder J, Driscoll CT (1983) Acidification and alkalinization of soils. Plant Soil 75(3):283–308
Webster R, Oliver MA (2001) Geostatistics for environmental scientists. Wiley & Sons, Chichester
Acknowledgments
This study was supported by a fellowship from the Master and Back program financed by the Regional Sardinia Government, under agreement between the School of Geography, University of Southampton (UK) and the Dipartimento di Economia e Sistemi Arbori, University of Sassari (Italy). Thanks are due to the Apulia Regional Authority for Ecology and the Water Research Institute of the National Research Council for providing the data used in this study. Finally, the authors would like to acknowledge Dr. Edith Cheng at the University of Southampton, for inspiring the analysis and for helpful assistance.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Sedda, L., Atkinson, P.M., Barca, E. et al. Imputing censored data with desirable spatial covariance function properties using simulated annealing. J Geogr Syst 14, 265–282 (2012). https://doi.org/10.1007/s10109-010-0145-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10109-010-0145-1