Abstract
The paper considers the problem of testing for multiple outliers in a regression model and provides fast approximations to the null distribution of the minimum deletion residual used as a test statistic. Since direct simulation of each combination of number of observations and number of parameters is too time consuming, methods using simple normal samples are described for approximating the pointwise distribution of the test statistic. One approximation is based on adjustments to the results of simple simulations. The other uses properties of order statistics from folded t distributions to move outside the significance levels available by simulation. Analyses of data with beta errors and of transformed data on survival times demonstrate the usefulness in graphical methods of the inclusion of our bounds.
Similar content being viewed by others
References
Atkinson AC (1985). Plots, transformations, and regression. Oxford University Press, Oxford
Atkinson AC, Riani M (2000) Robust diagnostic regression analysis. Springer, New York
Atkinson AC, Riani M (2006): Distribution theory and simulations for tests of outliers in regression. J Comp Graphical Statist 15:460–476
Atkinson AC, Riani M, Cerioli A (2006): Random start forward searches with envelopes for detecting clusters in multivariate data.In: Zani S, Cerioli A, Riani M, Vichi M (eds). Data analysis, classification and the forward search. Springer, Berlin, pp 163–171
Barnett V, Lewis T (1994) Outliers in statistical data, 3rd edn. Wiley, New York
Beckman RJ, Cook RD (1983) Outlier..........s (with discussion). Technometrics 25:119–163
Billor N, Hadi AS, Velleman PJ (2000) BACON: blocked adaptive computationally efficient outlier nominators. Comp Statist Data Anal 34:279–298
Cook RD, Weisberg S (1982). Residuals and influence in regression. Chapman and Hall, London
Hawkins DM (1983) Discussion of the paper by Beckman and Cook. Technometrics 25:155–156
Hawkins DM, Olive DJ (2002) Inconsistency of resampling algorithms for high-breakdown regression estimators and a new algorithm (with discussion). J Am Statist Assoc 97:136–159
Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions - 1, 2nd edn. Wiley, New York
Lehmann E (1991) Point estimation, 2nd edn. Wiley, New York
Maronna RA, Yohai VJ (2002) Discussion of Hawkins and Olive (2002). J Am Statist Assoc 97:154–155
Neter J, Kutner MH, Nachtsheim CJ, Wasserman W (1996) Applied linear statistical models, 4th edn. McGraw-Hill, New York
Rousseeuw PJ (1984) Least median of squares regression. J Am Statist Assoc 79:871–880
Wisnowski JW, Montgomery DC, Simpson JR (2001) A comparative analysis of multiple outlier detection procedures in the linear regression model. Comp Statist Data Anal 36:351–382
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Riani, M., Atkinson, A.C. Fast calibrations of the forward search for testing multiple outliers in regression. ADAC 1, 123–141 (2007). https://doi.org/10.1007/s11634-007-0007-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-007-0007-y