Abstract
This paper is devoted to highly robust statistical methods with applications to image analysis. The methods of the paper exploit the idea of implicit weighting, which is inspired by the highly robust least weighted squares regression estimator. We use a correlation coefficient based on implicit weighting of individual pixels as a highly robust similarity measure between two images. The reweighted least weighted squares estimator is considered as an alternative regression estimator with a clear interpretation. We apply implicit weighting to dimension reduction by means of robust principal component analysis. Highly robust methods are exploited in tasks of face localization and face detection in a database of 2D images. In this context we investigate a method for outlier detection and a filter for image denoising based on implicit weighting.




Similar content being viewed by others
References
Arya, K.V., Gupta, P., Kalra, P.K., Mitra, P.: Image registration using robust M-estimators. Pattern Recognit. Lett. 28, 1957–1968 (2007)
Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)
Böhringer, S., Vollmar, T., Tasse, C., Würtz, R.P., Gillessen-Kaesbach, G., Horsthemke, B., Wieczorek, D.: Syndrome identification based on 2D analysis software. Eur. J. Hum. Genet. 14, 1082–1089 (2006)
Chai, X., Shan, S., Chen, X., Gao, W.: Locally linear regression for pose-invariant face recognition. IEEE Trans. Image Process. 16(7), 1716–1725 (2007)
Chambers, J.M.: Software for Data Analysis: Programming with R. Springer, New York (2008)
Chen, J.-H., Chen, C.-S., Chen, Y.-S.: Fast algorithm for robust template matching with M-estimators. IEEE Trans. Signal Process. 51(1), 230–243 (2003)
Čížek, P.: Robust estimation with discrete explanatory variables. In: Härdle, W., Rönz, B. (eds.) COMPSTAT 2002, Proceedings in Computational Statistics, pp. 509–514. Physica-Verlag, Heidelberg (2002)
Čížek, P.: Semiparametrically weighted robust estimation of regression models. Comput. Stat. Data Anal. 55(1), 774–788 (2011)
Dabov, K., Foi, A., Katkovnik, V., Egizarian, K.: Image denoising by sparse 3D transform-domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR, pp. 886–893. IEEE Computer Society, Washington (2005)
Davies, P.L., Gather, U.: Breakdown and groups. Ann. Stat. 33(3), 977–1035 (2005)
Davies, P.L., Kovac, A.: Local extremes, runs, strings and multiresolution. Ann. Stat. 29(1), 1–65 (2001)
Donoho, D.L., Huber, P.J.: The notion of breakdown point. In: Bickel, P.J., Doksum, K., Hodges, J.L.J. (eds.) A Festschrift for Erich L. Lehmann, pp. 157–184. Wadsworth, Belmont (1983)
Ellis, S.P., Morgenthaler, S.: Leverage and breakdown in L1 regression. J. Am. Stat. Assoc. 87(417), 143–148 (1992)
Fidler, S., Skočaj, D., Leonardis, A.: Combining reconstructive and discriminative subspace methods for robust classification and regression by subsampling. IEEE Trans. Pattern Anal. Mach. Intell. 28(3), 337–350 (2006)
Franceschi, E., Odone, F., Smeraldi, F., Verri, A.: Finding objects with hypothesis testing. In: Proceedings of ICPR 2004, Workshop on Learning for Adaptable Visual Systems, Cambridge, 2004. IEEE Computer Society, Los Alamitos (2004)
Fried, R., Einbeck, J., Gather, U.: Weighted repeated median smoothing and filtering. J. Am. Stat. Assoc. 102(480), 1300–1308 (2007)
Gervini, D., Yohai, V.J.: A class of robust and fully efficient regression estimators. Ann. Stat. 30(2), 583–616 (2002)
Hájek, J., Šidák, Z., Sen, P.K.: Theory of Rank Tests, 2nd edn. Academic Press, San Diego (1999)
Härdle, W.K., Simar, L.: Applied Multivariate Statistical Analysis. Springer, Heidelberg (2007)
He, X., Portnoy, S.: Reweighted LS estimators converge at the same rate as the initial estimator. Ann. Stat. 20(4), 2161–2167 (1992)
Hillebrand, M., Müller, C.: Outlier robust corner-preserving methods for reconstructing noisy images. Ann. Stat. 35(1), 132–165 (2007)
Hotz, T., Marnitz, P., Stichtenoth, R., Davies, P.L., Kabluchko, Z., Munk, A.: Locally adaptive image denoising by a statistical multiresolution criterion. Preprint statistical regularization and qualitative constraints 8/2009, University of Göttingen (2009)
Huang, L.-L., Shimizu, A.: Combining classifiers for robust face detection. In: Lecture Notes in Computer Science, vol. 3972, pp. 116–121 (2006)
Hubert, M., Rousseeuw, P.J., van Aelst, S.: High-breakdown robust multivariate methods. Stat. Sci. 23(1), 92–119 (2008)
Kalina, J.: Asymptotic Durbin-Watson test for robust regression. Bull. Int. Stat. Inst. 62, 3406–3409 (2007)
Kalina, J.: Robust image analysis of faces for genetic applications. Eur. J. Biomed. Inform. 6(2), 6–13 (2010)
Kalina, J.: On multivariate methods in robust econometrics. Prague Econ. Pap. 1(2012), 69–82 (2012)
Kleihorst, R.P.: Noise filtering of image sequences. Dissertation, Technical University Delft (1997)
Lin, Z., Davis, L.S., Doermann, D.S., DeMenthon, D.: Hierarchical part-template matching for human detection and segmentation. In: Proceedings of the Eleventh IEEE International Conference on Computer Vision ICCV 2007, pp. 1–8. IEEE Computer Society, Washington (2007)
Mairal, J., Elad, M., Sapiro, G.: Sparse representation for color image restoration. IEEE Trans. Image Process. 17(1), 53–69 (2008)
Maronna, R.A., Martin, R.D., Yohai, V.J.: Robust Statistics: Theory and Methods. Wiley, Chichester (2006)
Meer, P., Mintz, D., Rosenfeld, A., Kim, D.Y.: Robust regression methods for computer vision: A review. Int. J. Comput. Vis. 6(1), 59–70 (1991)
Müller, C.: Redescending M-estimators in regression analysis, cluster analysis and image analysis. Discuss. Math., Probab. Stat. 24(1), 59–75 (2004)
Naseem, I., Togneri, R., Bennamoun, M.: Linear regression for face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(11), 2106–2112 (2010)
Pitas, I., Venetsanopoulos, A.N.: Nonlinear Digital Filters. Kluwer, Dordrecht (1990)
Plát, P.: The least weighted squares estimator. In: Antoch, J. (ed.) COMPSTAT 2004, Proceedings in Computational Statistics, pp. 1653–1660. Physica-Verlag, Heidelberg (2004)
Portilla, J., Strela, V., Wainwright, M.J., Simoncelli, E.P.: Image denoising using scale mixtures of Gaussians in the wavelet domain. IEEE Trans. Image Process. 12(11), 1338–1351 (2003)
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (1987)
Rousseeuw, P.J., van Driessen, K.: Computing LTS regression for large data sets. Data Min. Knowl. Discov. 12(1), 29–45 (2006)
Rowley, H., Baluja, S., Kanade, S.: Neural network-based face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 23–38 (1998)
Salibián-Barrera, M.: The asymptotics of MM-estimators for linear regression with fixed designs. Metrika 63, 283–294 (2006)
Schettlinger, K., Fried, R., Gather, U.: Real time signal processing by adaptive repeated median filters. Int. J. Adapt. Control Signal Process. 24(5), 346–362 (2010)
Shevlyakov, G.L., Vilchevski, N.O.: Robustness in Data Analysis: Criteria and Methods. VSP, Utrecht (2002)
Tableman, M.: The influence functions for the least trimmed squares and the least trimmed absolute deviations estimators. Stat. Probab. Lett. 19, 329–337 (1994)
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. IEEE Trans. Pattern Anal. Mach. Intell. 5, 854–869 (2007)
Tuzel, O., Porikli, F., Meer, P.: Human detection via classification on Riemannian manifolds. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR 2007. IEEE Computer Society, Washington (2007)
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57, 137–154 (2004)
Víšek, J.A.: The least weighted squares II. Consistency and asymptotic normality. Bull. Czech Econom. Soc. 9(16), 1–28 (2002)
Víšek, J.A.: Robust error-term-scale estimate. In: Nonparametrics and Robustness in Modern Statistical Inference and Time Series Analysis. Institute of Mathematical Statistics Collections, vol. 7, pp. 254–267 (2010)
Víšek, J.A.: Consistency of the least weighted squares under heteroscedasticity. Kybernetika 47(2), 179–206 (2011)
Wang, M., Lai, C.-H.: A Concise Introduction to Image Processing Using C++. CRC Press, Boca Raton (2008)
Wang, X., Tang, X.: Subspace analysis using random mixture models. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR 2005, pp. 574–580. IEEE Computer Society, Washington (2005)
Wong, Y., Sanderson, C., Lovell, B.C.: Regression based non-frontal face synthesis for improved identity verification. In: Jiang, X., Petkov, N. (eds.) Computer Analysis of Images and Patterns, pp. 116–124. Springer, Heidelberg (2010)
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Yi, M.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Yang, M.-H., Kriegman, D.J., Ahuja, N.: Detecting faces in images: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 24(1), 34–58 (2002)
Acknowledgements
This research is fully supported by the project 1M06014 of the Ministry of Education, Youth and Sports of the Czech Republic. The author is grateful to two anonymous referees for providing valuable suggestions.
Author information
Authors and Affiliations
Corresponding author
Appendix: Technical Details
Appendix: Technical Details
Definition 9
(Weight function)
Let a function ψ:[0,1]→[0,1] be non-increasing and continuous on [0,1], let ψ(0)=1 and ψ(1)=0. Moreover, we assume that both one-sided derivatives of ψ exist in all points of (0,1), that they are bounded by a common constant and we assume the existence of a finite left derivative in 0 and finite right derivative in point 1. Then the function ψ is called a weight function.
Definition 10
(Least weighted squares with adaptive weights)
In the model (1), let b 0 denote an initial robust estimator of β and let \(\hat{\sigma}_{0}^{2}\) denote a corresponding initial robust estimator of σ 2. Let F χ denote the distribution function of \(\chi^{2}_{1}\) distribution. The least weighted squares estimator of β with adaptive weights is defined as
where
G n is the empirical distribution function of \(u_{i}^{2}(\mathbf{b})\), \(G_{n}^{0}\) is the empirical distribution function of \(u_{i}^{2}(\mathbf{b}_{0})\),
is used to avoid dividing by zero,
\(c=F_{\chi}^{-1}(q)\) and q∈[0.9999,1) is a chosen constant.
Assumptions \(\mathcal{A}\)
We assume a sequence of non-random vectors \(\{\mathbf{X}_{n}\}_{n=1}^{\infty}\) with values in ℝp and a sequence of independent and identically distributed random variables \(\{e_{n}\}_{n=1}^{\infty}\) with values in ℝ, which form the model (1) for each n. The distribution function F(z) of the random error e 1 is symmetric and absolutely continuous with a bounded density f(z), which is decreasing on ℝ+. The density is positive on (−∞,∞) and its second derivative is bounded. Moreover,
Let
in probability, where Q is a regular matrix. There exists a distribution function H(x) for x∈ℝp such that

Let \(B(\boldsymbol {\beta },\delta) = \{ \tilde{\boldsymbol {\beta }}\in\mathbb{R}^{p}; ||\tilde {\boldsymbol {\beta }}-\boldsymbol {\beta }||<\delta \}\) for an arbitrary fixed δ>0. For any compact set W with W∖B(β,δ)≠0, there exists γ δ >0 such that
Proof
(Theorem 1) Let us define
where \(\mathbf{u}^{LWS} = (u_{1}^{LWS}, \ldots, u_{n}^{LWS})^{T}\) are residuals of the least weighted squares estimator. We can express
and the statement follows from Theorem 4 for the linear regression context. □
Proof
(Theorem 2) Consequence of a general result of [50], who derived (19) as a consistent estimator of σ 2. The constant γ in (20) is independent on n and can be approximated by numerical integration as
using a partition −∞<t 1<t 2<⋯<t m <∞ of the real line into intervals. Normal distribution N(0,σ 2) of errors is assumed. Without loss of generality, we use the N(0,1) distribution for the numerical integration for the examples of weights in (7) and (8). □
Proof
(Theorem 3) The LWS estimator (5) in the model (27) is defined as the minimum of \(\sum_{i=1}^{n} w_{i} u^{2}_{(i)}(\hat{\mu})\) over \(\hat{\mu} \in{\mathbb{R}}\) and over all permutations of the weights with magnitudes w 1,…,w n . The location model (27) is a special case of linear regression and therefore the solution has a form of a weighted mean \(\hat{\mu} = \sum_{i=1}^{n} w^{*}_{i} Y_{i}\), where \(w_{1}^{*},\ldots,w_{n}^{*}\) are permuted values of w 1,…,w n . Therefore, we can express the LWS estimator as
where the minimum is considered over all permutations of the weights with magnitudes w 1,…,w n and the notation
is used for ordered coordinates of
However, (53) minimizes the weighted variance \(S^{2}_{w}(\mathbf{Y})\) (28), which concludes the proof. □
Proof
(Theorem 4) The first part follows immediately from the asymptotic normality of b LWS of [37]. The independence between the numerator and the denominator is assured asymptotically in probability, because b LWS is asymptotically in probability equivalent with (X T X)−1 X T Y and (u LWS)T u LWS is asymptotically equivalent with e T Me in probability [49], where \(\mathbf{M} = \mbox{\boldmath$\mathcal{I}$}_{n} - \mathbf{H}\), H=X(X T X)−1 X T and \(\mbox{\boldmath$\mathcal{I}$}_{n}\) denotes an identity matrix of dimension n.
The second part follows from the asymptotic representation for the LWS estimator [49]. The asymptotic representation holds in the form

for n→∞, where
ψ is the weight function defining the LWS estimator and coordinates of η=(η 1,…,η p )T are of order o P (1). Let us denote \(\boldsymbol {\tau }= - \frac{1}{\sqrt{n}}\mathbf{X}\boldsymbol {\eta }\), κ=u LWS−τ and Ψ(e)=(ψ 1(e),…,ψ n (e))T with

and
It holds κ=Me+ϕ and
To complete the proof of the second part we can repeat the steps of [26], who considered the test statistic of the Durbin-Watson test. The residual sum of squares (u LWS)T u LWS is asymptotically equivalent in probability with κ T κ and e T Me, which is the test statistic computed with least squares residuals. At the same time, e T Me/σ 2 follows \(\chi^{2}_{n-p}\) distribution. The third part of the theorem is proven by analogous reasoning. □
Proof
(Corollary 1) Starting with (34), we exploit the scale-invariance of the denominator. \(\sigma^{2}_{\psi}\) is estimated consistently following [37] to obtain (39). □
Rights and permissions
About this article
Cite this article
Kalina, J. Implicitly Weighted Methods in Robust Image Analysis. J Math Imaging Vis 44, 449–462 (2012). https://doi.org/10.1007/s10851-012-0337-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10851-012-0337-z