Implicitly Weighted Methods in Robust Image Analysis

Kalina, Jan

doi:10.1007/s10851-012-0337-z

Implicitly Weighted Methods in Robust Image Analysis

Published: 11 May 2012

Volume 44, pages 449–462, (2012)
Cite this article

Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Jan Kalina¹

526 Accesses
Explore all metrics

Abstract

This paper is devoted to highly robust statistical methods with applications to image analysis. The methods of the paper exploit the idea of implicit weighting, which is inspired by the highly robust least weighted squares regression estimator. We use a correlation coefficient based on implicit weighting of individual pixels as a highly robust similarity measure between two images. The reweighted least weighted squares estimator is considered as an alternative regression estimator with a clear interpretation. We apply implicit weighting to dimension reduction by means of robust principal component analysis. Highly robust methods are exploited in tasks of face localization and face detection in a database of 2D images. In this context we investigate a method for outlier detection and a filter for image denoising based on implicit weighting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Some Robust Variants of the Principal Components Analysis

Principle component analysis: Robust versions

Article 11 March 2017

Three iteratively reweighted least squares algorithms for $L_1$-norm principal component analysis

Article 12 June 2017

References

Arya, K.V., Gupta, P., Kalra, P.K., Mitra, P.: Image registration using robust M-estimators. Pattern Recognit. Lett. 28, 1957–1968 (2007)
Article Google Scholar
Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)
Article Google Scholar
Böhringer, S., Vollmar, T., Tasse, C., Würtz, R.P., Gillessen-Kaesbach, G., Horsthemke, B., Wieczorek, D.: Syndrome identification based on 2D analysis software. Eur. J. Hum. Genet. 14, 1082–1089 (2006)
Article Google Scholar
Chai, X., Shan, S., Chen, X., Gao, W.: Locally linear regression for pose-invariant face recognition. IEEE Trans. Image Process. 16(7), 1716–1725 (2007)
Article MathSciNet Google Scholar
Chambers, J.M.: Software for Data Analysis: Programming with R. Springer, New York (2008)
Book MATH Google Scholar
Chen, J.-H., Chen, C.-S., Chen, Y.-S.: Fast algorithm for robust template matching with M-estimators. IEEE Trans. Signal Process. 51(1), 230–243 (2003)
Article MathSciNet Google Scholar
Čížek, P.: Robust estimation with discrete explanatory variables. In: Härdle, W., Rönz, B. (eds.) COMPSTAT 2002, Proceedings in Computational Statistics, pp. 509–514. Physica-Verlag, Heidelberg (2002)
Google Scholar
Čížek, P.: Semiparametrically weighted robust estimation of regression models. Comput. Stat. Data Anal. 55(1), 774–788 (2011)
Article Google Scholar
Dabov, K., Foi, A., Katkovnik, V., Egizarian, K.: Image denoising by sparse 3D transform-domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007)
Article MathSciNet Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR, pp. 886–893. IEEE Computer Society, Washington (2005)
Google Scholar
Davies, P.L., Gather, U.: Breakdown and groups. Ann. Stat. 33(3), 977–1035 (2005)
Article MathSciNet MATH Google Scholar
Davies, P.L., Kovac, A.: Local extremes, runs, strings and multiresolution. Ann. Stat. 29(1), 1–65 (2001)
Article MathSciNet MATH Google Scholar
Donoho, D.L., Huber, P.J.: The notion of breakdown point. In: Bickel, P.J., Doksum, K., Hodges, J.L.J. (eds.) A Festschrift for Erich L. Lehmann, pp. 157–184. Wadsworth, Belmont (1983)
Google Scholar
Ellis, S.P., Morgenthaler, S.: Leverage and breakdown in L1 regression. J. Am. Stat. Assoc. 87(417), 143–148 (1992)
MathSciNet MATH Google Scholar
Fidler, S., Skočaj, D., Leonardis, A.: Combining reconstructive and discriminative subspace methods for robust classification and regression by subsampling. IEEE Trans. Pattern Anal. Mach. Intell. 28(3), 337–350 (2006)
Article Google Scholar
Franceschi, E., Odone, F., Smeraldi, F., Verri, A.: Finding objects with hypothesis testing. In: Proceedings of ICPR 2004, Workshop on Learning for Adaptable Visual Systems, Cambridge, 2004. IEEE Computer Society, Los Alamitos (2004)
Google Scholar
Fried, R., Einbeck, J., Gather, U.: Weighted repeated median smoothing and filtering. J. Am. Stat. Assoc. 102(480), 1300–1308 (2007)
Article MathSciNet MATH Google Scholar
Gervini, D., Yohai, V.J.: A class of robust and fully efficient regression estimators. Ann. Stat. 30(2), 583–616 (2002)
Article MathSciNet MATH Google Scholar
Hájek, J., Šidák, Z., Sen, P.K.: Theory of Rank Tests, 2nd edn. Academic Press, San Diego (1999)
MATH Google Scholar
Härdle, W.K., Simar, L.: Applied Multivariate Statistical Analysis. Springer, Heidelberg (2007)
MATH Google Scholar
He, X., Portnoy, S.: Reweighted LS estimators converge at the same rate as the initial estimator. Ann. Stat. 20(4), 2161–2167 (1992)
Article MathSciNet MATH Google Scholar
Hillebrand, M., Müller, C.: Outlier robust corner-preserving methods for reconstructing noisy images. Ann. Stat. 35(1), 132–165 (2007)
Article MATH Google Scholar
Hotz, T., Marnitz, P., Stichtenoth, R., Davies, P.L., Kabluchko, Z., Munk, A.: Locally adaptive image denoising by a statistical multiresolution criterion. Preprint statistical regularization and qualitative constraints 8/2009, University of Göttingen (2009)
Huang, L.-L., Shimizu, A.: Combining classifiers for robust face detection. In: Lecture Notes in Computer Science, vol. 3972, pp. 116–121 (2006)
Google Scholar
Hubert, M., Rousseeuw, P.J., van Aelst, S.: High-breakdown robust multivariate methods. Stat. Sci. 23(1), 92–119 (2008)
Article Google Scholar
Kalina, J.: Asymptotic Durbin-Watson test for robust regression. Bull. Int. Stat. Inst. 62, 3406–3409 (2007)
Google Scholar
Kalina, J.: Robust image analysis of faces for genetic applications. Eur. J. Biomed. Inform. 6(2), 6–13 (2010)
MathSciNet Google Scholar
Kalina, J.: On multivariate methods in robust econometrics. Prague Econ. Pap. 1(2012), 69–82 (2012)
Google Scholar
Kleihorst, R.P.: Noise filtering of image sequences. Dissertation, Technical University Delft (1997)
Lin, Z., Davis, L.S., Doermann, D.S., DeMenthon, D.: Hierarchical part-template matching for human detection and segmentation. In: Proceedings of the Eleventh IEEE International Conference on Computer Vision ICCV 2007, pp. 1–8. IEEE Computer Society, Washington (2007)
Chapter Google Scholar
Mairal, J., Elad, M., Sapiro, G.: Sparse representation for color image restoration. IEEE Trans. Image Process. 17(1), 53–69 (2008)
Article MathSciNet Google Scholar
Maronna, R.A., Martin, R.D., Yohai, V.J.: Robust Statistics: Theory and Methods. Wiley, Chichester (2006)
Book MATH Google Scholar
Meer, P., Mintz, D., Rosenfeld, A., Kim, D.Y.: Robust regression methods for computer vision: A review. Int. J. Comput. Vis. 6(1), 59–70 (1991)
Article Google Scholar
Müller, C.: Redescending M-estimators in regression analysis, cluster analysis and image analysis. Discuss. Math., Probab. Stat. 24(1), 59–75 (2004)
MathSciNet MATH Google Scholar
Naseem, I., Togneri, R., Bennamoun, M.: Linear regression for face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(11), 2106–2112 (2010)
Article Google Scholar
Pitas, I., Venetsanopoulos, A.N.: Nonlinear Digital Filters. Kluwer, Dordrecht (1990)
MATH Google Scholar
Plát, P.: The least weighted squares estimator. In: Antoch, J. (ed.) COMPSTAT 2004, Proceedings in Computational Statistics, pp. 1653–1660. Physica-Verlag, Heidelberg (2004)
Google Scholar
Portilla, J., Strela, V., Wainwright, M.J., Simoncelli, E.P.: Image denoising using scale mixtures of Gaussians in the wavelet domain. IEEE Trans. Image Process. 12(11), 1338–1351 (2003)
Article MathSciNet Google Scholar
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (1987)
Book MATH Google Scholar
Rousseeuw, P.J., van Driessen, K.: Computing LTS regression for large data sets. Data Min. Knowl. Discov. 12(1), 29–45 (2006)
Article MathSciNet Google Scholar
Rowley, H., Baluja, S., Kanade, S.: Neural network-based face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 23–38 (1998)
Article Google Scholar
Salibián-Barrera, M.: The asymptotics of MM-estimators for linear regression with fixed designs. Metrika 63, 283–294 (2006)
Article MathSciNet MATH Google Scholar
Schettlinger, K., Fried, R., Gather, U.: Real time signal processing by adaptive repeated median filters. Int. J. Adapt. Control Signal Process. 24(5), 346–362 (2010)
MathSciNet MATH Google Scholar
Shevlyakov, G.L., Vilchevski, N.O.: Robustness in Data Analysis: Criteria and Methods. VSP, Utrecht (2002)
Google Scholar
Tableman, M.: The influence functions for the least trimmed squares and the least trimmed absolute deviations estimators. Stat. Probab. Lett. 19, 329–337 (1994)
Article MathSciNet MATH Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. IEEE Trans. Pattern Anal. Mach. Intell. 5, 854–869 (2007)
Article Google Scholar
Tuzel, O., Porikli, F., Meer, P.: Human detection via classification on Riemannian manifolds. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR 2007. IEEE Computer Society, Washington (2007)
Google Scholar
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57, 137–154 (2004)
Article Google Scholar
Víšek, J.A.: The least weighted squares II. Consistency and asymptotic normality. Bull. Czech Econom. Soc. 9(16), 1–28 (2002)
Google Scholar
Víšek, J.A.: Robust error-term-scale estimate. In: Nonparametrics and Robustness in Modern Statistical Inference and Time Series Analysis. Institute of Mathematical Statistics Collections, vol. 7, pp. 254–267 (2010)
Google Scholar
Víšek, J.A.: Consistency of the least weighted squares under heteroscedasticity. Kybernetika 47(2), 179–206 (2011)
MathSciNet MATH Google Scholar
Wang, M., Lai, C.-H.: A Concise Introduction to Image Processing Using C++. CRC Press, Boca Raton (2008)
Google Scholar
Wang, X., Tang, X.: Subspace analysis using random mixture models. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR 2005, pp. 574–580. IEEE Computer Society, Washington (2005)
Google Scholar
Wong, Y., Sanderson, C., Lovell, B.C.: Regression based non-frontal face synthesis for improved identity verification. In: Jiang, X., Petkov, N. (eds.) Computer Analysis of Images and Patterns, pp. 116–124. Springer, Heidelberg (2010)
Google Scholar
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Yi, M.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Article Google Scholar
Yang, M.-H., Kriegman, D.J., Ahuja, N.: Detecting faces in images: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 24(1), 34–58 (2002)
Article Google Scholar

Download references

Acknowledgements

This research is fully supported by the project 1M06014 of the Ministry of Education, Youth and Sports of the Czech Republic. The author is grateful to two anonymous referees for providing valuable suggestions.

Author information

Authors and Affiliations

Center of Biomedical Informatics, Institute of Computer Science of the Academy of Sciences of the Czech Republic, Pod Vodárenskou věží 2, 182 07, Prague, Czech Republic
Jan Kalina

Authors

Jan Kalina
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jan Kalina.

Appendix: Technical Details

Definition 9

(Weight function)

Let a function ψ:[0,1]→[0,1] be non-increasing and continuous on [0,1], let ψ(0)=1 and ψ(1)=0. Moreover, we assume that both one-sided derivatives of ψ exist in all points of (0,1), that they are bounded by a common constant and we assume the existence of a finite left derivative in 0 and finite right derivative in point 1. Then the function ψ is called a weight function.

Definition 10

(Least weighted squares with adaptive weights)

In the model (1), let b ₀ denote an initial robust estimator of β and let $\hat{\sigma}_{0}^{2}$ denote a corresponding initial robust estimator of σ ². Let F _χ denote the distribution function of $\chi^{2}_{1}$ distribution. The least weighted squares estimator of β with adaptive weights is defined as

$$ \arg\min_{\mathbf{b} \in\mathbb{R}^p} \sum_{i=1}^n \hat{w}_n \biggl[ G_n\bigl(u_i^2( \mathbf{b})\bigr) - \frac{1}{2n} \biggr] u_i^2( \mathbf{b}), $$

(42)

where

$$ \hat{w}_n(t) = \left \{ \begin{array}{l@{\quad}l} \hat{\sigma}_0 \frac{F_\chi^{-1}(\max\{t, c_n\})}{(G_n^0)^{-1} (\max\{t, c_n\})}, & \mbox{if } t < 1-d_n, \\ 0, & \mbox{otherwise}, \end{array} \right . $$

(43)

G _n is the empirical distribution function of $u_{i}^{2}(\mathbf{b})$, $G_{n}^{0}$ is the empirical distribution function of $u_{i}^{2}(\mathbf{b}_{0})$,

$$ c_n = \min \biggl\{ \frac{m}{n}; u^2_{(m)}( \mathbf{b}_0)>0 \biggr\} $$

(44)

is used to avoid dividing by zero,

$$ d_n = \sup_{t \geq c} \max \bigl\{ 0, \hat{\sigma}_0 F_\chi(t) - G_n^0(t) \bigr\}, $$

(45)

$c=F_{\chi}^{-1}(q)$ and q∈[0.9999,1) is a chosen constant.

Assumptions $\mathcal{A}$

We assume a sequence of non-random vectors $\{\mathbf{X}_{n}\}_{n=1}^{\infty}$ with values in ℝ^p and a sequence of independent and identically distributed random variables $\{e_{n}\}_{n=1}^{\infty}$ with values in ℝ, which form the model (1) for each n. The distribution function F(z) of the random error e ₁ is symmetric and absolutely continuous with a bounded density f(z), which is decreasing on ℝ⁺. The density is positive on (−∞,∞) and its second derivative is bounded. Moreover,

$$ \sum_{i=1}^n \Vert\mathbf{X}_i \Vert^3 = \mathcal{O}(n) \quad\mbox{\textit{and}}\quad \mathsf{E}e_1^2 \in(0, \infty). $$

(46)

Let

$$ \lim_{n\to\infty} \hat{\mathbf{Q}}_n = \mathbf{Q} $$

(47)

in probability, where Q is a regular matrix. There exists a distribution function H(x) for x∈ℝ^p such that

(48)

Let $B(\boldsymbol {\beta },\delta) = \{ \tilde{\boldsymbol {\beta }}\in\mathbb{R}^{p}; ||\tilde {\boldsymbol {\beta }}-\boldsymbol {\beta }||<\delta \}$ for an arbitrary fixed δ>0. For any compact set W with W∖B(β,δ)≠0, there exists γ _δ>0 such that

$$ \inf_{\boldsymbol {\omega }\in W\backslash B(\boldsymbol {\beta },\delta)} \frac{1}{n}\sum_{i=1}^n \bigl( \mathbf{X}_i^T(\boldsymbol {\omega }- \boldsymbol {\beta }) \bigr)^2 > \gamma_\delta. $$

(49)

Proof

(Theorem 1) Let us define

$$ s_{LWS} = \frac{1}{n-2} \bigl( \mathbf{u}^{LWS} \bigr)^T \mathbf{u}^{LWS}, $$

(50)

where $\mathbf{u}^{LWS} = (u_{1}^{LWS}, \ldots, u_{n}^{LWS})^{T}$ are residuals of the least weighted squares estimator. We can express

$$ T=\frac{b^{LWS}_1}{s_{LWS}} \sqrt{\sum_{i=1}^n (X_i-\bar{X})^2} $$

(51)

and the statement follows from Theorem 4 for the linear regression context. □

Proof

(Theorem 2) Consequence of a general result of [50], who derived (19) as a consistent estimator of σ ². The constant γ in (20) is independent on n and can be approximated by numerical integration as

$$ \sum_{i=2}^m (t_{i-1} - t_i) \psi \bigl( F(|t_i|) \bigr) t_i^2 f(t_i) $$

(52)

using a partition −∞<t ₁<t ₂<⋯<t _m<∞ of the real line into intervals. Normal distribution N(0,σ ²) of errors is assumed. Without loss of generality, we use the N(0,1) distribution for the numerical integration for the examples of weights in (7) and (8). □

Proof

(Theorem 3) The LWS estimator (5) in the model (27) is defined as the minimum of $\sum_{i=1}^{n} w_{i} u^{2}_{(i)}(\hat{\mu})$ over $\hat{\mu} \in{\mathbb{R}}$ and over all permutations of the weights with magnitudes w ₁,…,w _n. The location model (27) is a special case of linear regression and therefore the solution has a form of a weighted mean $\hat{\mu} = \sum_{i=1}^{n} w^{*}_{i} Y_{i}$, where $w_{1}^{*},\ldots,w_{n}^{*}$ are permuted values of w ₁,…,w _n. Therefore, we can express the LWS estimator as

$$ \arg\min\sum_{i=1}^n w_i \Biggl(\mathbf{Y}-\sum_{j=1}^n w_j Y_j \Biggr)^2_{(i)}, $$

(53)

where the minimum is considered over all permutations of the weights with magnitudes w ₁,…,w _n and the notation

$$ \Biggl(\mathbf{Y}-\sum_{j=1}^n w_j Y_j \Biggr)_{(1)} \leq\cdots\leq \Biggl( \mathbf{Y}-\sum_{j=1}^n w_j Y_j \Biggr)_{(n)} $$

(54)

is used for ordered coordinates of

$$ \Biggl(Y_1 - \sum_{j=1}^n w_j Y_j, \ldots, Y_n - \sum _{j=1}^n w_j Y_j \Biggr)^T. $$

(55)

However, (53) minimizes the weighted variance $S^{2}_{w}(\mathbf{Y})$ (28), which concludes the proof. □

Proof

(Theorem 4) The first part follows immediately from the asymptotic normality of b ^LWS of [37]. The independence between the numerator and the denominator is assured asymptotically in probability, because b ^LWS is asymptotically in probability equivalent with (X ^T X)⁻¹ X ^T Y and (u ^LWS)^T u ^LWS is asymptotically equivalent with e ^T Me in probability [49], where $\mathbf{M} = \mbox{\boldmath$\mathcal{I}$}_{n} - \mathbf{H}$, H=X(X ^T X)⁻¹ X ^T and $\mbox{\boldmath$\mathcal{I}$}_{n}$ denotes an identity matrix of dimension n.

The second part follows from the asymptotic representation for the LWS estimator [49]. The asymptotic representation holds in the form

(56)

for n→∞, where

$$ \hat{\mathbf{Q}}_n^{(1)}= \hat{\mathbf{Q}}_n \biggl(1-\int_0^1 \alpha d\psi(\alpha) - 2\int _0^1 u_\alpha f(u_\alpha)d\psi( \alpha) \biggr), $$

(57)

ψ is the weight function defining the LWS estimator and coordinates of η=(η ₁,…,η _p)^T are of order o _P(1). Let us denote $\boldsymbol {\tau }= - \frac{1}{\sqrt{n}}\mathbf{X}\boldsymbol {\eta }$, κ=u ^LWS−τ and Ψ(e)=(ψ ₁(e),…,ψ _n(e))^T with

(58)

and

$$ \boldsymbol {\phi }= \mathbf{H}\mathbf{e} - \mathbf{X}\bigl(\mathbf{X}^T\mathbf{X} \bigr)^{-1} \mathbf{X}^T\boldsymbol {\varPsi }(\mathbf{e})= \mathbf{H}\bigl[ \mathbf{e}- \boldsymbol {\varPsi }(\mathbf{e})\bigr]. $$

(59)

It holds κ=Me+ϕ and

$$ \boldsymbol {\kappa }^T\boldsymbol {\kappa }= \mathbf{e}^T \mathbf{M} \mathbf{e} + [ \mathbf{e}- \boldsymbol {\varPsi }\mathbf{e} ]^T \mathbf{H}\bigl[\mathbf{e}- \boldsymbol {\varPsi }( \mathbf{e})\bigr]. $$

(60)

To complete the proof of the second part we can repeat the steps of [26], who considered the test statistic of the Durbin-Watson test. The residual sum of squares (u ^LWS)^T u ^LWS is asymptotically equivalent in probability with κ ^T κ and e ^T Me, which is the test statistic computed with least squares residuals. At the same time, e ^T Me/σ ² follows $\chi^{2}_{n-p}$ distribution. The third part of the theorem is proven by analogous reasoning. □

Proof

(Corollary 1) Starting with (34), we exploit the scale-invariance of the denominator. $\sigma^{2}_{\psi}$ is estimated consistently following [37] to obtain (39). □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kalina, J. Implicitly Weighted Methods in Robust Image Analysis. J Math Imaging Vis 44, 449–462 (2012). https://doi.org/10.1007/s10851-012-0337-z

Download citation

Published: 11 May 2012
Issue Date: November 2012
DOI: https://doi.org/10.1007/s10851-012-0337-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Implicitly Weighted Methods in Robust Image Analysis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Some Robust Variants of the Principal Components Analysis

Principle component analysis: Robust versions

Three iteratively reweighted least squares algorithms for \(L_1\)-norm principal component analysis

References

Acknowledgements