Bias Reduction in Variational Regularization | Journal of Mathematical Imaging and Vision Skip to main content
Log in

Bias Reduction in Variational Regularization

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

The aim of this paper was to introduce and study a two-step debiasing method for variational regularization. After solving the standard variational problem, the key idea is to add a consecutive debiasing step minimizing the data fidelity on an appropriate set, the so-called model manifold. The latter is defined by Bregman distances or infimal convolutions thereof, using the (uniquely defined) subgradient appearing in the optimality condition of the variational method. For particular settings, such as anisotropic \(\ell ^1\) and TV-type regularization, previously used debiasing techniques are shown to be special cases. The proposed approach is, however, easily applicable to a wider range of regularizations. The two-step debiasing is shown to be well-defined and to optimally reduce bias in a certain setting. In addition to visual and PSNR-based evaluations, different notions of bias and variance decompositions are investigated in numerical studies. The improvements offered by the proposed scheme are demonstrated, and its performance is shown to be comparable to optimal results obtained with Bregman iterations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. http://r0k.us/graphics/kodak/.

References

  1. Benning, M., Burger, M.: Ground states and singular vectors of convex variational regularization methods. Methods Appl. Anal. 20, 295–334 (2013)

    MathSciNet  MATH  Google Scholar 

  2. Borwein, J., Zhu, Q.: Techniques of variational analysis. Springer Science & Business Media (2006)

  3. Bredies, K., Kunisch, K., Pock, T.: Total generalized variation. SIAM J. Imaging Sci. 3, 492–526 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bredies, K., Pikkarainen, H.: Inverse problems in spaces of measures. ESAIM Control Optim. Calc. Var. 9(1), 190–218 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  5. Burger, M.: Bregman distances in inverse problems and partial differential equations. In: Hiriard-Urrurty, J., Korytowski, A., Maurer, H., Szymkat, M. (eds.) Advances in Mathematical Modeling, Optimization, and Optimal Control. Springer, New York (2016)

    Google Scholar 

  6. Burger, M., Gilboa, G., Moeller, M., Eckardt, L., Cremers, D.: Spectral decompositions using one-homogeneous functionals. arxiv 1601.02912 (2016)

  7. Burger, M., Gilboa, G., Osher, S., Xu, J.: Nonlinear inverse scale space methods. Commun. Math. Sci. 4, 178–212 (2006)

    MathSciNet  MATH  Google Scholar 

  8. Burger, M., Moeller, M., Benning, M., Osher, S.: An adaptive inverse scale space method for compressed sensing. Math. Comput. 82, 269–299 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  9. Burger, M., Osher, S.: Convergence rates of convex variational regularization. Inverse Probl. 20, 1411–1421 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  10. Burger, M., Osher, S.: A guide to the TV zoo. In: Level Set and PDE Based Reconstruction Methods in Imaging, pp. 1 70. Springer International Publishing (2013)

  11. Burger, M., Resmerita, E., He, L.: Error estimation for Bregman iterations and inverse scale space methods in image restoration. Computing 81, 109–135 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  12. Candes, E.J., Plan, Y.: A probabilistic and RIPless theory of compressed sensing. IEEE Trans. Inf. Theory 57(11), 7235–7254 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  13. Caselles, V., Chambolle, A., Novaga, M.: Total variation in imaging. In: Handbook of Mathematical Methods in Imaging, pp. 1016–1057. Springer Science & Business Media (2011)

  14. Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40, 120–145 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  15. Combettes, P.L., Salzo, S., Villa, S.: Regularized learning schemes in Banach space. Preprint, arXiv:1410.6847 (2014)

  16. Deledalle, C.A., Papadakis, N., Salmon, J.: On debiasing restoration algorithms: applications to total-variation and nonlocal-means. In: Scale Space and Variational Methods in Computer Vision 2015, Lecture Notes in Computer Science, 9087, pp. 129–141 (2015)

  17. Deledalle, C.A., Papadakis, N., Salmon, J., Vaiter, S.: CLEAR: Covariant LEAst-Square Re-fitting with Applications to Image Restoration. Preprint, arXiv:1606.05158 (2016)

  18. Ekeland, I., Temam, R.: Convex Analysis and Variational Problems. SIAM, Philadelphia (1999)

    Book  MATH  Google Scholar 

  19. Esser, E., Zhang, X., Chan, T.: A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3(4), 1015–1046 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  20. Figueiredo, M., Nowak, R., Wright, S.: Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process. 1(4), 586–598 (2007)

    Article  Google Scholar 

  21. Hintermüller, M., Papafitsoros, K., Rautenberg, C.: Analytical aspects of spatially adapted total variation regularisation. Preprint, arXiv:1609.01074 (2016)

  22. Lederer, J.: Trust, but verify: benefits and pitfalls of least-squares refitting in high dimensions. Preprint, arXiv:1306.0113v1 (2013)

  23. Min, J., Vonesch, C., Kirshner, H., Carlini, L., Olivier, N., Holden, S., Manley, S., Ye, J.C., Unser, M.: FALCON: fast and unbiased reconstruction of high-density super-resolution microscopy data. Sci. Rep. 4, 4577 (2014)

    Article  Google Scholar 

  24. Moeller, M., Brinkmann, E.-M., Burger, M., Seybold, T.: Color Bregman TV. SIAM J. Imaging Sci. 7(4), 2771–2806 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  25. Moeller, M., Burger, M.: Multiscale methods for polyhedral regularizations. SIAM J. Optim. 23, 1424–1456 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  26. Osher, S., Ruan, F., Xiong, J., Yao, Y., Yin, W.: Sparse recovery via differential inclusions. Appl. Comput. Harmon. Anal. 41(2), 436–469 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  27. Osher, S., Burger, M., Goldfarb, D., Xu, J., Yin, W.: An iterative regularization method for total variation-based image restoration. SIAM Multiscale Model. Simul. 4, 460–489 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  28. Pock, T., Cremers, D., Bischof, H., Chambolle, A.: An algorithm for minimizing the Mumford-Shah functional. In: IEEE 12th International Conference on Computer Vision 2009, pp. 1133–1140 (2009)

  29. Resmerita, E.: Regularization of ill-posed problems in Banach spaces: convergence rates. Inverse Probl. 21(4), 1303 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  30. Rockafellar, R.T.: Convex Analysis. Princeton Landmarks in Mathematics. Princeton University Press, Princeton (1997)

    Google Scholar 

  31. Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D Nonlinear Phenom. 60, 259–268 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  32. Scherzer, O., Groetsch, C.: Inverse scale space theory for inverse problems. Scale-space Springer LNCS 2106, pp. 317–325 (2001)

  33. Shapiro, A.: On concepts of directional differentiability. J. Optim. Theory Appl. 66(3), 477–487 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  34. Schuster, T., Kaltenbacher, B., Hofmann, B., Kazimierski, K.S.: Regularization Methods in Banach Spaces. DeGruyter, Berlin (2012)

    Book  MATH  Google Scholar 

  35. Tadmor, E., Nezzar, S., Vese, L.: A multiscale image representation using hierarchical (BV, L2) decompositions. Multiscale Model. Simul. 2, 554–579 (2004)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by ERC via Grant EU FP 7-ERC Consolidator Grant 615216 LifeInverse. MB acknowledges support by the German Science Foundation DFG via EXC 1003 Cells in Motion Cluster of Excellence, Münster, Germany.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Camille Sutour.

Appendix

Appendix

We have included some examples and proofs in the Appendix in order not to interrupt the flow of the paper. These are in particular the proof for shrinkage and the calculation of the corresponding derivatives for isotropic and anisotropic shrinkage in Example 1, and the calculation of the infimal convolution of two \(\ell ^1\)-Bregman distances in Example 4.

1.1 Shrinkage

Let \(f \in \ell ^2(\mathbb {R}^d)\) be a vector-valued signal for \(d \in \mathbb {N}\). Then the solution of the isotropic shrinkage problem

$$\begin{aligned} u_\alpha (f) \in \mathop {\mathrm{arg\,min}}\limits _{u \in \ell ^1(\mathbb {R}^d)} \dfrac{1}{2} \Vert u-f\Vert _{\ell ^2(\mathbb {R}^d)}^2 + \alpha \Vert u\Vert _{\ell ^1(\mathbb {R}^d)} \end{aligned}$$

is given by the isotropic soft-thresholding

$$\begin{aligned}{}[u_\alpha (f)]_i = {\left\{ \begin{array}{ll} (1 - \frac{\alpha }{|f_i|}) f_i, &{} |f_i| > \alpha , \\ 0, &{} |f_i| \le \alpha . \end{array}\right. } \end{aligned}$$

Proof

We first point out, that the objective allows to exploit strong duality. Following [2, Theorem 4.4.3 and Lemma 4.3.1], strong duality holds if

$$\begin{aligned} \mathrm {dom}( \Vert \cdot \Vert _{\ell ^1(\mathbb {R}^d)} ) \cap \mathrm {cont}\left( \dfrac{1}{2} \Vert \cdot -f\Vert _{\ell ^2(\mathbb {R}^d)}^2 \right) \ne \emptyset . \end{aligned}$$

Since the \(\ell ^1(\mathbb {R}^d)\)-norm has full domain and the \(\ell ^2(\mathbb {R}^d)\)-norm is continuous everywhere, this is trivially fulfilled. Hence, by the dual definition of the \(\ell ^1(\mathbb {R}^d)\)-norm, we find

$$\begin{aligned}&\quad \min _{u \in \ell ^1(\mathbb {R}^d)} \dfrac{1}{2} \Vert u-f\Vert _{\ell ^2(\mathbb {R}^d)}^2 + \alpha \Vert u\Vert _{\ell ^1(\mathbb {R}^d)} \\&= \min _{u \in \ell ^1(\mathbb {R}^d)} \sup _{\begin{array}{c} r \in \ell ^{\infty }(\mathbb {R}^d) \\ \Vert r\Vert _{\ell ^{\infty }(\mathbb {R}^d)} \le \alpha \end{array}} \dfrac{1}{2} \Vert u-f\Vert _{\ell ^2(\mathbb {R}^d)}^2 + \langle r,u \rangle \\&= \sup _{\Vert r\Vert _{\ell ^{\infty }(\mathbb {R}^d)} \le \alpha } \min _{u \in \ell ^1(\mathbb {R}^d)} \dfrac{1}{2} \Vert u-f\Vert _{\ell ^2(\mathbb {R}^d)}^2 + \langle r,u \rangle , \end{aligned}$$

where we used strong duality to interchange the infimum and the supremum. We can explicitly compute the minimizer for u as \(u = f-r\) and hence

$$\begin{aligned}&\quad \sup _{\Vert r\Vert _{\ell ^{\infty }(\mathbb {R}^d)} \le \alpha } \min _{u \in \ell ^1(\mathbb {R}^d)} \dfrac{1}{2} \Vert u-f\Vert _{\ell ^2(\mathbb {R}^d)}^2 + \langle r,u \rangle \\&= \sup _{\Vert r\Vert _{\ell ^{\infty }(\mathbb {R}^d)} \le \alpha } - \dfrac{1}{2} \Vert r\Vert _{\ell ^2(\mathbb {R}^d)}^2 + \langle r,f \rangle . \end{aligned}$$

This supremum can be computed explicitly pointwise with the corresponding Lagrangian

$$\begin{aligned} \mathcal {L}(r_i, \lambda ) = - \dfrac{1}{2} |r_i|^2 + r_i \cdot f_i + \lambda (|r_i|^2 - \alpha ^2) \end{aligned}$$

with \(\lambda \le 0\). Note that both the objective function and the constraints are continuously differentiable and that Slater’s condition holds. Optimality with respect to \(r_i\) yields

$$\begin{aligned} f_i - r_i + 2 \lambda r_i = 0 \end{aligned}$$

and hence

$$\begin{aligned} r_i = \dfrac{f_i}{1- 2 \lambda }. \end{aligned}$$

We distinguish two cases:

If \(|r_i| = \alpha \), then \(\alpha (1- 2\lambda ) = |f_i|\) and

$$\begin{aligned} u_i = f_i - r_i = f_i - \frac{f_i}{1-2\lambda } = \left( 1 - \frac{\alpha }{|f_i|}\right) f_i. \end{aligned}$$

The nonpositivity of \(\lambda \) implies that \(|f_i| \ge \alpha \). In case \(|r_i| < \alpha \), we obtain that \(\lambda = 0\) and hence \(r_i = f_i\) and \(u_i = 0\) when \(|f_i| < \alpha \). Note that since \(f \in \ell ^2(\mathbb {R}^d)\) there exists a finite N such that \(|f_i| \le \alpha \) for all \(i > N\). Hence trivially \(u_\alpha (f) \in \ell ^1(\mathbb {R}^d)\) as \(\sum _{i \in \mathbb {N}} \vert \left[ u_\alpha (f)\right] _i \vert \) is a finite sum. This yields the assertion. \(\square \)

Remark

For \(d = 1\) and a square-summable sequence \(f \in \ell ^2\), we immediately obtain the anisotropic case: the solution to

$$\begin{aligned} u_\alpha \in \arg \min _{u \in \ell ^1} \dfrac{1}{2} \Vert u-f\Vert _{\ell ^2}^2 + \alpha \Vert u\Vert _{\ell ^1} \end{aligned}$$
(8.1)

for \(\alpha > 0\) is given by

$$\begin{aligned}{}[u_\alpha (f)]_i = {\left\{ \begin{array}{ll} f_i - \alpha ~ \mathrm {sign}(f_i), &{} |f_i| \ge \alpha \\ 0, &{} |f_i| < \alpha . \end{array}\right. } \end{aligned}$$

Directional derivative The computation of the directional derivative requires a little more work. At first, let us compute the directional derivative of the function \(F :\mathbb {R}^d\backslash \{0\} \rightarrow \mathbb {R}\), \(x \mapsto \frac{1}{|x|}\) into the direction \(g \in \mathbb {R}^d\). We define \(G :\mathbb {R}^d\backslash \{0\} \rightarrow \mathbb {R}\), \(x \mapsto \frac{1}{|x|^2}\) and calculate

$$\begin{aligned} \mathrm {d}G(x;g)&= \lim _{t \rightarrow 0^+} \frac{G(x+tg) - G(x)}{t} \\&= \lim _{t \rightarrow 0^+} \frac{1}{t} \left( \frac{1}{|x+tg|^2} - \frac{1}{|x|^2} \right) \\&= \lim _{t \rightarrow 0^+} \frac{1}{t} \left( \frac{|x|^2 - |x + tg|^2}{|x|^2 |x + tg|^2} \right) \\&= \lim _{t \rightarrow 0^+} \frac{1}{t} \left( \frac{- 2 t x \cdot g - t^2 |g|^2}{|x|^2 |x + tg|^2} \right) \\&= - 2 \frac{x \cdot g}{|x|^4}. \end{aligned}$$

Then, by the chain rule, we obtain

$$\begin{aligned} \mathrm {d}F(x;g)&= \mathrm {d}\sqrt{G}(x;g) = \frac{\mathrm {d}G(x;g)}{2 \sqrt{G(x)}} \\&= -2 \frac{x \cdot g}{|x|^4} \frac{|x|}{2} = - \frac{x \cdot g}{|x|^3}. \end{aligned}$$

Let us further define the projection of a vector \(x \in \mathbb {R}^d\) onto another vector \(y \in \mathbb {R}^d\backslash \{0\}\) as

$$\begin{aligned} \Pi _y (x) = \frac{y \cdot x}{|y|^2} y. \end{aligned}$$

We now have to compute

$$\begin{aligned}{}[\mathrm {d}u_\alpha (f;g)]_i = \lim _{t \rightarrow 0^+} \dfrac{1}{t} \big ( [u_\alpha (f + tg) ]_i - [u_\alpha (f) ]_i \big ) \end{aligned}$$

and we can distinguish four cases:

Let at first \(|f_i| > \alpha \). Then for t small enough we have \(|f_i + t g_i| > \alpha \) and hence

$$\begin{aligned}&\lim _{t \rightarrow 0^+} \dfrac{1}{t} \big ( [u_\alpha (f + tg) ]_i - [u_\alpha (f) ]_i \big ) \\ =&\lim _{t \rightarrow 0^+} \dfrac{1}{t} \left( \left( 1 - \frac{\alpha }{|f_i + t g_i|} \right) (f_i + t g_i)\right. \\&\quad \qquad \left. - \left( 1 - \frac{\alpha }{|f_i|} \right) f_i \right) \\ =&\lim _{t \rightarrow 0^+} \dfrac{1}{t} \left( f_i + t g_i - \alpha \frac{f_i + t g_i}{|f_i + t g_i|} - f_i + \alpha \frac{f_i}{|f_i|} \right) \\ =&\lim _{t \rightarrow 0^+} \dfrac{1}{t} \left( t g_i - \frac{\alpha t g_i}{|f_i + t g_i|} \right. \\&\quad \qquad \left. - \alpha f_i \left( \frac{1}{|f_i + t g_i|} - \frac{1}{|f_i|} \right) \right) \\ =&\ g_i - \alpha \frac{g_i}{|f_i|} + \alpha f_i \frac{f_i \cdot g_i}{|f_i|^3} \\ =&\ g_i + \frac{\alpha }{|f_i|} \left( \Pi _{f_i}(g_i) - g_i \right) . \end{aligned}$$

For \(|f_i| < \alpha \) and t small enough, we easily find \(|f_i + t g_i| < \alpha \) and hence

$$\begin{aligned}{}[\mathrm {d}u_\alpha (f;g)]_i = 0. \end{aligned}$$

In case \(|f_i| = \alpha \) we need to distinguish whether \(|f_i + t g_i| > \alpha \) or \(|f_i + t g_i| \le \alpha \) for arbitrarily small t. We hence compute

$$\begin{aligned}&\quad&|f_i + t g_i|&> \alpha \\&\Leftrightarrow \&|f_i + t g_i|^2&> \alpha ^2 \\&\Leftrightarrow \&|f_i|^2 + 2 t f_i \cdot g_i + t^2 |g_i|^2&> \alpha ^2 \\&\Leftrightarrow \&2 f_i \cdot g_i + t |g_i|^2&> 0, \end{aligned}$$

which for arbitrarily small t is true only if \(f_i \cdot g_i \ge 0\). Analogously, we find that \(|f_i + t g_i| < \alpha \) for small t is only true if \(f_i \cdot g_i < 0\).

Hence let now \(|f_i| = \alpha \) and \(f_i \cdot g_i \ge 0\). Then we obtain

$$\begin{aligned}{}[\mathrm {d}u_\alpha&(f;g)]_i = \lim _{t \rightarrow 0^+} \dfrac{1}{t} \big ( [u_\alpha (f + tg) ]_i \big ) \\&= \lim _{t \rightarrow 0^+} \dfrac{1}{t} \left( \left( 1- \frac{\alpha }{|f_i + t g_i|} \right) (f_i + t g_i) \right) . \end{aligned}$$

Using \(\alpha = |f_i|\), we find

$$\begin{aligned}&\lim _{t \rightarrow 0^+} \dfrac{|f_i|f_i}{t} \left( \frac{1}{|f_i|} - \frac{1}{|f_i + t g_i|} \right) + g_i - \frac{|f_i| g_i}{|f_i + t g_i|} \\&\qquad = |f_i|f_i \frac{f_i \cdot g_i}{|f_i|^3} \\&\qquad = \Pi _{f_i}(g_i). \end{aligned}$$

In the last case \(|f_i| = \alpha \) and \(f_i \cdot g_i < 0\), we find

$$\begin{aligned}{}[\mathrm {d}u_\alpha (f;g)]_i = \lim _{t \rightarrow 0^+} \dfrac{1}{t} \big ( [u_\alpha (f + tg) ]_i \big ) = 0. \end{aligned}$$

Summing up we have

$$\begin{aligned}&[\mathrm {d}u_\alpha (f; g)]_i \\&\quad = \left\{ \begin{array}{ll} {g_i + \frac{\alpha }{|f_i|} \left( \Pi _{f_i}(g_i) - g_i \right) ,} &{}\quad |f_i|> \alpha ,\\ 0, &{}\quad |f_i|< \alpha ,\\ \Pi _{f_i}(g_i),&{} \quad {|f_i|= \alpha ,\ f_i \cdot g_i > 0,} \\ 0, &{}\quad {|f_i|= \alpha ,\ f_i \cdot g_i \le 0.} \end{array}\right. \end{aligned}$$

It remains to show that

$$\begin{aligned} \left\| \frac{ u_\alpha (f + tg) - u_\alpha (f) }{t} - \mathrm {d}u_\alpha (f;g) \right\| _{\ell ^1(\mathbb {R}^d)} \rightarrow 0 \end{aligned}$$

for \(t \rightarrow 0^+\). Again, since \(f \in \ell ^2(\mathbb {R}^d)\), there exists \(N \in \mathbb {N}\) such that \(|f_i| < \alpha \) and hence \([\mathrm {d}u_\alpha (f;g)]_i = 0\) for all \(i > N\). The difference quotient as well vanishes for all \(i > N\), hence the above \(\ell ^1\) norm is a finite sum and thus we trivially obtain convergence in \(\ell ^1(\mathbb {R}^d)\).

Remark

For \(d=1\) and \(f \in \ell ^2\) we obtain the anisotropic result:

$$\begin{aligned}{}[\mathrm {d}u_\alpha (f;g)]_i = {\left\{ \begin{array}{ll} g_i, &{} |f_i| > \alpha \\ 0, &{} |f_i| < \alpha \\ g_i, &{} |f_i| = \alpha , \mathrm {sign}(f_i) = \mathrm {sign}(g_i) \\ 0, &{} |f_i| = \alpha , \mathrm {sign}(f_i) \ne \mathrm {sign}(g_i), \end{array}\right. } \end{aligned}$$

where we mention that here \(\Pi _{f_i}(g_i) = g_i\).

Model manifold: The corresponding (isotropic) model manifold is given by

$$\begin{aligned} u \in \mathcal {M}_f^{\mathrm {G}}\Leftrightarrow u_i = {\left\{ \begin{array}{ll} v \in \mathbb {R}^d, &{} |f_i| > \alpha , \\ 0, &{} |f_i| < \alpha , \\ \lambda f_i,\ \lambda \ge 0, &{} |f_i| = \alpha . \end{array}\right. } \end{aligned}$$

Analogously to the anisotropic case discussed in Example 1, the model manifold allows for arbitrary elements, here even including the direction, if the magnitude \(|f_i|\) of the signal is strictly above the threshold parameter \(\alpha \). As already discussed in Example 1, \(|f_i| = \alpha \) is the odd case of the three, since in contrast to \(|f_i| > \alpha \) it only allows for changes into the direction of the signal \(f_i\). If we exclude that case, we again find a linear derivative, hence a Gâteaux derivative and even a Fréchet derivative. Accordingly the isotropic shrinkage is the immediate generalization of the anisotropic shrinkage, which we can find as a special case for \(d = 1\).

Summing up, the debiasing procedure on this manifold again yields the solution of hard thresholding:

$$\begin{aligned}{}[\hat{u}(f)]_i = {\left\{ \begin{array}{ll} f_i, &{} |f_i| \ge \alpha , \\ 0, &{} |f_i|< \alpha . \end{array}\right. } \end{aligned}$$

Note that we again maintain the signal directly on the threshold.

1.2 Infimal Convolution of \(\ell ^1\) Bregman Distances

Theorem 5

Let \(\Gamma :\ell ^2(\mathbb {R}^n) \rightarrow \ell ^1(\mathbb {R}^m)\) be linear and bounded and \(J(u) = \Vert \Gamma u \Vert _{\ell ^1(\mathbb {R}^m)}\) for \(m,n \in \mathbb {N}\). Let further \(q_\alpha \in \partial \Vert \cdot \Vert _{\ell ^1(\mathbb {R}^m)} (\Gamma u_\alpha )\) such that \(p_\alpha = \Gamma ^* q_\alpha \). Then

$$\begin{aligned} \mathrm {ICB}_{\ell ^1(\mathbb {R}^m)}^{q_{\alpha }}(\Gamma u, \Gamma u_\alpha ) \le \mathrm {ICB}_J^{p_{\alpha }}(u, u_\alpha ). \end{aligned}$$

Proof

$$\begin{aligned}&\mathrm {ICB}_J^{p_{\alpha }}(u, u_\alpha ) \\&\quad = \inf _{z \in \ell ^2(\mathbb {R}^n)} ~ D_J^{p_\alpha }(u-z,u_\alpha ) + D_J^{-p_\alpha }(z,-u_\alpha ) \\&\quad = \inf _{z \in \ell ^2(\mathbb {R}^n)} ~ \Vert \Gamma (u-z) \Vert _{\ell ^1(\mathbb {R}^m)} - \langle p_\alpha ,u-z \rangle \\&\quad \quad + \Vert \Gamma z\Vert _{\ell ^1(\mathbb {R}^m)} + \langle p_\alpha ,z \rangle \\&\quad = \inf _{z \in \ell ^2(\mathbb {R}^n)} ~ \Vert \Gamma (u-z) \Vert _{\ell ^1(\mathbb {R}^m)} - \langle q_\alpha ,\Gamma (u-z) \rangle \\&\quad \quad + \Vert \Gamma z\Vert _{\ell ^1(\mathbb {R}^m)} + \langle q_\alpha ,\Gamma z \rangle \\&\quad = \inf _{\Gamma z \in \ell ^1(\mathbb {R}^m)} \Vert \Gamma (u-z) \Vert _{\ell ^1(\mathbb {R}^m)} - \langle q_\alpha ,\Gamma (u-z) \rangle \\&\quad \quad + \Vert \Gamma z\Vert _{\ell ^1(\mathbb {R}^m)} + \langle q_\alpha ,\Gamma z \rangle \\&\quad \ge \inf _{w \in \ell ^1(\mathbb {R}^m)} ~ \Vert \Gamma u - w \Vert _{\ell ^1(\mathbb {R}^m)} - \langle q_\alpha ,\Gamma u -w \rangle \\&\quad \quad + \Vert w\Vert _{\ell ^1(\mathbb {R}^m)} + \langle q_\alpha ,w \rangle \\&\quad = \inf _{w \in \ell ^1(\mathbb {R}^m)} ~ D_{\ell ^1(\mathbb {R}^m)}^{q_\alpha }(\Gamma u - w, \Gamma u_\alpha ) \\&\quad \quad + D_{\ell ^1(\mathbb {R}^m)}^{-q_\alpha }(w, -\Gamma u_\alpha ) \\&\quad = \mathrm {ICB}_{\ell ^1(\mathbb {R}^m)}^{q_{\alpha }}(\Gamma u, \Gamma u_\alpha ). \end{aligned}$$

\(\square \)

Note that we get equality for surjective \(\Gamma \) in Theorem 5.

Theorem 6

Let \(v,u \in \ell ^1(\mathbb {R}^m)\) and \(q \in \partial \Vert v\Vert _{\ell ^1(\mathbb {R}^m)}\). Then

$$\begin{aligned} ICB _{\ell ^1(\mathbb {R}^m)}^q(u,v) = \sum _{i \in \mathbb {N}} G(u_i, q_i) \end{aligned}$$

with \(G :\mathbb {R}^m \times \mathbb {R}^m \rightarrow \mathbb {R}\) defined as

$$\begin{aligned} G(u_i, q_i) = {\left\{ \begin{array}{ll} |u_i| (1 - |\cos (\varphi _i)| |q_i|), &{} |q_i| < |\cos (\varphi _i)|, \\ |u_i| | \sin (\varphi _i)| \sqrt{1 - |q_i|^2}, &{} |q_i| \ge |\cos (\varphi _i)|. \end{array}\right. } \end{aligned}$$

where \(\varphi _i\) denotes the angle between \(u_i\) and \(q_i\), i.e., \(\cos (\varphi _i) |u_i| |q_i| = u_i \cdot q_i\) with \(\varphi _i := 0\) for \(q_i = 0\) or \(u_i = 0\).

Proof

Let

$$\begin{aligned} f_1(u)&= D_{\ell ^1(\mathbb {R}^m)}^q(u,v) = \Vert u\Vert _{\ell ^1(\mathbb {R}^m)} - \langle q,u \rangle , \\ f_2(u)&= D_{\ell ^1(\mathbb {R}^m)}^{-q}(u,-v) = \Vert u \Vert _{\ell ^1(\mathbb {R}^m)} + \langle q, u \rangle . \end{aligned}$$

Since \((f_1 \Box f_2)^* = f_1^* + f_2^*\) and by the definition of the biconjugate, we know that

$$\begin{aligned} f_1 \Box f_2 \ge (f_1^* + f_2^*)^*. \end{aligned}$$
  1. (1)

    We shall first compute the right-hand side. We have

    $$\begin{aligned} f_1^*(w)&= \iota _{B^{\infty }(1)}(w +q), \\ f_2^*(w)&= \iota _{B^{\infty }(1)}(w -q), \end{aligned}$$

where \(\iota _{B^{\infty }(1)}\) denotes the characteristic function of the \(\ell ^{\infty }(\mathbb {R}^m)\)-ball

$$\begin{aligned} B^{\infty }(1) = \big \{ w \in \ell ^{\infty }(\mathbb {R}^m) ~|~ \Vert w \Vert _{\ell ^{\infty }(\mathbb {R}^m)} \le 1 \big \}. \end{aligned}$$

Thus

$$\begin{aligned}&(f_1^* + f_2^*)^*(u) = \sup _{w \in \ell ^{\infty }(\mathbb {R}^m) } \langle u, w \rangle \\&\text { s.t.} ~ \Vert w+q\Vert _{\ell ^{\infty }(\mathbb {R}^m)} \le 1, \Vert w-q\Vert _{\ell ^{\infty }(\mathbb {R}^m)} \le 1. \end{aligned}$$

Taking into account the specific form of these constraints, we can carry out the computation pointwise, i.e.,

$$\begin{aligned} \sup _{w_i \in \mathbb {R}^m } u_i \cdot w_i ~ \text { s.t.} ~ |w_i + q_i| \le 1, |w_i - q_i| \le 1. \end{aligned}$$

From now on we drop the dependence on i for simplicity.

  • Let us first consider the case \(|q| = 1\). We immediately deduce that \(w = 0\) and \(u \cdot w = 0\).

  • Hence we assume \(|q| < 1\) from now on, and set up the corresponding Lagrangian

    $$\begin{aligned} \mathcal {L}(w, \lambda , \mu ) =&- w \cdot u + \lambda (|w - q|^2 -1) \nonumber \\&+ \mu (|w + q|^2 -1). \end{aligned}$$
    (8.2)

Both the objective functional and the constraints are differentiable, so every optimal point of (8.2) has to fulfill the four Karush–Kuhn–Tucker conditions, namely

$$\begin{aligned}&\dfrac{\partial }{\partial w} \mathcal {L}(w, \lambda , \mu ) = 0, \quad&\lambda (|w -q|^2-1) = 0, \\&\lambda , \mu \ge 0,&\mu (|w + q|^2 -1) = 0, \end{aligned}$$

Slater’s condition implies the existence of Lagrange multipliers for a KKT-point of (8.2). The first KKT-condition yields

$$\begin{aligned} -u + 2 \lambda (w-q) + 2 \mu (w + q) = 0. \end{aligned}$$
(8.3)
  • Let us first remark that the case \(u=0\) causes the objective function to vanish anyway, hence in the following \(u\ne 0\).

  • Then let us address the case \(q=0\) in which (8.3) yields

    $$\begin{aligned} u = 2 (\lambda + \mu ) w. \end{aligned}$$

    In case \(|w| = 1\) we find that \(2 (\lambda + \mu ) = |u|\), hence \(w = \frac{u}{|u|}\). We infer

    $$\begin{aligned} w \cdot u = \frac{u \cdot u}{|u|} = |u|. \end{aligned}$$

    Note that for \(|w| < 1\), we find that \(\lambda = \mu = 0\) and hence \(u = 0\).

  • If \(q \ne 0\), we can distinguish four cases: 1st case: \(|w -q|^2 <1, |w +q|^2 = 1\). Thus \(\lambda = 0\) and (8.3) yields

    $$\begin{aligned} u = 2 \mu (w +q). \end{aligned}$$

Since \(|w + q|^2 = 1\), we deduce \(\mu = |u|/2\), so

$$\begin{aligned} w = \dfrac{u}{|u|} - q \end{aligned}$$

and finally for the value of the objective function

$$\begin{aligned} w \cdot u = \left( \dfrac{u}{|u|} - q \right) \cdot u = |u| - q \cdot u. \end{aligned}$$

2nd case: \(|w +q|^2 <1, |w -q|^2 = 1\).

We analogously find

$$\begin{aligned} w \cdot u = |u| + q\cdot u. \end{aligned}$$

The first two cases thus occur whenever (insert w into the conditions)

$$\begin{aligned} \left| \dfrac{u}{|u|} - 2q \right|< 1 \text { or } \left| \dfrac{u}{|u|} + 2q \right| < 1. \end{aligned}$$

We calculate

$$\begin{aligned}&\quad \left| \dfrac{u}{|u|} - 2q \right| ^2< 1 \\&\Leftrightarrow |q|^2< q \cdot \dfrac{u}{|u|} \\&\Leftrightarrow |q| < \cos (\varphi ). \end{aligned}$$

Hence \(q\cdot u >0\) and

$$\begin{aligned} |u| - q \cdot u = |u| -|q \cdot u|. \end{aligned}$$

In the second case, we analogously find

$$\begin{aligned} |q| < -\cos (\varphi ), \end{aligned}$$

hence \(q \cdot u <0\) and

$$\begin{aligned} |u| + q \cdot u = |u| -|q \cdot u|, \end{aligned}$$

so we may summarize the first two cases as

$$\begin{aligned} w \cdot u = |u| - |q \cdot u| = |u| (1 - |\cos (\varphi )| |q|), \end{aligned}$$

whenever \(|q| < |\cos (\varphi )|\).

3rd case: \(|w -q|^2 =1, |w +q|^2 = 1\).

At first we observe that from

$$\begin{aligned} |w +q |^2 = |w -q|^2 \end{aligned}$$

we may deduce that \(w \cdot q = 0\). Therefore we have

$$\begin{aligned} |w +q|^2 = 1 \Rightarrow |w| = \sqrt{1-|q|^2}. \end{aligned}$$

We multiply the optimality condition (8.3) by q and obtain

$$\begin{aligned}&\quad u \cdot q = 2\lambda (w -q) \cdot q + 2\mu (w +q) \cdot q \\&\Leftrightarrow u \cdot q = 2(\mu - \lambda )~ |q|^2 \\&\Leftrightarrow (\mu - \lambda ) = \frac{u}{2} \cdot \dfrac{q}{|q|^2}. \end{aligned}$$

Multiplying (8.3) by w yields

$$\begin{aligned} u \cdot w = 2 (\lambda + \mu ) |w|^2 \end{aligned}$$

and another multiplication of (8.3) by u yields

$$\begin{aligned} |u|^2&= 2 (\lambda +\mu ) w \cdot u + 2 (\mu -\lambda ) q\cdot u \\&= 4 (\lambda +\mu )^2 |w|^2 + \left( u\cdot \dfrac{q}{|q|} \right) ^2, \end{aligned}$$

where we inserted the previous results in the last two steps. We rearrange and find

$$\begin{aligned} 2 (\lambda + \mu ) = \sqrt{ |u|^2 - \left( u \cdot \dfrac{q}{|q|} \right) ^2} |w|^{-1}. \end{aligned}$$

Note that \(|w| > 0\) since \(|q| < 1\). This finally leads us to

$$\begin{aligned} u \cdot w&= 2 (\lambda + \mu ) |w|^2 \\&= \sqrt{ |u|^2 - \left( u \cdot \dfrac{q}{|q|} \right) ^2} |w| \\&= |u| \sqrt{\left( 1 - \left( \dfrac{u}{|u|} \cdot \dfrac{q}{|q|} \right) ^2\right) \left( 1 - |q|^2\right) } \\&= |u| \sqrt{\left( 1 - |\cos (\varphi )|^2\right) \left( 1 - |q|^2\right) } \\&= |u| |\sin (\varphi )| \sqrt{\left( 1 - |q|^2\right) }. \end{aligned}$$

4th case: \(|w -q|^2< 1, |w +q|^2 < 1\).

Here the first KKT-condition yields \(u =0\), which can only occur if the objective function \(w\cdot u\) vanishes anyway. Summing up, we have

$$\begin{aligned} (f_1^* + f_2^*)^*(u) = \sum _{i \in \mathbb {N}} G(u_i,q_i) \le \Vert u\Vert _{\ell ^1(\mathbb {R}^m)}. \end{aligned}$$
  1. (2)

    It remains to show that

    $$\begin{aligned} (f_1 \Box f_2)(u)&= \inf _{z \in \ell ^1(\mathbb {R}^m)} \sum _{i \in \mathbb {N}} g_i(z_i) \\&\le (f_1^* + f_2^*)^*(u), \end{aligned}$$

where

$$\begin{aligned} g_i(z_i) = |u_i - z_i| + |z_i| - q_i \cdot (u_i - 2z_i) \ge 0. \end{aligned}$$

Again we need to distinguish four cases.

1st case: If \(|q_i| < \cos (\varphi _i)\), we have \(q_i \cdot u_i > 0\) and we can choose \(z_i = 0\) to obtain

$$\begin{aligned} g_i(z_i) = |u_i| - q_i \cdot u_i = |u_i| - |q_i \cdot u_i|. \end{aligned}$$

2nd case: Analogously if \(|q_i| < -\cos (\varphi _i)\), we have \(q_i \cdot u_i < 0\) and choose \(z_i = u_i\), thus

$$\begin{aligned} g_i(z_i) = |u_i| + q_i \cdot u_i = |u_i| - |q_i \cdot u_i|. \end{aligned}$$

3rd case: If \(|q_i| = 1\), we compute for \(z_i = \frac{u_i}{2} - \frac{c}{2}q_i\) , \(c > 0\),

$$\begin{aligned} g_i(z_i)&= \left| \frac{u_i}{2} + \frac{c}{2} q_i \right| + \left| \frac{u_i}{2} - \frac{c}{2} q_i \right| - c|q_i|^2 \\&=\frac{c}{2} \left( \left| q_i + \frac{u_i}{c} \right| + \left| q_i - \frac{u_i}{c} \right| - 2 \right) . \end{aligned}$$

Using a Taylor expansion around q we obtain

$$\begin{aligned} \left| q_i + \frac{u_i}{c} \right|&= |q_i| + \frac{q_i}{|q_i|} \cdot \frac{u_i}{c} + O (c^{-2}), \\ \left| q_i - \frac{u_i}{c} \right|&= |q_i| - \frac{q_i}{|q_i|} \cdot \frac{u_i}{c} + O (c^{-2}). \end{aligned}$$

Hence with \(|q_i| = 1\) we find

$$\begin{aligned} g_i(z_i) = \frac{c}{2} (2 |q_i| + O(c^{-2}) - 2) = O(c^{-1}) \rightarrow 0 \end{aligned}$$

for \(c \rightarrow \infty \). Hence for every \(\varepsilon \) there exists a \(c_i > 0\) such that \(g_i(z_i) \le \varepsilon / 2^i\).

4th case: Finally, if \(|q_i| \ge |\cos (\varphi _i)|\) and \(|q_i| < 1\), we pick \(z_i = 2 \lambda _i (w_i - q_i)\), with \(\lambda _i\) and \(w_i\) being the Lagrange multiplier and the dual variable from the above computation of \((f_1^* + f_2^*)^*\). It is easy to see that

$$\begin{aligned} g_i(z_i) = |u_i| | \sin (\varphi _i)| \sqrt{1 - |q_i|^2}. \end{aligned}$$

Hence we define \(z := (z_i)_i\) such that

$$\begin{aligned} z_i = {\left\{ \begin{array}{ll} 0, &{} \text { if } |q_i|< \cos (\varphi _i),\\ u_i, &{} \text { if } |q_i|< -\cos (\varphi _i),\\ \frac{u_i}{2} - \frac{c_i}{2}q_i, &{} \text { if } |q_i| = 1, \\ \lambda _i (w_i - q_i) &{} \text { if } |q_i| \ge |\cos (\varphi _i)|, \\ &{} |q_i| < 1. \end{array}\right. } \end{aligned}$$

Let \(z^N\) denote z truncated at index \(N \in \mathbb {N}\), i.e.,

$$\begin{aligned} z_i^N = {\left\{ \begin{array}{ll} z_i, &{} \text { if } i \le N, \\ 0, &{} \text { else.} \end{array}\right. } \end{aligned}$$

Then trivially \(z^N \in \ell ^1(\mathbb {R}^m)\) and we compute

$$\begin{aligned}&(f_1 \Box f_2)(u) \le \sum _{i \in \mathbb {N}} g_i(z_i^N) \\&\quad \le \sum _{i = 1}^N \big ( G(u_i, q_i) + \frac{\varepsilon }{2^i} \big ) + \sum _{i=N+1}^\infty g_i(0) \\&\quad = \sum _{i = 1}^\infty G(u_i, q_i) + \sum _{i = 1}^N \frac{\varepsilon }{2^i}\\&\quad \quad + \sum _{i=N+1}^\infty \big ( |u_i| - q_i \cdot u_i - G(u_i,q_i) \big ) \\&\quad \le \sum _{i = 1}^\infty G(u_i, q_i) + \sum _{i = 1}^N \frac{\varepsilon }{2^i} + 3 \sum _{i=N+1}^\infty |u_i| \\&\quad \rightarrow \sum _{i = 1}^\infty G(u_i, q_i) + \varepsilon \end{aligned}$$

as \(N \rightarrow \infty \). This completes the proof. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Brinkmann, EM., Burger, M., Rasch, J. et al. Bias Reduction in Variational Regularization. J Math Imaging Vis 59, 534–566 (2017). https://doi.org/10.1007/s10851-017-0747-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10851-017-0747-z

Keywords

Navigation