Abstract
Many problems in science and engineering give rise to linear systems of equations that are commonly referred to as large-scale linear discrete ill-posed problems. These problems arise, for instance, from the discretization of Fredholm integral equations of the first kind. The matrices that define these problems are typically severely ill-conditioned and may be rank-deficient. Because of this, the solution of linear discrete ill-posed problems may not exist or be very sensitive to perturbations caused by errors in the available data. These difficulties can be reduced by applying Tikhonov regularization. We describe a novel “approximate Tikhonov regularization method” based on constructing a low-rank approximation of the matrix in the linear discrete ill-posed problem by carrying out a few steps of the Arnoldi process. The iterative method so defined is transpose-free. Our work is inspired by a scheme by Donatelli and Hanke, whose approximate Tikhonov regularization method seeks to approximate a severely ill-conditioned block-Toeplitz matrix with Toeplitz-blocks by a block-circulant matrix with circulant-blocks. Computed examples illustrate the performance of our proposed iterative regularization method.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We are interested in computing approximate solutions to linear least-squares problems of the form
where \(\boldsymbol {A} \in \mathbb {R}^{n \times n}\) is a square severely ill-conditioned or rank-deficient matrix, whose singular values “cluster” at the origin. Throughout, we let ∥⋅∥ denote the Euclidean norm. Least-squares problems with a matrix of this kind are commonly referred to as linear discrete ill-posed problems. They arise, for instance, by discretizing Fredholm integral equations of the first kind; see [7. , 12. ]. We focus on applications to image deblurring, but the method proposed also can be used with other applications.
The vector bδ in (1) often represents measurements that are corrupted by error; we denote this error by e so that
where b is the unknown error-free vector associated with bδ. We would like to determine the solution of minimal Euclidean norm, \(\boldsymbol {x}^{\dagger }\), of the unavailable least-squares problem
Since bδ is contaminated by error and the singular values of the matrix A cluster at the origin, the solution of (1) of minimal Euclidean norm typically is a poor approximation of \(\boldsymbol {x}^{\dagger }\). For this reason, we replace the problem (1) by a nearby problem whose solution is less sensitive to the error e in bδ. This replacement is known as regularization. The most popular and well-understood regularization method is due to Tikhonov. The simplest form of Tikhonov regularization replaces the least-squares problem (1) by the penalized least-squares problem
where α > 0 is a regularization parameter. The solution of (3) can be expressed as
where the superscript T denotes transposition. The regularization parameter α > 0 determines the sensitivity of the vector x(α) to the error in bδ, as well as the closeness of x(α) to the desired solution \(\boldsymbol {x}^{\dagger }\).
There are many techniques for determining a suitable value of α including generalized cross validation and the L-curve criterion; see [2. , 8. , 15. , 16. , 18. ] for discussions and illustrations. Another fairly common technique is the discrepancy principle. It requires knowledge of a fairly accurate error bound
as well as that the system (2) be consistent. The discrepancy principle prescribes that α > 0 be chosen so that
where τ > 1 is a user-specified constant that is independent of δ. It can be shown that \(\boldsymbol {x}^{(\alpha )}\rightarrow \boldsymbol {x}^{\dagger }\) as δ ↘ 0; see [7. ] for a proof in a Hilbert space setting.
We are interested in developing an iterative method for the computation of x(α) when A is a large matrix and matrix-vector products Aw for \(\boldsymbol {w}\in {\mathbb {R}}^{n}\) can be evaluated fairly inexpensively, but matrix-vector products ATw cannot. This situation arises, for instance, when A represents a discretization of an integral operator and Aw is evaluated by a multipole method. It also arises when A represents the Jacobian of a nonlinear problem; see [5. ]. We therefore focus on an iterative method that only requires matrix-vector product evaluations with A, but not with AT.
Let x(k) denote the approximate solution of the least-squares problem (1) determined at step k of an iterative method. Define the (unknown) error
We may approximate e(k) by the solution h(k) of the error equation
As the matrix A is ill-conditioned and the residual r(k) is contaminated by error, we determine an approximate solution of (6) by Tikhonov regularization
Then, generally, an improved approximation of \(\boldsymbol {x}^{\dagger }\) is given by
The iterations may be repeated and the resulting solution method is known as the iterated Tikhonov method; see, e.g., [2. , 3. , 11. ] for discussions of this method. These iterations may be terminated by the discrepancy principle, i.e., the iterative process is stopped once the kth iterate, x(k), satisfies
As the computed solution of (7) may be sensitive to the choice of the regularization parameter α(k) at step k, Donatelli and Hanke use a modified Levenberg-Marquardt iteration, first suggested in [10. ], to solve
up to a certain relative error, i.e., they solve
where the lower bound of 0 < q(k) < 1 is a user-specified constant. This approach to determine the regularization parameter using a damped version of the discrepancy principle has proved to be useful when A has a structure that allows fast execution of each step of iterated Tikhonov regularization.
A major drawback of the iterative method (7) is that if A is large, then a large system of linear equations with the matrix AAT + α(k)I has to be solved in each iteration. Moreover, the transpose of A may be unavailable or expensive to compute. In [6. ] Donatelli and Hanke address these issues by replacing A in (7) by a matrix C, whose structure allows for fast computation and transposition. The matrix CT(CCT + α(k)I)− 1 so obtained is referred to as a preconditioner; see [11. ]. The algorithm in [6. ], termed the approximated iterated Tikhonov (AIT) method, was demonstrated to give accurate image reconstructions for a reasonable computational cost when A is a matrix that has dominant block-Toeplitz with Toeplitz-blocks (BTTB) substructure and C is a suitably chosen block-circulant with circulant-blocks (BCCB) matrix. Analyses of the AIT method are provided in [4. , 6. ]. Under the assumption that A and C are spectrally equivalent (see below), it is shown in [6. ] that the AIT method is a regularization method in a well-defined sense. However, oftentimes in applications, the spectral equivalence assumption is not satisfied; this issue is addressed in [4. ]. Irrespective of this, the AIT method provides fast and reliable image reconstructions when the available image, represented by the vector bδ, is contaminated by sufficiently much noise.
In this paper, we allow \(\boldsymbol {A}\in {\mathbb {R}}^{n\times n}\) to be a fairly general large matrix whose singular values “cluster” at the origin, and let the matrix C be a low-rank matrix that is constructed by applying a few steps of the Arnoldi process to the matrix A. We will refer to the iterative method so obtained as the iterated Arnoldi-Tikhonov (IAT) method. Our analysis of this method is closely related to the analysis of the AIT method in [6. ]. The main advantage of our IAT method, when compared to the AIT method, is that we do not require A to have a particular structure.
The rest of this paper is structured as follows: Section 2 describes the AIT method by Donatelli and Hanke [6. ] and reviews its properties. In Section 3 we introduce the IAT method and present its analysis. Section 4 provides a brief background on image deblurring problems and reports some numerical examples. We conclude in Section 5 with some remarks.
2 The approximated iterated Tikhonov method
This section reviews the AIT method and the theoretical results shown by Donatelli and Hanke [6. ]. A modification of this method is discussed in [4. ]; we will not consider this modification in the present paper. As mentioned above, the AIT method is a regularization method in a well-defined sense under the assumption that A and C are spectrally equivalent; this condition is stated as follows:
Assumption 1
(Spectral Equivalence) The matrices \(\boldsymbol {A},\boldsymbol {C}\in \mathbb {R}^{m\times n}\) are said to be spectrally equivalent if for some constant 0 < ρ < 1/2 it holds
Note that condition (11) implies that \(\mathcal {N}(\boldsymbol {A})\subseteq \mathcal {N}(\boldsymbol {C})\). The following algorithm describes the AIT method.
We review the essential results by Donatelli and Hanke [6. ] regarding the AIT method without proofs.
Proposition 1.
Let Assumption 1 hold for some 0 < ρ < 1/2, and let τ∗ = (1 + ρ)/(1 − 2ρ). Then, if τ(k) = ∥r(k)∥/δ > τ∗, it follows that
where x(k) is the kth iterate generated by the AIT method, \(\boldsymbol {e}^{(k)}=\boldsymbol {x}^{\dagger }-\boldsymbol {x}^{(k)}\), and r(k) = bδ −Ax(k).
Proposition 2.
Let Assumption 1 hold. Then the norm of the error, e(k), in the kth iterate, x(k), generated by Algorithm 1 decreases monotonically as k increases,
as long as ∥r(k)∥ > τδ, where r(k) = bδ −Ax(k).
Corollary 1.
Under the assumptions of Proposition 2, let kδ denote the index of the last iterate determined by Algorithm 1. Then
for some constant c > 0 that only depends on ρ and q.
Theorem 1.
Assume that δ = 0 and that x(0) is not a solution of (2). Then the sequence of iterates x(k), \(k = 0,1,2, \dots\), generated by Algorithm 1 converges as \(k \rightarrow \infty\) to the solution of (1) that is closest to x(0).
The following final result states that the AIT algorithm is an iterative regularization method.
Theorem 2.
Let Assumption 1 be valid for some 0 < ρ < 1/2, and let δ↦bδ be a function such that (5) holds for all δ > 0. For fixed parameters τ and q, let xδ denote the approximate solution computed by Algorithm 1. Then, as \(\delta \rightarrow 0\), xδ converges to the solution of (1) that is closest to x(0).
3 The iterated Arnoldi-Tikhonov method
We now turn our attention to the IAT method and its relation to the AIT method. The use of a BCCB matrix C to approximate A in [6. ] was justified by the ease of computation of matrix-vector products as well as inversion with such matrices; see Section 4. In a similar light, we utilize the Arnoldi process and the orthogonality of the vectors generated by this process to simplify the computations.
The pth step of the Arnoldi process applied to the matrix \(\boldsymbol {A} \in \mathbb {R}^{n \times n}\) with initial vector bδ generically yields the decomposition
where the matrix \(\boldsymbol {V}_{p+1}=[\boldsymbol {v}_{1},\boldsymbol {v}_{2},\ldots ,\boldsymbol {v}_{p+1}]\in \mathbb {R}^{n \times (p+1)}\) has orthonormal columns vj for \(j=1,2,\dots ,p+1\) with \(\boldsymbol {v}_{1}=\boldsymbol {b}^{\boldsymbol {\delta }}/\|\boldsymbol {b}^{\boldsymbol {\delta }}\|\), and \(\boldsymbol {V}_{p} \in \mathbb {R}^{n\times p}\) is made up of the first p columns of Vp+ 1. Further, \(\boldsymbol {H}_{p+1,p}\in \mathbb {R}^{(p+1)\times p}\) is of upper Hessenberg form. The range of Vp is the Krylov subspace
see, e.g., Saad [19. ] for details on the Arnoldi process.
At step p we utilize the Arnoldi decomposition (12) to approximate A. Because of this, we seek the projected solution of \(\boldsymbol {x}^{\dagger }\) in the Krylov subspace \(\mathbb {K}_{p}\) using the column space of Vp. We will denote this projected solution in \(\mathbb {K}_{p}\) by \(\boldsymbol {x}^{\dagger }_{p}\). The iterated Tikhonov method may be simplified by substituting (12) into (7) to obtain
as an improved approximation to \(\boldsymbol {x}_{p}^{\dagger }\). Since we may select a different regularization parameter \(\alpha ^{(k)}_{p}\) at each iteration, we refer to the iterative method (13) as a nonstationary preconditioned iterative method, analogously with the terminology used in [6. ].
The following theorem connects Assumption 1 of Section 2 with our proposed method.
Theorem 3.
Let the matrices A and
be nonvanishing. Then A and Ap are spectrally equivalent for any \(\rho \in \mathbb {R}\) and for all \(\boldsymbol {z} \in \mathbb {K}_{p}\).
Proof
Define z = Vpx for any fixed \(\boldsymbol {x} \in \mathbb {R}^{p}\). We note that for any \(\boldsymbol {z} \in \mathbb {K}_{p}\) that
In this case Assumption 1 is satisfied trivially for any \(\rho \in \mathbb {R}\)
□
We remark that Theorem 3 only holds when the Arnoldi decomposition (12) is determined in exact arithmetic. In particular, the columns of the matrix Vp+ 1 are required to be exactly orthonormal. This requirement is typically not satisfied when finite-precision floating-point arithmetic is used as in the computed examples reported in Section 4. We have not experienced any difficulty due to this issue. If the error e is of very small norm and, therefore, many steps of the Arnoldi process will be carried out, then it may be advantageous to implement this process with reorthogonalization. This has not been necessary for the computed examples reported in this paper.
Theorem 3 allows us to replace the BCCB matrix C from Section 2 by the matrix (14) determined by p steps of the Arnoldi process because it satisfies Assumption 1 for any \(\rho \in \mathbb {R}\). To utilize the general algorithmic structure of the AIT method and its theory we will require \(\rho \in (0,\frac {1}{2})\) as in [6. ].
To determine an apt regularization parameter at step k of the iterative process, we now introduce a result adapted from [4. , Lemma 1. ].
Proposition 3.
Let x(k) live in the Krylov subspace \(\mathbb {K}_{p}\), and assume that the residual vector r(k) = bδ −Ax(k) and the matrix A are nonvanishing. Then by Theorem 3, the equation
has a unique solution for \(0<\alpha ^{(k)}<\infty\), when 0 < q(k) < 1 is close enough to unity, and the elements of r(k) satisfy a pair of conditions specified in the proof.
Proof
We argue first that q(k) > 0. Assume that q(k) = 0 and α(k) > 0. Then we have from (15)
This equation only holds if α(k)r(k) = 0. Thus, we have that q(k) > 0.
Before proceeding, we simplify (15). Using the Arnoldi decomposition (12), we may rewrite the right-hand side of (15) as
Multiply the vector inside the norm by the matrix \(\boldsymbol {V}_{p+1}^{T}\) from the left. This gives
Define the singular value decomposition (SVD) \(\boldsymbol {H}_{p+1,p}=\boldsymbol {U{\Sigma } W}^{T}\), where the matrices \(\boldsymbol {U}\in \mathbb {R}^{(p+1)\times (p+1)}\) and \(\boldsymbol {W}\in \mathbb {R}^{p\times p}\) are orthogonal, and \(\boldsymbol {\Sigma }\in \mathbb {R}^{(p+1)\times p}\) is diagonal; the diagonal entries are the singular values of Hp+ 1,p in non-increasing order. Using the SVD of Hp+ 1,p and defining \(\boldsymbol {\tilde {r}}^{(k)} := \boldsymbol {V}_{p+1}^{T}\boldsymbol {r}^{(k)}\), the expression (16) simplifies to
Let \(\boldsymbol {\bar {\Sigma }}:=\boldsymbol {\Sigma }\boldsymbol {\Sigma }^{T}\in \mathbb {R}^{(p+1)\times (p+1)}\), whose first p diagonal elements \(\bar {\sigma }_{j}\), 1 ≤ j ≤ p, are the squared singular values of Hp+ 1,p and the last diagonal entry \(\bar {\sigma }_{p+1}\) vanishes. Finally, defining \(\boldsymbol {\hat {r}}^{(k)} := \boldsymbol {U}^{T}\boldsymbol {\tilde {r}}^{(k)}\) and using orthogonality, we arrive at the simplified form of (15):
Squaring both sides of (17) gives
where \(\hat {r}^{(k)}_{j}\) denotes the jth entry of the vector \(\boldsymbol {\hat {r}}^{(k)}\).
Introduce the function
Then we have
and the derivative satisfies \(\phi ^{\prime }(0)=0\) and \(\phi ^{\prime } \big (\alpha ^{(k)}\big )>0\) for α(k) > 0. Thus, the function ϕ(α(k)) is monotonically increasing for α(k) > 0. It follows that (18) has a unique finite solution if and only if the following pair of conditions are satisfied
The bottom inequality holds when 0 < q(k) < 1 and the top inequality holds when q(k) is large enough and p is sufficiently large (see Remark 2 below). □
Remark 1
For the constants q and q(k) that occur in both Algorithms 1 and 2, we note that by construction q ≤ q(k). The conditions in Proposition 3 on q(k) can then be satisfied by choosing q large enough. In numerical experiments reported in [4. ,6. ], the value q = 0.7 worked well. We will use this value in the examples reported in the next section. The computed results are not sensitive to small changes in the q-value, however, we note that using a value very close to one results in an increased number of iterations, while a value very close to zero gives fewer iterations and worse quality of the computed reconstructions.
Remark 2
In computations, the top condition of (19) usually fails to be satisfied after a single Tikhonov step has been carried out in \(\mathbb {K}_{p}\), because the first p entries of the residual vector \((\boldsymbol {\hat {r}}^{(k)})^{2}\) are small compared to the (p + 1)st entry. When this situation occurs, a positive regularization parameter with which to carry out the next Tikhonov step in \(\mathbb {K}_{p}\) cannot be found using the Levenberg-Marquardt iteration. To alleviate this issue, one need only expand the Krylov space by performing an additional step of the Arnoldi process. By doing so, the entries of \((\boldsymbol {\hat {r}}^{(k)})^{2}\) become larger in magnitude. This process is repeated as necessary until the top inequality of (19) holds. We demonstrate in our numerical experiments that it is typically unnecessary to carry out multiple expansions in early Tikhonov steps of the IAT method. We visualize and comment further on this issue in Section 4.
We now present the IAT algorithm.
Remark 3
With regard to computational efficiency, we note that the computation of the (k + 1)st residual vector \(\boldsymbol {r}^{(k+1)}_{p}\) in Algorithm 2 does not require an additional matrix-vector product with A. Instead, this vector may be computed as follows:
where in the last equality we have used the Arnoldi decomposition (12) to avoid the evaluation of further matrix-vector products with the matrix A.
The following results are analogous to those at the end of Section 2; most of them can be found in [6. ].
Proposition 4.
For some 0 < ρ < 1/2, let τ∗ = (1 + ρ)/(1 − 2ρ). Then, by Theorem 3 if \(\tau ^{(k)}_{p} = \|\boldsymbol {r}^{(k)}_{p}\|/\delta > \tau _{\ast }\), it follows that
where \(\boldsymbol {x}^{(k)}_{p}\) is the kth iterate generated by the outer pth loop of Algorithm 2, \(\boldsymbol {e}^{(k)}_{p} = \boldsymbol {x}_{p}^{\dagger } - \boldsymbol {x}^{(k)}_{p}\), and \(\boldsymbol {r}^{(k)}_{p} = \boldsymbol {b}^{\boldsymbol {\delta }} - \boldsymbol {A}\boldsymbol {x}^{(k)}_{p}\).
Proposition 5.
Under the results and assumptions of Theorem 3, the norm of the iteration error \(\boldsymbol {e}^{(k)}_{p} = \boldsymbol {x}_{p}^{\dagger } - \boldsymbol {x}^{(k)}_{p}\) in the subspace \(\mathbb {K}_{p}\) decreases monotonically for \(k=0,1,2,\dots ,k^{\delta }_{p}-1\):
provided that \(\|\boldsymbol {r}^{(k)}_{p}\|>\tau \delta\).
Corollary 2.
Under the assumptions and notation of Proposition 5, and for sufficiently large p, there exists a first iterate \(\boldsymbol {x}_{p}^{(k)}\) with \(k=k^{\delta }_{p}\) such that
for some constant c > 0 that only depends on the parameter ρ and the value of q used in Algorithm 2.
Theorem 4.
Assume that δ = 0 and that x(0) is not a solution of (2). Then for sufficiently large p, the sequence of iterates \(\boldsymbol {x}^{(k)}_{p}\), \(k = 1,2, \dots\), generated by the outer pth loop of Algorithm 2 converges as \(k \rightarrow \infty\) to a solution of (1) that is closest to \(\boldsymbol {x}^{(0)}_{p}\).
Proof
If δ = 0 then the stopping criterion (8) can only be satisfied with \(k = k_{p}^{\delta }\) for a solution \(\boldsymbol {x}_{p}^{(k)}\) of (1). Here, if k > 0 then \(\boldsymbol {h}_{p}^{(k-1)}\) must coincide with \(\boldsymbol {e}_{p}^{(k-1)}\) up to an element in the null space of A, that is, in the null space of Ap by Theorem 3. Accordingly, it follows from (10) with the pth Arnoldi approximation of A substituted and Proposition 4 that
However, this contradicts the definition of \(q^{(k-1)}_{p}:= \max \limits \{q,2\rho + (1+\rho )/\tau ^{(k-1)}_{p}\}\) that is used in Algorithms 1 and 2. Thus, the iteration does not terminate for exact data unless \(\boldsymbol {x}^{(0)}_{p}\) is already a solution of (1). An analogous argument to that in [6. ] can be used to show that for p sufficiently large each sequence \(\{\boldsymbol {x}_{p}^{(k)}\}\) is a Cauchy sequence that converges as \(k \rightarrow \infty\) to a solution of (1) that is closest to \(\boldsymbol {x}^{(0)}_{p}\). □
Theorem 5.
By Theorem 3 and for some 0 < ρ < 1/2, let δ↦bδ be a function such that (5) holds for all δ > 0. For fixed parameters τ and q, let \(\boldsymbol {x}_{p}^{\delta }\) denote the approximate solution generated by the outer pth loop of Algorithm 2. Then, as \(\delta \rightarrow 0\) and for sufficiently large p, \(\boldsymbol {x}_{p}^{\delta }\) converges to a solution of (1) that is closest to \(\boldsymbol {x}^{(0)}_{p}\).
A variant of Theorem 5 has been used previously in [7. , 10. ]. Necessary results include the monotonicity property of the iterates and their continuous dependence on bδ.
4 Numerical results
We illustrate the properties of the IAT method and compare its performance to the AIT and range restricted generalized minimum residual (rrGMRES) when applied to image deblurring. The latter method is described in [17. ]. Both spatially variant and invariant blur are considered. Both kinds of blur can be modeled by a Fredholm integral equation of the first kind,
where b represents the blurred image, κ the point spread function (PSF), and Ω is the domain of the exact image represented by x. When the blur is spatially invariant, the kernel κ in the integral equation above is given by κ(u,s,v,t) = κ(u − s,v − t). Upon discretization of (22) we have a problem of the form (1), where the structure of \(\boldsymbol {A} \in \mathbb {R}^{n \times n}\) depends on the properties of κ and on the boundary conditions (BC); see [1. , 13. ]. For common BCs including zero, periodic, and reflexive, the matrix A may be decomposed as
where T is a BTTB matrix, R is a matrix of low-rank, and E is a matrix of small norm. The approximation of A by a BCCB matrix is attractive, because the structure of C allows diagonalization by the discrete Fourier transform. Therefore, the vector \((\boldsymbol {CC}^{T} + \alpha _{p}\boldsymbol {I})^{-1}\boldsymbol {r}_{p}\) can be computed in \(O(n \log n)\) floating-point operations using a fast Fourier transform algorithm.
To evaluate the quality of the reconstructed images, we compute the relative reconstructive error (RRE), which is defined by
where \(\boldsymbol {x}^{(k)}_{p}\) denotes the computed approximate solution obtained when the discrepancy principle (8) is satisfied, and \(\boldsymbol {x}^{\dagger }\) represents the exact solution. We will refer to \(\boldsymbol {x}^{(k)}_{p}\) as the solution at breakout. Because of the “double” iterative structure of our IAT method, it is necessary to differentiate between the Tikhonov steps taken and the number of Arnoldi iterations computed. The Tikhonov steps of the IAT method are measured by the number of successive residual vectors computed, and the Arnoldi iterations are tracked by the iteration counter, p, in the algorithm. We will see that the number of Arnoldi iterations required to achieve breakout according to the discrepancy principle is larger than or equal to the number of Tikhonov steps. The number of Arnoldi iterations of the IAT method is compared to the number of iterations with the AIT and rrGMRES methods. For further comparative studies of the AIT and closely related methods, we refer the interested reader to [2. ].
Our computational work was carried out in the IR Tools toolbox environment by Gazzola et al. [9. ] using MATLAB R2020b on a MacBook Pro laptop computer running MacOS Catalina with an i5 Dual-Core Intel processor @2.7 GHz and 8 GB of RAM. The computations were carried out with about 15 significant decimal digits. Standard images from MATLAB’s image processing toolbox as well as constructed PSFs were used. For proper comparison between the IAT and AIT methods, we set ρ = 10− 3 and q = 0.7 in our examples, as was done in [6. ] and [4. ].
Our first two examples below compare the three methods for two different spatially invariant non-symmetric motion blurs for images that naturally require zero and reflexive BCs, respectively. The last example considers a spatially variant rotational motion blur. The first example, hubble, also emphasizes the reconstructive process of the IAT method for one level of noise by considering the solution basis and computed residuals.
Hubble
We begin by considering a non-symmetric problem using a PSF from IR Tools that models motion blur; see Fig. 1. Here, we chose the most severe option of this PSF from IR Tools, where a high level of severity corresponds to a faster rate of decay of the singular values of the blurring matrix to zero. Because the image is black near the border, we impose zero BCs. We contaminate the blurred image by Gaussian noise with noise levels 3%, 1%, and 0.1%. We found that the best initial vector choice for the IAT method was the zero vector (i.e., x0 = 0). Interestingly, the AIT method also required fewer iterations and displayed lower RRE values for solutions with x0 = 0 as its starting vector for all noise levels. This is in contrast with the examples considered in [6. ] and [4. ]. We comment on this further below.
Before comparing the computational aspects and the accuracy of the computed reconstructions of the methods considered, we discuss the manner by which the IAT method arrives at its reconstructions. By viewing the 19 reshaped basis vectors in Fig. 2 that define the solution subspace for the hubble problem with 3% noise, we can see that the initial basis vectors of the solution subspace represent low frequencies, while later basis vectors represent higher frequencies. This can be observed by the lack of entry-to-entry pixel change in the early reshaped basis vectors. The change in the progression of the reshaped iterative residual vectors in Fig. 3 for the same problem also illustrates this.
As mentioned in Remark 2, not every Tikhonov step of the IAT method requires more than one Arnoldi iteration. We illustrate this in Fig. 3 by denoting below each reshaped residual vector the number of sequential basis vectors from Fig. 2 that were needed to determine a unique positive regularization parameter and thus achieve the next Tikhonov step. We note that the first residual vector is determined at the start of the algorithm using A. In our experience, later Tikhonov steps often require more Arnoldi iterations. We see this from the later reshaped residual vectors in Fig. 3 by noting the larger number of required Arnoldi iterations with A. In an image deconvolution setting this point of view makes intuitive sense as more high frequency basis information is needed to resolve finer details of an image.
The iteration requirements and breakout RRE values for the hubble example for all three investigated noise levels are shown in Table 1. Figure 4 displays the final reconstructions at breakout for the corresponding noise levels. We note that the IAT method produces the lowest RRE values amongst the three methods for all levels of noise. Furthermore, the number of Arnoldi iterations necessary for the IAT method to achieve breakout is smaller than the number of iterations required to breakout for rrGMRES in all cases.
The AIT method did not perform well in this example compared with the other two methods. This is evident from the quick breakouts in the 3% and 1% noise cases, and the poor RRE values compared to the other methods. We hasten to add that the initial vector \(\boldsymbol {x}_{0}=\boldsymbol {A}^{T}\boldsymbol {b}^{\delta }\) used in [6. ] and [4. ] provided worse results in all noise cases for the AIT method, and therefore the initial vector played no major role. This result was particularly surprising given that in the aforementioned studies it performed well for problems with higher noise levels. We surmise that the chosen severity of this PSF from IR Tools contributed significantly to this outcome.
The iterative evolution of the RREs and residual plots for the methods considered are shown in Fig. 5 for all noise levels. In the left-hand column of the relative residual plots, we note the quick termination of the AIT method after two iterations for 3% and 1% noise levels. For 0.1% noise, the AIT method fares better as it does not immediately terminate after the first iteration. In general, these plots provide visualization of the rapid relative residual decay of the IAT and AIT methods compared to the rrGMRES method. The RRE plots in the right-hand column provide a visual trajectory of the RRE values of each algorithm. We note that the IAT and rrGMRES methods demonstrate semiconvergent behavior in the 3% and 1% cases as both methods have their RRE values increase before breakout occurs. However, the relative quickness of termination of the IAT method stymies this effect compared with the rrGMRES method.
Brezinski
We next consider the Brezinski image blurred with constructed non-symmetric motion blur, where the PSF diagonal elements are computed by the Softmax function, \(\mathcal {F}\), whose entries are given by
Here, the yi come from a linearly spaced vector, y, whose first and last entries are 0 and 1, respectively. The off-diagonal entries are defined to be zero. We note that the singular values associated with this problem decay less rapidly compared to the hubble example. We impose reflexive boundary conditions and introduce 0.1% Gaussian noise; see Fig. 6. Similarly to the hubble example, the selection of \(\boldsymbol {x}_{0}=\boldsymbol {A}^{T}\boldsymbol {b}^{\delta }\) was found to cause the magnitude of the normalized starting residual r0 to be large, which resulted in a poor solution. Because of this, we chose our initial vector for the IAT algorithm to again be x0 = 0. The AIT algorithm also performed best when the starting vector was the zero vector.
The iteration requirements and breakout RRE values for the Brezinski example are tabulated in Table 2. Figure 7 displays the final image reconstructions at breakout for each of the methods. We note a couple of items: the first being that the IAT method had the lowest RRE at breakout. The second is that the number of iterations required to achieve breakout for the IAT method is smaller than that for the other two methods. The iterative evolution of the RREs and relative residual plots are provided in Fig. 8. The left-hand side plot displays the relative residual progression and the right-hand side plot displays the RRE values for each iteration of the three methods.
Satellite
In our last example we consider a spatially variant rotational motion blurring problem described in [14. ] and implemented in the IR Tools toolbox. The singular values of the blurring matrix decay to zero more slowly than in the hubble example. Figure 9 displays both the true image as well as the blurred image with 3% imposed noise. Here, again, we impose zero BCs. The blurred image is created by taking an average of a series of images, each of which is rotated slightly with respect to the center of the image (see Figure 1 in [14. ]). The IAT method uses x0 = 0 as its starting vector. We note that the blurring matrix for this spatially variant motion blur problem does not have BTTB substructure. As such, we only compare the IAT and rrGMRES methods since the AIT algorithm cannot be applied in this situation.
Iteration requirements and RRE breakout values for the IAT and rrGMRES methods are provided in Table 3. The final image reconstructions at breakout are shown in Fig. 10. We found similarly to the hubble example that many more Arnoldi iterations were required to be able to determine unique regularization parameters for later Tikhonov steps than for earlier ones. The number of Arnoldi iterations was largest for the final two Tikhonov steps. The number of iterations, while large compared to the hubble and Brezinski examples, did result in successful breakout. The rrGMRES method was manually terminated at the 300th iteration due to long runtime. In the left-hand side pane of Fig. 11, one may estimate that the rrGMRES method might have required an additional 100 iterations to achieve breakout. The RRE progression of each method is also shown in the right-hand pane of Fig. 11. We especially note the erratic behavior of the rrGMRES progression. Finally, we note in Fig. 10 the pictorial differences between the IAT and the rrGMRES method reconstructions. While there is some minor background noise in the IAT reconstruction, the wings of the satellite are much better defined than in the reconstruction obtained by rrGMRES.
5 Conclusions
We have introduced a transpose-free preconditioned iterated Tikhonov method using the Arnoldi iteration which we refer to as the iterated Arnoldi-Tikhonov (IAT) method based on the AIT method formulated by Donatelli and Hanke in [6. ]. In Section 3, we presented the IAT algorithm and showed that it is a regularization method provided the original matrix A is approximated sufficiently accurately by the Arnoldi process. Section 4 showcased the effectiveness of the IAT method at reconstructing images that have been contaminated by blurs of various severities and Gaussian noise of different levels.
Data availability
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
References
Berisha, S., Nagy, J.G.: Iterative methods for image restoration. Tech. Rep. Department Math. Comput. Sci., Emory University. http://www.mathcs.emory.edu/∼nagy/RestoreTools/IR.pdf (2012)
Buccini, A.: Regularizing preconditioners by non-stationary iterated Tikhonov with general penalty term. Appl. Numer. Math. 116, 64–81 (2017)
Buccini, A., Donatelli, M., Reichel, L.: Iterated Tikhonov with general penalty term. Numer. Linear Algebra Appl. 24, e2089 (2017)
Buccini, A., Donatelli, M., Reichel, L., Zhang, W.: A new non-stationary preconditioned iterative method for linear discrete ill-posed problems. Numer. Linear Algebra Appl. 28(2), e2353 (2021)
Chan, T.F., Jackson, K.R.: Nonlinearly preconditioned krylov subspace methods for discrete newton algorithms. SIAM J. Sci. Stat. Comput. 5, 533–542 (1984)
Donatelli, M., Hanke, M.: Fast nonstationary preconditioned iterative methods for ill-posed problems, with application to image deblurring. Inverse Prob. 29(9), 095,008 (2013)
Engl, H.W., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Kluwer, Doordrecht (1996)
Fenu, C., Reichel, L., Rodriguez, G.: GCV for Tikhonov regularization via global Golub-Kahan decomposition. Numer. Linear Algebra Appl. 23 (3), 467–484 (2016)
Gazzola, S., Hansen, P.C., Nagy, J.G.: IR Tools: a MATLAB package of iterative regularization methods and large-scale test problems. Numer. Algorithms 81, 773–811 (2019)
Hanke, M.: A regularizing levenberg-Marquardt scheme, with applications to inverse groundwater filtration problems. Inverse Prob. 13(1), 79 (1997)
Hanke, M., Groetsch, C.W.: Nonstationary iterated Tikhonov regularization. J. Optim. Theory Appl. 98(1), 37–53 (1998)
Hansen, P.C.: Rank Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion. SIAM Philadelphia (1998)
Hansen, P.C., Nagy, J., O’Leary, D.P.: Deblurring Images: Matrices, Spectra, and Filtering. SIAM Philadelphia (2006)
Hansen, P.C., Nagy, J.G., Tigkos, K.: Rotational image deblurring with sparse matrices. BIT Numer. Math. 54, 649–671 (2014)
Kindermann, S.: Convergence analysis of minimization-based noise level-free parameter choice rules for linear ill-posed problems. Electron. Trans. Numer. Anal. 38, 233–257 (2011)
Kindermann, S., Raik, K.: A simplified L-curve as error estimator. Electron. Trans. Numer. Anal. 53, 213–238 (2020)
Neuman, A., Reichel, L., Sadok, H.: Algorithms for range restricted iterative methods for linear discrete ill-posed problems. Numer. Algorithms 59(2), 325–331 (2012)
Reichel, L., Rodriguez, G.: Old and new parameter choice rules for discrete ill-posed problems. Numer. Algorithms 63, 65–87 (2013)
Saad, Y.: Iterative methods for sparse linear systems, 2nd edn., SIAM, Philadelphia (2003)
Acknowledgements
The authors would like to thank a referee for comments. They also would like to thank Marco Donatelli for precious discussions that helped improve the quality of this paper.
Funding
Open access funding provided by Università degli Studi di Cagliari within the CRUI-CARE Agreement. A.B. is a member of the GNCS group of INdAM that partially founded his research with the project “Regularization methods and models for the solution of large-scale ill-posed inverse problems”.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Dedicated to Claude Brezinski on the occasion of his 80th birthday.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Buccini, A., Onisk, L. & Reichel, L. An Arnoldi-based preconditioner for iterated Tikhonov regularization. Numer Algor 92, 223–245 (2023). https://doi.org/10.1007/s11075-022-01407-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-022-01407-7