Abstract
Parameter identification tasks for partial differential equations are non-linear illposed problems where the parameters are typically assumed to be in \(L^\infty \). This Banach space is non-smooth, non-reflexive and non-separable and requires therefore a more sophisticated regularization treatment than the more regular \(L^p\)-spaces with \(1<p<\infty \). We propose a novel inexact Newton-like iterative solver where the Newton update is an approximate minimizer of a smoothed Tikhonov functional over a finite-dimensional space whose dimension increases as the iteration progresses. In this way, all iterates stay bounded in \(L^\infty \) and the regularizer, delivered by a discrepancy principle, converges weakly-\(\star \) to a solution when the noise level decreases to zero. Our theoretical results are demonstrated by numerical experiments based on the acoustic wave equation in one spatial dimension. This model problem satisfies all assumptions from our theoretical analysis.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We consider the numerical solution of non-linear illposed and inverse problems where the underlying non-linearity F maps from a possibly multi-component version of \(L^\infty \) into a normed space Y. This scenario appears quite naturally in many parameter identification tasks for partial differential equations.
For instance, in electrical impedance tomography (EIT) one wants to identify the conductivity distribution in a body. To this end, electric currents are applied through electrodes and the resulting voltages are measured. In the quasi-static regime, an adapted version of the Laplace equation connects the currents with the voltages, see, e.g., [1]. Another application is full waveform inversion (FWI), the most advanced inversion technique in seismic imaging, see, e.g., [2, 3]. Depending on the used mathematical model for wave propagation (acoustic, elastic, or visco-elastic regime) the searched-for parameters include bulk density, pressure and shear wave velocities, and corresponding relaxation times. Here, a wave field is initiated by a source (explosion or earthquake) and the parts of the wave field that are reflected from the Earth’s internal structure are then picked up by receivers on the Earth’s surface or by hydrophones in the ocean.
From an abstract point of view, in both examples we have to solve an equation
where the non-linear operator
maps \(\ell \) parameter functions u located on some domain of interest D to the respective measurements. Further, \(y^\delta \) are the (noisy) measurements satisfying \(\Vert y^\delta -F(u^+)\Vert _Y\le \delta \) for one \(u^+\in \mathcal {D}(F)\).
Newton-like regularization schemes are well-established iterations for getting a meaningful approximate solution of non-linear inverse problems. In this work we explore the Newton-like solver REGINN \(^\infty \) which extends REGINN of [4, 5] to a non-linear inverse problem with generic operators F as in (2). Here, F is required to fulfill a few specific properties which are satisfied by EIT and, except for one, also by FWI in all regimes. This not yet verified property of FWI is a structural assumption known as tangential cone condition (TCC), see (3) below and consult [6] for a first promising result. Apart from establishing the non-linearity constraint, the main challenge about regularization in \(L^{\infty }\) is its non-reflexive nature. As this space is further non-separable, convergence of discretization schemes in the strong topology, which is desirable for practical implementations, cannot be achieved, see [7].
In this work we will exploit the weak-\(\star \) topology and semi-discrete approximations to F based on a family \(\{X^n\}_n\) of finite-dimensional nested subspaces of \(L^\infty (D)^\ell \). Such an ansatz is very close to an implementation and addresses discretization errors directly in the theory. Apart from minor mathematical restrictions which will be specified below, the subspaces will also be held quite generic and may reflect features of the reconstructions one wants to focus on, e.g. using piecewise constant functions to model sharp interfaces. By the well-known limit
for bounded D, \(X^n\) can be equivalently furnished with the \(L^{q_n}(D)^\ell \)-topology for properly chosen \(q_n<\infty \) such that \(q_n\rightarrow \infty \) as \(n\rightarrow \infty \). This approach then implies that the Newton update for the n-th iterate can be obtained as an approximate minimizer of a smooth (provided Y is smooth) and convex Tikhonov functional over \(X^n\). As a result, the underlying minimizing procedure can be implemented for general \(\{X^n\}_n\) with classical techniques from smooth and unconstrained optimization.
There is a wealth of literature on the analysis of Newton-like regularization schemes, mainly in a Hilbert space but meanwhile also in a Banach space setting; we refer only to the monographs [8, 9] for a first reading. However, most of the Banach space methods are formulated in abstract spaces requiring smoothness and reflexivity at least as they rely heavily on duality mappings to mimic the Riesz isomorphism. To the best of our knowledge, regularization schemes applicable to non-reflexive spaces are only considered in [7, 10,11,12,13,14,15]. The first six of these publications consider iterative schemes on the basis of proximal point methods, Morozov, Ivanov, or Tikhonov regularization, respectively. We will compare our scheme with the somewhat similar IRGNM-Tikhonov method of [12] in Remark 2.6 below.
Our material is organized as follows: In Sect. 2 we introduce and analyze two versions of REGINN \(^\infty \) which differ in what information about the smoothness of the ground truth is available a priori. Under reasonable assumptions both algorithms are well defined and terminate with a regularized solution \(u_{M_\delta }\in \mathcal {D}(F)\) of (1). Further, we prove the regularization property, that is, weak-\(\star \) convergence of \(\{u_{M_\delta }\}_{\delta >0}\) to an exact solution of \(F(\cdot )=F(u^+)\) as the noise level \(\delta \) tends to zero. Our hypotheses are reasonable in fact as they are met by EIT as well as FWI with exception of the TCC (Sect. 3) as mentioned above. For FWI in the acoustic regime we are even able to validate the stronger assumption about the ground truth which enters the second version of REGINN \(^\infty \). Finally, Sect. 4 contains some experiments to identify two parameters (density and pressure wave velocity) in the acoustic wave equation. Here, all our assumptions including TCC in a semi-discrete setting actually hold. Some technical details that would otherwise interrupt the flow of reading have been moved to three appendices.
2 REGINN \(^{\infty }\)
For some bounded domain \(D\subset \mathbb {R}^d\) let \(F:\mathcal {D}(F)\subset L^{\infty }(D)^\ell \rightarrow Y\) be Fréchet-differentiable and satisfy the tangential cone condition (TCC) at \(u^+\), i.e., there are a positive constant \(\omega <1\) and a ball \(B_{r}(u^+)\subset \mathcal {D}(F)\) with radius \(r>0\) such that
Here, \(F':\mathcal {D}(F)\subset L^{\infty }(D)^\ell \rightarrow \mathcal {L}\big (L^\infty (D)^\ell ,Y\big )\) denotes the Fréchet-derivative of F and \(L^\infty (D)^\ell \) is endowed with \(\Vert \cdot \Vert ^2_{L^{\infty }(D)^\ell }\) where
The TCC, introduced in [16], is a well-established and widely used assumption in the convergence analysis of a variety of iterative regularization schemes for non-linear illposed problems; we only refer to the monographs [8, 9]. Nevertheless, only a few academic examples are reported in the infinite-dimensional setting where it holds, see, e.g., [8, 17, 18].
For \(\omega <1/2\) we can equivalently restate the TCC as
with \(L=\omega /(1-\omega )<1\). Since our method will be explicitly based on discretizing \(L^\infty (D)^\ell \), we impose the following assumptions on corresponding spaces \(X^n\):
-
(S1)
\(\{X^n\}_{n\in \mathbb {N}}\) is a sequence of nested subspaces of \(L^\infty (D)^\ell \), i.e., \(X^n\subset X^{n+1}\subset L^\infty (D)^\ell \) for all \(n\in \mathbb {N}\).
-
(S2)
For each \(X^n\) there exists a linear projection operator \(\mathcal {P}^n:L^\infty (D)^\ell \rightarrow X^n\), that is, \(\mathcal {P}^nu=u\) for all \(u\in X^n\), satisfying \(\Vert \mathcal {P}^nu\Vert _{L^\infty (D)^\ell }\le C_{\mathcal {P}}\Vert u\Vert _{L^\infty (D)^\ell }\) where the constant \(C_{\mathcal {P}}\ge 1\) is independent of n.
-
(S3)
For any fixed \(C_{\infty }>1\) we can find a positive increasing sequence \(\{q_n\}_{n\in \mathbb {N}}\) such that
$$\begin{aligned}\Vert u\Vert _{L^\infty (D)^\ell }\le C_{\infty }\Vert u\Vert _{L^{q_n}(D)^\ell }\quad \text {for all }u\in X^n. \end{aligned}$$
Note that the \(L^{\infty }(D)^\ell \)-norm is always stronger than the \(L^{q_n}(D)^\ell \)-norm, hence the magnitude of \(C_{\infty }>1\) in (S3) determines how tight the norm equivalence of \(\Vert \cdot \Vert _{L^\infty (D)^\ell }\) and \(\Vert \cdot \Vert _{L^{q_n}(D)^\ell }\) is, restricted to \(X^n\). Thus, having set \(\{q_n\}_{n\in \mathbb {N}}\) once for a fixed feasible subspace sequence \(\{X^n\}_{n\in \mathbb {N}}\) and some \(C_\infty >1\), we can easily switch between non-smooth and smooth norms. We emphasize that \(C_\infty \) is independent of n. A family \(\{X^n\}_{n\in \mathbb {N}}\) enjoying (S1)–(S3) is constructed in Appendix A.1 on the basis of tensor product B-splines (see Appendix A.2 for a possible other construction, which is, however, not considered further in this work).
To ensure that our discretizations do indeed approximate the actual inverse problem in \(L^\infty (D)^\ell \) sufficiently well, we require a compatibility condition for F of the form
for all \(u\in \mathcal {D}(F)\) and all \(\widehat{u}\in L^\infty (D)^\ell \). This extra condition is necessary since, in general, one cannot expect \(\lim _{n\rightarrow \infty }\mathcal {P}^n\widehat{u}= \widehat{u}\) in \(L^\infty (D)^\ell \) strongly (which would yield (5) by continuity of \(F'\)) because the union of \(X^n\) is countable while \(L^\infty (D)^\ell \) is not separable. However, as a consequence, the closure of the range of \(F'(u)\) must be separable, in contrast to Y itself, for all \(u\in \mathcal {D}(F)\).
Additionally, to prove the regularization property of our scheme (Algorithm 1), we will require either the weak-\(\star \) closedness of F or the range inclusion
where \(F'(u^+)^{*}:Y^*\rightarrow \big (L^\infty (D)^*\big )^\ell \) denotes the adjoint operator. In Sects. 3.1 and 3.2 below we will verify (5) as well as (6) for the forward maps belonging to EIT and FWI, respectively.
As motivated in the introduction, the guideline for designing our regularization algorithm is to generate uniformly bounded iterates \(u_m\) in \(L^\infty (D)^\ell \) which give sufficiently small residuals
where \(y^\delta \in Y\) is the perturbed datum at hand. We build on an inexact-Newton framework and find the updates from the linearization approximately via Tikhonov regularization linked to \(X^n\) as follows: Given an initial guess \(u_0\in \mathcal {D}(F)\) and an initial space \(X^{n_0}\), we iterate
where \(n_m\in \mathbb {N}\) with \(n_m\ge n_{m-1}\) and \(s_m\in X^{n_m}\) are chosen such that
Here, the Tikhonov functional \(J_{n,m}:X^n\rightarrow [0,\infty )\) is
with penalty parameters set a posteriori by
We recall from assumption (S3) that an approximate minimizer of (10) is also one of the corresponding Tikhonov functional with \(\Vert \cdot \Vert _{L^\infty (D)^\ell }\)-penalty term and vice versa. However, given the sequence \(\{q_n\}_{n\in \mathbb {N}}\) for some \(C_\infty >1\) and a subspace sequence \(\{X^n\}_{n\in \mathbb {N}}\), we have more flexibility in choosing a numerical method to find \(s_m\) in (9) via the smoothed version.
So far, the parameters \(\gamma ,\{\mu _m\}_{m\in \mathbb {N}}\) are restricted to fulfill \(0<\mu _m<1\) and \(\gamma \ne 0\). While \(\mu _m\) serves as a stopping criterion in the spirit of an inexact Newton condition to set the m-th update \(s_m\), \(\gamma \) will be responsible for keeping the resulting iterate \(u_{m+1}\) sufficiently close to \(u^+\): the larger \(\gamma \) is chosen, the better the initial guess has to be. The effect of the initial discretization level \(n_0\) on the iterates is demonstrated in Section 4 by numerical experiments. We stop the Newton iteration (8) by a discrepancy principle with constant \(\tau >1\). The resulting inversion scheme REGINN \(^{\infty }\) is summarized in Algorithm 1. It is well defined under reasonable assumptions according to the following theorem. Its regularization property is then specified in Theorem 2.3.
Theorem 2.1
(Well-definedness and termination of REGINN \(^{\infty }\)) Let \(F:\mathcal {D}(F)\subset L^\infty (D)^\ell \rightarrow Y\) be as above satisfying (3) with \(\omega <1/3\) in \(B_{r}(u^+)\subset {\text {int}}(\mathcal {D}(F))\) and (5), where \(\{X^n\}_{n\in \mathbb {N}}\) and \(\{\mathcal {P}^n\}_{n\in \mathbb {N}}\) fulfill assumptions (S1)–(S3). Let \(y^\delta \) be given such that \(\Vert F(u^+)-y^\delta \Vert _{Y}\le \delta \) for one \(\delta >0\). Let \(\Lambda \in (\frac{2\omega }{1-\omega },1)\) and set
For
and
choose
Further, define
Restrict all tolerances \(\{\mu _m\}\) to \((\mu _{\min },\mu _{\max })\) and start with some arbitrary \(n_0\in \mathbb {N}\) and \(u_0\in B_{r_0}(u^+)\). Then, there exists an \(M_{\delta }\in \mathbb {N}\) such that all iterates \(\{u_1,\ldots ,u_{M_{\delta }}\}\) of REGINN \(^{\infty }\) are well-defined and stay in \(B_r(u^+)\). Moreover, \(\Vert b_{m+1}^{\delta }\Vert _{Y}\le \Lambda \Vert b_m^{\delta }\Vert _{Y}\) for \(m=0,\ldots ,M_{\delta }-1\), \(\Vert b^\delta _{M_{\delta }}\Vert _{Y}\le \tau \delta \), and \(M_{\delta }=\mathcal {O} (\vert \log \delta \vert )\) as \(\delta \searrow 0\).
Proof
Before we begin with the proof we discuss our assumptions on the parameters. First, observe that the open interval for choosing \(\Lambda \) is non-empty by \(\omega <1/3\). The lower bound for \(\Lambda \) guarantees that \(\mu _{\max }>\omega \). Together with the upper bound on \(\gamma \) this yields a positive upper bound for \(r_0\). Further, the radicand and the denominator of the lower bound for \(\tau \) are positive. Finally, \(\mu _{\min } <\mu _{\max }\).
We use an inductive argument and assume therefore that \(\Vert b_m^{\delta }\Vert _{Y}\le \Lambda ^m \Vert b_0^{\delta }\Vert _{Y}\) as well as \(\Vert u_i-u^+\Vert _{L^\infty (D)^\ell }< r\) for \(i\le m\), which holds in particular for \(m=0\) because of \(\Vert u_0-u^+\Vert _{L^\infty (D)^\ell }<r_0<r\). If \(\Vert b^{\delta }_m\Vert _{Y}\le \tau \delta \), REGINN \(^{\infty }\) stops with \(u_{M_{\delta }}:=u_m\) and nothing else needs to be shown. Otherwise, \(\Vert b^{\delta }_m\Vert _{Y}> \tau \delta \) and we next show that a Newton update is well defined by (9). Let \(s_{n,m}:=\arg \min _{s\in X^n} J_{n,m}(s)\) which exists as the unique minimizer of a strictly convex functional over a finite dimensional space. Then,
Recursively, we get \(u_m-u_0=s_0+s_1+\cdots + s_{m-1}\) from which we deduce that \(u_m-u_0\) is in \(X^{n_{m-1}}\) as the spaces are nested by (S1). Hence, by (S2),
for \(n\ge n_{m-1}\) and by linearity of \(\mathcal {P}^{n}\) we may simplify
In the last step we additionally used (11), Hölder’s inequality and \(\Vert \mathcal {P}^n(u^+-u_0)\Vert _{L^\infty (D)^\ell }\le C_{\mathcal {P}}\Vert u^+-u_0\Vert _{L^\infty (D)^\ell }<C_{\mathcal {P}}r_0\), see (S2). We continue by splitting the residual term according to
employing again (12) to get the bottom line. Since \(\Vert u_{m}-u^+\Vert _{L^\infty (D)^\ell }< r\) by induction, TCC (3) yields
and with \(\Vert F(u^+)-F(u_m)\Vert _{Y}\le \Vert F(u^+)-y^\delta \Vert _{Y}+\Vert b_m^{\delta }\Vert _{Y}\le \delta + \Vert b_m^{\delta }\Vert _{Y}\) we deduce further
Taking into account that \(\Vert b^{\delta }_m\Vert _{Y}> \tau \delta \), we get
and finally
Since \(\text {vol}_\text {d}(D)^{2/q_n}\rightarrow 1\) as \(n\rightarrow \infty \) and in view of (5), we find that
Consequently, condition (9) with \(\mu _m>\mu _{\min }\) is feasible for \(n_m\) large and appropriate \(s_m\in X^{n_m}\backslash \{0\}\), where \(s_m\ne 0\) follows by \(J_{n,m}(0)\ge \Vert b_m^{\delta }\Vert ^2_{Y}\). Hence, \(u_{m+1}=u_m+s_m\) is well defined and, relying on (S3) as well as (11), we proceed with
Hence,
by the upper bound of \(r_0\), yielding \(u_{m+1}\in B_r(u^+)\subset \text {int}(\mathcal {D}(F))\). Finally, we estimate on the basis of (4) and (9)
Having thus proven the induction part, REGINN \(^{\infty }\) is forced to terminate for any \(\delta >0\) due to \(\Vert b_m^{\delta }\Vert _{Y}\le \Lambda ^m \Vert b_0^{\delta }\Vert _{Y} \le \tau \delta \) for m sufficiently large. From this estimate, we may even deduce \(M_{\delta }=\mathcal {O}(\vert \log \delta \vert )\) as \(\delta \searrow 0\). \(\square \)
Remark 2.2
(a) The name REGINN \(^{\infty }\) for Algorithm 1 is justified by the stopping condition (9) for determining the Newton update which is, in view of (10) and (11), equivalent to
In particular, \(s_m\) satisfies the stopping condition of REGINN [4], i.e., the above condition without penalty term.
(b) Recall that REGINN fulfills in the Hilbert space setting (and likewise for smooth reflexive Banach spaces) the error reducing property for the iterates of many inner linear solvers, keeping thus \(u_m\in B_{r}(u^+)\) if the initial guess was chosen so. However, this does not hold any longer for our \(L^{\infty }\)-tailored REGINN \(^{\infty }\) in general. Therefore our parameters need to be controlled in terms of both \(\omega \) and r, whereas standard REGINN only requires the knowledge of \(\omega \) for defining admissible tolerances \(\mu \) and stopping constants \(\tau \), see Theorem 3.1 in [19].
In case that F is linear, i.e., \(F(u)=Au\) for some \(A\in \mathcal {L}(L^\infty (D)^\ell ,Y)\), the TCC holds with \(\omega =0\) and \(r=\infty \). Some observations are in order:
-
Although \(r_0\) can be arbitrarily large now, to still ensure a finite and uniform \(L^{\infty }\)-bound on the iterates, \(r_0<\gamma <\infty \) needs to be chosen compatibly.
-
Because of
$$\begin{aligned} \Vert F'(u_m)s_m -b^{\delta }_m\Vert ^2_{Y} =\Vert As_m -(y^\delta -Au_m)\Vert ^2_{Y} =\Vert Au_{m+1} -y^\delta \Vert ^2_{Y} \end{aligned}$$the iterate \(u_{m+1}\) satisfies
$$\begin{aligned} \Vert Au_{m+1} -y^\delta \Vert ^2_{Y}+\alpha _m\Vert u_{m+1}-u_0\Vert _{L^{q_{n_m}}(D)}^2\le \mu _m^2 \Vert b_m^\delta \Vert _Y^2. \end{aligned}$$Hence, \(u_{m+1}\) can be considered an approximate minimizer of the Tikhonov functional \( u\mapsto \Vert Au-y^\delta \Vert ^2_{Y}+\alpha _m\Vert u-u_0\Vert _{L^{q_{n_m}}(D)}^2\) in the set \(u_0+X^{n_{m}}\). Put differently: in the linear case, REGINN \(^{\infty }\) can be viewed as a cascading Tikhonov regularization iterating over nested finite-dimensional spaces where the penalty term is determined a posteriori by the previous iterate.
Theorem 2.3
(Regularization property of REGINN \(^{\infty }\)) Adopt all assumptions and notations from Theorem 2.1 with \(\overline{B_r(u^+)}\subset \mathcal {D}(F)\) and set \(F(u^+)=y\). Additionally, assume that \(F'(u^+)\) fulfills (6) or that F is weakly-\(\star \) sequentially closed, that is, \(w_n{\mathop {\rightharpoonup }\limits ^{\star }}w\) in \(L^\infty (D)^\ell \) and \(F(w_n)\rightharpoonup z\) imply that \(F(w)=z\). Let \(\{\delta _i\}_{i\in \mathbb {N}}\) be a positive zero sequence and let \(u_{M_{\delta _i}}\) be the output of Algorithm 1 with respect to perturbed data \(y^{\delta _i}\). Then the set of weak-\(\star \) accumulation points of the sequence \(\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}\) is non-empty and consists of solutions to \(F(\cdot )=y\). If \(u^+\) is the only solution in \(\overline{B_r(u^+)}\), then even the whole sequence \(\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}\) converges weakly-\(\star \) to \(u^+\) in \(L^\infty (D)^\ell \).
Proof
By construction in Theorem 2.1 we know that \(\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}\) yields
and is uniformly bounded in \(L^\infty (D)^\ell \), so there exists weak-\(\star \) accumulation points in \(\overline{B_r(u^+)}\) by sequential weak-\(\star \)-compactness. Take representatively \(u_{M_{\delta _{i_k}}}{\mathop {\rightharpoonup }\limits ^{\star }}\widetilde{u}\). In case that F is weakly-\(\star \) sequentially closed, we can directly deduce \(F(\widetilde{u})=y\), hence any weak-\(\star \) accumulation point solves the equation. In case that (6) holds, we first note that the TCC (3) implies by the reverse triangle inequality for any \(u\in B_r(u^+)\) that
With (6) we then obtain for any \(g\in Y^{*}\)
where the last equality above follows by the second inequality in (17) with \(u=u_{M_{\delta _{i_k}}}\) and \(F(u_{M_{\delta _{i_k}}})\rightarrow y\) in \(Y\). We deduce that \(F'(u^+)(u^+-\widetilde{u})=0\) and combining this relation now with the first inequality in (17) using \(\widetilde{u}=u\), we may again conclude \(F(\widetilde{u})=y\). Finally, if \(u^+\) is the only solution to \(F(\cdot )=y\) in \(\overline{B_r(u^+)}\), the weak-\(\star \) convergence of the whole sequence \(\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}\) follows by a standard subsequence argument, see [20, Prop. 10.13(2)]. \(\square \)
Remark 2.4
If \(F'(u^+)\) is injective, \(\Vert u\Vert _\star :=\Vert F'(u^+)u\Vert \) constitutes a norm on \(L^\infty (D)^\ell \) with respect to which \(\{u_{M_{\delta }}\}_{\delta >0}\) then converges strongly to \(u^+\) according to (17) and (16) at the rate \(\Vert u^+ -u_{M_{\delta }}\Vert _\star =\mathcal {O}(\delta )\) as \(\delta \rightarrow 0\). However, this norm is generally weaker than \(\Vert \cdot \Vert _{L^\infty (D)^\ell }\) with equivalence if and only if \(F'(u^+)\) is boundedly invertible. However, for locally illposed problems \(F(\cdot )=y\) we expect its linearization to be illposed as well.
Remark 2.5
We discuss how the statements of the theorems from above carry over to a semi-discrete situation as it appears under an implementation of Algorithm 1. Typically, one \(X^{n_{\max }}\) represents the finest possible or finest chosen resolution for the sought-for quantity \(u^+\in X^{n_{\max }}\) and models the implementation from a mathematical point of view.Footnote 1 Here, \(X^{n_{\max }}\) is equipped with the \(L^\infty \)-topology. Now, Theorems 2.1 and 2.3 apply to
where (5) can be omitted due to
since both \(u_0\) and \(u^+\) are assumed to be in \(X^{n_{\max }}\). Further, (14) then reads
and as the only consequence the constant \(C_{\mathcal {P}}\) needs to be replaced by \(\text {vol}_d(D)^{1/2}C_{\mathcal {P}}\) in the definition of corresponding REGINN \(^\infty \) parameters.
We emphasize that the underlying semi-discrete inverse problem is: given \(y^\delta \in Y\) find \(u^\delta \in X^{n_{\max }}\) such that \(F_{n_{\max }}(u^\delta )\approx y^\delta \) where \(y^\delta \) now incorporates measurement noise and discretization error.
Remark 2.6
At first glance the IRGNM-Tikhonov method of [12] and REGINN \(^\infty \) seem to be quite similar, but they are separated by significant structural differences: In IRGNM-Tikhonov the penalty parameter is determined a priori and is assigend to a fixed regularization term, whereas for REGINN \(^\infty \) the regularization term is explicitly n-dependent and relies on an the posteriori parameter choice (11). Further, the Newton update for the IRGNM-Tikhonov method has to be an exact minimizer of the Tikhonov functional. In contrast, the Newton update for REGINN \(^\infty \) only requires an approximate minimizer, see (9), which is especially convenient when it comes to practical realizations. Note also that the discrete spaces \(X^n\) are an integral part of REGINN \(^\infty \) and its theory, so numerical effects are included in a natural way. As part of our results, the discrete regularizers \(u_{M_{\delta _i}} \in X^{n_{M_{\delta _i}}}\) of Theorem 2.3 converge to the continuous limit \(u^+\in L^\infty (D)^\ell \). A similar connection of the discrete with the continuous world is (to our knowledge) still pending for the method in [12].
The previous version of REGINN \(^{\infty }\) requires the determination of successive discretization levels \(n_m\ge n_{m-1}\) for possibly strongly increasing \(n_m\) so that many intermediate (almost) minimizers \(s\in X^n\) of (10) for \( n_{m-1}\le n\le n_m\) need to be computed before meeting the given \(\mu _m\)-criterion in (9). As this adaptive discretization strategy can become numerically expensive, we want to present an alternative version which directly links n to m. A closer look at the proof of Theorem 2.1 reveals that \(n_m\) actually depends on the decay of \(\Vert F'(u_m)(\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0))\Vert _{Y}\). Hence, if we have a concrete upper bound for this discretization residual in terms of n, feasible choices of \(n_m\) can be found by simple algebraic manipulation. Such upper bounds can be deduced on the basis of better initial guesses which are governed by some stronger norm. For this purpose we state the following refined version of assumption (5):
If \(X\subset L^\infty (D)^\ell \) is an embedded normed space such that
then for any \(u^+\) such that \(B_{\widetilde{r}}(u^+)\subset \text {int}(\mathcal {D}(F))\), \(u\in B_{\widetilde{r}}(u^+)\) and \(\widehat{u}\in X\) we assume that
where \(C^+>0\) and \(\beta \) fulfills \(\beta (n)\searrow 0\) with \(\beta (0)=1\). We think of \(\beta \) as being rather independent of \(u\in \mathcal {D}(F)\) once \(X\) and \(\{X^n\}_{n\in \mathbb {N}}\) are set while the magnitude of \(C^+\) is strongly \(B_{\widetilde{r}}(u^+)\)-dependent. We will verify in Sect. 3.3 below that (19) is satisfied, for instance, by the modeling assumptions of FWI in the acoustic regime. The next theorem shows that we can indeed determine \(n_m\) conveniently for successive Newton steps of REGINN \(^{\infty }\) when replacing condition (5) by (19).
Theorem 2.7
Adopt all assumptions, notations and parameters from Theorems 2.1 and 2.3, and assume that (19) holds—without loss of generality with \(\widetilde{r}=r\) by shrinking one of the radii otherwise. Start with \(u_0\in L^\infty (D)^\ell \) such that \(\Vert u^+-u_0\Vert _X<\min \{r_0/C_{X},1/C^+\}\) and restrict \(\{\mu _m\}\) to \((\mu ^{\varepsilon }_{\min },\mu _{\max })\), where
for some \(\varepsilon >0\) sufficiently small and \(q_{n_0}\) large with \(n_0\in \mathbb {N}\). Further, let \(n_m\) be successively defined by
Then we can find \(s_m\in X^{n_m}\) satisfying (9). In particular, REGINN \(^{\infty }\) also terminates in this case and the regularization property still holds.
Proof
First, \(n_m\) according to (21) is well defined since \(\lim _{n\rightarrow \infty }\beta (n)=0\). Besides, since \(\Vert b^{\delta }_m\Vert _{Y}\) is monotonously decreasing in m, we get that \(n_m\) is non-decreasing, too. Using \(s_m:=\arg \min _{s\in X^{n_m}} J_{n_m,m}(s)\), we may compute with (13) and by (19)
The fact that REGINN \(^{\infty }\) still terminates and also admits the regularization property follows by Theorems 2.1, 2.3 and \(\Vert u_0-u^+\Vert _{L^\infty (D)^\ell }\le C_{X}\Vert u_0-u^+\Vert _{X}< r_0\). \(\square \)
For convenience, we restate REGINN \(^{\infty }\) in Algorithm 2 subject to \(u^+-u_0\in X\) for which we need to provide \(\varepsilon \) and \(\beta \) as additional input. This version is especially of interest if the regularity \(u^+\in X\) is known a priori so that \(u_0\in X\) ensures \(u^+-u_0\in X\), as desired.
3 Applications: parameter identification tasks in PDEs
In this section we will verify our abstract assumptions (5) and (6) in the concrete settings of electrical impedance tomography and time domain full waveform inversion (FWI) in the visco-elastic regime which is the state-of-the-art imaging modality in exploration geophysics, see, e.g., [2, 3]. Both inverse problems are locally illposed. We will even show that (19) is satisfied in the acoustic regime. Moreover, our results for FWI carry over to parameter identification tasks for other hyperbolic evolution equations, such as an inverse problem in electromagnetic scattering.
For all applications we rely on the discrete B-spline spaces constructed in Appendix A.1.
3.1 Electrical impedance tomography under the complete electrode model
Electrical impedance tomography (EIT) entails the determination of the electric conductivity distribution of an object by applying electric currents at the boundary through electrodes and measuring the resulting voltages at the boundary as well. Potential applications are, for instance, medical imaging and non-destructive testing.
Let \(\sigma :D\rightarrow [\sigma _{\min },\infty )\), \(\sigma _{\min }>0\), be the searched-for conductivity distribution in the simply connected Lipschitz-domain \(D \subset \mathbb {R}^2\). Further, the p electrodes are denoted by \(E_1,\ldots ,E_p\) and are assumed to be open subsets of \(\partial D\), the boundary of D, having positive surface measure: \(\vert E_j\vert >0\), \(j=1,\ldots ,p\). Moreover, let the electrodes be connected and separated: \(\overline{E}_i\cap \overline{E}_j=\emptyset \), \(i\ne j\). To this electrode configuration we associate the electrode space
where \(\chi _{E_i}\) is the indicator function of the ith electrode and \(L^2_{\scriptscriptstyle \diamondsuit }(\partial D)\) is the space of \(L^2\)-functions on the boundary of D having vanishing mean.
The forward problem of impedance tomography under the complete electrode model (CEM) in the weak formulation is based on the bilinear form \(a:Y_p\times Y_p\rightarrow \mathbb {R}\), \(Y_p:= H^1(D)\oplus \mathscr {E}_p\),
Given an electrode (mean) current \(I\in \mathscr {E}_p\) and a contact impedance \(z>0\), find a voltage potential \(u \in H^1(D)\) and an electrode voltage \(U \in \mathscr {E}_p\) such that
The vanishing mean of I and U can be interpreted as conservation of charge and grounding the potential, respectively. CEM (22) is the most accurate mathematical model for EIT currently in use and has a unique solution for \(\sigma \in L^\infty _+(D)\), where
The latter follows from the Lax-Milgram theorem, see [21].
The nonlinear forward operator F describing CEM is
In other words: \(F(\sigma )\) maps the current I to the voltage U of the solution of (22). Its Frechét derivative \(F'(\sigma )\in \mathcal {L}\big (L^\infty (D),\mathcal {L}(\mathscr {E}_p)\big )\) is given by
where \((u',U')\in Y_p\) uniquely solves
with \(u=u(I)\) being the first component of the solution of (22) with respect to the current I, see, e.g., [22]. We will similarly use U(I) for the second component of the solution of (22) and \(u'(I)\) for the first component of (23).
We equip \(\mathcal {L}(\mathscr {E}_p)\) with the Hilbert-Schmidt inner product: \(\langle \Lambda , \Gamma \rangle _\text { HS}=\sum _{j=1}^{p-1} \langle \Lambda I_j,\Gamma I_j\rangle _{L^2(\partial D)}\) where \(\{I_1,\ldots ,I_{p-1}\}\) is an orthonormal basis of \(\mathscr {E}_p\). This inner product is independent of the chosen orthonormal basis.
Lemma 3.1
The adjoint operator \(F'(\sigma )^*:\mathcal {L}(\mathscr {E}_p)^*\rightarrow L^\infty (D)^*\) is given by
where \(u(\Gamma I_j)\) and \(u(I_j)\) denote the solutions of (22) with respect to \(I=\Gamma I_j\) and \(I=I_j\), respectively. As a sum of products of two \(L^2\)-functions, \(F'(\sigma )^*\Gamma \) is even in \(L^1(D)\subset L^\infty (D)^*\), so that (6) holds.
Proof
Set \(\Lambda = F'(\sigma )[h]\), \(\Lambda _j=\Lambda I_j\), and \(\Gamma _j=\Gamma I_j\) for one \(\Gamma \in \mathcal {L}(\mathscr {E}_p)\). Then,
which settles the argument. \(\square \)
Lemma 3.2
Let \(X^n=X_N^n({D})\), \(n\in \mathbb {N}\), be the tensor product spline space of Appendix A.1 with projector \(\mathcal {P}^n\) defined in (A5). Then, for any \(\sigma \in L_+^\infty (D)\),
Proof
We start with
The bilinear form a is bounded and elliptic on \(Y_p\). Hence, it follows from (23) that
Since \(\mathcal {P}^nh\xrightarrow {n\rightarrow \infty } h\) pointwise a.e. (Appendix B), the norm on the right tends to zero as \(n\rightarrow \infty \) by the dominated convergence theorem. \(\square \)
So we have validated (5) and (6) for the forward operator of EIT. Moreover, TCC (3) holds for the semi-discrete version \(F_{n_{\max }}\) of F (Remark 2.5) provided the number of electrodes is sufficiently large [22, Theorem 4.5].
3.2 Full waveform inversion
Wave propagation in realistic media can be modeled by a visco-elastic wave equation which accounts for dispersion and attenuation [2, Chapter 5]. Here, we consider the formulation introduced in [23]: Let \(D\subset \mathbb {R}^3\) be a Lipschitz domain. Using \(L\in \mathbb {N}\) damping tensors \({\varvec{\sigma }}_l:[0,T]\times D\rightarrow \mathbb {R}^{3\times 3}_\text {sym}\), \(l=1,\ldots ,L\), the evolution of the particle velocity field \(\textbf{v}:[0,T]\times D\rightarrow \mathbb {R}^3\) and stress tensor \({\varvec{\sigma }}_0:[0,T]\times D\rightarrow \mathbb {R}^{3\times 3}_\text {sym}\) reads
with zero initial conditionsFootnote 2
Above, \({\varvec{f}}:(0,T)\times D\rightarrow \mathbb {R}^3\) is the volume force, which initiates the wave, and the functions \(\tau _{\text {\tiny P}},\tau _{\text {\tiny S}}:D\rightarrow \mathbb {R}\) are scaling factors for the unrelaxed bulk modulus \(\pi :D\rightarrow \mathbb {R}\) and shear modulus \(\mu :D\rightarrow \mathbb {R}\), respectively. Further, \(\rho :D\rightarrow \mathbb {R}\) is the mass density. The linear map C(m, p), \(m,p\in \mathbb {R}\), is Hooke’s tensor
where \(\textbf{I}\in \mathbb {R}^{3\times 3}\) is the identity matrix and \({\text {tr}}(\textbf{M})\) denotes the trace of \(\textbf{M}\in \mathbb {R}^{3\times 3}\). Finally,
is the linearized strain rate.
Wave propagation is frequency-dependent and the numbers \(\tau _{{\varvec{\sigma }},l}>0\), \(l=1,\ldots ,L\), model this dependency about the center frequency \(\omega _0\), see [24, 25]. Introducing the frequency-dependent phase velocities of P- and S-waves,
full waveform inversion entails the identification of the five spatially dependent parameters
from partial wave field measurements. For a physically meaningful open subsetFootnote 3\(\mathcal {D}(F)\subset L^\infty (D)^5\) the full waveform forward operator
is well defined with
where y \(\in C^{1}([0,T],H)\) is the unique classical solution of (24) with source \({\varvec{f}}\in W^{1,1}((0,T),L^2(D,\mathbb {R}^3))\), see [26] (of course, y satisfies boundary conditions, whose concrete formulations we have omitted for simplicity).
Remark 3.3
Actually, the operator modeling seismic imaging is \(\Phi =M\circ F\) where \(M:L^2((0,T),H) \rightarrow S \) denotes the measurement operator (S is the space of seismograms). We can reasonably assume that M is linear and bounded. If F satisfies (5) and (6), so does \(\Phi \).
For suitable \(w=(\textbf{w},\varvec{\psi }_0,\ldots ,\varvec{\psi }_L)\in H\) we define operators A, B, and Q mapping into H by
so that (24) collapses to
Note that only \(B:\mathcal {D}(F)\subset L^\infty (D)^5\rightarrow \mathcal {L}(H)\) depends on the parameters to be identified: \(B=B(u)\). It is Fréchet-differentiable with
where \(\widehat{u}=(\widehat{\rho },\widehat{v}_{\text {\tiny S}}, \widehat{\tau }_{\text {\tiny S}}, \widehat{v}_{\text {\tiny P}},\widehat{\tau }_{\text {\tiny P}})\) and
Further,
and
Both, \(\widetilde{C}(\mu ,\pi )\) and \(\widetilde{C}(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )\) are well defined for \(u\in \mathcal {D}(F)\), see [26, Section 4.1] for the involved definition of \(\mathcal {D}(F)\).
Lemma 3.4
Let \(\{\widehat{u}_n\}_{n\in \mathbb {N}}\subset L^\infty (D)^5\) be bounded and be convergent to \(\widehat{u}\in L^\infty (D)^5\) pointwise a.e. Then, for any \(u\in \mathcal {D}(F)\),
Proof
The \(L^2\)-space H has inner product
where the colon indicates the Frobenius inner product on \(\mathbb {R}^{3\times 3}\). Now, in view of (26) and (27) we see that the integrand of \(\Vert B'(u)[\widehat{u}_n-u]h \Vert _H^2\) converges to zero pointwise a.e. Furthermore, as \(\{\widehat{u}_n\}_{n\in \mathbb {N}}\) is bounded in \(L^\infty (D)^5\), the absolute value of the integrand is bounded by an integrable function (as a matter of fact, the integrand is the sum of products of two \(L^2\)-functions with an \(L^\infty \)-function). The assertions follows from the dominated convergence theorem. \(\square \)
For completion, we quote a result from [26, Theorem 4.4] which we will need below.
Lemma 3.5
Under the assumptions from above, F is Fréchet-differentiable at any \(u\in \mathcal {D}(F)\). For \(\widehat{u}\in L^{\infty }(D)^{5}\) we have \(\overline{y}:=F'(u)\widehat{u}\in C([0,T],H)\) given as the unique weak solution of
with \(y=F(u)\). Further, for any \(t\in [0,T]\),
where C depends continuously on the operator norms of B, \(B^{-1}\), Q, and on T.
We define
as well as
where \(X_N^n\) and \(\mathcal {P}_N^n\) are given by the cardinal B-spline spaces in (A3) and (A5), respectively. We point out that also heterogeneous choices for the factors in \(X^n\) are possible as long as \(X^n\) keeps nested with respect to n. The convergence properties of \(\{\mathcal {P}^n\}_n\), see Appendix B, then guarantee (5) according to the next lemma.
Lemma 3.6
Let \(X^n\) and \(\mathcal {P}^n\) be as above. For the full waveform forward operator F with \({\varvec{f}}\in W^{1,1}((0,T),L^2(D,\mathbb {R}^3))\) we have that
for any \(u\in \mathcal {D}(F)\) and all \(\widehat{u}\in L^{\infty }(D)^{5}\).
Proof
The function \(\overline{y}_n:=F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\) is the unique weak solution of
which according to (29) satisfies
Since \(C([0,T],H)\hookrightarrow L^2((0,T),H)\) is continuous, the assertion of the theorem follows if we can show that the right-hand side of the above stability estimate goes to zero as \(n\rightarrow \infty \). Applying Proposition B.1 componentwise, we deduce that \(\mathcal {P}^{n_k}\widehat{u}\rightarrow \widehat{u}\) pointwise a.e. for a subsequence \(\{n_k\}_{k\in \mathbb {N}}\). Using \(\Vert \mathcal {P}^{n_k}\widehat{u}\Vert _{L^{\infty }(D)^{5}}\le C_\mathcal {P}\Vert \widehat{u}\Vert _{L^{\infty }(D)^{5}}\) and \(\partial _ty+Qy\in C([0,T],H)\), we see that (26) corresponds to linear combinations of uniformly bounded functions in t for all k. In particular, we can find \(g\in L^1(0,T)\) such that
By (28) we can then apply the dominated convergence theorem for integration in time which yields \(\Vert B'(u)\big [\widehat{u}-\mathcal {P}^{n_k}\widehat{u}\big ](\partial _ty+Qy)\Vert _{L^1((0,T),H)}\rightarrow 0\) as \(k\rightarrow \infty \). By uniqueness of the pointwise limit \(\widehat{u}\) the latter convergence even holds for the whole sequence, see [20, Prop. 10.13(2)]. \(\square \)
Condition (6), which guarantees the regularization property (Theorem 2.3), was proven for the visco-elastic case (24) in [26, Theorem 4.5 and Remark 4.6]. Unfortunately, the TCC, which is the remaining condition for the rigorous applicability of REGINN \(^{\infty }\), is subject of current research in the context of FWI and only special cases are known to hold. For example, in [6] the TCC has recently been shown for a semi-discrete setting in the acoustic regime, cf. (31) below.
Remark 3.7
The findings for the full waveform forward operator in this subsection, that is, (5) and (6) hold, carry over to parameter identification tasks for other hyperbolic evolution equations of the form (25). Examples are the parameter-to-state maps with respect to the acoustic and elastic wave equations (which are in fact simplifications of the visco-elastic model). A further example is the Maxwell system where the components of \(y=(\textbf{E}, \textbf{H})\) are the electric and magnetic fields, respectively. The involved operators are
where \(\varepsilon ,\mu ,\sigma :D\rightarrow \mathbb {R}\) are the (electric) permittivity, the (magnetic) permeability, and the conductivity. Further, \(\textbf{J}_{\text { e}}, \textbf{J}_{\text { m}}:[0,\infty )\times D\rightarrow \mathbb {R}^3\) are the current and magnetic densities. See [27, Section 5] for the details to verify (5) and (6) for the map \(F:(\varepsilon , \mu )\mapsto y\).
3.3 A stronger result for FWI in the acoustic regime
Setting \(\mu =0\), \(\tau _{{\varvec{\sigma }},l}=0\), \(\tau _{\text {\tiny P}}=\tau _{\text {\tiny S}}=0\) in (24) and introducing the hydrostatic pressure \(p={\text {tr}}(\varvec{\sigma }_0)/3\), we can formally derive the acoustic wave equation as first order system,
where we allow two source terms and where \(\nu =v_\text {\tiny P}\) is the compression wave speed. In our abstract formulation (25), it is represented by \(y=(\textbf{v},p)\in L^2((0,T),H)\), \(H=L^2(D,\mathbb {R}^d)\times L^2(D)\), and the operators
Further, A is defined on
If \({\varvec{f}}=({\varvec{f}}_1,f_2)\in W^{1,1}((0,T),H)\) then (31) has a unique classical solution in \(C^1([0,T],H) \cap C([0,T],\mathcal {D}(A))\), see [27] (recall that we have zero initial conditions (24d)).
The parameter-to-state map here is
with domain of definition
As explained in Remark 3.7, F satisfies (5) and (6). In this part we even verify the stronger compatibility condition (19), that is, for a special choice of \(X^n\) and X, see (18), we will specify the decay function \(\beta \) in Theorem 3.8 below. For this purpose, we restrict ourselves to the discretization space \(X^n=X^n_1\times X^n_1\) with \(X^n_1=X^n_1(D)\) from (A3) with \(N=1\). Thus, \(X^n\) consists of piecewise constant functions in the sequel. The associated projector onto \(X^n\) then reads
whose components are given by
for \(j\in \{1,2\}\) according to (A5). Here,
is the translated and dilated unit cube. Further, \(\mathcal {I}_n = \{k\in \mathbb {Z}^d: \square ^n_{k}\subset \overline{D}\} \), see (A4).
We are left to determine \(X\subset L^{\infty }(D)^2\) where the subspace X is governed by a stronger topology measuring some kind of smoothness. Intuitively, X should be large enough to still contain a wide class of discontinuous profiles, on the other hand we need some minimal a priori regularity such that its \(X^n\)-projections facilitate a common decay rate in (19). For \(s>0\) fixed we set
whose component spaces are characterized by
with \(D^{h}:=\{x\in D:\ x-h\in D\}\) for any \(h\in \mathbb {R}^d\). We assign the norm
where \([\cdot ]_{B^{s}_{2,\infty }(D)}\) is a semi-norm given by the numerical value of the supremum in (38). Originally, \([\cdot ]_{B^{s}_{2,\infty }(D)}\) emerges from the definition of Hilbertian Besov-Nikolskii spaces
with \(\Vert \cdot \Vert _{B^{s}_{2,\infty }(D)}:=\Vert \cdot \Vert _{L^2(D)}+[\cdot ]_{B^{s}_{2,\infty }(D)}\), see [28]. We set
and obviously have that \(\Vert u\Vert _{L^{\infty }(D)^2}\le \Vert u\Vert _{X}\) for all \(u\in X\).
Theorem 3.8
Let \(D\subset \mathbb {R}^d\), \(d\in \{1,2,3\}\), be a domain with \(C^1\) boundary and assume that \({\varvec{f}}=({\varvec{f}}_1,f_2)\in W^{2,1}((0,T),H)\) with \({\varvec{f}}(0)=\partial _t {\varvec{f}}(0)=\textbf{0}\) and \({\varvec{f}}_1\in C([0,T],L^{q}(D,\mathbb {R}^d))\) for some \(q\in (2,q_{\max })\), where \(q_{\max }\) only depends on D and the ratio \(\rho _{\max }/\rho _{\min }<\infty \) of the parameters from the definition of \(\mathcal {D}(F)\). Let \(\mathcal {P}^n\) be as in (36) and let \(X=L^\infty _s(D)^2\) be as above for some \(0<s\le 1/2\). Then, there is a neighborhood U of \(u^+\in \mathcal {D}(F)\) such that
for all \(u\in U\) and any \(\widehat{u}\in X\).
To prove the theorem, some preparation is required. We start with the observation that \(L^{\infty }_s(D)\subset B^{s}_{2,\infty }(D)\) and \(X_1^n\subset L^{\infty }_s(D)\) if \(s\le 1/2\). The latter inclusion can be seen as follows: Let \(h=(h_1,\dots ,h_d)\) with \(\vert h\vert <2^{-n}\). Without loss of generality we may assume that \(h_i\ge 0\). Now, thanks to the symmetry of the cubes, we estimate
for all \(k\in \mathcal {I}_n\), where we used the mean value theorem in the last step.
Lemma 3.9
Let D be a bounded Lipschitz domain and let \(\mathcal {P}^n\) be as in (36). Then, for any \(u\in X\) and \(0<s\le 1/2\),
where C only depends on D and the dimension d.
Proof
In view of (39) it suffices to prove the assertion for each component \(\mathcal {P}^n_1\) of \(\mathcal {P}^n\) and for any \(w\in L^{\infty }_s(D)\). Let \(\square \subset \mathbb {R}^d\) be a sufficiently large rectangle containing D. According to [29] there exists \(\widetilde{w}\in B^s_{2,\infty }(\square )\) such that \(\widetilde{w}\vert _D=w\) and \(\Vert \widetilde{w}\Vert _{B^s_{2,\infty }(\square )}\le \widetilde{C}\Vert w\Vert _{B^s_{2,\infty }(D)}\), where \(\Vert w\Vert _{B^s_{2,\infty }(D)}\) can be replaced by the stronger norm \(\Vert w\Vert _{L^{\infty }_s(D)}\). Using a dyadic partition of \(\square \) at level n based on the cubes \(\{\square _{k}^n\}_{k}\) (for which we restrict \(\square \) to have integer side length), we have
see [30], where \(\mathcal {\widetilde{P}}^n_1\) is defined as in (36) but with respect to the larger index set \(\mathcal {I}_n(\square )= \{k\in \mathbb {Z}^d:\square ^n_{k}\subset \overline{\square }\}\). Setting
we conclude that \(\mathcal {P}^n_1w\vert _{D_n}=\mathcal {\widetilde{P}}^n_1\widetilde{w}\vert _{D_n}\) and \(\mathcal {P}^n_1w\vert _{D\backslash D_n}=0\). The latter implies
for which we can estimate for some \(C_D<\infty \)
according to the inclusion \(D\backslash D_n\subset \cup _{x\in \partial D} \big (x+2^{-n}[-1,1]^d]\big )\), where \(\mathcal {H}^{d-1}(\partial D)\) denotes the \(d-1\)-dimensional Hausdorff measure of \(\partial D\). Altogether, we obtain for \(s\le 1/2\) that
which proves the lemma. \(\square \)
Lemma 3.10
Under the assumptions of Theorem 3.8, there is a \(q_{\max }>2\) such that for any \(u^+=(\rho ^+,\nu ^+)\in \mathcal {D}(F)\) there exists a neighborhood U of \(u^+\) such that
for any \(q\in (2,q_{\max })\) where
Here, \(q_{\max }\) only depends on D and the ratio \(\rho _{\max }/\rho _{\min }<\infty \) of the parameters from the definition of \(\mathcal {D}(F)\).
Proof
The proof makes use of converting higher time regularity to higher spatial integrability. By Theorem 2.6 of [27], we know that for \(u^+\in \mathcal {D}(F)\) we have \(\partial _ty=(\partial _t p,\partial _t\textbf{v})\in C([0,T],\mathcal {D}(A))\cap C^1([0,T],H)\), in particular \(\partial _t p\in L^1((0,T),H^1_0(D))\). By Sobolev embedding, see [31], we obtain at least \(\partial _t p\in L^1((0,T),L^6(D))\) for \(d\in \{1,2,3\}\). Again by Theorem 2.6 of [27], since the constant of the stability estimate depends continuously on B and thus on \(u\), we can actually conclude \(\Vert \partial _t p\Vert _{L^1((0,T),L^6(D))}<\infty \) uniformly in a neighborhood of \(u^+\). Concerning \(\partial _t\textbf{v}\), we only have integrability information about its divergence. Therefore, we first note that by (31) we have
in the sense of distributions. As \((\rho , \nu )\in \mathcal {D}(F)\) is bounded away from zero uniformly, the parameters do not affect the integrability of the right-hand side terms. Now Meyers’ estimate, see [32, Thm. 1], implies for fixed \(t\in [0,T]\) and any \(q\in (2,q_{\max })\) that
where \(q_{\max }\) depends on D and \(\rho _{\max }/\rho _{\min }<\infty \), while C additionally depends on q. Since \(\Vert \partial _t^2 p\Vert _{L^2(D)}\) can be bounded in terms of \(\Vert \partial _t^2 {\varvec{f}}\Vert _{L^1((0,T),H)}\) and a constant which depends continuously on B, that is, on u, see Theorem 2.6 of [27], we concude that \(\Vert \nabla p\Vert _{L^1((0,T),L^q(D,\mathbb {R}^d))}<\infty \) uniformly in a neighborhood of \(u^+\). Comparing with (31), we finally get that also \(\Vert \partial _t\textbf{v}\Vert _{L^1((0,T),L^q(D,\mathbb {R}^d))}<\infty \) locally uniform in \(u\). This completes the proof. \(\square \)
Finally, we can prove the main result.
Proof of Theorem 3.8
The proof uses a more elaborate analysis of the stability estimate
compared to Lemma 3.6. Recall that the constant C depends continuously on the operator norms of \(B(u)\), \(B^{-1}(u)\) and on T according to (29). Since also \(B\mapsto B^{-1}\) is (locally) continuous, we can assume without loss of generality that the above inequality holds for some fixed \(C=C_{r',u^+}>0\) for all \(\Vert u-u^+\Vert _{L^{\infty }(D)^{2}}<r'\) by shrinking the original \(r'>0\) otherwise. Then, for any \(\delta >0\),
Due to \(\Vert \widehat{u}-\mathcal {P}^n\widehat{u}\Vert _{L^{\infty }(D)^{2}}\le (1+C_{\mathcal {P}})\Vert \widehat{u}\Vert _{L^{\infty }(D)^{2}}\le (1+C_{\mathcal {P}})\Vert \widehat{u}\Vert _{X}\) by (S2) and \(C_X=1\) in (18), we obtain, by dropping the complementary indicator function in the bottom line above, that
The middle line here can also be directly expressed in terms of \(\delta \). To this end, we apply Hölder’s inequality with exponent \(q/2>1\) in view of (41) to get
where we additionally employed Tschebyscheff’s inequality in the bottom line, see e.g., [33]. With Lemma 3.9 we can further estimate
Altogether, we get with a similar Hölder-inequality argument for
that
which holds for all \(\delta >0\). Optimization in \(\delta \) then yields \(\delta \propto \big (2^{-s(q-2)/(4d(q-1))}\big )^n\). By Lemma 3.10 and the continuity of \(B'\), we can indeed find \(C^+<\infty \) such that (40) holds. \(\square \)
4 Numerical Results
We present numerical experimentsFootnote 4 on a two-parameter reconstruction to demonstrate the operation of REGINN \(^\infty \) in a test scenario where all assumptions required for our analysis in the previous sections are satisfied.
Recall from Theorem 2.3 that, in general, the regularization property holds only in the weak-\(\star \) topology permitting a kind of strange convergence behavior. Therefore, we test Algorithm 1 as the noise level approaches zero and also how it behaves under different initial spaces \(X^{n_0}\). We will start with a rather low dimensional \(X^{n_0}\) such that \(n_m\) increases successively in the course of the Newton iteration (while-loop) and in contrast also with some large dimensional \(X^{n_0}\) which corresponds to a more static use of Tikhonov regularization throughout all iterations.
Our experiments rely on the acoustic wave equation in one spatial dimension, \(d=1\), where \(D=(0,1)\) and \(T=1\):
The source components \(f_1,f_2:[0,1]\times [0,1]\rightarrow \mathbb {R}\) are
where
The corresponding exact data, that is, the solution of (42) and (43), are given by
We solve the appearing wave equations during inversion for the parameters by the FEM-based MATLAB (R2021a) command pdepe with 300 spatial and 100 temporal grid points. Both sets of points are distributed equidistantly in [0, 1].
Our discrete parameter spaces \(X^n=X^n_1\times X^n_1\) are generated by the piecewise constant cardinal B-spline as explained in Appendix A.1. So, conditions (S1)–(S3) are fulfilled. Note that the dimension of \(X^n_1\) is \(2^n\). In view of Remark 2.5 we set \(n_{\max }=8\) yielding the semi-discrete parameter-to-state map
where \((p,v)\in L^2([0,1],H)\), \(H= L^2(0,1)^2\), solves (42) with (43) and \((\rho ,\nu )\in \mathcal {D}(F_{n_{\max }})= X^{n_{\max }} \cap \mathcal {D}(F)\), see (35) for \(\mathcal {D}(F)\). Within our computations, \(\Vert \cdot \Vert _{ L^2([0,1],H)}\) is discretized by the corresponding space-time Euclidean norm and denoted by \(\Vert \cdot \Vert \). Since \(F_{n_{\max }}\) satisfies the TCC (3), see Appendix C, Theorem 2.1 guarantees termination of REGINN \(^\infty \) applied to the inverse problem
To simplify notation we use the same symbols for the continuous and the discrete versions of functions such as p, v, \(\rho \), \(\nu \), etc.
We apply REGINN \(^\infty \) (Algorithm 1) to (47) where we choose \(q_n=n/\log _2 C_\infty \) for \(J_{n,m}\) from (10) in accordance with the lower bound in (A6) below. For each m the computation of Newton update candidates \(s_m\in X^n\) is realized—benefiting greatly from the smoothness of the Tikhonov functionals—by a steepest descent routine with Armijo stepsize rule in a loop over n until (9) is met. We adapt \(\mu _m\) during iteration according to the rule proposed in [4]: we start with \(\mu _1=\mu _0\) and set
where \(\mu _0\in (0,1)\) is user-supplied and \(j_m\) denotes the number of gradient decent steps needed to compute the update \(s_m\). Complementary, the underlying discretization level n will be increased if the gradient descent loop stagnates on \(X^n\), which we consider to occur if the ratio of two successive gradient step evaluations does not exceed a fixed threshold close to 1, say 0.99999. We stop the algorithm either by the discrepancy principle or if \(n\ge n_{\max }\) happens, that is, if the discretization of \(X^n\) would become finer than the computational grid used in the pdepe-routine for solving the wave equation. In the latter case, we still perform \(u_{m+1}=u_m+\widetilde{s}_m\) with the last update candidate \(\widetilde{s}_m\in X^{n_{\max }}\) before abortion. We emphasize that \(\widetilde{s}_m\) is not a Newton update in the sense of (9), but the corresponding \(u_{m+1}\) might still fulfill the discrepancy principle unlike \(u_m\).
In our experiments we especially want to detect the jump regions [7/30, 17/30] and [13/30, 23/30], where the parameters differ from the homogeneous background material \((\rho _0,\nu _0)=(1,1)\in X^{n_{\max }}\), respectively, that we take as initial guess. Note that no grid point of \(X^n\) coincides with either of the jump discontinuity points for all n so that the error of any reconstruction of \(\rho \) and \(\nu \) will always be at least \((\max {\rho }-\min {\rho })/2=1/10\) and \((\max {\nu }-\min {\nu })/2=1/20\) with respect to the \(L^{\infty }\)-norm, respectively.
First, we investigate the case of ‘exact’ data, that is, \((p^\delta ,v^\delta )= (p,v)\) with (p, v) from (45). Despite of \(\delta =0\) our data might still be contaminated by some discretization error with respect to \(F_{n_{\max }}\) since the corresponding analytical solution \((\rho ,\nu )\) given by (44) is not contained in \(X^{n_{\max }}\). Choosing \(\mu _0=0.7\), \(\gamma =0.8\), \(C_\infty =1.1\), we run Algorithm 1 for different \(n_0\) to observe how its choice affects the outcome. Note that setting \(\tau \) is redundant here because termination is solely forced by n exceeding \(n_{\max }\). Figure 1 displays the exact parameter functions \(\rho \) and \(\nu \) (red) and the corresponding outputs \(\rho _M\) and \(\nu _M\) (blue) of Algorithm 1 when starting with \(n_0\in \{2,5,8\}\), respectively. We see that the larger \(n_0\) is, the smoother the output becomes, while the points of discontinuity are more sharply located for smaller \(n_0\). Hence, for the reconstruction of jump discontinuities, \(n_0\) shall be chosen large enough to locate discontinuity points sufficiently precise while at the same time it should not be too large to prevent oversmoothing. Figure 2 shows a more detailed convergence history in the case \(n_0=2\) and confirms that the majority of Newton steps is indeed undertaken with \(n_m\le 4\).
Next, we study the case of noisy data. For this purpose we generate noise vectors \(\zeta \) as random samples from a centered Gaussian distribution and scale it such that \(\Vert \zeta \Vert =\delta \Vert (p,v)\Vert \). Since \(\delta \) is a relative perturbation here, the discrepancy principle must be adjusted accordingly. As before, we employ Algorithm 1 with \(\mu _0=0.7\), \(\gamma =0.8\), \(C_\infty =1.1\), \(\tau =1.1\). Using the insights from our exact data case we set \(n_0=5\) as initial value to balance the aforementioned effects of globally smooth and locally oscillating reconstructions. The corresponding results for \(\rho _M\) and \(\nu _M\) are shown in Fig. 3 for \(\delta =5\%\), \(\delta =2\%\), and \(1\%\). We see that the reconstructions’ profiles approach the correct jump height of the exact solution as \(\delta \) becomes smaller. In all three cases, termination occurs by reaching the discretization limit, however, each last update fulfills the discrepancy principle afterwards. Altogether, the plots are in agreement with the weak-\(\star \) regularization property of REGINN \(^\infty \) (Theorem 2.3).
5 Conclusion
We have investigated a novel iterative regularization algorithm tailored for non-linear illposed problems between \(L^\infty (D)^\ell \) and a normed space Y. The main focus was on generating uniformly bounded iterates relying on a Tikhonov-like regularization term. Due to the non-smooth structure of \(L^\infty (D)^\ell \), a straightforward implementation would require non-smooth or box-constrained calculus which we could circumvent by using discretization in combination with equivalent \(L^p(D)^\ell \)-norms for \(p<\infty \). Under reasonable assumptions on the input parameters, our algorithm REGINN \(^\infty \) terminates after finitely many steps. Further, it fulfills the regularization property in the weak-\(\star \) topology as the noise level of the Y-data tends to zero. Depending on the non-linearity, this convergence can be reformulated as convergence with respect to a norm. Numerical experiments with a model problem illustrate the theoretical findings.
Future research may include a convergence rate analysis under higher regularity assumptions as in (19) or under more general variational source conditions with respect to a Bregman distance [13]. We could even incorporate a metric to overcome that \(L^\infty (D)^\ell \) is non-separable; an approach proposed in [7]. Concerning the data space, especially the task of finding proper measures for the misfit in seismograms, the Kantorovich-Rubinstein (KR) norm has recently proven advantageous in exploration geophysics, see, e.g., [34, 35]. This fact suggests an implementation of our method under the KR-norm on Y. In fact, our theory also allows more general distance functions on Y. For instance, any distance concept is admissible which is convex in one of its two arguments (e.g. Bregman distances). Such a concept can in particular be learned within a predefined set of admissible distance functions in practice and is subject of ongoing research, see [36].
Notes
Recall that Theorem 2.1 in its original version requires an initial guess \(u_0\in B_{r_0}(u^+)\). Since \(L^\infty (D)^\ell \) is not separable, however, there might be no element in \(X^n\) for any \(n\in \mathbb {N}\) which satisfies this closeness condition. As remedy we may even enlarge the parameter space for the semi-discrete problem to \(U_0+X^{n_{\max }}\) where \(U_0\subset L^\infty (D)^\ell \) is a proper finite dimensional subspace such that \(u_0\in U_0\). In this case, we assume \(u^+\in U_0+X^{n_{\max }}\).
Zero initial conditions are not a general restriction. We use them here only to ease the presentation.
That is, each of the 5 parameters is restricted from below and from above by physically meaningful bounds. Consult (35) for a concrete example.
For the reader’s own experiments we provide our MATLAB code on http://www.math.kit.edu/ianm3/~rieder/media/reginn_infty_fig2.m. Executed in MATLAB (R2021a) on an Intel(R) Core(TM) i5-1035G4 CPU under Windows 10, the code produces the output shown in Fig. 2. In our program we use a routine by John D’Errico (2021): Piecewise functions (https://www.mathworks.com/matlabcentral/fileexchange/9394-piecewise-functions), MATLAB Central File Exchange. Retrieved November 29, 2021.
References
Mueller, J.L., Siltanen, S.: Linear and Nonlinear Inverse Problems with Practical Applications. Computational Science & Engineering, vol. 10. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2012). https://doi.org/10.1137/1.9781611972344
Fichtner, A.: Full Seismic Waveform Modelling and Inversion Advances in Geophysical and Environmental Mechanics and Mathematics. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-15807-0
Virieux, J., Operto, S.: An overview of full-waveform inversion in exploration geophysics. Geophysics 74(6), 1–26 (2009). https://doi.org/10.1190/1.3238367
Rieder, A.: On the regularization of nonlinear ill-posed problems via inexact Newton iterations. Inverse Prob. 15(1), 309–327 (1999). https://doi.org/10.1088/0266-5611/15/1/028
Margotti, F.J.: On inexact Newton methods for inverse problems in Banach spaces. Ph.D. thesis, Karlsruher Institut für Technologie (KIT) (2015). https://doi.org/10.5445/IR/1000048606
Eller, M., Rieder, A.: Tangential cone condition and Lipschitz stability for the full waveform forward operator in the acoustic regime. Inverse Prob. 37, 085011 (2021). https://doi.org/10.1088/1361-6420/ac11c5
Pöschl, C., Resmerita, E., Scherzer, O.: Discretization of variational regularization in Banach spaces. Inverse Probl. 26(10), 105017 (2010). https://doi.org/10.1088/0266-5611/26/10/105017
Kaltenbacher, B., Neubauer, A., Scherzer, O.: Iterative Regularization Methods for Nonlinear Ill-Posed Problems. Radon Series on Computational and Applied Mathematics, vol. 6. Walter de Gruyter GmbH & Co. KG, Berlin (2008). https://doi.org/10.1515/9783110208276
Schuster, T., Kaltenbacher, B., Hofmann, B., Kazimierski, K.S.: Regularization Methods in Banach Spaces. De Gruyter, Berlin (2012). https://doi.org/10.1515/9783110255720
Clason, C., Klassen, A.: Quasi-solution of linear inverse problems in non-reflexive Banach spaces. J. Inverse Ill-Posed Probl. 26(5), 689–702 (2018). https://doi.org/10.1515/jiip-2018-0026
Iusem, A.N., Resmerita, E.: A proximal point method in nonreflexive Banach spaces. Set-Valued Var. Anal. 18(1), 109–120 (2010). https://doi.org/10.1007/s11228-009-0126-z
Kaltenbacher, B., de Souza, M.L.P.: Convergence and adaptive discretization of the IRGNM Tikhonov and the IRGNM Ivanov method under a tangential cone condition in Banach space. Numer. Math. 140(2), 449–478 (2018). https://doi.org/10.1007/s00211-018-0971-5
Kaltenbacher, B., Klassen, A.: On convergence and convergence rates for Ivanov and Morozov regularization and application to some parameter identification problems in elliptic PDEs. Inverse Probl. 34(5), 055008 (2018). https://doi.org/10.1088/1361-6420/aab739
Kaltenbacher, B., Klassen, A., Previatti de Souza, M.L.: The Ivanov regularized Gauss-Newton method in Banach space with an a posteriori choice of the regularization radius. J. Inverse Ill-Posed Probl. 27(4), 539–557 (2019). https://doi.org/10.1515/jiip-2018-0093
Schuster, T., Rieder, A., Schöpfer, F.: The approximate inverse in action: IV. Semi-discrete equations in a Banach space setting. Inverse Probl. 28(10), 104001 (2012). https://doi.org/10.1088/0266-5611/28/10/104001
Scherzer, O.: The use of Morozov’s discrepancy principle for Tikhonov regularization for solving nonlinear ill-posed problems. Computing 51(1), 45–60 (1993). https://doi.org/10.1007/BF02243828
Hanke, M., Neubauer, A., Scherzer, O.: A convergence analysis of the Landweber iteration for nonlinear ill-posed problems. Numer. Math. 72(1), 21–37 (1995). https://doi.org/10.1007/s002110050158
Kaltenbacher, B., Nguyen, T.T.N., Scherzer, O.: The tangential cone condition for some coefficient identification model problems in parabolic PDEs. In: Kaltenbacher, B., Schuster, T., Wald, A. (eds.) Time-dependent Problems in Imaging and Parameter Identification, pp. 121–163. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-57784-1_5
Lechleiter, A., Rieder, A.: Towards a general convergence theory for inexact Newton regularizations. Numer. Math. 114(3), 521–548 (2010). https://doi.org/10.1007/s00211-009-0256-0
Zeidler, E.: Nonlinear Functional Analysis and Its Applications I: Fixed-Point Theorems. Springer, New York (1986)
Somersalo, E., Cheney, M., Isaacson, D.: Existence and uniqueness for electrode models for electric current computed tomography. SIAM J. Appl. Math. 52(4), 1023–1040 (1992). https://doi.org/10.1137/0152060
Lechleiter, A., Rieder, A.: Newton regularizations for impedance tomography: convergence by local injectivity. Inverse Probl. 24(6), 065009 (2008). https://doi.org/10.1088/0266-5611/24/6/065009
Zeltmann, U.: The viscoelastic seismic model: existence, uniqueness and differentiability with respect to parameters. Ph.D. thesis, Karlsruhe Institute of Technology (2018). https://doi.org/10.5445/IR/1000093989
Bohlen, T.: Viskoelastische FD-Modellierung seismischer Wellen zur Interpretation gemessener Seismogramme. Ph.D. thesis, Christian-Albrechts-Universität zu Kiel (1998). https://bit.ly/2LM0SWr
Bohlen, T.: Parallel 3-D viscoelastic finite difference seismic modelling. Comput. Geosci. 28(8), 887–899 (2002)
Kirsch, A., Rieder, A.: Inverse problems for abstract evolution equations II: higher order differentiability for viscoelasticity. SIAM J. Appl. Math. 79(6), 2639–2662 (2019). https://doi.org/10.1137/19M1269403. Corrigendum under https://doi.org/10.48550/arXiv.2203.01309
Kirsch, A., Rieder, A.: Inverse problems for abstract evolution equations with applications in electrodynamics and elasticity. Inverse Probl. 32(8), 085001 (2016). https://doi.org/10.1088/0266-5611/32/8/085001
Triebel, H.: Theory of Function Spaces. II. Monographs in Mathematics, vol. 84. Birkhäuser Verlag, Basel (1992). https://doi.org/10.1007/978-3-0346-0419-2
Rychkov, V.S.: On restrictions and extensions of the Besov and Triebel–Lizorkin spaces with respect to Lipschitz domains. J. Lond. Math. Soc. 60(1), 237–257 (1999). https://doi.org/10.1112/S0024610799007723
Binev, P., Cohen, A., Dahmen, W., DeVore, R., Temlyakov, V.: Universal algorithms for learning theory I. Piecewise constant functions. J. Mach. Learn. Res. 6, 1297–1321 (2005)
Evans, L.C.: Partial Differential Equations. Graduate Studies in Mathematics, vol. 19, 2nd edn. American Mathematical Society, Providence (2010). https://doi.org/10.1090/gsm/019
Meyers, N.G.: An \(L^{p}\)-estimate for the gradient of solutions of second order elliptic divergence equations. Ann. Sc. Norm. Sup. Pisa Cl. Sci. (3) 17, 189–206 (1963)
Ibe, O.C.: Markov Processes for Stochastic Modeling, 2nd edn. Elsevier, Amsterdam (2013). https://doi.org/10.1016/B978-0-12-407795-9.00001-3
Messud, J., Poncet, R., Lambaré, G.: Optimal transport in full-waveform inversion: analysis and practice of the multidimensional Kantorovich–Rubinstein norm. Inverse Prob. 37(6), 065012 (2021). https://doi.org/10.1088/1361-6420/abfb4c
Métivier, L., Brossier, R., Mérigot, Q., Oudet, E., Virieux, J.: An optimal transport approach for seismic tomography: application to 3D full waveform inversion. Inverse Probl. 32(11), 115008 (2016). https://doi.org/10.1088/0266-5611/32/11/115008
Sun, B., Alkhalifah, T.: Ml-misfit: learning a robust misfit function for full-waveform inversion using machine learning. In: Conference Proceedings, EAGE 2020 Annual Conference & Exhibition Online, 1–5 (2020). https://doi.org/10.3997/2214-4609.202010466
Schumaker, L.L.: Spline Functions: Basic Theory. Cambridge Mathematical Library, 3rd edn. Cambridge University Press, Cambridge (2007). https://doi.org/10.1017/CBO9780511618994
Cohen, A., Daubechies, I., Feauveau, J.-C.: Biorthogonal bases of compactly supported wavelets. Commun. Pure Appl. Math. 45(5), 485–560 (1992). https://doi.org/10.1002/cpa.3160450502
Oswald, P.: Multilevel Finite Element Approximation. Teubner Skripten zur Numerik [Teubner Scripts on Numerical Mathematics]. B. G. Teubner, Stuttgart (1994). https://doi.org/10.1007/978-3-322-91215-2
Acknowledgements
This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Project-ID 258734477 - SFB 1173. The authors thank Christian Rheinbay for insightful discussions.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
We hereby confirm that we do not have any conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A A family of admissible subspaces
In this appendix we give a concrete construction for a family \(\{X^n\}_{n\in \mathbb {N}}\) of subspaces of \(L^\infty (D)^\ell \) which satisfies our assumptions (S1)–(S3) of Sect. 2 for D an open and bounded subset of \(\mathbb {R}^d\). Using Cartesian products in case of \(\ell >1\), cf. (30), we restrict our attention to \(\ell =1\) here.
1.1 A.1 The discrete subspaces used in this work
We will rely on the cardinal B-spline \(\varphi _N:\mathbb {R}\rightarrow \mathbb {R}\) of order \(N\in \mathbb {N}\) which is recursively defined by
It obeys the scaling relation
Further properties are
-
\(\displaystyle \text {supp} \varphi _N=[0,N], \quad \varphi _N\vert _{]0,N[}>0,\quad \varphi _N\in \mathcal {C}^{N-2}\),
-
for each \(k\in \mathbb {Z}\), \(\varphi _N\vert _{[k,k+1]}\) is a polynomial of degree \(N-1\),
-
for all \(t\in \mathbb {R}\),
$$\begin{aligned} 1=\sum _{m\in \mathbb {Z}} \varphi _N(t-m), \end{aligned}$$(A2)
see, e.g., [37].
Using the tensor product B-Spline \(\Phi ( x):=\prod _{i=1}^d\varphi _N(x_i)\), \( x=(x_1,\ldots ,x_d)^\top \in \mathbb {R}^d\), and the notation \(\Phi _{n, k}( x)=2^{nd/2}\,\Phi (2^n x- k)\), \(n\in \mathbb {N}\), \( k\in \mathbb {Z}^d\), we define
with
These spaces are nested due to (A1), so that (S1) holds. Note that \(\cup _{ k\in \mathcal {I}_n} \text {supp} \Phi _{n, k}\subset \overline{D}\) which is a proper inclusion in general.
Next we demonstrate (S2). To this end we set
where \(\widetilde{\Phi }\) is a compactly supported dual function to \(\Phi \) satisfying the biorthogonality
The existence of such functions has been shown in [38]. The biorthogonality yields \(\mathcal {P}_N^n \Phi _{n, k} =\Phi _{n, k}\), for all \( k\in \mathcal {I}_n\). Hence, the required projection property holds: \(\mathcal {P}_N^nu=u\) for all \(u\in X_N^n(D)\). We proceed with
Since
we have established (S2) with \(C_{\mathcal {P}}\le \Vert \widetilde{\Phi }\Vert _{L^1(\mathbb {R}^d)}\). Observe that \(\Vert \widetilde{\Phi }\Vert _{L^1(\mathbb {R}^d)}\ge 1\) as \(\widetilde{\Phi }\) has mean value 1 just as \(\Phi \).
It remains to validate (S3). Let \(u\in X_N^n(D)\) with \(\Vert u\Vert _{L^\infty (D)}=1\). Then,
for
This minimum is non-zero and exists as M is compact in the finite dimensional space \(X_N^n(D)\). Since \(\delta _q=\Vert u_q\Vert _{L^q(D)}\) for one \(u_q\in M\) and as \(\delta _q \rightarrow 1\) for \(q\rightarrow \infty \) (see below), we find a q with \(\delta _q\ge 1/C_\infty \) for any fixed \(C_\infty >1\). Hence, \(1\le C_\infty \,\Vert u\Vert _{L^q(D)}\) for all \(u\in M\) and (S3) follows by the homogeneity of norms.
We finish with proving \(\lim _{q\rightarrow \infty } \delta _q=1\). Obviously, \(\delta _q=\Vert u_q\Vert _{L^q(D)}\le \text {vol}_d(D)^{1/q}\rightarrow 1\) as \(q\rightarrow \infty \). Therefore \(\{\delta _q\}\) is bounded and admits a convergent subsequence, say, \(\lim _{i\rightarrow \infty }\delta _{q_i}=\delta ^*\le 1\). For each q let \( x_q^*\in D\) with \(\vert u_q( x_q^*)\vert =1\). If \(N=1\), \(u_q\) must be equal to unity in a whole cube of length \(2^{-n}\) as a subset of D containing \(x_q^*\). So we can estimate
as \(q\rightarrow \infty \) which proves the assertion in this case. If \(N>1\), we can still find for any \(\varepsilon >0\) sufficiently small a \(\delta >0\) such that \(\vert u_q\vert _{B_\delta ( x_q^*)\cap D}\vert \ge 1-\varepsilon \) for all q. This follows by uniform equicontinuity ensured by the Arzelà-Ascoli theorem since M is compact in \(\mathcal {C}(\overline{D})\) as a bounded, closed, and finite dimensional set. Further, we have that
for all q. This follows by the more general observation that the union over \(m\in \mathbb {N}\) of
is an open cover for \(\overline{D}\), so we can find a minimal m such that \(\overline{D}\subset V_m\) by compactness of \(\overline{D}\). Altogether, we can again deduce a lower bound of the form
as \(q\rightarrow \infty \). We conclude \(\delta ^*=1\) since \(\varepsilon >0\) can be chosen as small as we wish. Finally, any subsequence of \(\{\delta _q\}\) contains a subsequence which converges to 1. So, the whole sequence must converge to 1, see, e.g., [20, Prop. 10.13(2)].
1.2 A.2 A different approach
The functions of \(X_N^n(D)\) from the above construction vanish on the set \(\Delta =D\backslash \cup _{ k\in \mathcal {I}_n} \text {supp} \Phi _{n, k}\) which is non-empty in general. Thus, \(\mathcal {P}_N^nu\) is a poor approximation of u near to the boundary of D in general: \(\mathcal {P}_N^nu\vert _\Delta =0\). Here we sketch an alternative approach to overcome this drawback. We emphasize that this approach has not been relied on in the previous sections as it requires the verification of an additional assumption on \(F'\), see text below following (A7).
Basically, we extend the preimage-space of the map F of Sect. 2 while keeping all its necessary properties to carry over Theorems 2.1 and 2.3 to the extension. Let \(\widetilde{D}\) be an open superset of \(\overline{D}\): \(\overline{D}\subset \widetilde{D}\). We will need two operators: the extension operator \(E:L^\infty (D)\rightarrow L^\infty (\widetilde{D})\), which extends a function by zero, and the restriction operator \(R:L^\infty (\widetilde{D}) \rightarrow L^\infty (D)\), which multiplies a function by the indicator \(\mathbb {1}_D\).
We define \(\widetilde{F}:\mathcal {D}(\widetilde{F})\subset L^\infty (\widetilde{D})\rightarrow Y\) by \(\mathcal {D}(\widetilde{F})=E \mathcal {D}({F})\) and \(\widetilde{F}(\widetilde{u})=F(R\widetilde{u})\). This \(\widetilde{F}\) is Fréchet-differentiable just like F. Moreover, the TCC holds in \(B_r(Eu^+)\subset \mathcal {D}(\widetilde{F})\). Indeed, let \(\widetilde{u}\in B_r(Eu^+)\) then \(\Vert R\widetilde{u}-u^+\Vert _{L^\infty (D)}=\Vert R\widetilde{u}-REu^+\Vert _{L^\infty (D)}\le \Vert \widetilde{u}-Eu^+\Vert _{L^\infty (\widetilde{D})}\le r\), that is, \(R\widetilde{u}\in B_r(u^+)\subset \mathcal {D}({F})\). Thus, for \(u,\widetilde{u}\in B_r(Eu^+)\), we get
Further, (6) is also satisfied by \(\widetilde{F}\) as \(R^*=E\).
For this \(\widetilde{F}\) we can define spaces \(X^n=X_N^n(\widetilde{D})\) as in Appendix A.1 but with respect to \(\widetilde{D}\) rather than D. Now, the union of the supports of \(\Phi _{n, k}\), for \( k\in \mathcal {I}_n(\widetilde{D})\) covers D when n is large enough. Unfortunately, condition (5) does not transfer immediately to the new construction. One must prove it for the concrete case, for example, as follows: for \(\widetilde{u}\in \mathcal {D}(\widetilde{F})\) and \(\widehat{u}\in L^\infty (\widetilde{D})\) we have that
where \(\mathcal {P}_N^n:L^\infty (D)\rightarrow X_N^n(D)\) and \(\widetilde{\mathcal {P}}_N^n :L^\infty (\widetilde{D})\rightarrow X_N^n(\widetilde{D})\) are the corresponding projection operators in accordance with (S2). The left norm on the right hand side of (A7) tends to 0 for \(n\rightarrow \infty \) by (5). The right norm converges to 0, for instance, if \(F'(u):L^\infty (D) \rightarrow Y\) is weak-\(\star \) continuous for all \(u\in \mathcal {D}(F)\): both sequences \(\{\mathcal {P}_N^nR\widehat{u}\}_n\) and \(\{R\widetilde{\mathcal {P}}_N^n \widehat{u}\}_n\) converge to \(R\widehat{u}\) pointwise a.e. This convergence can be verified by standard arguments, see e.g., [37, Chap. 12.3] and [39, Chap. 2]. Further, both sequences are uniformly bounded due to (S2). Hence, \(\mathcal {P}_N^nR\widehat{u}-R\widetilde{\mathcal {P}}_N^n \widehat{u}\rightarrow 0 \text{ weakly- }\star \).
Appendix B An approximation result
Proposition B.1
For \(u\in L^{\infty }(D)\) and \(\{\mathcal {P}^n\}_n\) as in (A5) we have that \(\mathcal {P}^nu\rightarrow u\) in \(L^q(D)\) for all \(q<\infty \).
Proof
Let \(\Box \) be a rectangular superset of D and \(\widetilde{\Box }\) a superset of \(\Box \). Further, extend \(u\) by zero outside of D. The convergence results of Section 12.3 from [37] yield that \(\widetilde{\mathcal {P}}^nu\rightarrow u\) in \(L^q(\Box )\) for any \(q<\infty \) where \(\widetilde{\mathcal {P}}^n\) is defined as in (A5), however, with respect to \(\mathcal {I}_n(\widetilde{\Box })\) and \(X^n(\widetilde{\Box })\). Hence, for any \(x\in D\) we have that \(\widetilde{\mathcal {P}}^nu(x)=\mathcal {P}^nu(x)\) for n large enough such that \(x\in D_n\), where we set
in particular \(\text {vol}_d\left( D\backslash D_n\right) \rightarrow 0\). Because of \(\Vert \mathcal {P}^nu\Vert _{L^\infty (D)}\le C_{\mathcal {P}}\Vert u\Vert _{L^\infty (D)}\) by (S2), we can estimate
and the assertion follows. \(\square \)
Appendix C On the tangential cone condition for the operator in (46)
Here we argue that the semi-discrete non-linear operator \(F_{n_{\max }}\) defined in (46) satisfies the TCC (3).
The underlying abstract system is (25) with the concrete settings (32), \(u_0=0\), and (33) where \(D=(0,1)\), \(T=1\), that is, \(d=1\) and \(u=(p,v)\in L^2([0,1],H)\), \(H= L^2(0,1)^2\). Thus, we obtain the acoustic system (42) which has a unique classical solution under \((\rho ,\nu )\in \mathcal {D}(F_{n_{\max }})\) and for the sources (43). In view of Lemma 3.5, \(F_{n_{\max }}\) is Fréchet-differentiable and we have \(F_{n_{\max }}'(\rho ,\nu )(\widehat{\rho },\widehat{\nu })=(\overline{p},\overline{v})\) where \((\overline{p},\overline{v})\) weakly solves
In a first step we validate injectivity of \(F_{n_{\max }}'(\rho ,\nu )\) for any \((\rho ,\nu )\in \mathcal {D}(F_{n_{\max }})\). To this end, assume \(F_{n_{\max }}'(\rho ,\nu )(\widehat{\rho },\widehat{\nu })=(0,0)\). From (C8) we get
Assume \(0\ne \widehat{\rho }\in X^{n_{\max }}\). Then, there is a non-empty interval [a, b], \(a=2^{-n_{\max }}k\), \(b=2^{-n_{\max }}(k+1)\), \(k\in \mathbb {N}_0\), where \(\widehat{\rho }\) does not vanish. Hence, \(\partial _t p=0\) in \([0,1]\times [a,b]\). By the first equation in (42), \(-\partial _x v = f_1\) in \([0,1]\times [a,b]\), that is,
Recalling the zero initial value \(v(0,\cdot )=0\) we get the contradiction \(0=\int _a^x f_1(0,y)\textrm{d}y<0\) for \(x\in [a,b]\) according to (43). So, \(\widehat{\rho }= 0\) in (0, 1). One argues analogously to validate \(\widehat{\nu }= 0\) in (0, 1). Hence, \(F_{n_{\max }}'(\rho ,\nu )\) is one-to-one which implies the TCC at any interior point of \(\mathcal {D}(F_{n_{\max }})\) due to Lemma C.1 of [6].
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Pieronek, L., Rieder, A. On the iterative regularization of non-linear illposed problems in \(L^{\infty }\). Numer. Math. 154, 209–247 (2023). https://doi.org/10.1007/s00211-023-01359-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00211-023-01359-7