On the iterative regularization of non-linear illposed problems in $$L^{\infty }$$

Pieronek, Lukas; Rieder, Andreas

doi:10.1007/s00211-023-01359-7

On the iterative regularization of non-linear illposed problems in $L^{\infty }$

Open access
Published: 22 June 2023

Volume 154, pages 209–247, (2023)
Cite this article

Download PDF

You have full access to this open access article

Numerische Mathematik Aims and scope Submit manuscript

On the iterative regularization of non-linear illposed problems in $L^{\infty }$

Download PDF

1412 Accesses
1 Citation
Explore all metrics

Abstract

Parameter identification tasks for partial differential equations are non-linear illposed problems where the parameters are typically assumed to be in $L^\infty $. This Banach space is non-smooth, non-reflexive and non-separable and requires therefore a more sophisticated regularization treatment than the more regular $L^p$-spaces with $1<p<\infty $. We propose a novel inexact Newton-like iterative solver where the Newton update is an approximate minimizer of a smoothed Tikhonov functional over a finite-dimensional space whose dimension increases as the iteration progresses. In this way, all iterates stay bounded in $L^\infty $ and the regularizer, delivered by a discrepancy principle, converges weakly-$\star $ to a solution when the noise level decreases to zero. Our theoretical results are demonstrated by numerical experiments based on the acoustic wave equation in one spatial dimension. This model problem satisfies all assumptions from our theoretical analysis.

Regularization Methods for Ill-Posed Problems

An iterative technique for solving singularly perturbed parabolic PDE

Article 18 January 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

We consider the numerical solution of non-linear illposed and inverse problems where the underlying non-linearity F maps from a possibly multi-component version of $L^\infty $ into a normed space Y. This scenario appears quite naturally in many parameter identification tasks for partial differential equations.

For instance, in electrical impedance tomography (EIT) one wants to identify the conductivity distribution in a body. To this end, electric currents are applied through electrodes and the resulting voltages are measured. In the quasi-static regime, an adapted version of the Laplace equation connects the currents with the voltages, see, e.g., [1]. Another application is full waveform inversion (FWI), the most advanced inversion technique in seismic imaging, see, e.g., [2, 3]. Depending on the used mathematical model for wave propagation (acoustic, elastic, or visco-elastic regime) the searched-for parameters include bulk density, pressure and shear wave velocities, and corresponding relaxation times. Here, a wave field is initiated by a source (explosion or earthquake) and the parts of the wave field that are reflected from the Earth’s internal structure are then picked up by receivers on the Earth’s surface or by hydrophones in the ocean.

From an abstract point of view, in both examples we have to solve an equation

$$\begin{aligned} \text {find } u\in \mathcal {D}(F)\text { such that } F(u)\approx y^\delta \end{aligned}$$

(1)

where the non-linear operator

$$\begin{aligned} F:\mathcal {D}(F)\subset L^\infty (D)^\ell \rightarrow Y \end{aligned}$$

(2)

maps $\ell $ parameter functions u located on some domain of interest D to the respective measurements. Further, $y^\delta $ are the (noisy) measurements satisfying $\Vert y^\delta -F(u^+)\Vert _Y\le \delta $ for one $u^+\in \mathcal {D}(F)$.

Newton-like regularization schemes are well-established iterations for getting a meaningful approximate solution of non-linear inverse problems. In this work we explore the Newton-like solver REGINN $^\infty $ which extends REGINN of [4, 5] to a non-linear inverse problem with generic operators F as in (2). Here, F is required to fulfill a few specific properties which are satisfied by EIT and, except for one, also by FWI in all regimes. This not yet verified property of FWI is a structural assumption known as tangential cone condition (TCC), see (3) below and consult [6] for a first promising result. Apart from establishing the non-linearity constraint, the main challenge about regularization in $L^{\infty }$ is its non-reflexive nature. As this space is further non-separable, convergence of discretization schemes in the strong topology, which is desirable for practical implementations, cannot be achieved, see [7].

In this work we will exploit the weak-$\star $ topology and semi-discrete approximations to F based on a family $\{X^n\}_n$ of finite-dimensional nested subspaces of $L^\infty (D)^\ell $. Such an ansatz is very close to an implementation and addresses discretization errors directly in the theory. Apart from minor mathematical restrictions which will be specified below, the subspaces will also be held quite generic and may reflect features of the reconstructions one wants to focus on, e.g. using piecewise constant functions to model sharp interfaces. By the well-known limit

$$\begin{aligned} \Vert u\Vert _{L^\infty (D)^\ell }=\lim _{q\rightarrow \infty }\Vert u\Vert _{L^q(D)^\ell } \quad \text { for all } u\in L^\infty (D)^\ell , \end{aligned}$$

for bounded D, $X^n$ can be equivalently furnished with the $L^{q_n}(D)^\ell $-topology for properly chosen $q_n<\infty $ such that $q_n\rightarrow \infty $ as $n\rightarrow \infty $. This approach then implies that the Newton update for the n-th iterate can be obtained as an approximate minimizer of a smooth (provided Y is smooth) and convex Tikhonov functional over $X^n$. As a result, the underlying minimizing procedure can be implemented for general $\{X^n\}_n$ with classical techniques from smooth and unconstrained optimization.

There is a wealth of literature on the analysis of Newton-like regularization schemes, mainly in a Hilbert space but meanwhile also in a Banach space setting; we refer only to the monographs [8, 9] for a first reading. However, most of the Banach space methods are formulated in abstract spaces requiring smoothness and reflexivity at least as they rely heavily on duality mappings to mimic the Riesz isomorphism. To the best of our knowledge, regularization schemes applicable to non-reflexive spaces are only considered in [7, 10,11,12,13,14,15]. The first six of these publications consider iterative schemes on the basis of proximal point methods, Morozov, Ivanov, or Tikhonov regularization, respectively. We will compare our scheme with the somewhat similar IRGNM-Tikhonov method of [12] in Remark 2.6 below.

Our material is organized as follows: In Sect. 2 we introduce and analyze two versions of REGINN $^\infty $ which differ in what information about the smoothness of the ground truth is available a priori. Under reasonable assumptions both algorithms are well defined and terminate with a regularized solution $u_{M_\delta }\in \mathcal {D}(F)$ of (1). Further, we prove the regularization property, that is, weak-$\star $ convergence of $\{u_{M_\delta }\}_{\delta >0}$ to an exact solution of $F(\cdot )=F(u^+)$ as the noise level $\delta $ tends to zero. Our hypotheses are reasonable in fact as they are met by EIT as well as FWI with exception of the TCC (Sect. 3) as mentioned above. For FWI in the acoustic regime we are even able to validate the stronger assumption about the ground truth which enters the second version of REGINN $^\infty $. Finally, Sect. 4 contains some experiments to identify two parameters (density and pressure wave velocity) in the acoustic wave equation. Here, all our assumptions including TCC in a semi-discrete setting actually hold. Some technical details that would otherwise interrupt the flow of reading have been moved to three appendices.

2 REGINN $^{\infty }$

For some bounded domain $D\subset \mathbb {R}^d$ let $F:\mathcal {D}(F)\subset L^{\infty }(D)^\ell \rightarrow Y$ be Fréchet-differentiable and satisfy the tangential cone condition (TCC) at $u^+$, i.e., there are a positive constant $\omega <1$ and a ball $B_{r}(u^+)\subset \mathcal {D}(F)$ with radius $r>0$ such that

$$\begin{aligned}{} & {} \Vert F(u)-F(\widetilde{u})-F'(\widetilde{u})(u-\widetilde{u})\Vert _{Y}\le \omega \Vert F(u)-F(\widetilde{u})\Vert _{Y}\nonumber \\{} & {} \quad \text { for all } u,\widetilde{u}\in B_{r}(u^+). \end{aligned}$$

(3)

Here, $F':\mathcal {D}(F)\subset L^{\infty }(D)^\ell \rightarrow \mathcal {L}\big (L^\infty (D)^\ell ,Y\big )$ denotes the Fréchet-derivative of F and $L^\infty (D)^\ell $ is endowed with $\Vert \cdot \Vert ^2_{L^{\infty }(D)^\ell }$ where

$$\begin{aligned} \Vert u\Vert ^2_{L^{q}(D)^\ell }=\Vert (u_1,\ldots ,u_{\ell })\Vert ^2_{L^{q}(D)^\ell }:=\sum _{j=1}^{\ell }\Vert u_i\Vert ^2_{L^{q}(D)}, \quad q\in [1,\infty ]. \end{aligned}$$

The TCC, introduced in [16], is a well-established and widely used assumption in the convergence analysis of a variety of iterative regularization schemes for non-linear illposed problems; we only refer to the monographs [8, 9]. Nevertheless, only a few academic examples are reported in the infinite-dimensional setting where it holds, see, e.g., [8, 17, 18].

For $\omega <1/2$ we can equivalently restate the TCC as

$$\begin{aligned} \Vert F(u)-F(\widetilde{u})-F'(\widetilde{u})(u-\widetilde{u})\Vert _{Y}\le L \Vert F'(\widetilde{u})(u-\widetilde{u})\Vert _{Y}, \end{aligned}$$

(4)

with $L=\omega /(1-\omega )<1$. Since our method will be explicitly based on discretizing $L^\infty (D)^\ell $, we impose the following assumptions on corresponding spaces $X^n$:

(S1)
$\{X^n\}_{n\in \mathbb {N}}$ is a sequence of nested subspaces of $L^\infty (D)^\ell $, i.e., $X^n\subset X^{n+1}\subset L^\infty (D)^\ell $ for all $n\in \mathbb {N}$.
(S2)
For each $X^n$ there exists a linear projection operator $\mathcal {P}^n:L^\infty (D)^\ell \rightarrow X^n$, that is, $\mathcal {P}^nu=u$ for all $u\in X^n$, satisfying $\Vert \mathcal {P}^nu\Vert _{L^\infty (D)^\ell }\le C_{\mathcal {P}}\Vert u\Vert _{L^\infty (D)^\ell }$ where the constant $C_{\mathcal {P}}\ge 1$ is independent of n.
(S3)
For any fixed $C_{\infty }>1$ we can find a positive increasing sequence $\{q_n\}_{n\in \mathbb {N}}$ such that
$$\begin{aligned}\Vert u\Vert _{L^\infty (D)^\ell }\le C_{\infty }\Vert u\Vert _{L^{q_n}(D)^\ell }\quad \text {for all }u\in X^n. \end{aligned}$$

Note that the $L^{\infty }(D)^\ell $-norm is always stronger than the $L^{q_n}(D)^\ell $-norm, hence the magnitude of $C_{\infty }>1$ in (S3) determines how tight the norm equivalence of $\Vert \cdot \Vert _{L^\infty (D)^\ell }$ and $\Vert \cdot \Vert _{L^{q_n}(D)^\ell }$ is, restricted to $X^n$. Thus, having set $\{q_n\}_{n\in \mathbb {N}}$ once for a fixed feasible subspace sequence $\{X^n\}_{n\in \mathbb {N}}$ and some $C_\infty >1$, we can easily switch between non-smooth and smooth norms. We emphasize that $C_\infty $ is independent of n. A family $\{X^n\}_{n\in \mathbb {N}}$ enjoying (S1)–(S3) is constructed in Appendix A.1 on the basis of tensor product B-splines (see Appendix A.2 for a possible other construction, which is, however, not considered further in this work).

To ensure that our discretizations do indeed approximate the actual inverse problem in $L^\infty (D)^\ell $ sufficiently well, we require a compatibility condition for F of the form

$$\begin{aligned} \liminf _{n\rightarrow \infty }\Vert F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\Vert _{Y}=0 \end{aligned}$$

(5)

for all $u\in \mathcal {D}(F)$ and all $\widehat{u}\in L^\infty (D)^\ell $. This extra condition is necessary since, in general, one cannot expect $\lim _{n\rightarrow \infty }\mathcal {P}^n\widehat{u}= \widehat{u}$ in $L^\infty (D)^\ell $ strongly (which would yield (5) by continuity of $F'$) because the union of $X^n$ is countable while $L^\infty (D)^\ell $ is not separable. However, as a consequence, the closure of the range of $F'(u)$ must be separable, in contrast to Y itself, for all $u\in \mathcal {D}(F)$.

Additionally, to prove the regularization property of our scheme (Algorithm 1), we will require either the weak-$\star $ closedness of F or the range inclusion

$$\begin{aligned} \mathcal {R}(F'(u^+)^{*})\subset L^1(D)^\ell \end{aligned}$$

(6)

where $F'(u^+)^{*}:Y^*\rightarrow \big (L^\infty (D)^*\big )^\ell $ denotes the adjoint operator. In Sects. 3.1 and 3.2 below we will verify (5) as well as (6) for the forward maps belonging to EIT and FWI, respectively.

As motivated in the introduction, the guideline for designing our regularization algorithm is to generate uniformly bounded iterates $u_m$ in $L^\infty (D)^\ell $ which give sufficiently small residuals

$$\begin{aligned} b^{\delta }_m:=y^\delta -F(u_m), \end{aligned}$$

(7)

where $y^\delta \in Y$ is the perturbed datum at hand. We build on an inexact-Newton framework and find the updates from the linearization approximately via Tikhonov regularization linked to $X^n$ as follows: Given an initial guess $u_0\in \mathcal {D}(F)$ and an initial space $X^{n_0}$, we iterate

$$\begin{aligned} u_{m+1}=u_m+ s_m,\quad m=0,1,2,\ldots , \end{aligned}$$

(8)

where $n_m\in \mathbb {N}$ with $n_m\ge n_{m-1}$ and $s_m\in X^{n_m}$ are chosen such that

$$\begin{aligned} J_{n_m,m}(s_m)\le \mu _m^2 \Vert b^{\delta }_m\Vert ^2_{Y}. \end{aligned}$$

(9)

Here, the Tikhonov functional $J_{n,m}:X^n\rightarrow [0,\infty )$ is

$$\begin{aligned} J_{n,m}(s):=\Vert F'(u_m)s -b^{\delta }_m\Vert ^2_{Y}+\alpha _m\Vert s+(u_m-u_0)\Vert _{L^{q_n}(D)^\ell }^2 \end{aligned}$$

(10)

with penalty parameters set a posteriori by

$$\begin{aligned} \alpha _m=\frac{\Vert b^\delta _m\Vert ^2_Y}{\gamma ^2}. \end{aligned}$$

(11)

We recall from assumption (S3) that an approximate minimizer of (10) is also one of the corresponding Tikhonov functional with $\Vert \cdot \Vert _{L^\infty (D)^\ell }$-penalty term and vice versa. However, given the sequence $\{q_n\}_{n\in \mathbb {N}}$ for some $C_\infty >1$ and a subspace sequence $\{X^n\}_{n\in \mathbb {N}}$, we have more flexibility in choosing a numerical method to find $s_m$ in (9) via the smoothed version.

So far, the parameters $\gamma ,\{\mu _m\}_{m\in \mathbb {N}}$ are restricted to fulfill $0<\mu _m<1$ and $\gamma \ne 0$. While $\mu _m$ serves as a stopping criterion in the spirit of an inexact Newton condition to set the m-th update $s_m$, $\gamma $ will be responsible for keeping the resulting iterate $u_{m+1}$ sufficiently close to $u^+$: the larger $\gamma $ is chosen, the better the initial guess has to be. The effect of the initial discretization level $n_0$ on the iterates is demonstrated in Section 4 by numerical experiments. We stop the Newton iteration (8) by a discrepancy principle with constant $\tau >1$. The resulting inversion scheme REGINN $^{\infty }$ is summarized in Algorithm 1. It is well defined under reasonable assumptions according to the following theorem. Its regularization property is then specified in Theorem 2.3.

Theorem 2.1

(Well-definedness and termination of REGINN $^{\infty }$) Let $F:\mathcal {D}(F)\subset L^\infty (D)^\ell \rightarrow Y$ be as above satisfying (3) with $\omega <1/3$ in $B_{r}(u^+)\subset {\text {int}}(\mathcal {D}(F))$ and (5), where $\{X^n\}_{n\in \mathbb {N}}$ and $\{\mathcal {P}^n\}_{n\in \mathbb {N}}$ fulfill assumptions (S1)–(S3). Let $y^\delta $ be given such that $\Vert F(u^+)-y^\delta \Vert _{Y}\le \delta $ for one $\delta >0$. Let $\Lambda \in (\frac{2\omega }{1-\omega },1)$ and set

$$\begin{aligned} \mu _{\max }:=(1-\omega )\Lambda -\omega . \end{aligned}$$

For

$$\begin{aligned} \gamma \in \left( 0,\frac{r}{\sqrt{C_{\infty }}\mu _{\max }}\right) \end{aligned}$$

and

$$\begin{aligned} r_0\in \left( 0,\min \left\{ r-\sqrt{C_{\infty }}\mu _{\max }\gamma ,\frac{\gamma }{C_{\mathcal {P}}}\sqrt{\mu ^2_{\max }-\omega ^2}\right\} \right) \end{aligned}$$

choose

$$\begin{aligned} \tau >\frac{1+\omega }{\sqrt{\mu ^2_{\max }-C_{\mathcal {P}}^2r_0^2/\gamma ^2}-\omega }. \end{aligned}$$

Further, define

$$\begin{aligned} \mu _{\min }:=\sqrt{\left( \omega +\frac{1+\omega }{\tau }\right) ^2+C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}}. \end{aligned}$$

Restrict all tolerances $\{\mu _m\}$ to $(\mu _{\min },\mu _{\max })$ and start with some arbitrary $n_0\in \mathbb {N}$ and $u_0\in B_{r_0}(u^+)$. Then, there exists an $M_{\delta }\in \mathbb {N}$ such that all iterates $\{u_1,\ldots ,u_{M_{\delta }}\}$ of REGINN $^{\infty }$ are well-defined and stay in $B_r(u^+)$. Moreover, $\Vert b_{m+1}^{\delta }\Vert _{Y}\le \Lambda \Vert b_m^{\delta }\Vert _{Y}$ for $m=0,\ldots ,M_{\delta }-1$, $\Vert b^\delta _{M_{\delta }}\Vert _{Y}\le \tau \delta $, and $M_{\delta }=\mathcal {O} (\vert \log \delta \vert )$ as $\delta \searrow 0$.

Proof

Before we begin with the proof we discuss our assumptions on the parameters. First, observe that the open interval for choosing $\Lambda $ is non-empty by $\omega <1/3$. The lower bound for $\Lambda $ guarantees that $\mu _{\max }>\omega $. Together with the upper bound on $\gamma $ this yields a positive upper bound for $r_0$. Further, the radicand and the denominator of the lower bound for $\tau $ are positive. Finally, $\mu _{\min } <\mu _{\max }$.

We use an inductive argument and assume therefore that $\Vert b_m^{\delta }\Vert _{Y}\le \Lambda ^m \Vert b_0^{\delta }\Vert _{Y}$ as well as $\Vert u_i-u^+\Vert _{L^\infty (D)^\ell }< r$ for $i\le m$, which holds in particular for $m=0$ because of $\Vert u_0-u^+\Vert _{L^\infty (D)^\ell }<r_0<r$. If $\Vert b^{\delta }_m\Vert _{Y}\le \tau \delta $, REGINN $^{\infty }$ stops with $u_{M_{\delta }}:=u_m$ and nothing else needs to be shown. Otherwise, $\Vert b^{\delta }_m\Vert _{Y}> \tau \delta $ and we next show that a Newton update is well defined by (9). Let $s_{n,m}:=\arg \min _{s\in X^n} J_{n,m}(s)$ which exists as the unique minimizer of a strictly convex functional over a finite dimensional space. Then,

$$\begin{aligned} J_{n,m}(s_{n,m})&\le \ J_{n,m}\big (\mathcal {P}^{n}(u^+-u_m)\big ) \\&=\Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-b_m^{\delta }\Vert ^2_{Y}\\&\qquad +\alpha _m\Vert \mathcal {P}^{n}(u^+-u_m)+(u_m-u_0)\Vert ^2_{L^{q_n}(D)^\ell }. \end{aligned}$$

Recursively, we get $u_m-u_0=s_0+s_1+\cdots + s_{m-1}$ from which we deduce that $u_m-u_0$ is in $X^{n_{m-1}}$ as the spaces are nested by (S1). Hence, by (S2),

$$\begin{aligned} \mathcal {P}^{n}(u_m-u_0)=u_m-u_0 \end{aligned}$$

(12)

for $n\ge n_{m-1}$ and by linearity of $\mathcal {P}^{n}$ we may simplify

$$\begin{aligned} J_{n,m}(s_{n,m})&\le \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-b_m^{\delta }\Vert ^2_{Y}+\alpha _m\Vert \mathcal {P}^{n}(u^+-u_0)\Vert ^2_{L^{q_n}(D)^\ell } \\&\le \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-b_m^{\delta }\Vert ^2_{Y}+ \text {vol}_\text {d}(D)^{2/q_n}C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}\Vert b^\delta _m\Vert ^2_Y. \end{aligned}$$

In the last step we additionally used (11), Hölder’s inequality and $\Vert \mathcal {P}^n(u^+-u_0)\Vert _{L^\infty (D)^\ell }\le C_{\mathcal {P}}\Vert u^+-u_0\Vert _{L^\infty (D)^\ell }<C_{\mathcal {P}}r_0$, see (S2). We continue by splitting the residual term according to

$$\begin{aligned} \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-b_m^{\delta }\Vert _{Y}&\le \Vert F'(u_m)(u^+-u_m)-F(u^+)-F(u_m)\Vert _{Y} \\&\qquad +\Vert F(u^+)-y^\delta \Vert _{Y} \\&\qquad \qquad +\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_m)-(u^+-u_m)\big )\Vert _{Y} \\&\le \Vert F'(u_m)(u^+-u_m)-F(u^+)-F(u_m)\Vert _{Y} +\delta \\&\qquad \qquad +\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y}, \end{aligned}$$

employing again (12) to get the bottom line. Since $\Vert u_{m}-u^+\Vert _{L^\infty (D)^\ell }< r$ by induction, TCC (3) yields

$$\begin{aligned}{} & {} \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-\big (y^\delta -F(u_m)\big )\Vert _{Y} \\{} & {} \quad \le \omega \Vert F(u^+)-F(u_m)\Vert _{Y} +\delta +\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y} \end{aligned}$$

and with $\Vert F(u^+)-F(u_m)\Vert _{Y}\le \Vert F(u^+)-y^\delta \Vert _{Y}+\Vert b_m^{\delta }\Vert _{Y}\le \delta + \Vert b_m^{\delta }\Vert _{Y}$ we deduce further

$$\begin{aligned}{} & {} \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-\big (y^\delta -F(u_m)\big )\Vert _{Y} \\{} & {} \quad \le \omega \big (\delta +\Vert b_m^{\delta }\Vert _{Y}\big ) +\delta +\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y}. \end{aligned}$$

Taking into account that $\Vert b^{\delta }_m\Vert _{Y}> \tau \delta $, we get

$$\begin{aligned}{} & {} \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-\big (y^\delta -F(u_m)\big )\Vert _{Y} \\{} & {} \quad \le \Vert b_m^{\delta }\Vert _{Y}\Big (\omega +\frac{1+\omega }{\tau }\Big )+\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y} \end{aligned}$$

and finally

$$\begin{aligned}{} & {} J_{n,m}(s_{n,m}) \le \bigg (\Vert b_m^{\delta }\Vert _{Y}\Big (\omega +\frac{1+\omega }{\tau }\Big )+\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y}\bigg )^2\nonumber \\{} & {} \qquad \qquad \qquad \quad +{\text {vol}}_d(D)^{2/q_n}C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}\Vert b^\delta _m\Vert ^2_{L^2(D)}. \end{aligned}$$

(13)

Since $\text {vol}_\text {d}(D)^{2/q_n}\rightarrow 1$ as $n\rightarrow \infty $ and in view of (5), we find that

$$\begin{aligned} \liminf _{n\rightarrow \infty }J_{n,m}(s_{n,m})\le \Vert b_m^{\delta }\Vert ^2_{Y}\bigg (\Big (\omega +\frac{1+\omega }{\tau }\Big )^2+C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}\bigg )= \mu _{\min }^2 \Vert b_m^{\delta }\Vert ^2_{Y}. \end{aligned}$$

(14)

Consequently, condition (9) with $\mu _m>\mu _{\min }$ is feasible for $n_m$ large and appropriate $s_m\in X^{n_m}\backslash \{0\}$, where $s_m\ne 0$ follows by $J_{n,m}(0)\ge \Vert b_m^{\delta }\Vert ^2_{Y}$. Hence, $u_{m+1}=u_m+s_m$ is well defined and, relying on (S3) as well as (11), we proceed with

$$\begin{aligned} \Vert u_{m+1}-u_0\Vert _{L^\infty (D)^\ell }^2 =\Vert s_{m}+(u_m-u_0)\Vert ^2_{L^\infty (D)^\ell }&\le C_{\infty } \frac{J_{n_m,m}(s_m)}{\alpha _m}\\&\le C_{\infty }\frac{\mu _m^2 \Vert b_m^{\delta }\Vert ^2_{Y}}{\alpha _m} < C_{\infty } \mu _m^2 \gamma ^2\ . \end{aligned}$$

Hence,

$$\begin{aligned} \Vert u_{m+1}-u^+\Vert _{L^\infty (D)^\ell }&\le \Vert u_{m+1}-u_0\Vert _{L^\infty (D)^\ell }+\Vert u^+-u_0\Vert _{L^\infty (D)^\ell } \\&< \sqrt{C_{\infty }}\mu _{\max } \gamma +r_0<r \end{aligned}$$

by the upper bound of $r_0$, yielding $u_{m+1}\in B_r(u^+)\subset \text {int}(\mathcal {D}(F))$. Finally, we estimate on the basis of (4) and (9)

$$\begin{aligned} \Vert b^{\delta }_{m+1}\Vert _{Y}&=\Vert \big (b^{\delta }_m-F'(u_m)s_m\big )-\big (F(u_{m+1})-F(u_m)-F'(u_m)s_m\big )\Vert _{Y} \nonumber \\&\le \Vert b^{\delta }_m-F'(u_m)s_m\Vert _{Y}+\frac{\omega }{1-\omega }\Vert F'(u_m)s_m\Vert _{Y}\nonumber \\&\le \sqrt{J_{n,m}(s_{m})} + \frac{\omega }{1-\omega }\big (\Vert b^{\delta }_{m}\Vert _{Y}+\Vert F'(u_m)s_m-b^{\delta }_{m}\Vert _{Y}\big )\nonumber \\&\le \mu _m \Vert b^{\delta }_m\Vert _{Y} + \frac{\omega }{1-\omega }(1+\mu _m)\Vert b^{\delta }_m\Vert _{Y}\\&=\left( \mu _m+\frac{\omega }{1-\omega }(1+\mu _m)\right) \Vert b^{\delta }_m\Vert _{Y} \nonumber \\&<\left( \mu _{\max }+\frac{\omega }{1-\omega }(1+\mu _{\max })\right) \Vert b^{\delta }_m\Vert _{Y} < \Lambda \Vert b^{\delta }_m\Vert _{Y}.\nonumber \end{aligned}$$

(15)

Having thus proven the induction part, REGINN $^{\infty }$ is forced to terminate for any $\delta >0$ due to $\Vert b_m^{\delta }\Vert _{Y}\le \Lambda ^m \Vert b_0^{\delta }\Vert _{Y} \le \tau \delta $ for m sufficiently large. From this estimate, we may even deduce $M_{\delta }=\mathcal {O}(\vert \log \delta \vert )$ as $\delta \searrow 0$. $\square $

Remark 2.2

(a) The name REGINN $^{\infty }$ for Algorithm 1 is justified by the stopping condition (9) for determining the Newton update which is, in view of (10) and (11), equivalent to

$$\begin{aligned} \frac{\Vert F'(u_m)s_m -b^{\delta }_m\Vert ^2_{Y}}{\Vert b_m^{\delta }\Vert ^2_{Y}}+\frac{\Vert s_m+(u_m-u_0)\Vert _{L^{q_n}(D)^\ell }^2}{\gamma ^2} \le \mu _m^2. \end{aligned}$$

In particular, $s_m$ satisfies the stopping condition of REGINN [4], i.e., the above condition without penalty term.

(b) Recall that REGINN fulfills in the Hilbert space setting (and likewise for smooth reflexive Banach spaces) the error reducing property for the iterates of many inner linear solvers, keeping thus $u_m\in B_{r}(u^+)$ if the initial guess was chosen so. However, this does not hold any longer for our $L^{\infty }$-tailored REGINN $^{\infty }$ in general. Therefore our parameters need to be controlled in terms of both $\omega $ and r, whereas standard REGINN only requires the knowledge of $\omega $ for defining admissible tolerances $\mu $ and stopping constants $\tau $, see Theorem 3.1 in [19].

In case that F is linear, i.e., $F(u)=Au$ for some $A\in \mathcal {L}(L^\infty (D)^\ell ,Y)$, the TCC holds with $\omega =0$ and $r=\infty $. Some observations are in order:

Although $r_0$ can be arbitrarily large now, to still ensure a finite and uniform $L^{\infty }$-bound on the iterates, $r_0<\gamma <\infty $ needs to be chosen compatibly.
Because of
$$\begin{aligned} \Vert F'(u_m)s_m -b^{\delta }_m\Vert ^2_{Y} =\Vert As_m -(y^\delta -Au_m)\Vert ^2_{Y} =\Vert Au_{m+1} -y^\delta \Vert ^2_{Y} \end{aligned}$$
the iterate $u_{m+1}$ satisfies
$$\begin{aligned} \Vert Au_{m+1} -y^\delta \Vert ^2_{Y}+\alpha _m\Vert u_{m+1}-u_0\Vert _{L^{q_{n_m}}(D)}^2\le \mu _m^2 \Vert b_m^\delta \Vert _Y^2. \end{aligned}$$
Hence, $u_{m+1}$ can be considered an approximate minimizer of the Tikhonov functional $ u\mapsto \Vert Au-y^\delta \Vert ^2_{Y}+\alpha _m\Vert u-u_0\Vert _{L^{q_{n_m}}(D)}^2$ in the set $u_0+X^{n_{m}}$. Put differently: in the linear case, REGINN $^{\infty }$ can be viewed as a cascading Tikhonov regularization iterating over nested finite-dimensional spaces where the penalty term is determined a posteriori by the previous iterate.

Theorem 2.3

(Regularization property of REGINN $^{\infty }$) Adopt all assumptions and notations from Theorem 2.1 with $\overline{B_r(u^+)}\subset \mathcal {D}(F)$ and set $F(u^+)=y$. Additionally, assume that $F'(u^+)$ fulfills (6) or that F is weakly-$\star $ sequentially closed, that is, $w_n{\mathop {\rightharpoonup }\limits ^{\star }}w$ in $L^\infty (D)^\ell $ and $F(w_n)\rightharpoonup z$ imply that $F(w)=z$. Let $\{\delta _i\}_{i\in \mathbb {N}}$ be a positive zero sequence and let $u_{M_{\delta _i}}$ be the output of Algorithm 1 with respect to perturbed data $y^{\delta _i}$. Then the set of weak-$\star $ accumulation points of the sequence $\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}$ is non-empty and consists of solutions to $F(\cdot )=y$. If $u^+$ is the only solution in $\overline{B_r(u^+)}$, then even the whole sequence $\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}$ converges weakly-$\star $ to $u^+$ in $L^\infty (D)^\ell $.

Proof

By construction in Theorem 2.1 we know that $\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}$ yields

$$\begin{aligned} \Vert y-F(u_{M_{\delta _i}})\Vert _{Y}\le \Vert y-y^{\delta _i}\Vert _{Y}+\Vert b_{M_{\delta _i}}^{\delta _i}\Vert _{Y}\le (1+\tau )\delta _i\rightarrow 0 \end{aligned}$$

(16)

and is uniformly bounded in $L^\infty (D)^\ell $, so there exists weak-$\star $ accumulation points in $\overline{B_r(u^+)}$ by sequential weak-$\star $-compactness. Take representatively $u_{M_{\delta _{i_k}}}{\mathop {\rightharpoonup }\limits ^{\star }}\widetilde{u}$. In case that F is weakly-$\star $ sequentially closed, we can directly deduce $F(\widetilde{u})=y$, hence any weak-$\star $ accumulation point solves the equation. In case that (6) holds, we first note that the TCC (3) implies by the reverse triangle inequality for any $u\in B_r(u^+)$ that

$$\begin{aligned} (1-\omega )\Vert F(u^+)-F(u)\Vert _{Y}{} & {} \le \Vert F'(u^+)(u^+-u)\Vert _{Y}\nonumber \\{} & {} \le (1+\omega )\Vert F(u^+)-F(u)\Vert _{Y}. \end{aligned}$$

(17)

With (6) we then obtain for any $g\in Y^{*}$

$$\begin{aligned} \langle F'(u^+)\widetilde{u},g\rangle _{Y,Y^{*}}&=\langle \widetilde{u},F'(u^+)^{*}g\rangle _{L^\infty (D)^\ell ,L^1(D)^\ell }\\&=\lim _{k\rightarrow \infty }\langle u_{M_{\delta _{i_k}}},F'(u^+)^{*}g\rangle _{L^\infty (D)^\ell ,L^1(D)^\ell }\\&=\lim _{k\rightarrow \infty }\langle F'(u^+)u_{M_{\delta _{i_k}}},g\rangle _{Y,Y^{*}} \\&=\langle F'(u^+)u^+,g\rangle _{Y,Y^{*}}, \end{aligned}$$

where the last equality above follows by the second inequality in (17) with $u=u_{M_{\delta _{i_k}}}$ and $F(u_{M_{\delta _{i_k}}})\rightarrow y$ in $Y$. We deduce that $F'(u^+)(u^+-\widetilde{u})=0$ and combining this relation now with the first inequality in (17) using $\widetilde{u}=u$, we may again conclude $F(\widetilde{u})=y$. Finally, if $u^+$ is the only solution to $F(\cdot )=y$ in $\overline{B_r(u^+)}$, the weak-$\star $ convergence of the whole sequence $\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}$ follows by a standard subsequence argument, see [20, Prop. 10.13(2)]. $\square $

Remark 2.4

If $F'(u^+)$ is injective, $\Vert u\Vert _\star :=\Vert F'(u^+)u\Vert $ constitutes a norm on $L^\infty (D)^\ell $ with respect to which $\{u_{M_{\delta }}\}_{\delta >0}$ then converges strongly to $u^+$ according to (17) and (16) at the rate $\Vert u^+ -u_{M_{\delta }}\Vert _\star =\mathcal {O}(\delta )$ as $\delta \rightarrow 0$. However, this norm is generally weaker than $\Vert \cdot \Vert _{L^\infty (D)^\ell }$ with equivalence if and only if $F'(u^+)$ is boundedly invertible. However, for locally illposed problems $F(\cdot )=y$ we expect its linearization to be illposed as well.

Remark 2.5

We discuss how the statements of the theorems from above carry over to a semi-discrete situation as it appears under an implementation of Algorithm 1. Typically, one $X^{n_{\max }}$ represents the finest possible or finest chosen resolution for the sought-for quantity $u^+\in X^{n_{\max }}$ and models the implementation from a mathematical point of view.^{Footnote 1} Here, $X^{n_{\max }}$ is equipped with the $L^\infty $-topology. Now, Theorems 2.1 and 2.3 apply to

$$\begin{aligned} F_{n_{\max }}:\mathcal {D}(F)\cap X^{n_{\max }}\subset L^\infty (D)^\ell \rightarrow Y,\quad u\mapsto F(u), \end{aligned}$$

where (5) can be omitted due to

$$\begin{aligned} F_{n_{\max }}'(u)\big ((u^+-u_0)-\mathcal {P}^{n_{\max }}(u^+-u_0)\big )=0 \end{aligned}$$

since both $u_0$ and $u^+$ are assumed to be in $X^{n_{\max }}$. Further, (14) then reads

$$\begin{aligned} J_{n_{\max },m}(s_{n_{\max },m})\le \Vert b_m^{\delta }\Vert ^2_{Y}\bigg (\Big (\omega +\frac{1+\omega }{\tau } \Big )^2+\text {vol}_\text {d}(D)C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}\bigg ) \end{aligned}$$

and as the only consequence the constant $C_{\mathcal {P}}$ needs to be replaced by $\text {vol}_d(D)^{1/2}C_{\mathcal {P}}$ in the definition of corresponding REGINN $^\infty $ parameters.

We emphasize that the underlying semi-discrete inverse problem is: given $y^\delta \in Y$ find $u^\delta \in X^{n_{\max }}$ such that $F_{n_{\max }}(u^\delta )\approx y^\delta $ where $y^\delta $ now incorporates measurement noise and discretization error.

Remark 2.6

At first glance the IRGNM-Tikhonov method of [12] and REGINN $^\infty $ seem to be quite similar, but they are separated by significant structural differences: In IRGNM-Tikhonov the penalty parameter is determined a priori and is assigend to a fixed regularization term, whereas for REGINN $^\infty $ the regularization term is explicitly n-dependent and relies on an the posteriori parameter choice (11). Further, the Newton update for the IRGNM-Tikhonov method has to be an exact minimizer of the Tikhonov functional. In contrast, the Newton update for REGINN $^\infty $ only requires an approximate minimizer, see (9), which is especially convenient when it comes to practical realizations. Note also that the discrete spaces $X^n$ are an integral part of REGINN $^\infty $ and its theory, so numerical effects are included in a natural way. As part of our results, the discrete regularizers $u_{M_{\delta _i}} \in X^{n_{M_{\delta _i}}}$ of Theorem 2.3 converge to the continuous limit $u^+\in L^\infty (D)^\ell $. A similar connection of the discrete with the continuous world is (to our knowledge) still pending for the method in [12].

The previous version of REGINN $^{\infty }$ requires the determination of successive discretization levels $n_m\ge n_{m-1}$ for possibly strongly increasing $n_m$ so that many intermediate (almost) minimizers $s\in X^n$ of (10) for $ n_{m-1}\le n\le n_m$ need to be computed before meeting the given $\mu _m$-criterion in (9). As this adaptive discretization strategy can become numerically expensive, we want to present an alternative version which directly links n to m. A closer look at the proof of Theorem 2.1 reveals that $n_m$ actually depends on the decay of $\Vert F'(u_m)(\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0))\Vert _{Y}$. Hence, if we have a concrete upper bound for this discretization residual in terms of n, feasible choices of $n_m$ can be found by simple algebraic manipulation. Such upper bounds can be deduced on the basis of better initial guesses which are governed by some stronger norm. For this purpose we state the following refined version of assumption (5):

If $X\subset L^\infty (D)^\ell $ is an embedded normed space such that

$$\begin{aligned} \Vert \widehat{u}\Vert _{L^\infty (D)^\ell }\le C_{X} \Vert \widehat{u}\Vert _X\ \text { for all } \widehat{u}\in X, \end{aligned}$$

(18)

then for any $u^+$ such that $B_{\widetilde{r}}(u^+)\subset \text {int}(\mathcal {D}(F))$, $u\in B_{\widetilde{r}}(u^+)$ and $\widehat{u}\in X$ we assume that

$$\begin{aligned} \Vert F'(u)\big (\widehat{u}-\mathcal {P}^{n}\widehat{u}\big )\Vert _{Y}\le C^+ \Vert \widehat{u}\Vert _{X}\beta (n), \end{aligned}$$

(19)

where $C^+>0$ and $\beta $ fulfills $\beta (n)\searrow 0$ with $\beta (0)=1$. We think of $\beta $ as being rather independent of $u\in \mathcal {D}(F)$ once $X$ and $\{X^n\}_{n\in \mathbb {N}}$ are set while the magnitude of $C^+$ is strongly $B_{\widetilde{r}}(u^+)$-dependent. We will verify in Sect. 3.3 below that (19) is satisfied, for instance, by the modeling assumptions of FWI in the acoustic regime. The next theorem shows that we can indeed determine $n_m$ conveniently for successive Newton steps of REGINN $^{\infty }$ when replacing condition (5) by (19).

Theorem 2.7

Adopt all assumptions, notations and parameters from Theorems 2.1 and 2.3, and assume that (19) holds—without loss of generality with $\widetilde{r}=r$ by shrinking one of the radii otherwise. Start with $u_0\in L^\infty (D)^\ell $ such that $\Vert u^+-u_0\Vert _X<\min \{r_0/C_{X},1/C^+\}$ and restrict $\{\mu _m\}$ to $(\mu ^{\varepsilon }_{\min },\mu _{\max })$, where

$$\begin{aligned} \mu ^{\varepsilon }_{\min }:=\sqrt{\left( \omega +\frac{1+\omega }{\tau }+\varepsilon \right) ^2+\max \big \{\text {vol}_d(D)^{2/q_{n_0}},1\big \}C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}}<\mu _{\max } \end{aligned}$$

(20)

for some $\varepsilon >0$ sufficiently small and $q_{n_0}$ large with $n_0\in \mathbb {N}$. Further, let $n_m$ be successively defined by

$$\begin{aligned} \begin{aligned} n_m:=\min \bigg \{n\ge n_0:\ \beta (n)\le \varepsilon \Vert b^{\delta }_m\Vert _{Y}\bigg \} . \end{aligned} \end{aligned}$$

(21)

Then we can find $s_m\in X^{n_m}$ satisfying (9). In particular, REGINN $^{\infty }$ also terminates in this case and the regularization property still holds.

Proof

First, $n_m$ according to (21) is well defined since $\lim _{n\rightarrow \infty }\beta (n)=0$. Besides, since $\Vert b^{\delta }_m\Vert _{Y}$ is monotonously decreasing in m, we get that $n_m$ is non-decreasing, too. Using $s_m:=\arg \min _{s\in X^{n_m}} J_{n_m,m}(s)$, we may compute with (13) and by (19)

$$\begin{aligned} J_{n_m,m}(s_m)&\le \bigg (\Vert b_m^{\delta }\Vert _{Y}\Big (\omega +\frac{1+\omega }{\tau }\Big )+\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y}\bigg )^2\\&\qquad +\text {vol}_d(D)^{2/q_{n_m}}\frac{r_0^2}{\gamma ^2}C_{\mathcal {P}}^2\Vert b^\delta _m\Vert ^2_{L^2(D)} \\&\le \bigg (\Vert b_m^{\delta }\Vert _{Y}\Big (\omega +\frac{1+\omega }{\tau }\Big )+\underbrace{C^+ \Vert u^+-u_0\Vert _{X}}_{<1}\underbrace{\beta (n_m)}_{\le \varepsilon \Vert b^{\delta }_m\Vert _{Y}}\bigg )^2\\&\qquad +\max \{\text {vol}_d(D)^{2/q_{n_0}},1\}C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}\Vert b^\delta _m\Vert ^2_{L^2(D)} \\&\le (\mu ^{\varepsilon }_{\min })^2\Vert b^\delta _m\Vert ^2_{L^2(D)}. \end{aligned}$$

The fact that REGINN $^{\infty }$ still terminates and also admits the regularization property follows by Theorems 2.1, 2.3 and $\Vert u_0-u^+\Vert _{L^\infty (D)^\ell }\le C_{X}\Vert u_0-u^+\Vert _{X}< r_0$. $\square $

For convenience, we restate REGINN $^{\infty }$ in Algorithm 2 subject to $u^+-u_0\in X$ for which we need to provide $\varepsilon $ and $\beta $ as additional input. This version is especially of interest if the regularity $u^+\in X$ is known a priori so that $u_0\in X$ ensures $u^+-u_0\in X$, as desired.

3 Applications: parameter identification tasks in PDEs

In this section we will verify our abstract assumptions (5) and (6) in the concrete settings of electrical impedance tomography and time domain full waveform inversion (FWI) in the visco-elastic regime which is the state-of-the-art imaging modality in exploration geophysics, see, e.g., [2, 3]. Both inverse problems are locally illposed. We will even show that (19) is satisfied in the acoustic regime. Moreover, our results for FWI carry over to parameter identification tasks for other hyperbolic evolution equations, such as an inverse problem in electromagnetic scattering.

For all applications we rely on the discrete B-spline spaces constructed in Appendix A.1.

3.1 Electrical impedance tomography under the complete electrode model

Electrical impedance tomography (EIT) entails the determination of the electric conductivity distribution of an object by applying electric currents at the boundary through electrodes and measuring the resulting voltages at the boundary as well. Potential applications are, for instance, medical imaging and non-destructive testing.

Let $\sigma :D\rightarrow [\sigma _{\min },\infty )$, $\sigma _{\min }>0$, be the searched-for conductivity distribution in the simply connected Lipschitz-domain $D \subset \mathbb {R}^2$. Further, the p electrodes are denoted by $E_1,\ldots ,E_p$ and are assumed to be open subsets of $\partial D$, the boundary of D, having positive surface measure: $\vert E_j\vert >0$, $j=1,\ldots ,p$. Moreover, let the electrodes be connected and separated: $\overline{E}_i\cap \overline{E}_j=\emptyset $, $i\ne j$. To this electrode configuration we associate the electrode space

$$\begin{aligned} \mathscr {E}_p:=\textrm{span}\{\chi _{E_1},\ldots ,\chi _{E_p}\}\cap L^2_{\scriptscriptstyle \diamondsuit }( \partial D)\subset L^2_{\scriptscriptstyle \diamondsuit }( \partial D) \end{aligned}$$

where $\chi _{E_i}$ is the indicator function of the ith electrode and $L^2_{\scriptscriptstyle \diamondsuit }(\partial D)$ is the space of $L^2$-functions on the boundary of D having vanishing mean.

The forward problem of impedance tomography under the complete electrode model (CEM) in the weak formulation is based on the bilinear form $a:Y_p\times Y_p\rightarrow \mathbb {R}$, $Y_p:= H^1(D)\oplus \mathscr {E}_p$,

$$\begin{aligned} a\big ((v,V),(w,W)\big ):=\int _D\sigma \nabla v\cdot \nabla w\,\textrm{d} x+ \frac{1}{z}\int _{\partial D} (v-V) (w-W)\,\textrm{d}s \end{aligned}$$

Given an electrode (mean) current $I\in \mathscr {E}_p$ and a contact impedance $z>0$, find a voltage potential $u \in H^1(D)$ and an electrode voltage $U \in \mathscr {E}_p$ such that

$$\begin{aligned} a\big ((u,U),(w,W)\big )=\int _{\partial D}I\,W\textrm{d}s \quad \text { for all }(w,W)\in Y_p. \end{aligned}$$

(22)

The vanishing mean of I and U can be interpreted as conservation of charge and grounding the potential, respectively. CEM (22) is the most accurate mathematical model for EIT currently in use and has a unique solution for $\sigma \in L^\infty _+(D)$, where

$$\begin{aligned} L^\infty _+(D):=\{\sigma \in L^\infty (D):\sigma \ge \sigma _{\min } >0 \ \text { a.e. in }D\}. \end{aligned}$$

The latter follows from the Lax-Milgram theorem, see [21].

The nonlinear forward operator F describing CEM is

$$\begin{aligned} F:L^\infty _+(D)\subset L^\infty (D)\rightarrow \mathcal {L}(\mathscr {E}_p), \quad \sigma \mapsto \{I\mapsto U\}. \end{aligned}$$

In other words: $F(\sigma )$ maps the current I to the voltage U of the solution of (22). Its Frechét derivative $F'(\sigma )\in \mathcal {L}\big (L^\infty (D),\mathcal {L}(\mathscr {E}_p)\big )$ is given by

$$\begin{aligned} F'(\sigma )[h]I=U' \end{aligned}$$

where $(u',U')\in Y_p$ uniquely solves

$$\begin{aligned} a\big ((u',U'),(w,W)\big )=-\int _{D} h \nabla u(I)\cdot \nabla w\textrm{d}x \quad \text {for all }(w,W)\in Y_p \end{aligned}$$

(23)

with $u=u(I)$ being the first component of the solution of (22) with respect to the current I, see, e.g., [22]. We will similarly use U(I) for the second component of the solution of (22) and $u'(I)$ for the first component of (23).

We equip $\mathcal {L}(\mathscr {E}_p)$ with the Hilbert-Schmidt inner product: $\langle \Lambda , \Gamma \rangle _\text { HS}=\sum _{j=1}^{p-1} \langle \Lambda I_j,\Gamma I_j\rangle _{L^2(\partial D)}$ where $\{I_1,\ldots ,I_{p-1}\}$ is an orthonormal basis of $\mathscr {E}_p$. This inner product is independent of the chosen orthonormal basis.

Lemma 3.1

The adjoint operator $F'(\sigma )^*:\mathcal {L}(\mathscr {E}_p)^*\rightarrow L^\infty (D)^*$ is given by

$$\begin{aligned} F'(\sigma )^*\Gamma =-\sum _{j=1}^{p-1} \nabla u( \Gamma I_j)\cdot \nabla u( I_j) \end{aligned}$$

where $u(\Gamma I_j)$ and $u(I_j)$ denote the solutions of (22) with respect to $I=\Gamma I_j$ and $I=I_j$, respectively. As a sum of products of two $L^2$-functions, $F'(\sigma )^*\Gamma $ is even in $L^1(D)\subset L^\infty (D)^*$, so that (6) holds.

Proof

Set $\Lambda = F'(\sigma )[h]$, $\Lambda _j=\Lambda I_j$, and $\Gamma _j=\Gamma I_j$ for one $\Gamma \in \mathcal {L}(\mathscr {E}_p)$. Then,

$$\begin{aligned}{} & {} \langle \Lambda , \Gamma \rangle _\text { HS}= \sum _{j=1}^{p-1} \langle \Lambda _j,\Gamma _j\rangle _{L^2(\partial D)} {\mathop {=}\limits ^{(22)}} \sum _{j=1}^{p-1} a\big ((u(\Gamma _j),U(\Gamma _j)),(u'(I_j), \Lambda _j)\big )\\{} & {} \quad {\mathop {=}\limits ^{(23)}} - \int _{\partial D} h \sum _{j=1}^{p-1} \nabla u( \Gamma _j)\cdot \nabla u(I_j)\,\textrm{d}s =\langle h, F'(\sigma )^*\Gamma \rangle _{L^\infty (D)\times L^\infty (D)^*} \end{aligned}$$

which settles the argument. $\square $

Lemma 3.2

Let $X^n=X_N^n({D})$, $n\in \mathbb {N}$, be the tensor product spline space of Appendix A.1 with projector $\mathcal {P}^n$ defined in (A5). Then, for any $\sigma \in L_+^\infty (D)$,

$$\begin{aligned} \lim _{n\rightarrow \infty }\big \Vert F'(\sigma )[h-\mathcal {P}^nh]\big \Vert _{\mathcal {L}(\mathscr {E}_p)}=0\quad \text { for any }h\in L^\infty (D). \end{aligned}$$

Proof

We start with

$$\begin{aligned} \big \Vert F'(\sigma )[h-\mathcal {P}^nh]\big \Vert _{\mathcal {L}(\mathscr {E}_p)}^2 =\sum _{j=1}^{p-1} \big \Vert F'(\sigma )[h-\mathcal {P}^nh]I_j\big \Vert _{L^2(\partial D)}^2. \end{aligned}$$

The bilinear form a is bounded and elliptic on $Y_p$. Hence, it follows from (23) that

$$\begin{aligned} \big \Vert F'(\sigma )[h-\mathcal {P}^nh]I_j\big \Vert _{L^2(\partial D)}\lesssim \big \Vert (h-\mathcal {P}^nh) \vert \nabla u(I_j)\vert \big \Vert _{L^2(D)}. \end{aligned}$$

Since $\mathcal {P}^nh\xrightarrow {n\rightarrow \infty } h$ pointwise a.e. (Appendix B), the norm on the right tends to zero as $n\rightarrow \infty $ by the dominated convergence theorem. $\square $

So we have validated (5) and (6) for the forward operator of EIT. Moreover, TCC (3) holds for the semi-discrete version $F_{n_{\max }}$ of F (Remark 2.5) provided the number of electrodes is sufficiently large [22, Theorem 4.5].

3.2 Full waveform inversion

Wave propagation in realistic media can be modeled by a visco-elastic wave equation which accounts for dispersion and attenuation [2, Chapter 5]. Here, we consider the formulation introduced in [23]: Let $D\subset \mathbb {R}^3$ be a Lipschitz domain. Using $L\in \mathbb {N}$ damping tensors ${\varvec{\sigma }}_l:[0,T]\times D\rightarrow \mathbb {R}^{3\times 3}_\text {sym}$, $l=1,\ldots ,L$, the evolution of the particle velocity field $\textbf{v}:[0,T]\times D\rightarrow \mathbb {R}^3$ and stress tensor ${\varvec{\sigma }}_0:[0,T]\times D\rightarrow \mathbb {R}^{3\times 3}_\text {sym}$ reads

$$\begin{aligned} \rho \, \partial _t \textbf{v}&= \mathrm{\,div\,}\Big (\sum _{l=0}^{L}\varvec{\sigma }_l\Big )+{\varvec{f}}{} & {} \text {in }(0,T)\times D, \end{aligned}$$

(24a)

$$\begin{aligned} \partial _t {\varvec{\sigma }}_0&= C\big (\mu ,\pi \big )\,\varepsilon (\textbf{v}){} & {} \text {in }(0,T)\times D , \end{aligned}$$

(24b)

$$\begin{aligned} \tau _{{\varvec{\sigma }},l}\, \partial _t {\varvec{\sigma }}_l&= C\big (\tau _\text {\tiny S}\mu , \tau _\text {\tiny P}\pi \big )\,\varepsilon (\textbf{v}) -{\varvec{\sigma }}_l,\qquad l=1,\dots ,L,{} & {} \text {in }(0,T)\times D, \end{aligned}$$

(24c)

with zero initial conditions^{Footnote 2}

$$\begin{aligned} \textbf{v}(0)=\textbf{0}\ \text { and }\ {\varvec{\sigma }}_l(0)=\textbf{0},\ l=0,\ldots ,L. \end{aligned}$$

(24d)

Above, ${\varvec{f}}:(0,T)\times D\rightarrow \mathbb {R}^3$ is the volume force, which initiates the wave, and the functions $\tau _{\text {\tiny P}},\tau _{\text {\tiny S}}:D\rightarrow \mathbb {R}$ are scaling factors for the unrelaxed bulk modulus $\pi :D\rightarrow \mathbb {R}$ and shear modulus $\mu :D\rightarrow \mathbb {R}$, respectively. Further, $\rho :D\rightarrow \mathbb {R}$ is the mass density. The linear map C(m, p), $m,p\in \mathbb {R}$, is Hooke’s tensor

$$\begin{aligned} C(m,p):\mathbb {R}^{3\times 3}\rightarrow \mathbb {R}^{3\times 3},\qquad C(m,p){\textbf{M}} =2m\textbf{M} + (p-2m) {\text {tr}}(\textbf{M})\textbf{I}, \end{aligned}$$

where $\textbf{I}\in \mathbb {R}^{3\times 3}$ is the identity matrix and ${\text {tr}}(\textbf{M})$ denotes the trace of $\textbf{M}\in \mathbb {R}^{3\times 3}$. Finally,

$$\begin{aligned} \varvec{\varepsilon }( \textbf{v}) = \frac{1}{2}\big [(\nabla _x \textbf{v})^\top +\nabla _x \textbf{v}\big ] \end{aligned}$$

is the linearized strain rate.

Wave propagation is frequency-dependent and the numbers $\tau _{{\varvec{\sigma }},l}>0$, $l=1,\ldots ,L$, model this dependency about the center frequency $\omega _0$, see [24, 25]. Introducing the frequency-dependent phase velocities of P- and S-waves,

$$\begin{aligned} v^{2}_{\text {\tiny P}} =\frac{\pi }{\rho }(1+ \tau _{\text {\tiny P}} \alpha ) \ \text { and }\ v^{2}_{\text {\tiny S}} =\frac{\mu }{\rho }(1+ \tau _{\text {\tiny S}} \alpha )\ \text { with } \alpha =\sum _{l=1}^{L} \frac{\omega _0^{2}\tau _{{\varvec{\sigma }},l}^{2}}{1+\omega _0^{2}\tau _{{\varvec{\sigma }},l}^{2}}, \end{aligned}$$

full waveform inversion entails the identification of the five spatially dependent parameters

$$\begin{aligned} u=(\rho ,v_{\text {\tiny S}}, \tau _{\text {\tiny S}}, v_{\text {\tiny P}},\tau _{\text {\tiny P}}) \end{aligned}$$

from partial wave field measurements. For a physically meaningful open subset^{Footnote 3}$\mathcal {D}(F)\subset L^\infty (D)^5$ the full waveform forward operator

$$\begin{aligned} F:\mathcal {D}(F)\subset L^\infty (D)^5\rightarrow L^2((0,T),H),\quad u \mapsto y:=(\textbf{v},{\varvec{\sigma }}_0,\dots ,{\varvec{\sigma }}_L), \end{aligned}$$

is well defined with

$$\begin{aligned}H=L^2(D,\mathbb {R}^3)\times L^2(D,\mathbb {R}^{3\times 3}_{\textrm{sym}})^{1+L} \end{aligned}$$

where y $\in C^{1}([0,T],H)$ is the unique classical solution of (24) with source ${\varvec{f}}\in W^{1,1}((0,T),L^2(D,\mathbb {R}^3))$, see [26] (of course, y satisfies boundary conditions, whose concrete formulations we have omitted for simplicity).

Remark 3.3

Actually, the operator modeling seismic imaging is $\Phi =M\circ F$ where $M:L^2((0,T),H) \rightarrow S $ denotes the measurement operator (S is the space of seismograms). We can reasonably assume that M is linear and bounded. If F satisfies (5) and (6), so does $\Phi $.

For suitable $w=(\textbf{w},\varvec{\psi }_0,\ldots ,\varvec{\psi }_L)\in H$ we define operators A, B, and Q mapping into H by

$$\begin{aligned} Aw=-\begin{pmatrix} \mathrm{\,div\,}\big (\sum _{l=0}^{L}\varvec{\psi }_l\big ) \\ \varepsilon (\textbf{w})\\ \vdots \\ \varepsilon (\textbf{w})\end{pmatrix}\!,\ B^{-1}w=\begin{pmatrix} \frac{1}{\rho }\,\textbf{w}\\ C\big (\mu ,\pi \big )\varvec{\psi }_0\\ C\big (\tau _\text {\tiny S}\mu , \tau _\text {\tiny P}\pi \big )\varvec{\psi }_1\\ \vdots \\ C\big (\tau _\text {\tiny S}\mu , \tau _\text {\tiny P}\pi \big )\varvec{\psi }_L \end{pmatrix}\!,\ Qw=\begin{pmatrix} \textbf{0}\\ \textbf{0}\\ \frac{1}{\tau _{{\varvec{\sigma }},1}}\,\varvec{\psi }_1\\ \vdots \\ \frac{1}{\tau _{{\varvec{\sigma }},L}}\,\varvec{\psi }_L \end{pmatrix}\!, \end{aligned}$$

so that (24) collapses to

$$\begin{aligned} \begin{aligned} B\partial _ty(t)+Ay(t)+BQy(t)&=f(t)&\text {in }(0,T),\\ y(0)&=0. \end{aligned}\ \end{aligned}$$

(25)

Note that only $B:\mathcal {D}(F)\subset L^\infty (D)^5\rightarrow \mathcal {L}(H)$ depends on the parameters to be identified: $B=B(u)$. It is Fréchet-differentiable with

$$\begin{aligned} B'(u)[\widehat{u}]\! \begin{pmatrix}\! \textbf{w}\\ \varvec{\psi }_0\\ \vdots \\ \varvec{\psi }_L \!\end{pmatrix} = \begin{pmatrix} \widehat{\rho } \,\textbf{w}\\ -\frac{\widehat{\rho }}{\rho }\widetilde{C}(\mu ,\pi )\varvec{\psi }_0 +{\rho }\, \widetilde{C}'(\mu ,\pi )\! \begin{bmatrix} \widetilde{\mu }\\ \widetilde{\pi } \end{bmatrix}\! \varvec{\psi }_0\\ -\frac{\widehat{\rho }}{\rho }\widetilde{C}(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )\varvec{\psi }_1 +{\rho }\, \widetilde{C}'(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )\! \begin{bmatrix} \widehat{\mu }\\ \widehat{\pi } \end{bmatrix}\! \varvec{\psi }_1\\ \vdots \\ -\frac{\widehat{\rho }}{\rho }\widetilde{C}(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )\varvec{\psi }_L +{\rho }\, \widetilde{C}'(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )\! \begin{bmatrix} \widehat{\mu }\\ \widehat{\pi } \end{bmatrix}\! \varvec{\psi }_L \end{pmatrix} \end{aligned}$$

(26)

where $\widehat{u}=(\widehat{\rho },\widehat{v}_{\text {\tiny S}}, \widehat{\tau }_{\text {\tiny S}}, \widehat{v}_{\text {\tiny P}},\widehat{\tau }_{\text {\tiny P}})$ and

$$\begin{aligned} \widetilde{\mu }&= \frac{2v_{\text {\tiny S}}}{1+\tau _{\text {\tiny S}}\alpha }\, \widehat{v}_{\text {\tiny S}} - \frac{\alpha \,v_{\text {\tiny S}}^2}{(1+\tau _{\text {\tiny S}}\alpha )^2}\, \widehat{\tau }_{\text {\tiny S}},&\qquad \widetilde{\pi }&=\frac{2v_{\text {\tiny P}}}{1+\tau _{\text {\tiny P}}\alpha }\, \widehat{v}_{\text {\tiny P}} - \frac{\alpha \,v_{\text {\tiny P}}^2}{(1+\tau _{\text {\tiny P}}\alpha )^2}\, \widehat{\tau }_{\text {\tiny P}},\\ \widehat{\mu }&= \frac{2\tau _\text {\tiny S}\,v_{\text {\tiny S}}}{1+\tau _{\text {\tiny S}}\alpha }\, \widehat{v}_{\text {\tiny S}} + \frac{v_{\text {\tiny S}}^2}{(1+\tau _{\text {\tiny S}}\alpha )^2}\, \widehat{\tau }_{\text {\tiny S}},&\qquad \widehat{\pi }&=\frac{2\tau _\text {\tiny P}\,v_{\text {\tiny P}}}{1+\tau _{\text {\tiny P}}\alpha }\, \widehat{v}_{\text {\tiny P}} +\frac{v_{\text {\tiny P}}^2}{(1+\tau _{\text {\tiny P}}\alpha )^2}\, \widehat{\tau }_{\text {\tiny P}}. \end{aligned}$$

Further,

$$\begin{aligned} \widetilde{C}(m,p):=C(m,p)^{-1}=C\left( \frac{1}{4m},\frac{p-m}{m(3p-4m)}\right) . \end{aligned}$$

and

$$\begin{aligned} \widetilde{C}'(m,p)\begin{bmatrix} \widehat{m}\\ \widehat{p} \end{bmatrix} =-\widetilde{C}(m,p)\circ C(\widehat{m},\widehat{p})\circ \widetilde{C}(m,p). \end{aligned}$$

(27)

Both, $\widetilde{C}(\mu ,\pi )$ and $\widetilde{C}(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )$ are well defined for $u\in \mathcal {D}(F)$, see [26, Section 4.1] for the involved definition of $\mathcal {D}(F)$.

Lemma 3.4

Let $\{\widehat{u}_n\}_{n\in \mathbb {N}}\subset L^\infty (D)^5$ be bounded and be convergent to $\widehat{u}\in L^\infty (D)^5$ pointwise a.e. Then, for any $u\in \mathcal {D}(F)$,

$$\begin{aligned} \lim _{n\rightarrow \infty } \big \Vert B'(u)[\widehat{u}_n-\widehat{u}]h \big \Vert _H=0 \quad \text {for any } h\in H. \end{aligned}$$

(28)

Proof

The $L^2$-space H has inner product

$$\begin{aligned} \big \langle (\textbf{v}, {\varvec{\sigma }}_0,\ldots , {\varvec{\sigma }}_L), (\textbf{w}, \varvec{\psi }_0,\ldots , \varvec{\psi }_l)\big \rangle _H= \int _D\Big (\textbf{v}\cdot \textbf{w}+\sum _{l=0}^L\varvec{\sigma }_l:\varvec{\psi }_l\Big )\,\textrm{d}x \end{aligned}$$

where the colon indicates the Frobenius inner product on $\mathbb {R}^{3\times 3}$. Now, in view of (26) and (27) we see that the integrand of $\Vert B'(u)[\widehat{u}_n-u]h \Vert _H^2$ converges to zero pointwise a.e. Furthermore, as $\{\widehat{u}_n\}_{n\in \mathbb {N}}$ is bounded in $L^\infty (D)^5$, the absolute value of the integrand is bounded by an integrable function (as a matter of fact, the integrand is the sum of products of two $L^2$-functions with an $L^\infty $-function). The assertions follows from the dominated convergence theorem. $\square $

For completion, we quote a result from [26, Theorem 4.4] which we will need below.

Lemma 3.5

Under the assumptions from above, F is Fréchet-differentiable at any $u\in \mathcal {D}(F)$. For $\widehat{u}\in L^{\infty }(D)^{5}$ we have $\overline{y}:=F'(u)\widehat{u}\in C([0,T],H)$ given as the unique weak solution of

$$\begin{aligned} B\partial _t\overline{y}(t)+A\overline{y}(t)+BQ\overline{y}(t)&=-B'(u)[\widehat{u}]\left( \partial _ty(t)+Qy(t)\right) , \\ \overline{y}(0)&=0, \end{aligned}$$

with $y=F(u)$. Further, for any $t\in [0,T]$,

$$\begin{aligned} \Vert \overline{y}\Vert _{C([0,t],H)}\le C\Vert B'(u)[\widehat{u}](\partial _ty+Qy)\Vert _{L^1((0,t),H)} , \end{aligned}$$

(29)

where C depends continuously on the operator norms of B, $B^{-1}$, Q, and on T.

We define

$$\begin{aligned} X^n =X_N^n \times \dots \times X_N^n\quad \text {(5 factors)} \end{aligned}$$

(30)

as well as

$$\begin{aligned} \mathcal {P}^n:L^{\infty }(D)^{5}\rightarrow X^n,\quad \mathcal {P}^nu=(\mathcal {P}_N^nu_1,\ldots ,\mathcal {P}_N^nu_{5}), \end{aligned}$$

where $X_N^n$ and $\mathcal {P}_N^n$ are given by the cardinal B-spline spaces in (A3) and (A5), respectively. We point out that also heterogeneous choices for the factors in $X^n$ are possible as long as $X^n$ keeps nested with respect to n. The convergence properties of $\{\mathcal {P}^n\}_n$, see Appendix B, then guarantee (5) according to the next lemma.

Lemma 3.6

Let $X^n$ and $\mathcal {P}^n$ be as above. For the full waveform forward operator F with ${\varvec{f}}\in W^{1,1}((0,T),L^2(D,\mathbb {R}^3))$ we have that

$$\begin{aligned} \lim _{n\rightarrow \infty }\Vert F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\Vert _{L^2((0,T),H)}=0 \end{aligned}$$

for any $u\in \mathcal {D}(F)$ and all $\widehat{u}\in L^{\infty }(D)^{5}$.

Proof

The function $\overline{y}_n:=F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )$ is the unique weak solution of

$$\begin{aligned} \begin{aligned} B\partial _t\overline{y}_n(t)+A\overline{y}_n(t)+BQ\overline{y}_n(t)&=-B'(u)\big [\widehat{u}-\mathcal {P}^n\widehat{u}\big ]\left( \partial _ty(t)+Qy(t)\right) , \\ \overline{y}_n(0)&=0, \end{aligned} \end{aligned}$$

which according to (29) satisfies

$$\begin{aligned} \Vert F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\Vert _{C([0,T],H)}\le C\Vert [B'(u)\big [\widehat{u}-\mathcal {P}^n\widehat{u}\big ](\partial _ty+Qy)\Vert _{L^1((0,T),H)} . \end{aligned}$$

Since $C([0,T],H)\hookrightarrow L^2((0,T),H)$ is continuous, the assertion of the theorem follows if we can show that the right-hand side of the above stability estimate goes to zero as $n\rightarrow \infty $. Applying Proposition B.1 componentwise, we deduce that $\mathcal {P}^{n_k}\widehat{u}\rightarrow \widehat{u}$ pointwise a.e. for a subsequence $\{n_k\}_{k\in \mathbb {N}}$. Using $\Vert \mathcal {P}^{n_k}\widehat{u}\Vert _{L^{\infty }(D)^{5}}\le C_\mathcal {P}\Vert \widehat{u}\Vert _{L^{\infty }(D)^{5}}$ and $\partial _ty+Qy\in C([0,T],H)$, we see that (26) corresponds to linear combinations of uniformly bounded functions in t for all k. In particular, we can find $g\in L^1(0,T)$ such that

$$\begin{aligned} \sup _{k}\Vert [B'(u)\big [\widehat{u}-\mathcal {P}^{n_k}\widehat{u}\big ](\partial _ty+Qy)(t)\Vert _H\le g(t). \end{aligned}$$

By (28) we can then apply the dominated convergence theorem for integration in time which yields $\Vert B'(u)\big [\widehat{u}-\mathcal {P}^{n_k}\widehat{u}\big ](\partial _ty+Qy)\Vert _{L^1((0,T),H)}\rightarrow 0$ as $k\rightarrow \infty $. By uniqueness of the pointwise limit $\widehat{u}$ the latter convergence even holds for the whole sequence, see [20, Prop. 10.13(2)]. $\square $

Condition (6), which guarantees the regularization property (Theorem 2.3), was proven for the visco-elastic case (24) in [26, Theorem 4.5 and Remark 4.6]. Unfortunately, the TCC, which is the remaining condition for the rigorous applicability of REGINN $^{\infty }$, is subject of current research in the context of FWI and only special cases are known to hold. For example, in [6] the TCC has recently been shown for a semi-discrete setting in the acoustic regime, cf. (31) below.

Remark 3.7

The findings for the full waveform forward operator in this subsection, that is, (5) and (6) hold, carry over to parameter identification tasks for other hyperbolic evolution equations of the form (25). Examples are the parameter-to-state maps with respect to the acoustic and elastic wave equations (which are in fact simplifications of the visco-elastic model). A further example is the Maxwell system where the components of $y=(\textbf{E}, \textbf{H})$ are the electric and magnetic fields, respectively. The involved operators are

$$\begin{aligned} A=\begin{pmatrix} \sigma \text {Id} &{} -\text {curl}\\ \text {curl} &{} 0 \end{pmatrix},\ \ B=\begin{pmatrix} \varepsilon \text {Id}&{}0 \\ 0 &{} \mu \text {Id} \end{pmatrix},\quad Q=0,\ \text { and }\ f= \begin{pmatrix}-\textbf{J}_{\text {e}} \\ \textbf{J}_{\text {m}} \end{pmatrix} \end{aligned}$$

where $\varepsilon ,\mu ,\sigma :D\rightarrow \mathbb {R}$ are the (electric) permittivity, the (magnetic) permeability, and the conductivity. Further, $\textbf{J}_{\text { e}}, \textbf{J}_{\text { m}}:[0,\infty )\times D\rightarrow \mathbb {R}^3$ are the current and magnetic densities. See [27, Section 5] for the details to verify (5) and (6) for the map $F:(\varepsilon , \mu )\mapsto y$.

3.3 A stronger result for FWI in the acoustic regime

Setting $\mu =0$, $\tau _{{\varvec{\sigma }},l}=0$, $\tau _{\text {\tiny P}}=\tau _{\text {\tiny S}}=0$ in (24) and introducing the hydrostatic pressure $p={\text {tr}}(\varvec{\sigma }_0)/3$, we can formally derive the acoustic wave equation as first order system,

$$\begin{aligned} \rho \,\partial _t \textbf{v}- \nabla p&= {\varvec{f}}_1 \qquad \text {in }(0,T)\times D, \end{aligned}$$

(31a)

$$\begin{aligned} \frac{1}{\rho \nu ^2} \,\partial _t p - \mathrm{\,div\,}\textbf{v}&= f_2 \qquad \text {in } (0,T)\times D, \end{aligned}$$

(31b)

where we allow two source terms and where $\nu =v_\text {\tiny P}$ is the compression wave speed. In our abstract formulation (25), it is represented by $y=(\textbf{v},p)\in L^2((0,T),H)$, $H=L^2(D,\mathbb {R}^d)\times L^2(D)$, and the operators

$$\begin{aligned} A=- \begin{pmatrix} 0&{} \nabla \\ \mathrm{\,div\,}&{} 0 \end{pmatrix}, \quad B= \begin{pmatrix} \rho \text {Id} &{} 0\\ 0 &{}\frac{1}{\rho \,\nu ^2} \text {Id} \end{pmatrix}, \quad Q=0.\end{aligned}$$

(32)

Further, A is defined on

$$\begin{aligned} \mathcal {D}(A)=\big \{(\textbf{v},p)\in L^2(D,\mathbb {R}^d) \times H_0^1(D):\ \mathrm{\,div\,}\textbf{v}\in L^2(D)\big \}. \end{aligned}$$

(33)

If ${\varvec{f}}=({\varvec{f}}_1,f_2)\in W^{1,1}((0,T),H)$ then (31) has a unique classical solution in $C^1([0,T],H) \cap C([0,T],\mathcal {D}(A))$, see [27] (recall that we have zero initial conditions (24d)).

The parameter-to-state map here is

$$\begin{aligned} F:\mathcal {D}(F) \subset L^\infty (D)^2\rightarrow L^2((0,T),H),\quad u=(\rho ,\nu )\mapsto y=(\textbf{v},p), \end{aligned}$$

(34)

with domain of definition

$$\begin{aligned} \mathcal {D}(F)=\big \{(\rho ,\nu )\in L^{\infty }(D)^{2}:\&0<\rho _{\min }<\rho<\rho _{\max }<\infty \ \text { and } \nonumber \\&0<\nu _{\min }< \nu< \nu _{\max }<\infty \ \text { a.e. in }D\big \}. \end{aligned}$$

(35)

As explained in Remark 3.7, F satisfies (5) and (6). In this part we even verify the stronger compatibility condition (19), that is, for a special choice of $X^n$ and X, see (18), we will specify the decay function $\beta $ in Theorem 3.8 below. For this purpose, we restrict ourselves to the discretization space $X^n=X^n_1\times X^n_1$ with $X^n_1=X^n_1(D)$ from (A3) with $N=1$. Thus, $X^n$ consists of piecewise constant functions in the sequel. The associated projector onto $X^n$ then reads

$$\begin{aligned} \mathcal {P}^n:L^{\infty }(D)^{2}\rightarrow X^n,\quad \mathcal {P}^nu=(\mathcal {P}_1^nu_1,\mathcal {P}_1^nu_2), \end{aligned}$$

(36)

whose components are given by

$$\begin{aligned} \mathcal {P}_1^nu_j=\sum _{k\in \mathcal {I}_n}2^{nd}\left( \int _{\square ^n_{k}}u_j(x)\;\textrm{d}x\right) \mathbb {1}_{\square ^n_{k}} \end{aligned}$$

for $j\in \{1,2\}$ according to (A5). Here,

$$\begin{aligned} \square ^n_{k}: =2^{-n}([0,1]^d+k) \end{aligned}$$

(37)

is the translated and dilated unit cube. Further, $\mathcal {I}_n = \{k\in \mathbb {Z}^d: \square ^n_{k}\subset \overline{D}\} $, see (A4).

We are left to determine $X\subset L^{\infty }(D)^2$ where the subspace X is governed by a stronger topology measuring some kind of smoothness. Intuitively, X should be large enough to still contain a wide class of discontinuous profiles, on the other hand we need some minimal a priori regularity such that its $X^n$-projections facilitate a common decay rate in (19). For $s>0$ fixed we set

$$\begin{aligned}X:=L^{\infty }_s(D)^2 \end{aligned}$$

whose component spaces are characterized by

$$\begin{aligned} L^{\infty }_s(D):=\left\{ w\in L^{\infty }(D):\ \sup _{h\ne 0}\frac{\Vert w(\cdot - h)-w\Vert _{L^2(D^{h})}}{\vert h\vert ^s}<\infty \right\} \end{aligned}$$

(38)

with $D^{h}:=\{x\in D:\ x-h\in D\}$ for any $h\in \mathbb {R}^d$. We assign the norm

$$\begin{aligned} \Vert \cdot \Vert _{L^{\infty }_s(D)}:=\Vert \cdot \Vert _{L^{\infty }(D)}+[\cdot ]_{B^{s}_{2,\infty }(D)}, \end{aligned}$$

where $[\cdot ]_{B^{s}_{2,\infty }(D)}$ is a semi-norm given by the numerical value of the supremum in (38). Originally, $[\cdot ]_{B^{s}_{2,\infty }(D)}$ emerges from the definition of Hilbertian Besov-Nikolskii spaces

$$\begin{aligned} B^{s}_{2,\infty }(D)=\left\{ w\in L^2(D):\ \sup _{h\ne 0}\frac{\Vert w(\cdot - h)-w\Vert _{L^2(D^{h})}}{\vert h\vert ^s}<\infty \right\} \end{aligned}$$

with $\Vert \cdot \Vert _{B^{s}_{2,\infty }(D)}:=\Vert \cdot \Vert _{L^2(D)}+[\cdot ]_{B^{s}_{2,\infty }(D)}$, see [28]. We set

$$\begin{aligned} \Vert u\Vert _{X}^2:=\Vert \rho \Vert _{L^{\infty }_s(D)}^2+\Vert \nu \Vert _{L^{\infty }_s(D)}^2 \end{aligned}$$

(39)

and obviously have that $\Vert u\Vert _{L^{\infty }(D)^2}\le \Vert u\Vert _{X}$ for all $u\in X$.

Theorem 3.8

Let $D\subset \mathbb {R}^d$, $d\in \{1,2,3\}$, be a domain with $C^1$ boundary and assume that ${\varvec{f}}=({\varvec{f}}_1,f_2)\in W^{2,1}((0,T),H)$ with ${\varvec{f}}(0)=\partial _t {\varvec{f}}(0)=\textbf{0}$ and ${\varvec{f}}_1\in C([0,T],L^{q}(D,\mathbb {R}^d))$ for some $q\in (2,q_{\max })$, where $q_{\max }$ only depends on D and the ratio $\rho _{\max }/\rho _{\min }<\infty $ of the parameters from the definition of $\mathcal {D}(F)$. Let $\mathcal {P}^n$ be as in (36) and let $X=L^\infty _s(D)^2$ be as above for some $0<s\le 1/2$. Then, there is a neighborhood U of $u^+\in \mathcal {D}(F)$ such that

$$\begin{aligned} \Vert F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\Vert _{L^2((0,T),H)}\le C^+ \Vert \widehat{u}\Vert _{X}\Big (2^{-s(q-2)/(4d(q-1))}\Big )^n \end{aligned}$$

(40)

for all $u\in U$ and any $\widehat{u}\in X$.

To prove the theorem, some preparation is required. We start with the observation that $L^{\infty }_s(D)\subset B^{s}_{2,\infty }(D)$ and $X_1^n\subset L^{\infty }_s(D)$ if $s\le 1/2$. The latter inclusion can be seen as follows: Let $h=(h_1,\dots ,h_d)$ with $\vert h\vert <2^{-n}$. Without loss of generality we may assume that $h_i\ge 0$. Now, thanks to the symmetry of the cubes, we estimate

$$\begin{aligned}&\Vert \mathbb {1}_{-h+\square ^n_{k}}-\mathbb {1}_{\square ^n_{k}}\Vert _{L^2(D^{h})}^2\\&\qquad \qquad \le 2\text {vol}_d\Big (\big [0,2^{-n}\big ]^d\backslash \big [0,2^{-n}-h_1\big ]\times \dots \times \big [0,2^{-n}-h_d\big ]\Big )\\&\qquad \qquad = 2\Big (2^{-nd}-\prod _{i=1}^d\big (2^{-n}-h_i\big )\Big )\\&\qquad \qquad \le 2\Big (2^{-nd}-\big (2^{-n}-\vert h\vert \big )^d\Big ) \le d\,2^{-n(d-1)+1}\vert h\vert \end{aligned}$$

for all $k\in \mathcal {I}_n$, where we used the mean value theorem in the last step.

Lemma 3.9

Let D be a bounded Lipschitz domain and let $\mathcal {P}^n$ be as in (36). Then, for any $u\in X$ and $0<s\le 1/2$,

$$\begin{aligned} \Vert u-\mathcal {P}^nu\Vert _{L^2(D)^2}\le C\Vert u\Vert _{X}2^{-ns/d}, \end{aligned}$$

where C only depends on D and the dimension d.

Proof

In view of (39) it suffices to prove the assertion for each component $\mathcal {P}^n_1$ of $\mathcal {P}^n$ and for any $w\in L^{\infty }_s(D)$. Let $\square \subset \mathbb {R}^d$ be a sufficiently large rectangle containing D. According to [29] there exists $\widetilde{w}\in B^s_{2,\infty }(\square )$ such that $\widetilde{w}\vert _D=w$ and $\Vert \widetilde{w}\Vert _{B^s_{2,\infty }(\square )}\le \widetilde{C}\Vert w\Vert _{B^s_{2,\infty }(D)}$, where $\Vert w\Vert _{B^s_{2,\infty }(D)}$ can be replaced by the stronger norm $\Vert w\Vert _{L^{\infty }_s(D)}$. Using a dyadic partition of $\square $ at level n based on the cubes $\{\square _{k}^n\}_{k}$ (for which we restrict $\square $ to have integer side length), we have

$$\begin{aligned} \Vert \widetilde{w}-\mathcal {\widetilde{P}}^n_1\widetilde{w}\Vert _{L^2(\square )}\le [\widetilde{w}]_{B^s_{2,\infty }(\square )}2^{-ns/d}, \end{aligned}$$

see [30], where $\mathcal {\widetilde{P}}^n_1$ is defined as in (36) but with respect to the larger index set $\mathcal {I}_n(\square )= \{k\in \mathbb {Z}^d:\square ^n_{k}\subset \overline{\square }\}$. Setting

$$\begin{aligned} D_n:=\bigcup _{k\in \mathcal {I}_n(D)}\square _{k}^n, \end{aligned}$$

we conclude that $\mathcal {P}^n_1w\vert _{D_n}=\mathcal {\widetilde{P}}^n_1\widetilde{w}\vert _{D_n}$ and $\mathcal {P}^n_1w\vert _{D\backslash D_n}=0$. The latter implies

$$\begin{aligned} \Vert w-\mathcal {P}^n_1w\Vert _{L^2(D\backslash D_n)}=\Vert w\Vert _{L^2(D\backslash D_n)}&\le \sqrt{\text {vol}_d(D\backslash D_n)}\,\Vert w\Vert _{L^\infty (D)}\\&\le \sqrt{\text {vol}_d(D\backslash D_n)}\,\Vert w\Vert _{L^{\infty }_s(D)} \end{aligned}$$

for which we can estimate for some $C_D<\infty $

$$\begin{aligned} \text {vol}(D\backslash D_n) \le C_D\mathcal {H}^{d-1}(\partial D)\, 2^{-n} \end{aligned}$$

according to the inclusion $D\backslash D_n\subset \cup _{x\in \partial D} \big (x+2^{-n}[-1,1]^d]\big )$, where $\mathcal {H}^{d-1}(\partial D)$ denotes the $d-1$-dimensional Hausdorff measure of $\partial D$. Altogether, we obtain for $s\le 1/2$ that

$$\begin{aligned} \Vert w-\mathcal {P}^n_1w\Vert _{L^2(D)}&\le \Vert w-\mathcal {P}^n_1w\Vert _{L^2(D_n)} + \Vert w-\mathcal {P}^n_1w\Vert _{L^2(D\backslash D_n)} \\&\le \big (\widetilde{C}2^{-ns/d} + \sqrt{C_D\mathcal {H}^{d-1}(\partial D)}2^{-n/2}\big ) \Vert w\Vert _{L^{\infty }_s(D)} \\&\le C \Vert w\Vert _{L^{\infty }_s(D)} 2^{-ns/d} \end{aligned}$$

which proves the lemma. $\square $

Lemma 3.10

Under the assumptions of Theorem 3.8, there is a $q_{\max }>2$ such that for any $u^+=(\rho ^+,\nu ^+)\in \mathcal {D}(F)$ there exists a neighborhood U of $u^+$ such that

$$\begin{aligned} \sup \left\{ \Vert \partial _ty\Vert _{L^1((0,T),H^{q})}:y=(p,\textbf{v})=F(u),\ u= (\rho ,\nu )\in U \right\} <\infty \end{aligned}$$

for any $q\in (2,q_{\max })$ where

$$\begin{aligned} H^q:=L^q(D)\times L^q(D,\mathbb {R}^d). \end{aligned}$$

(41)

Here, $q_{\max }$ only depends on D and the ratio $\rho _{\max }/\rho _{\min }<\infty $ of the parameters from the definition of $\mathcal {D}(F)$.

Proof

The proof makes use of converting higher time regularity to higher spatial integrability. By Theorem 2.6 of [27], we know that for $u^+\in \mathcal {D}(F)$ we have $\partial _ty=(\partial _t p,\partial _t\textbf{v})\in C([0,T],\mathcal {D}(A))\cap C^1([0,T],H)$, in particular $\partial _t p\in L^1((0,T),H^1_0(D))$. By Sobolev embedding, see [31], we obtain at least $\partial _t p\in L^1((0,T),L^6(D))$ for $d\in \{1,2,3\}$. Again by Theorem 2.6 of [27], since the constant of the stability estimate depends continuously on B and thus on $u$, we can actually conclude $\Vert \partial _t p\Vert _{L^1((0,T),L^6(D))}<\infty $ uniformly in a neighborhood of $u^+$. Concerning $\partial _t\textbf{v}$, we only have integrability information about its divergence. Therefore, we first note that by (31) we have

$$\begin{aligned} -\mathrm{\,div\,}\left( \frac{1}{\rho }\nabla p\right) =\mathrm{\,div\,}\left( \frac{{\varvec{f}}_1}{\rho }\right) -\mathrm{\,div\,}\partial _t\textbf{v}=\mathrm{\,div\,}\left( \frac{{\varvec{f}}_1}{\rho }\right) + \left( \partial _t f_2-\frac{\partial _t^2 p}{\rho \nu ^2}\right) \end{aligned}$$

in the sense of distributions. As $(\rho , \nu )\in \mathcal {D}(F)$ is bounded away from zero uniformly, the parameters do not affect the integrability of the right-hand side terms. Now Meyers’ estimate, see [32, Thm. 1], implies for fixed $t\in [0,T]$ and any $q\in (2,q_{\max })$ that

$$\begin{aligned} \Vert \nabla p\Vert _{L^q(D,\mathbb {R}^d)}\le C \left( \frac{\Vert {\varvec{f}}_1\Vert _{L^q(D,\mathbb {R}^d)}}{\rho _{\min }}+\Vert \partial _t f_2\Vert _{L^2(D)}+\frac{\Vert \partial _t^2 p\Vert _{L^2(D)}}{\rho _{\min } \nu _{\min }^2}\right) , \end{aligned}$$

where $q_{\max }$ depends on D and $\rho _{\max }/\rho _{\min }<\infty $, while C additionally depends on q. Since $\Vert \partial _t^2 p\Vert _{L^2(D)}$ can be bounded in terms of $\Vert \partial _t^2 {\varvec{f}}\Vert _{L^1((0,T),H)}$ and a constant which depends continuously on B, that is, on u, see Theorem 2.6 of [27], we concude that $\Vert \nabla p\Vert _{L^1((0,T),L^q(D,\mathbb {R}^d))}<\infty $ uniformly in a neighborhood of $u^+$. Comparing with (31), we finally get that also $\Vert \partial _t\textbf{v}\Vert _{L^1((0,T),L^q(D,\mathbb {R}^d))}<\infty $ locally uniform in $u$. This completes the proof. $\square $

Finally, we can prove the main result.

Proof of Theorem 3.8

The proof uses a more elaborate analysis of the stability estimate

$$\begin{aligned} \Vert F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )(t)\Vert _{L^2((0,T),H)}\le C\Vert B'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\partial _ty\Vert _{L^1((0,T),H)} , \end{aligned}$$

compared to Lemma 3.6. Recall that the constant C depends continuously on the operator norms of $B(u)$, $B^{-1}(u)$ and on T according to (29). Since also $B\mapsto B^{-1}$ is (locally) continuous, we can assume without loss of generality that the above inequality holds for some fixed $C=C_{r',u^+}>0$ for all $\Vert u-u^+\Vert _{L^{\infty }(D)^{2}}<r'$ by shrinking the original $r'>0$ otherwise. Then, for any $\delta >0$,

$$\begin{aligned}&\Vert B'(u)\big [\widehat{u}-\mathcal {P}^n\widehat{u}\big ]\partial _ty\Vert _{L^1((0,T),H)} \\&=\left\| B'(u)\left[ \frac{\widehat{u}-\mathcal {P}^n\widehat{u}}{\Vert \widehat{u}\Vert _{X}}\right] \big (\partial _ty(\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}}+\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert< \delta \Vert \widehat{u}\Vert _{X}\}})\big )\right\| _{L^1((0,T),H)} \\&\qquad \times \Vert \widehat{u}\Vert _{X} \\&\le \Vert B'(u)\Vert _{\mathcal {L}(L^{\infty }(D)^{2},\mathcal {L}(H))}\frac{\Vert \widehat{u}-\mathcal {P}^n\widehat{u}\Vert _{L^{\infty }(D)^{2}}}{\Vert \widehat{u}\Vert _{X}}\, \left\| \partial _ty\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}}\right\| _{L^1((0,T),H)} \\&\qquad \times \Vert \widehat{u}\Vert _{X} \\&\qquad +\Vert B'(u)\Vert _{\mathcal {L}(L^{\infty }(D)^{2},\mathcal {L}(H))}\delta \left\| \partial _ty\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert < \delta \Vert \widehat{u}\Vert _{X}\}}\right\| _{L^1((0,T),H)}\Vert \widehat{u}\Vert _{X}. \end{aligned}$$

Due to $\Vert \widehat{u}-\mathcal {P}^n\widehat{u}\Vert _{L^{\infty }(D)^{2}}\le (1+C_{\mathcal {P}})\Vert \widehat{u}\Vert _{L^{\infty }(D)^{2}}\le (1+C_{\mathcal {P}})\Vert \widehat{u}\Vert _{X}$ by (S2) and $C_X=1$ in (18), we obtain, by dropping the complementary indicator function in the bottom line above, that

$$\begin{aligned}{} & {} \Vert B'(u)\big [\widehat{u}-\mathcal {P}^n\widehat{u}\big ]\partial _ty\Vert _{L^1((0,T),H)} \\{} & {} \quad \le (1+C_{\mathcal {P}})\Vert B'(u)\Vert _{\mathcal {L}(L^{\infty }(D)^{2},\mathcal {L}(H))}\left\| \partial _ty\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}}\right\| _{L^1((0,T),H)}\Vert \widehat{u}\Vert _{X} \\{} & {} \qquad +\Vert B'(u)\Vert _{\mathcal {L}(L^{\infty }(D)^{2},\mathcal {L}(H))}\delta \left\| \partial _ty\right\| _{L^1((0,T),H)}\Vert \widehat{u}\Vert _{X}. \end{aligned}$$

The middle line here can also be directly expressed in terms of $\delta $. To this end, we apply Hölder’s inequality with exponent $q/2>1$ in view of (41) to get

$$\begin{aligned}&\left\| \partial _ty(\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}}\right\| _{L^1((0,T),H)}\\&\qquad \qquad = \int _0^T \left( \int _D \big \vert \partial _ty(t)\big \vert ^2(x)\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}}(x)\;\textrm{d}x\right) ^{1/2}\textrm{d}t \\&\qquad \qquad \le \int _0^T \left( \big \Vert \partial _ty(t)\big \Vert _{H^q}^2 \text {vol}\big (\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}\big )^{(q-2)/q} \right) ^{1/2}\textrm{d}t \\&\qquad \qquad =\Vert \partial _ty\Vert _{L^1((0,T),H^q)}\text {vol}\big (\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}\big )^{(q-2)/(2q)} \\&\qquad \qquad \le \Vert \partial _ty\Vert _{L^1((0,T),H^q)}\left( \frac{\Vert \widehat{u}-\mathcal {P}^n\widehat{u}\Vert _{L^2(D)^{2}}^2}{\delta ^2\Vert \widehat{u}\Vert _{X}^2}\right) ^{(q-2)/(2q)}, \end{aligned}$$

where we additionally employed Tschebyscheff’s inequality in the bottom line, see e.g., [33]. With Lemma 3.9 we can further estimate

$$\begin{aligned} \frac{\Vert \widehat{u}-\mathcal {P}^n\widehat{u}\Vert _{L^2(D)^{2}}^2}{\delta ^2\Vert \widehat{u}\Vert _{X}^2}\le \frac{C\Vert \widehat{u}\Vert _{X}^2\big (2^{-(s/d)}\big )^n}{\delta ^2\Vert \widehat{u}\Vert _{X}^2}= \frac{C\big (2^{-(s/d)}\big )^n}{\delta ^2}. \end{aligned}$$

Altogether, we get with a similar Hölder-inequality argument for

$$\begin{aligned} \Vert \partial _ty\Vert _{L^1((0,T),H)}\le \Vert \partial _ty\Vert _{L^1((0,T),H^q)}\text {vol}_d(D)^{(q-2)/(2q)} \end{aligned}$$

that

$$\begin{aligned}{} & {} \Vert B'(u)\big [\widehat{u}-\mathcal {P}^n\widehat{u}\big ]\partial _ty\Vert _{L^1((0,T),H)} \le \Vert \widehat{u}\Vert _{X}\Vert B'(u)\Vert _{\mathcal {L}(L^{\infty }(D)^{2},\mathcal {L}(H))}\Vert \partial _ty\Vert _{L^1((0,T),H^q)}\\{} & {} \quad \times \Bigg ((1+C_{\mathcal {P}})\left( \frac{C\big (2^{-(s/d)}\big )^n}{\delta ^2}\right) ^{(q-2)/(2q)}+\delta \text {vol}_d(D)^{(q-2)/(2q)}\Bigg ) \end{aligned}$$

which holds for all $\delta >0$. Optimization in $\delta $ then yields $\delta \propto \big (2^{-s(q-2)/(4d(q-1))}\big )^n$. By Lemma 3.10 and the continuity of $B'$, we can indeed find $C^+<\infty $ such that (40) holds. $\square $

4 Numerical Results

We present numerical experiments^{Footnote 4} on a two-parameter reconstruction to demonstrate the operation of REGINN $^\infty $ in a test scenario where all assumptions required for our analysis in the previous sections are satisfied.

Recall from Theorem 2.3 that, in general, the regularization property holds only in the weak-$\star $ topology permitting a kind of strange convergence behavior. Therefore, we test Algorithm 1 as the noise level approaches zero and also how it behaves under different initial spaces $X^{n_0}$. We will start with a rather low dimensional $X^{n_0}$ such that $n_m$ increases successively in the course of the Newton iteration (while-loop) and in contrast also with some large dimensional $X^{n_0}$ which corresponds to a more static use of Tikhonov regularization throughout all iterations.

Our experiments rely on the acoustic wave equation in one spatial dimension, $d=1$, where $D=(0,1)$ and $T=1$:

$$\begin{aligned} \begin{aligned} \frac{1}{\rho \nu ^2} \,\partial _t p - \partial _{x} v&= f_1{} & {} \text {in }(0,1)\times (0,1),\\ \rho \, \partial _t v - \partial _{x} p&= f_2{} & {} \text {in }(0,1)\times (0,1),\\ v(0,\cdot )=p(0,\cdot )&= 0{} & {} \text {on } (0,1),\\ p(\cdot ,0)=p(\cdot ,1)&=0{} & {} \text {on } (0,1). \end{aligned} \end{aligned}$$

(42)

The source components $f_1,f_2:[0,1]\times [0,1]\rightarrow \mathbb {R}$ are

$$\begin{aligned} \begin{aligned} f_1(t,x)&=f_1(x)=100\Big (x(x-1)\frac{1}{\rho (x)\nu (x)^2}-\frac{\pi }{2}\cos \left( \frac{\pi }{2}x\right) \Big ),\\ f_2(t,x)&=100\Big (-t(2x-1)+\sin \left( \frac{\pi }{2}x\right) \rho (x)\Big ), \end{aligned} \end{aligned}$$

(43)

where

$$\begin{aligned} \rho (x)=1+\frac{1}{5}\,\mathbb {1}_{[7/30,17/30]}(x) \quad \text {and} \quad \nu (x)=1-\frac{1}{10}\,\mathbb {1}_{[13/30,23/30]}(x).\end{aligned}$$

(44)

The corresponding exact data, that is, the solution of (42) and (43), are given by

$$\begin{aligned} p(t,x)=100tx(x-1)\quad \text {and}\quad v(t,x)=100t\sin \left( \frac{\pi }{2}x\right) . \end{aligned}$$

(45)

We solve the appearing wave equations during inversion for the parameters by the FEM-based MATLAB (R2021a) command pdepe with 300 spatial and 100 temporal grid points. Both sets of points are distributed equidistantly in [0, 1].

Our discrete parameter spaces $X^n=X^n_1\times X^n_1$ are generated by the piecewise constant cardinal B-spline as explained in Appendix A.1. So, conditions (S1)–(S3) are fulfilled. Note that the dimension of $X^n_1$ is $2^n$. In view of Remark 2.5 we set $n_{\max }=8$ yielding the semi-discrete parameter-to-state map

$$\begin{aligned} F_{n_{\max }}:\mathcal {D}(F_{n_{\max }}) \subset L^\infty (D)^2 \rightarrow L^2([0,1],H),\quad (\rho ,\nu )\mapsto (p,v), \end{aligned}$$

(46)

where $(p,v)\in L^2([0,1],H)$, $H= L^2(0,1)^2$, solves (42) with (43) and $(\rho ,\nu )\in \mathcal {D}(F_{n_{\max }})= X^{n_{\max }} \cap \mathcal {D}(F)$, see (35) for $\mathcal {D}(F)$. Within our computations, $\Vert \cdot \Vert _{ L^2([0,1],H)}$ is discretized by the corresponding space-time Euclidean norm and denoted by $\Vert \cdot \Vert $. Since $F_{n_{\max }}$ satisfies the TCC (3), see Appendix C, Theorem 2.1 guarantees termination of REGINN $^\infty $ applied to the inverse problem

$$\begin{aligned} \text {find } (\rho ,\nu )\in X^{n_{\max }}:\quad F_{n_{\max }}(\rho ,\nu )\approx (p^\delta ,v^\delta ). \end{aligned}$$

(47)

To simplify notation we use the same symbols for the continuous and the discrete versions of functions such as p, v, $\rho $, $\nu $, etc.

We apply REGINN $^\infty $ (Algorithm 1) to (47) where we choose $q_n=n/\log _2 C_\infty $ for $J_{n,m}$ from (10) in accordance with the lower bound in (A6) below. For each m the computation of Newton update candidates $s_m\in X^n$ is realized—benefiting greatly from the smoothness of the Tikhonov functionals—by a steepest descent routine with Armijo stepsize rule in a loop over n until (9) is met. We adapt $\mu _m$ during iteration according to the rule proposed in [4]: we start with $\mu _1=\mu _0$ and set

$$\begin{aligned} \mu _m={\left\{ \begin{array}{ll} \min \{1-\frac{j_{m-2}}{j_{m-1}}(1-\mu _{m-1}), 0.999\}, &{} j_{m-1}\ge j_{m-2},\\ 0.9\mu _{m-1}, &{} \text {otherwise}, \end{array}\right. } \qquad m\ge 2, \end{aligned}$$

where $\mu _0\in (0,1)$ is user-supplied and $j_m$ denotes the number of gradient decent steps needed to compute the update $s_m$. Complementary, the underlying discretization level n will be increased if the gradient descent loop stagnates on $X^n$, which we consider to occur if the ratio of two successive gradient step evaluations does not exceed a fixed threshold close to 1, say 0.99999. We stop the algorithm either by the discrepancy principle or if $n\ge n_{\max }$ happens, that is, if the discretization of $X^n$ would become finer than the computational grid used in the pdepe-routine for solving the wave equation. In the latter case, we still perform $u_{m+1}=u_m+\widetilde{s}_m$ with the last update candidate $\widetilde{s}_m\in X^{n_{\max }}$ before abortion. We emphasize that $\widetilde{s}_m$ is not a Newton update in the sense of (9), but the corresponding $u_{m+1}$ might still fulfill the discrepancy principle unlike $u_m$.

In our experiments we especially want to detect the jump regions [7/30, 17/30] and [13/30, 23/30], where the parameters differ from the homogeneous background material $(\rho _0,\nu _0)=(1,1)\in X^{n_{\max }}$, respectively, that we take as initial guess. Note that no grid point of $X^n$ coincides with either of the jump discontinuity points for all n so that the error of any reconstruction of $\rho $ and $\nu $ will always be at least $(\max {\rho }-\min {\rho })/2=1/10$ and $(\max {\nu }-\min {\nu })/2=1/20$ with respect to the $L^{\infty }$-norm, respectively.

First, we investigate the case of ‘exact’ data, that is, $(p^\delta ,v^\delta )= (p,v)$ with (p, v) from (45). Despite of $\delta =0$ our data might still be contaminated by some discretization error with respect to $F_{n_{\max }}$ since the corresponding analytical solution $(\rho ,\nu )$ given by (44) is not contained in $X^{n_{\max }}$. Choosing $\mu _0=0.7$, $\gamma =0.8$, $C_\infty =1.1$, we run Algorithm 1 for different $n_0$ to observe how its choice affects the outcome. Note that setting $\tau $ is redundant here because termination is solely forced by n exceeding $n_{\max }$. Figure 1 displays the exact parameter functions $\rho $ and $\nu $ (red) and the corresponding outputs $\rho _M$ and $\nu _M$ (blue) of Algorithm 1 when starting with $n_0\in \{2,5,8\}$, respectively. We see that the larger $n_0$ is, the smoother the output becomes, while the points of discontinuity are more sharply located for smaller $n_0$. Hence, for the reconstruction of jump discontinuities, $n_0$ shall be chosen large enough to locate discontinuity points sufficiently precise while at the same time it should not be too large to prevent oversmoothing. Figure 2 shows a more detailed convergence history in the case $n_0=2$ and confirms that the majority of Newton steps is indeed undertaken with $n_m\le 4$.

Next, we study the case of noisy data. For this purpose we generate noise vectors $\zeta $ as random samples from a centered Gaussian distribution and scale it such that $\Vert \zeta \Vert =\delta \Vert (p,v)\Vert $. Since $\delta $ is a relative perturbation here, the discrepancy principle must be adjusted accordingly. As before, we employ Algorithm 1 with $\mu _0=0.7$, $\gamma =0.8$, $C_\infty =1.1$, $\tau =1.1$. Using the insights from our exact data case we set $n_0=5$ as initial value to balance the aforementioned effects of globally smooth and locally oscillating reconstructions. The corresponding results for $\rho _M$ and $\nu _M$ are shown in Fig. 3 for $\delta =5\%$, $\delta =2\%$, and $1\%$. We see that the reconstructions’ profiles approach the correct jump height of the exact solution as $\delta $ becomes smaller. In all three cases, termination occurs by reaching the discretization limit, however, each last update fulfills the discrepancy principle afterwards. Altogether, the plots are in agreement with the weak-$\star $ regularization property of REGINN $^\infty $ (Theorem 2.3).

5 Conclusion

We have investigated a novel iterative regularization algorithm tailored for non-linear illposed problems between $L^\infty (D)^\ell $ and a normed space Y. The main focus was on generating uniformly bounded iterates relying on a Tikhonov-like regularization term. Due to the non-smooth structure of $L^\infty (D)^\ell $, a straightforward implementation would require non-smooth or box-constrained calculus which we could circumvent by using discretization in combination with equivalent $L^p(D)^\ell $-norms for $p<\infty $. Under reasonable assumptions on the input parameters, our algorithm REGINN $^\infty $ terminates after finitely many steps. Further, it fulfills the regularization property in the weak-$\star $ topology as the noise level of the Y-data tends to zero. Depending on the non-linearity, this convergence can be reformulated as convergence with respect to a norm. Numerical experiments with a model problem illustrate the theoretical findings.

Future research may include a convergence rate analysis under higher regularity assumptions as in (19) or under more general variational source conditions with respect to a Bregman distance [13]. We could even incorporate a metric to overcome that $L^\infty (D)^\ell $ is non-separable; an approach proposed in [7]. Concerning the data space, especially the task of finding proper measures for the misfit in seismograms, the Kantorovich-Rubinstein (KR) norm has recently proven advantageous in exploration geophysics, see, e.g., [34, 35]. This fact suggests an implementation of our method under the KR-norm on Y. In fact, our theory also allows more general distance functions on Y. For instance, any distance concept is admissible which is convex in one of its two arguments (e.g. Bregman distances). Such a concept can in particular be learned within a predefined set of admissible distance functions in practice and is subject of ongoing research, see [36].

Notes

Recall that Theorem 2.1 in its original version requires an initial guess $u_0\in B_{r_0}(u^+)$. Since $L^\infty (D)^\ell $ is not separable, however, there might be no element in $X^n$ for any $n\in \mathbb {N}$ which satisfies this closeness condition. As remedy we may even enlarge the parameter space for the semi-discrete problem to $U_0+X^{n_{\max }}$ where $U_0\subset L^\infty (D)^\ell $ is a proper finite dimensional subspace such that $u_0\in U_0$. In this case, we assume $u^+\in U_0+X^{n_{\max }}$.
Zero initial conditions are not a general restriction. We use them here only to ease the presentation.
That is, each of the 5 parameters is restricted from below and from above by physically meaningful bounds. Consult (35) for a concrete example.
For the reader’s own experiments we provide our MATLAB code on http://www.math.kit.edu/ianm3/~rieder/media/reginn_infty_fig2.m. Executed in MATLAB (R2021a) on an Intel(R) Core(TM) i5-1035G4 CPU under Windows 10, the code produces the output shown in Fig. 2. In our program we use a routine by John D’Errico (2021): Piecewise functions (https://www.mathworks.com/matlabcentral/fileexchange/9394-piecewise-functions), MATLAB Central File Exchange. Retrieved November 29, 2021.

References

Mueller, J.L., Siltanen, S.: Linear and Nonlinear Inverse Problems with Practical Applications. Computational Science & Engineering, vol. 10. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2012). https://doi.org/10.1137/1.9781611972344
Book MATH Google Scholar
Fichtner, A.: Full Seismic Waveform Modelling and Inversion Advances in Geophysical and Environmental Mechanics and Mathematics. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-15807-0
Book Google Scholar
Virieux, J., Operto, S.: An overview of full-waveform inversion in exploration geophysics. Geophysics 74(6), 1–26 (2009). https://doi.org/10.1190/1.3238367
Article Google Scholar
Rieder, A.: On the regularization of nonlinear ill-posed problems via inexact Newton iterations. Inverse Prob. 15(1), 309–327 (1999). https://doi.org/10.1088/0266-5611/15/1/028
Article MathSciNet MATH Google Scholar
Margotti, F.J.: On inexact Newton methods for inverse problems in Banach spaces. Ph.D. thesis, Karlsruher Institut für Technologie (KIT) (2015). https://doi.org/10.5445/IR/1000048606
Eller, M., Rieder, A.: Tangential cone condition and Lipschitz stability for the full waveform forward operator in the acoustic regime. Inverse Prob. 37, 085011 (2021). https://doi.org/10.1088/1361-6420/ac11c5
Article MathSciNet MATH Google Scholar
Pöschl, C., Resmerita, E., Scherzer, O.: Discretization of variational regularization in Banach spaces. Inverse Probl. 26(10), 105017 (2010). https://doi.org/10.1088/0266-5611/26/10/105017
Article MathSciNet MATH Google Scholar
Kaltenbacher, B., Neubauer, A., Scherzer, O.: Iterative Regularization Methods for Nonlinear Ill-Posed Problems. Radon Series on Computational and Applied Mathematics, vol. 6. Walter de Gruyter GmbH & Co. KG, Berlin (2008). https://doi.org/10.1515/9783110208276
Book MATH Google Scholar
Schuster, T., Kaltenbacher, B., Hofmann, B., Kazimierski, K.S.: Regularization Methods in Banach Spaces. De Gruyter, Berlin (2012). https://doi.org/10.1515/9783110255720
Book MATH Google Scholar
Clason, C., Klassen, A.: Quasi-solution of linear inverse problems in non-reflexive Banach spaces. J. Inverse Ill-Posed Probl. 26(5), 689–702 (2018). https://doi.org/10.1515/jiip-2018-0026
Article MathSciNet MATH Google Scholar
Iusem, A.N., Resmerita, E.: A proximal point method in nonreflexive Banach spaces. Set-Valued Var. Anal. 18(1), 109–120 (2010). https://doi.org/10.1007/s11228-009-0126-z
Article MathSciNet MATH Google Scholar
Kaltenbacher, B., de Souza, M.L.P.: Convergence and adaptive discretization of the IRGNM Tikhonov and the IRGNM Ivanov method under a tangential cone condition in Banach space. Numer. Math. 140(2), 449–478 (2018). https://doi.org/10.1007/s00211-018-0971-5
Article MathSciNet MATH Google Scholar
Kaltenbacher, B., Klassen, A.: On convergence and convergence rates for Ivanov and Morozov regularization and application to some parameter identification problems in elliptic PDEs. Inverse Probl. 34(5), 055008 (2018). https://doi.org/10.1088/1361-6420/aab739
Article MathSciNet MATH Google Scholar
Kaltenbacher, B., Klassen, A., Previatti de Souza, M.L.: The Ivanov regularized Gauss-Newton method in Banach space with an a posteriori choice of the regularization radius. J. Inverse Ill-Posed Probl. 27(4), 539–557 (2019). https://doi.org/10.1515/jiip-2018-0093
Article MathSciNet MATH Google Scholar
Schuster, T., Rieder, A., Schöpfer, F.: The approximate inverse in action: IV. Semi-discrete equations in a Banach space setting. Inverse Probl. 28(10), 104001 (2012). https://doi.org/10.1088/0266-5611/28/10/104001
Article MathSciNet MATH Google Scholar
Scherzer, O.: The use of Morozov’s discrepancy principle for Tikhonov regularization for solving nonlinear ill-posed problems. Computing 51(1), 45–60 (1993). https://doi.org/10.1007/BF02243828
Article MathSciNet MATH Google Scholar
Hanke, M., Neubauer, A., Scherzer, O.: A convergence analysis of the Landweber iteration for nonlinear ill-posed problems. Numer. Math. 72(1), 21–37 (1995). https://doi.org/10.1007/s002110050158
Article MathSciNet MATH Google Scholar
Kaltenbacher, B., Nguyen, T.T.N., Scherzer, O.: The tangential cone condition for some coefficient identification model problems in parabolic PDEs. In: Kaltenbacher, B., Schuster, T., Wald, A. (eds.) Time-dependent Problems in Imaging and Parameter Identification, pp. 121–163. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-57784-1_5
Chapter MATH Google Scholar
Lechleiter, A., Rieder, A.: Towards a general convergence theory for inexact Newton regularizations. Numer. Math. 114(3), 521–548 (2010). https://doi.org/10.1007/s00211-009-0256-0
Article MathSciNet MATH Google Scholar
Zeidler, E.: Nonlinear Functional Analysis and Its Applications I: Fixed-Point Theorems. Springer, New York (1986)
Book MATH Google Scholar
Somersalo, E., Cheney, M., Isaacson, D.: Existence and uniqueness for electrode models for electric current computed tomography. SIAM J. Appl. Math. 52(4), 1023–1040 (1992). https://doi.org/10.1137/0152060
Article MathSciNet MATH Google Scholar
Lechleiter, A., Rieder, A.: Newton regularizations for impedance tomography: convergence by local injectivity. Inverse Probl. 24(6), 065009 (2008). https://doi.org/10.1088/0266-5611/24/6/065009
Article MathSciNet MATH Google Scholar
Zeltmann, U.: The viscoelastic seismic model: existence, uniqueness and differentiability with respect to parameters. Ph.D. thesis, Karlsruhe Institute of Technology (2018). https://doi.org/10.5445/IR/1000093989
Bohlen, T.: Viskoelastische FD-Modellierung seismischer Wellen zur Interpretation gemessener Seismogramme. Ph.D. thesis, Christian-Albrechts-Universität zu Kiel (1998). https://bit.ly/2LM0SWr
Bohlen, T.: Parallel 3-D viscoelastic finite difference seismic modelling. Comput. Geosci. 28(8), 887–899 (2002)
Article Google Scholar
Kirsch, A., Rieder, A.: Inverse problems for abstract evolution equations II: higher order differentiability for viscoelasticity. SIAM J. Appl. Math. 79(6), 2639–2662 (2019). https://doi.org/10.1137/19M1269403. Corrigendum under https://doi.org/10.48550/arXiv.2203.01309
Kirsch, A., Rieder, A.: Inverse problems for abstract evolution equations with applications in electrodynamics and elasticity. Inverse Probl. 32(8), 085001 (2016). https://doi.org/10.1088/0266-5611/32/8/085001
Article MathSciNet MATH Google Scholar
Triebel, H.: Theory of Function Spaces. II. Monographs in Mathematics, vol. 84. Birkhäuser Verlag, Basel (1992). https://doi.org/10.1007/978-3-0346-0419-2
Book Google Scholar
Rychkov, V.S.: On restrictions and extensions of the Besov and Triebel–Lizorkin spaces with respect to Lipschitz domains. J. Lond. Math. Soc. 60(1), 237–257 (1999). https://doi.org/10.1112/S0024610799007723
Article MathSciNet MATH Google Scholar
Binev, P., Cohen, A., Dahmen, W., DeVore, R., Temlyakov, V.: Universal algorithms for learning theory I. Piecewise constant functions. J. Mach. Learn. Res. 6, 1297–1321 (2005)
MathSciNet MATH Google Scholar
Evans, L.C.: Partial Differential Equations. Graduate Studies in Mathematics, vol. 19, 2nd edn. American Mathematical Society, Providence (2010). https://doi.org/10.1090/gsm/019
Book Google Scholar
Meyers, N.G.: An $L^{p}$-estimate for the gradient of solutions of second order elliptic divergence equations. Ann. Sc. Norm. Sup. Pisa Cl. Sci. (3) 17, 189–206 (1963)
MATH Google Scholar
Ibe, O.C.: Markov Processes for Stochastic Modeling, 2nd edn. Elsevier, Amsterdam (2013). https://doi.org/10.1016/B978-0-12-407795-9.00001-3
Book MATH Google Scholar
Messud, J., Poncet, R., Lambaré, G.: Optimal transport in full-waveform inversion: analysis and practice of the multidimensional Kantorovich–Rubinstein norm. Inverse Prob. 37(6), 065012 (2021). https://doi.org/10.1088/1361-6420/abfb4c
Article MathSciNet MATH Google Scholar
Métivier, L., Brossier, R., Mérigot, Q., Oudet, E., Virieux, J.: An optimal transport approach for seismic tomography: application to 3D full waveform inversion. Inverse Probl. 32(11), 115008 (2016). https://doi.org/10.1088/0266-5611/32/11/115008
Article MathSciNet MATH Google Scholar
Sun, B., Alkhalifah, T.: Ml-misfit: learning a robust misfit function for full-waveform inversion using machine learning. In: Conference Proceedings, EAGE 2020 Annual Conference & Exhibition Online, 1–5 (2020). https://doi.org/10.3997/2214-4609.202010466
Schumaker, L.L.: Spline Functions: Basic Theory. Cambridge Mathematical Library, 3rd edn. Cambridge University Press, Cambridge (2007). https://doi.org/10.1017/CBO9780511618994
Book Google Scholar
Cohen, A., Daubechies, I., Feauveau, J.-C.: Biorthogonal bases of compactly supported wavelets. Commun. Pure Appl. Math. 45(5), 485–560 (1992). https://doi.org/10.1002/cpa.3160450502
Article MathSciNet MATH Google Scholar
Oswald, P.: Multilevel Finite Element Approximation. Teubner Skripten zur Numerik [Teubner Scripts on Numerical Mathematics]. B. G. Teubner, Stuttgart (1994). https://doi.org/10.1007/978-3-322-91215-2
Book Google Scholar

Download references

Acknowledgements

This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—Project-ID 258734477 - SFB 1173. The authors thank Christian Rheinbay for insightful discussions.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Department of Mathematics, Karlsruhe Institute of Technology (KIT), Englerstr. 2, 76131, Karlsruhe, Germany
Lukas Pieronek & Andreas Rieder

Authors

Lukas Pieronek
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Rieder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Lukas Pieronek or Andreas Rieder.

Ethics declarations

Conflict of interest

We hereby confirm that we do not have any conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A A family of admissible subspaces

In this appendix we give a concrete construction for a family $\{X^n\}_{n\in \mathbb {N}}$ of subspaces of $L^\infty (D)^\ell $ which satisfies our assumptions (S1)–(S3) of Sect. 2 for D an open and bounded subset of $\mathbb {R}^d$. Using Cartesian products in case of $\ell >1$, cf. (30), we restrict our attention to $\ell =1$ here.

1.1 A.1 The discrete subspaces used in this work

We will rely on the cardinal B-spline $\varphi _N:\mathbb {R}\rightarrow \mathbb {R}$ of order $N\in \mathbb {N}$ which is recursively defined by

$$\begin{aligned} \varphi _N(t):=\varphi _{N-1}\star \varphi _1(t)=\int _0^1\! \varphi _{N-1}(t-s)\,\textrm{d}s,\quad \varphi _1=\mathbb {1}_{[0,1]}. \end{aligned}$$

It obeys the scaling relation

$$\begin{aligned} \varphi _N(t)= 2^{1-N}\sum _{k=0}^N {N\atopwithdelims ()k}\,\varphi _N(2t-k). \end{aligned}$$

(A1)

Further properties are

$\displaystyle \text {supp} \varphi _N=[0,N], \quad \varphi _N\vert _{]0,N[}>0,\quad \varphi _N\in \mathcal {C}^{N-2}$,
for each $k\in \mathbb {Z}$, $\varphi _N\vert _{[k,k+1]}$ is a polynomial of degree $N-1$,
for all $t\in \mathbb {R}$,
$$\begin{aligned} 1=\sum _{m\in \mathbb {Z}} \varphi _N(t-m), \end{aligned}$$
(A2)

see, e.g., [37].

Using the tensor product B-Spline $\Phi ( x):=\prod _{i=1}^d\varphi _N(x_i)$, $ x=(x_1,\ldots ,x_d)^\top \in \mathbb {R}^d$, and the notation $\Phi _{n, k}( x)=2^{nd/2}\,\Phi (2^n x- k)$, $n\in \mathbb {N}$, $ k\in \mathbb {Z}^d$, we define

$$\begin{aligned} X^n=X_N^n(D):=\text {span} \big \{\Phi _{n, k}\vert _{D}: k\in \mathcal {I}_n(D)\big \} \end{aligned}$$

(A3)

with

$$\begin{aligned} \mathcal {I}_n=\mathcal {I}_n(D):=\big \{ k\in \mathbb {Z}^d: \text {supp} \Phi _{n, k} \subset \overline{D}\big \}. \end{aligned}$$

(A4)

These spaces are nested due to (A1), so that (S1) holds. Note that $\cup _{ k\in \mathcal {I}_n} \text {supp} \Phi _{n, k}\subset \overline{D}$ which is a proper inclusion in general.

Next we demonstrate (S2). To this end we set

$$\begin{aligned} \mathcal {P}^n u= \mathcal {P}_N^n u:=\sum _{k\in \mathcal {I}_n} \langle u,\widetilde{\Phi }_{n, k}\rangle _{L^2(D)} \Phi _{n, k}\quad \text { for } u\in L^\infty (D) \end{aligned}$$

(A5)

where $\widetilde{\Phi }$ is a compactly supported dual function to $\Phi $ satisfying the biorthogonality

$$\begin{aligned} \langle \widetilde{\Phi }(\cdot ), \Phi (\cdot - k)\rangle _{L^2(\mathbb {R}^d)}=\delta _{ 0, k}. \end{aligned}$$

The existence of such functions has been shown in [38]. The biorthogonality yields $\mathcal {P}_N^n \Phi _{n, k} =\Phi _{n, k}$, for all $ k\in \mathcal {I}_n$. Hence, the required projection property holds: $\mathcal {P}_N^nu=u$ for all $u\in X_N^n(D)$. We proceed with

$$\begin{aligned} \Vert \mathcal {P}_N^nu\Vert _{L^\infty (D)}&=\sup _{ x\in D}\Big \vert \sum _{ k\in \mathcal {I}_n} \langle u,\widetilde{\Phi }_{n, k}\rangle _{L^2(D)} \Phi _{n, k}( x)\Big \vert \\&=\Big \vert \sum _{ k\in \mathcal {I}_n} \langle u,\widetilde{\Phi }_{n, k}\rangle _{L^2(D)} \Phi _{n, k}( x^*)\Big \vert \\&\le \sum _{ k\in \mathcal {I}_n} \vert \langle u,\widetilde{\Phi }_{n, k}\rangle _{L^2(D)} \vert \,\Phi _{n, k}( x^*)\\&\le \Vert u\Vert _{L^\infty (D)} \sum _{ k\in \mathcal {I}_n} \Vert \widetilde{\Phi }_{n, k}\Vert _{L^1(D)}\, \Phi _{n, k}( x^*). \end{aligned}$$

Since

$$\begin{aligned}{} & {} \Vert \widetilde{\Phi }_{n, k}\Vert _{L^1(D)} \le 2^{-nd/2} \int _{\mathbb {R}^d}\vert \widetilde{\Phi }( x)\vert \,\textrm{d}x\\{} & {} \quad \text {and}\quad \sum _{ k\in \mathcal {I}_n} \Phi _{n, k}( x^*) \le \sum _{ k\in \mathbb {Z}^d} \Phi _{n, k}( x^*) {\mathop {=}\limits ^{{\hbox {(A2)}}}}2^{nd/2} \end{aligned}$$

we have established (S2) with $C_{\mathcal {P}}\le \Vert \widetilde{\Phi }\Vert _{L^1(\mathbb {R}^d)}$. Observe that $\Vert \widetilde{\Phi }\Vert _{L^1(\mathbb {R}^d)}\ge 1$ as $\widetilde{\Phi }$ has mean value 1 just as $\Phi $.

It remains to validate (S3). Let $u\in X_N^n(D)$ with $\Vert u\Vert _{L^\infty (D)}=1$. Then,

$$\begin{aligned} \Vert u\Vert _{L^q(D)} \ge \delta _q > 0 \end{aligned}$$

for

$$\begin{aligned} \delta _q = \min _{u\in M} \Vert u\Vert _{L^q(D)} \quad \text {where}\quad M= \big \{ u\in X_N^n(D): \Vert u\Vert _{L^\infty (D)}= 1\big \}. \end{aligned}$$

This minimum is non-zero and exists as M is compact in the finite dimensional space $X_N^n(D)$. Since $\delta _q=\Vert u_q\Vert _{L^q(D)}$ for one $u_q\in M$ and as $\delta _q \rightarrow 1$ for $q\rightarrow \infty $ (see below), we find a q with $\delta _q\ge 1/C_\infty $ for any fixed $C_\infty >1$. Hence, $1\le C_\infty \,\Vert u\Vert _{L^q(D)}$ for all $u\in M$ and (S3) follows by the homogeneity of norms.

We finish with proving $\lim _{q\rightarrow \infty } \delta _q=1$. Obviously, $\delta _q=\Vert u_q\Vert _{L^q(D)}\le \text {vol}_d(D)^{1/q}\rightarrow 1$ as $q\rightarrow \infty $. Therefore $\{\delta _q\}$ is bounded and admits a convergent subsequence, say, $\lim _{i\rightarrow \infty }\delta _{q_i}=\delta ^*\le 1$. For each q let $ x_q^*\in D$ with $\vert u_q( x_q^*)\vert =1$. If $N=1$, $u_q$ must be equal to unity in a whole cube of length $2^{-n}$ as a subset of D containing $x_q^*$. So we can estimate

$$\begin{aligned} \Vert u_q\Vert _{L^q(D)}\ge \big (2^{-nd}\big )^{1/q}\rightarrow 1 \end{aligned}$$

(A6)

as $q\rightarrow \infty $ which proves the assertion in this case. If $N>1$, we can still find for any $\varepsilon >0$ sufficiently small a $\delta >0$ such that $\vert u_q\vert _{B_\delta ( x_q^*)\cap D}\vert \ge 1-\varepsilon $ for all q. This follows by uniform equicontinuity ensured by the Arzelà-Ascoli theorem since M is compact in $\mathcal {C}(\overline{D})$ as a bounded, closed, and finite dimensional set. Further, we have that

$$\begin{aligned} \text {vol}_d(B_\delta ( x_q^*)\cap D)>c>0 \end{aligned}$$

for all q. This follows by the more general observation that the union over $m\in \mathbb {N}$ of

$$\begin{aligned} V_m:=\Big \{x\in \mathbb {R}^d:\ {\text {vol}}_d(B_\delta (x)\cap D)>\frac{1}{m}\Big \} \end{aligned}$$

is an open cover for $\overline{D}$, so we can find a minimal m such that $\overline{D}\subset V_m$ by compactness of $\overline{D}$. Altogether, we can again deduce a lower bound of the form

$$\begin{aligned} \Vert u_q\Vert _{L^q(D)}\ge (1-\varepsilon )\text {vol}_d(B_\delta ( x_q^*)\cap D)^{1/q}\ge (1-\varepsilon )c^{1/q}\rightarrow 1-\varepsilon \end{aligned}$$

as $q\rightarrow \infty $. We conclude $\delta ^*=1$ since $\varepsilon >0$ can be chosen as small as we wish. Finally, any subsequence of $\{\delta _q\}$ contains a subsequence which converges to 1. So, the whole sequence must converge to 1, see, e.g., [20, Prop. 10.13(2)].

1.2 A.2 A different approach

The functions of $X_N^n(D)$ from the above construction vanish on the set $\Delta =D\backslash \cup _{ k\in \mathcal {I}_n} \text {supp} \Phi _{n, k}$ which is non-empty in general. Thus, $\mathcal {P}_N^nu$ is a poor approximation of u near to the boundary of D in general: $\mathcal {P}_N^nu\vert _\Delta =0$. Here we sketch an alternative approach to overcome this drawback. We emphasize that this approach has not been relied on in the previous sections as it requires the verification of an additional assumption on $F'$, see text below following (A7).

Basically, we extend the preimage-space of the map F of Sect. 2 while keeping all its necessary properties to carry over Theorems 2.1 and 2.3 to the extension. Let $\widetilde{D}$ be an open superset of $\overline{D}$: $\overline{D}\subset \widetilde{D}$. We will need two operators: the extension operator $E:L^\infty (D)\rightarrow L^\infty (\widetilde{D})$, which extends a function by zero, and the restriction operator $R:L^\infty (\widetilde{D}) \rightarrow L^\infty (D)$, which multiplies a function by the indicator $\mathbb {1}_D$.

We define $\widetilde{F}:\mathcal {D}(\widetilde{F})\subset L^\infty (\widetilde{D})\rightarrow Y$ by $\mathcal {D}(\widetilde{F})=E \mathcal {D}({F})$ and $\widetilde{F}(\widetilde{u})=F(R\widetilde{u})$. This $\widetilde{F}$ is Fréchet-differentiable just like F. Moreover, the TCC holds in $B_r(Eu^+)\subset \mathcal {D}(\widetilde{F})$. Indeed, let $\widetilde{u}\in B_r(Eu^+)$ then $\Vert R\widetilde{u}-u^+\Vert _{L^\infty (D)}=\Vert R\widetilde{u}-REu^+\Vert _{L^\infty (D)}\le \Vert \widetilde{u}-Eu^+\Vert _{L^\infty (\widetilde{D})}\le r$, that is, $R\widetilde{u}\in B_r(u^+)\subset \mathcal {D}({F})$. Thus, for $u,\widetilde{u}\in B_r(Eu^+)$, we get

$$\begin{aligned}{} & {} \Vert \widetilde{F}(u)- \widetilde{F}(\widetilde{u}) - \widetilde{F}'(\widetilde{u})( u- \widetilde{u})\Vert _Y = \Vert {F}(Ru)- {F}(R\widetilde{u}) - {F}'(R\widetilde{u})(R u- R\widetilde{u})\Vert _Y\\{} & {} \quad \le \omega \, \Vert {F}(Ru)- {F}(R\widetilde{u})\Vert _Y= \omega \, \Vert \widetilde{F}(u)- \widetilde{F}(\widetilde{u})\Vert _Y. \end{aligned}$$

Further, (6) is also satisfied by $\widetilde{F}$ as $R^*=E$.

For this $\widetilde{F}$ we can define spaces $X^n=X_N^n(\widetilde{D})$ as in Appendix A.1 but with respect to $\widetilde{D}$ rather than D. Now, the union of the supports of $\Phi _{n, k}$, for $ k\in \mathcal {I}_n(\widetilde{D})$ covers D when n is large enough. Unfortunately, condition (5) does not transfer immediately to the new construction. One must prove it for the concrete case, for example, as follows: for $\widetilde{u}\in \mathcal {D}(\widetilde{F})$ and $\widehat{u}\in L^\infty (\widetilde{D})$ we have that

$$\begin{aligned}{} & {} \Vert {\widetilde{F}}'(\widetilde{u})\big (\widehat{u}-\widetilde{\mathcal {P}}_N^n\widehat{u})\Vert _{Y} \le \Vert F'(R\widetilde{u})(R\widehat{u}-\mathcal {P}_N^nR\widehat{u})\Vert _{Y} \nonumber \\{} & {} \quad + \Vert F'(R\widetilde{u})(\mathcal {P}_N^nR\widehat{u}-R\widetilde{\mathcal {P}}_N^n\widehat{u})\Vert _{Y} \end{aligned}$$

(A7)

where $\mathcal {P}_N^n:L^\infty (D)\rightarrow X_N^n(D)$ and $\widetilde{\mathcal {P}}_N^n :L^\infty (\widetilde{D})\rightarrow X_N^n(\widetilde{D})$ are the corresponding projection operators in accordance with (S2). The left norm on the right hand side of (A7) tends to 0 for $n\rightarrow \infty $ by (5). The right norm converges to 0, for instance, if $F'(u):L^\infty (D) \rightarrow Y$ is weak-$\star $ continuous for all $u\in \mathcal {D}(F)$: both sequences $\{\mathcal {P}_N^nR\widehat{u}\}_n$ and $\{R\widetilde{\mathcal {P}}_N^n \widehat{u}\}_n$ converge to $R\widehat{u}$ pointwise a.e. This convergence can be verified by standard arguments, see e.g., [37, Chap. 12.3] and [39, Chap. 2]. Further, both sequences are uniformly bounded due to (S2). Hence, $\mathcal {P}_N^nR\widehat{u}-R\widetilde{\mathcal {P}}_N^n \widehat{u}\rightarrow 0 \text{ weakly- }\star $.

Appendix B An approximation result

Proposition B.1

For $u\in L^{\infty }(D)$ and $\{\mathcal {P}^n\}_n$ as in (A5) we have that $\mathcal {P}^nu\rightarrow u$ in $L^q(D)$ for all $q<\infty $.

Proof

Let $\Box $ be a rectangular superset of D and $\widetilde{\Box }$ a superset of $\Box $. Further, extend $u$ by zero outside of D. The convergence results of Section 12.3 from [37] yield that $\widetilde{\mathcal {P}}^nu\rightarrow u$ in $L^q(\Box )$ for any $q<\infty $ where $\widetilde{\mathcal {P}}^n$ is defined as in (A5), however, with respect to $\mathcal {I}_n(\widetilde{\Box })$ and $X^n(\widetilde{\Box })$. Hence, for any $x\in D$ we have that $\widetilde{\mathcal {P}}^nu(x)=\mathcal {P}^nu(x)$ for n large enough such that $x\in D_n$, where we set

$$\begin{aligned} D_n:=\bigg \{x\in D:\ \sum _{\textbf{k}\in \mathcal {I}_n}\Phi _{n, \textbf{k}}(x)=1\bigg \}\, \end{aligned}$$

in particular $\text {vol}_d\left( D\backslash D_n\right) \rightarrow 0$. Because of $\Vert \mathcal {P}^nu\Vert _{L^\infty (D)}\le C_{\mathcal {P}}\Vert u\Vert _{L^\infty (D)}$ by (S2), we can estimate

$$\begin{aligned} \Vert u-\mathcal {P}^nu\Vert _{L^q(D)}=\ {}&\Vert u-\mathcal {P}^nu\Vert _{L^q(D\backslash D_n)} + \Vert u-\mathcal {P}^nu\Vert _{L^q(D_n)}\\ \le \ {}&\text {vol}_d\left( D\backslash D_n\right) ^{1/q}(C_{\mathcal {P}}+1)\Vert u\Vert _{L^\infty (D)}+ \Vert u-\widetilde{\mathcal {P}}^nu\Vert _{L^q(D_n)}\\ \le \ {}&\text {vol}_d\left( D\backslash D_n\right) ^{1/q}(C_{\mathcal {P}}+1)\Vert u\Vert _{L^\infty (D)}+ \Vert u-\widetilde{\mathcal {P}}^nu\Vert _{L^q(\Box )} \end{aligned}$$

and the assertion follows. $\square $

Appendix C On the tangential cone condition for the operator in (46)

Here we argue that the semi-discrete non-linear operator $F_{n_{\max }}$ defined in (46) satisfies the TCC (3).

The underlying abstract system is (25) with the concrete settings (32), $u_0=0$, and (33) where $D=(0,1)$, $T=1$, that is, $d=1$ and $u=(p,v)\in L^2([0,1],H)$, $H= L^2(0,1)^2$. Thus, we obtain the acoustic system (42) which has a unique classical solution under $(\rho ,\nu )\in \mathcal {D}(F_{n_{\max }})$ and for the sources (43). In view of Lemma 3.5, $F_{n_{\max }}$ is Fréchet-differentiable and we have $F_{n_{\max }}'(\rho ,\nu )(\widehat{\rho },\widehat{\nu })=(\overline{p},\overline{v})$ where $(\overline{p},\overline{v})$ weakly solves

$$\begin{aligned} \begin{aligned} \frac{1}{\rho \nu ^2}\, \partial _t\overline{p} - \partial _x \overline{v}&= - \widehat{\rho } \partial _t p{} & {} \text {in }(0,1)\times (0,1),\\ \rho \, \partial _t\overline{v} - \partial _x \overline{p}&=- \widehat{\nu } \partial _t v{} & {} \text {in }(0,1)\times (0,1),\\ \overline{v}(0,\cdot )=\overline{p}(0,\cdot )&= 0{} & {} \text {on } (0,1)\\ \overline{v}(\cdot ,0)=\overline{p}(\cdot ,1)&=0{} & {} \text {on } (0,1). \end{aligned} \end{aligned}$$

(C8)

In a first step we validate injectivity of $F_{n_{\max }}'(\rho ,\nu )$ for any $(\rho ,\nu )\in \mathcal {D}(F_{n_{\max }})$. To this end, assume $F_{n_{\max }}'(\rho ,\nu )(\widehat{\rho },\widehat{\nu })=(0,0)$. From (C8) we get

$$\begin{aligned} 0= \widehat{\rho }\, \partial _t p \quad \text {and}\quad 0= \widehat{\nu }\, \partial _t v\quad \text {in }(0,1)\times (0,1). \end{aligned}$$

Assume $0\ne \widehat{\rho }\in X^{n_{\max }}$. Then, there is a non-empty interval [a, b], $a=2^{-n_{\max }}k$, $b=2^{-n_{\max }}(k+1)$, $k\in \mathbb {N}_0$, where $\widehat{\rho }$ does not vanish. Hence, $\partial _t p=0$ in $[0,1]\times [a,b]$. By the first equation in (42), $-\partial _x v = f_1$ in $[0,1]\times [a,b]$, that is,

$$\begin{aligned} v(t,x)= v(t,a)-\int _a^x f_1(t,y)\textrm{d}y, \quad (t,x)\in (0,1)\times [a,b]. \end{aligned}$$

Recalling the zero initial value $v(0,\cdot )=0$ we get the contradiction $0=\int _a^x f_1(0,y)\textrm{d}y<0$ for $x\in [a,b]$ according to (43). So, $\widehat{\rho }= 0$ in (0, 1). One argues analogously to validate $\widehat{\nu }= 0$ in (0, 1). Hence, $F_{n_{\max }}'(\rho ,\nu )$ is one-to-one which implies the TCC at any interior point of $\mathcal {D}(F_{n_{\max }})$ due to Lemma C.1 of [6].

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Pieronek, L., Rieder, A. On the iterative regularization of non-linear illposed problems in $L^{\infty }$. Numer. Math. 154, 209–247 (2023). https://doi.org/10.1007/s00211-023-01359-7

Download citation

Received: 19 May 2022
Revised: 02 May 2023
Accepted: 02 June 2023
Published: 22 June 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s00211-023-01359-7

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On the iterative regularization of non-linear illposed problems in \(L^{\infty }\)

Abstract

Similar content being viewed by others

Regularization Methods for Ill-Posed Problems

Regularization Methods for Ill-Posed Problems

An iterative technique for solving singularly perturbed parabolic PDE

1 Introduction

2 REGINN \(^{\infty }\)

Theorem 2.1

Proof

Remark 2.2

Theorem 2.3

Proof

Remark 2.4

Remark 2.5

Remark 2.6

Theorem 2.7

Proof

3 Applications: parameter identification tasks in PDEs

3.1 Electrical impedance tomography under the complete electrode model

Lemma 3.1

Proof

Lemma 3.2

Proof

3.2 Full waveform inversion

Remark 3.3

Lemma 3.4

Proof

Lemma 3.5

Lemma 3.6

Proof

Remark 3.7

3.3 A stronger result for FWI in the acoustic regime

Theorem 3.8

Lemma 3.9

Proof

Lemma 3.10

Proof

Proof of Theorem 3.8

4 Numerical Results

5 Conclusion

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A A family of admissible subspaces

1.1 A.1 The discrete subspaces used in this work

1.2 A.2 A different approach

Appendix B An approximation result

Proposition B.1

Proof

Appendix C On the tangential cone condition for the operator in (46)

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation