Abstract
During the last 15 years there have been intensive research efforts in constructing cryptographic algorithms resilient to the side-channel leakage. The most fundamental part of every such construction are the leakage-resilient encoding schemes. Usually the cryptographic secrets encoded by them are assumed to belong to some finite group \((\mathbb {G},+)\). The most common encoding scheme is the n-out-of-n additive secret sharing: a secret X is encoded as \((X_1,\ldots ,X_n)\) such that \(X_1 + \cdots X_n = X\). Intuitively, if an adversary receives only small partial independent information about each \(X_i\) then his information about X should be even smaller, and should decrease (i.e. the noise should amplify) when n grows. However, of course, the concrete parameters (the amount of leakage that can be tolerated, and the number of shares needed to achieve a given level of security) depend on the exact model that is used.
One of the most prominent models used in this area is the so-called noisy-leakage model (Chari et al., CRYPTO’99, and Prouff and Rivain, EUROCRYPT’13), which is believed to correspond well to the real-life engineering experience, where the information that the adversary receives is always “noisy”. In the Prouff and Rivain model the amount of information that the noise provides is measured using a parameter \(\delta \) that is equal to 0 when the noise is “full”, and equal to 1 when there is no noise. It is natural to ask how small \(\delta \) needs to be to achieve the amplification of noise (in the additive encoding scheme described above). Until now it was known that such amplification can be achieved for \(\delta < 1/16\). In this paper we show that:
-
in the prime order groups \(\mathbb {G}\) it suffices that \(\delta < 1 - 1/|\mathbb {G}|\),
-
in general it suffices that \(\delta < 1/2\).
We also prove that these bounds are optimal. We then analyze the number n of shares needed to achieve security \(\epsilon \) of the encoded value X (where \(\epsilon \) is also defined in terms of “noisy information” that the adversary obtains about X). We give close lower and upper bounds on this value (that differ only in factor polylogarithmic in \(|\mathbb {G}|\)). We achieve our results using techniques from the additive combinatorics, the harmonic analysis, and the convex optimization.
The first and the last author were partly supported by the WELCOME/2010-4/2 grant founded within the framework of the EU Innovative Economy Operational Programme.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
Leakage-resilient cryptography [1–3, 5, 6, 9, 11, 12, 14, 15, 19, 20, 24] aims at constructing cryptographic schemes that are secure against side-channel leakages of secret information Leakage-resilient encoding schemes are important building blocks for constructing such algorithms. They allow to encode a secret message X with a randomized encoding function \(\mathsf {Enc}(X) =(X_1, \ldots , X_n)\) such that leakage from the codeword does not help the adversary to recover the secret message X. The simplest encoding function \(\mathsf {Enc}(.)\) uses an additive secret sharing scheme, where the shares \(X_i\) are chosen uniformly at random from a group \(\mathbb {G}\) such that \(X := X_1 + \ldots + X_n\). Of course, the leakage from \(\mathsf {Enc}(X)\) cannot be arbitrary as otherwise no security is possible. A common assumption that has been studied both in theoretical and practical works [4, 8, 12, 22, 24] is to assume that the leakage from the encoding is a “noisy” function. It has been shown in the work of Chari et al. [4] that the encoding scheme described above amplifies security when the leakage is “sufficiently” noisy. While several recent works improve the quantitative bounds on the amount of noise needed [10, 22], the current best bounds require the leakage to be very noisy – in particular far away from the level of noise that is typically available in physical measurements [8]. The main contribution of our work is to give an optimal characterization of the noisy leakage model. We develop optimal bounds for the amount of noise that is needed to amplify security, and give matching upper bounds by showing that above this threshold amplification is impossible. Our results are particular important for the security analysis of masking schemes, which we further describe below.
Leakage resilient encodings for masking schemes. Leakage resilient encodings are prominently used to build masking schemes [4, 15]. Masking schemes are widely used in practice to protect cryptographic implementations against side-channel leakage – in particular against leakage from the power consumption [16]. The basic idea of a masking scheme is simple: instead of computing directly on the sensitive information (e.g., the secret key), a cryptographic algorithm protected with the masking countermeasure computes on encoded values thereby concealing sensitive information, which makes it harder to extract relevant information from the leakage. The simplest and most widely used masking scheme is the Boolean masking scheme. The Boolean masking scheme introduced in the important work of Ishai et al. [15] uses the simple encoding function from above when \(\mathbb {G}=\mathsf {GF}(2)\), but can easily be extended to work in larger groups. For instance, to protect an implementation of the AES algorithm we typically use \(\mathbb {G}= \mathsf {GF}(2^8)\) as the AES algorithm can be implemented in a particular efficient way using operations in the Galois field. Another example is a protected implementation of discrete-log based crypto schemes that work in prime-order fields.
The noisy leakage model. While there are several variants of the “noisy leakage model” [12, 20, 22] most works that consider the security of masking schemes use the model of Chari et al. [4] and its generalization by Prouff and Rivain [22]. Informally, a noisy leakage function \(f: \mathbb {G}\rightarrow {\mathcal Y}\) is called \(\delta \)-noisy if the statistical distance between the uniform distribution X over \(\mathbb {G}\) and the conditional distribution of X given f(X) is bounded by a parameter \(\delta \in [0, 1-1/|\mathbb {G}|]\). Here, \({\mathcal Y}\) is the domain of the leakage, which in general can be an infinite set. To better understand the Prouff-Rivain noise model, let us consider the two extreme cases. First, when \(\delta \) is close to 0 then f is assumed to be very noisy, and hence the noisy leakage reveals little about the underlying sensitive information. On the other extreme, when \(\delta \) is close to \(1-1/|\mathbb {G}|\) then there is almost no noise in the leakage (i.e., the function f is “almost” deterministic). The Prouff-Rivain noise model is believed to model well real-world physical side-channel leakages. Moreover, as shown in [7, 10] it is also a robust noise measure as it is equivalent to various other ways of describing the noise present in a leakage function.
Masking schemes in the noisy leakage model. A masking function is said to be secure in the noisy leakage model if for any \(X,Y \in \mathbb {G}\), we have that noisy leakage from an encoding of \((X_1, \ldots X_n) \leftarrow \mathsf {Enc}(X)\) is statistically close to noisy leakage from \((Y_1, \ldots , Y_n) \leftarrow \mathsf {Enc}(Y)\). As discussed above recent works have significantly improved the Prouff-Rivain \(\delta \)-bias for which security of the encoding function can be shown. While initial works [7, 22] require that \(\delta = O(1/|\mathbb {G}|)\), the recent work of Dziembowski et al. [10] show that noise amplifies security already when \(\delta < 1/16\) (and, hence in particular independent of the size of the underlying group \(\mathbb {G}\)). In this work, we can show that for prime-order groups masking amplifies the noise for \(\delta \le 1- 1/p\). Notice that in case when p is super-polynomial in the security parameter (as in discrete-log based cryptosystems), then we achieve security under the optimal assumption of \(1- negl (n)\). For general groups (in particular, groups with small factorization) we show that amplification is possible when \(\delta < 1/2\). We also show that both our bounds are optimal as for values above the threshold amplification is not possible. We provide further details on our contributions and techniques in the next two sections.
1.1 Our Contributions
We analyze the amplification of noisy leakage for the simple additive encoding function \(\mathsf {Enc}\). The quality of the amplification is measured by the \(\epsilon \)-security of the encoding. We say that an encoding is \(\epsilon \)-secure if the statistical distance between the \(\delta \)-noisy leakage of \(\mathsf {Enc}(x)\) and \(\mathsf {Enc}(x')\) for two elements \(x,x' \in \mathbb {G}\) is upper bounded by \(\epsilon \). We characterize how many shares n we need in order to amplify the noise available in the leakage from the shares. To this end we derive a value \(\delta _{\max }\) which is the maximal value for \(\delta \) until which we can still amplify the noise for sufficiently large n. Of course, as we show the number of share n needs to increase the closer we set \(\delta \) to \(\delta _{\max }\). One interesting observation arising from our analysis is that the value of \(\delta _{\max }\) depends on the structure of the underlying group \(\mathbb {G}\). We summarize our results regarding the upper bound until which amplification is possible in the following informal theorem.
Theorem
(Noise amplification, informal version of Theorem 1 ). Let \(\mathbb {G}\) be a group (either prime order, or arbitrary) and let the adversary obtain \(\delta \)-noisy leakage from the shares of the encoding. Define the maximal noise parameter \(\delta _{\max }\) as
Then for any \(\delta < \delta _{\max }\) and any \(\epsilon >0\) we have that the encoding \(\mathsf {Enc}\) is \(\epsilon \)-secure for
(where \(\mathrm {poly}\) is some polynomial).
Let us explain the interplay of the parameters used in the above theorem. The number of shares grows polynomially with two parameters: (a) the logarithm of \(\epsilon ^{-1}\), i.e., the target security we aim for, and (b) the gap \(\theta = \delta _{\max } - \delta \) between the maximal possible noise value and the actual chosen noise level \(\delta \). The dependency on (a) is as expected: if we aim for better security meaning a smaller value of \(\epsilon \) we require a larger security parameter n. The reason for the dependency stated in (b) is more technical, and essentially comes from a bias convergence in the harmonic analysis (when \(\mathbb {G}\) has prime order), or in the XOR Lemma (when \(\mathbb {G}\) has non-prime order), see Sect. 3 for details.Footnote 1
Dependence on the group order. As already noted, our Theorem 1 distinguishes between two cases. In the first case the group \(\mathbb {G}\) is of prime order. Interestingly, in this case it turns out that arbitrarily small noise (i.e. \(\delta \) close to 1) can be amplified. Informally, this is thanks to the fact that prime-order groups have no non-trivial sub-groups. On the other hand, if a group has a non-prime order, i.e., it contains non-trivial subgroups, then we require a higher noise (more precisely: \(\delta < 1/2\)). On a practical level this means that in some sense the prime order groups “provide a better leakage resilience” than the general groups. This may be useful for discrete-log based cryptosystems that typically work in such groups.
Lower bounds on the necessary noisy level via homomorphic attacks. We show that the maximal noise parameter \(\delta _{\max }\), as defined in Eq. (1), is optimal in the following sense. We show in Proposition 1 that \(\delta \) has to be less than \(1-\frac{|\mathbb {H}|}{|\mathbb {G}|}\), where \(\mathbb {H}\) is the largest proper subgroup of \(\mathbb {G}\). An intuitive explanation for why the group structure is relevant here, is the existence of “homomorphic attacks”, when given \(X_1,\ldots ,X_n\) being shares of X and their evaluations \(\phi (X_1),\ldots ,\phi (X_n)\) under a homomorphism \(\phi : \mathbb {G}\rightarrow \mathbb {H}\), we can compute \(\phi (X) = \phi (X_1)+\ldots +\phi (X_n)\).
The implications of Proposition 1 are two-fold. Firstly, as long as the general groups are considered (and hence no assumptions on the order can be made), then we prove that one needs to assume that the \(\delta \) is less than 1 / 2. This is because, since the largest proper subgroup \(\mathbb {H}\) of \(\mathbb {G}\) can be of size \(|\mathbb {G}|/2\), thus \(1 - |\mathbb {H}|/|\mathbb {G}|\) can be as small as 1 / 2. Secondly, is \(\mathbb {G}\) has prime order then \(|\mathbb {H}| = 1\) and therefore \( 1 - |\mathbb {H}|/|\mathbb {G}| = 1 - 1/p\). Hence in this case the lower bound matches the upper bound from Theorem 1.
It is natural to ask if our results can be more fine-grained, and fully characterize the noise requirements in terms of \(|\mathbb {H}|/|\mathbb {G}|\). We conjecture that in fact the upper bound \(1-|\mathbb {H}|/|\mathbb {G}|\) can also be always achieved (not only in the cases when \(|\mathbb {H}|=|\mathbb {G}|/2\) and \(\mathbb {H}=|1|\)). We leave proving it as an open problem. We summarize the upper bounds and the relation to the number of shares in Table 1 below.
Applications of our techniques outside of masking schemes. We show that our techniques also have applications outside of the domain of leakage resilient cryptography. In particular, using our results we can extend the following product theorem, due to Maurer et al. [17].
Lemma 1
(Product Theorem [17]). Let \(\mathbb {G}\) be a group, \(d(\cdot )\) denote the distance from uniform and \(X_1,X_2\) be arbitrary independent random variables on \(\mathbb {G}\). Then we have
We give a different proof and calculate optimal constants for any group.
Theorem
(Product Theorem with optimal constants, informal version of Theorem 2 ). Let \(\mathbb {G}\) be a group, \(d(\cdot )\) denote the distance from uniform and \(X_1,X_2\) be arbitrary independent random variables on \(\mathbb {G}\). Then we have
where \(c(\mathbb {G}) \le 2\) is a constant depending on the structure of the underlying group \(\mathbb {G}\).
For the exact value of \(c(\mathbb {G})\) we refer the reader to Appendix A.6.
Comparing our results to previous works. Our work improves the previous state-of-the-art in the following aspects:
-
(a)
For general groups the best upper bound was given in [10], where it was shown that \(\delta < 1/16\). We improve this bound to \(\delta < 1/2\), which is optimal for groups with even order. Moreover, for groups of prime-order p we give a novel bound of \(1-\frac{1}{p}\). Notice that for primes that are super-polynomial in the security parameter we achieve security for \(\delta \) arbitrary close to 1. We notice that in contrast to earlier works our proof techniques also have the advantage of being more modular.
-
(b)
We provide matching lower bounds showing that the noise threshold as well as the growth rate of the number of shares are optimal, which was not known before.
-
(c)
Our proof techniques may be of independent interest and have not been used previously for analyzing the security of noisy leakages. In particular, our analysis uses techniques from convex optimization, additive combinatorics and harmonic analysis.
In Table 2 below we compare our results with related works.
Comparison with the binomial noise model and the XOR lemma. We stress that the noise model of Prouff and Rivain [22] we consider is significantly more general than the binomial noise model considered by Faust et al. in [12] (even in the binary case) and considers many other types of noisy functions. In particular, our noisy function f maps elements from the group \(\mathbb {G}\) to a possibly infinite set Y.
If we restrict in Theorem 1 the noise model to the special case of binomial noise (i.e., the leakage function f is the binomial noise function), then we obtain comparable parameters with e.g., in [12] (cf. Lemma 4 in [12]).Footnote 2
1.2 A High-Level Proof Outline
We now give an overview of our proof techniques (the details appear in Sect. 3). We prove our result in five steps illustrated on Fig. 1. We first show that it is enough to consider uniform secrets X. The proof appears in Sect. 3.1.1. The cryptographic interpretation of this claim is that it suffices to consider only random-plaintext attacks, instead of chosen-plaintext attacks.
Then, in Sect. 3.1.2, we consider the distance between a uniform secret X given \(\delta \)-noisy leakages \(f_1(X_1),\ldots ,f_n(X_n)\) from its encoding and a uniformly and independently chosen \(X'\). We show that bounding this distance is equivalent to bounding the distance of a random sum of the form \(Z=\sum _{i=1}^{n} Z_i\) with independent summands \(Z_i\), conditioned on noisy information \(\{ f_i(Z_i) \}_{i=1}^n\), from the uniform distribution U. Here we use the fact that X is uniform (guaranteed by Step 1). The fact that leakage functions are \(\delta \)-noisy guarantees that \(Z_i\) is \(\delta \)-close to uniform given \(f_i(Z_i)\), for \(i=1,\ldots ,n\). Intuitively it is clear that if independent random variables on a group are close to the uniform distribution, then their sum is even closer to uniform (cf. XOR-lemmas [13], see also Appendix A.8). The main issue here is that our summands are conditional distributions, which means that \(Z_i\) is close to U only in average conditioned on concrete values of the leakage \(f_i(Z_i)=y_i\).
Next in Sect. 3.1.3 we get rid of the conditional part \(\{f_1(Z_1),\ldots ,f_n(Z_n)\}\). This step is accomplished by considering concrete leakage values \(f_1(Z_i) = y_i\) for \(i=1,\ldots ,n\) and noticing that for most of them the distance from uniform is not much bigger than \(\delta \), which we conclude by the Chernoff Bound. This step results into an error term, which is exponential in \(n\theta ^2\) where \(\theta =\delta _{\max }- \delta \) is the gap-parameter defined above. Informally speaking, the gap \(\theta \) is what allows to further reduce the problem to study only the distance of sums of the form \(Z=Z_1+\ldots +Z_n\) from the uniform distribution.
Later, in Sect. 3.1.4, we characterize the distributions \(Z_i\) for which the distance between \(Z=Z_1+\ldots +Z_n\) and U is maximal. It turns out that they have a simple shape: they are a combination of a mass-point and a distribution uniform over a set of size \((1-\delta )|\mathbb {G}|-1\). A description of how these “worst-case” distributions look like, enables us to come up with concrete estimates for the statistical distance in the next step.
Finally, in Sect. 3.1.5 we prove concrete facts about the convergence speed of cumulative sums of random variables that are sufficiently close to the uniform distribution. Depending on the technique used and the assumption on the group structure imposed, we obtain different bounds in Theorem 1.
An overview of the proof of Theorem 1 and applied techniques.
We note that for the case when we make no assumptions on the structure of \(\mathbb {G}\), steps from Sects. 3.1.4 and 3.1.5 (but not from Sects. 3.1.1—3.1.3) could be replaced by a product theorem due to Maurer et al. [18]. Our technique allow us to extend this theorem, taking the group structure into account. In the last step we split the proof depending on the technique and the assumption about \(\mathbb {G}\). The quantitative comparison of different bounds we get is given in Table 3. Note that the number n of shares is asymptotically larger when the additive combinatorics is used (second row of Table 3) than when the harmonic analysis is used (the third row). Nevertheless we think it is instructive the present both results since the proof techniques are different, and both can be of independent interest.
1.3 Our Techniques
In this section we summarize our main techniques used in the proof of Theorem 1.
Convex analysis. We use the convex analysis in Sect. 3.1.4 to deal with the problem of determining how fast the sum of independent components \(Z_1+\ldots + Z_n\) on a group \(\mathbb {G}\) converges towards the uniform distribution. Our assumption on the noise guarantees that \(Z_i\) are at most \(\delta \)-close to the uniform distribution with some parameter \(\delta \). Since we can think of distributions over \(\mathbb {G}\) as vectors in \(\mathbb {R}^{|\mathbb {G}|}\), we observe that the mapping
is a convex mapping, with respect to the distribution of any \(Z_i\) when the remaining components are fixed. Since the restrictions on the distance of \(Z_i\)’s from uniform are also convex, we conclude that the maximum is achieved for one of the extreme points from the set of feasible \(Z_i\). As a consequence we observe that they have a surprisingly simple shape (see Lemma 5). That simple structure will play an important role in the very last step of the proof. Also, it allows us to derive a general product theorem for groups, with an explicit expression with tight constants.
Additive combinatorics. In Sect. 3.1.5, when studying the convergence of random sums over a general group \(\mathbb {G}\), we can find a proper subgroup \(A\mathrel { \triangleleft } \mathbb {G}\) which is by definition an additive set, that is
Such a set may constitute a trap for our random walk \(Z_1+\ldots +Z_n\). If \(Z_i \in A\) for all i, then \(Z_1+\ldots +Z_n \in A\). When \(\mathbb {G}=\mathbb {Z}_p\) such a trap does not exist, so intuitively the sum takes all the elements with similar probability when n is large (because even one non-zero point generates the group when added multiple times). This is where we use some basic facts from additive combinatorics. The first result of this sort is the Cauchy-Davenport theorem which states that the sumset \(A+B\), where A, B are arbitrary subsets of \(\mathbb {Z}_p\) must be substantially bigger than A and B alone. More precisely
This result does not help us much because it gives no estimate on repetitions in the sumset \(A+B\), that is how many of the expressions \(a+b\) where \(a\in A,b\in B\) hit the same place. To get more information about the distribution of repetitions in the sumset we use a more refined result due to Pollard. Combining it with the explicit form of \(Z_i\) (developed in the previous steps) we obtain a non-trivial upper bound on \(\mathsf {SD}(Z_1+Z_2;U)\) in terms of \(\mathsf {SD}(Z_1;U),\mathsf {SD}(Z_2;U)\), which is then extended to the sum of n elements.
Harmonic analysis. Also, in Sect. 3.1.5, having reduced our problem to the convergence of a random walk with independent increments, we can use techniques from Fourier analysis. Recall that a character is a complex-valued function \(\phi \) which is additive on \(\mathbb {G}\), that is \(\phi (x+y) = \phi (x)\cdot \phi (y)\). The expectations of characters on independent sums are especially easy to evaluate, because
For \(\mathbb {Z}_p\) expectations are easier to calculate, because any character \(\phi \) is of the simple form \(\phi (x) = \exp (2\pi k \mathrm {i}/p)\) for some k. Since we know the shape of the worst \(Z_i\), we can obtain a concrete estimate for a nontrivial character \(\phi \), namely,
Intuitively, this comes from the fact that \(Z_i\) “contains” a large uniform component which doesn’t allow the mass to concentrate at one point. Using a bound on geometric sums over unity roots, we conclude that \(\mathbb {E} \left[ \phi \left( Z_1+\ldots +Z_n\right) \right] < c\). Finally we apply the XOR lemma which states that characters are “representative” distinguishers: if two distributions have close expectations under every character, they indeed are statistically close. In our case we apply this claim to \(Z=Z_1+\ldots +Z_n\) and U and the result follows since we have shown that \(\mathbb {E}[\phi (Z)]\) is small and trivially we have \(\mathbb {E}[\phi (U)] = 0\) (for non-trivial \(\phi \)).
2 Preliminaries
If X and Y are random variables over the same set \({\mathcal X}\) then the statistical distance between X and Y is denoted as \(\mathsf {SD}(X; Y)\), and defined as \(\mathsf {SD}(X; Y) = \frac{1}{2} \sum _{x \in {\mathcal X}} | \Pr [X=x] - \Pr [Y=x] |\). If Z is a random variable then by \(\mathsf {SD}(X; Y|Z)\) we mean \(\mathsf {SD}((X,Z); (Y,Z))\), i.e., the statistical distance of the two joint distributions. If two distributions X and Y are equivalent, then we write \(X \,{\buildrel d \over =}\,Y\). Below we formally define the encoding and decoding of a secret \(X \in \mathbb {G}\).
Definition 1
(Encoding and Decoding). Let \((\mathbb {G},+)\) be a fixed group and \(n>1\) be a fixed natural number. For any \(X \in \mathbb {G}\), we define the encoding function \(\mathsf {Enc}\) by
where \(X_1,\ldots ,X_{n-1}\) are independent and uniform over \(\mathbb {G}\), and the decoding function \(\mathsf {Dec}\) by
We will typically omit n and \(\mathbb {G}\) and simply write \((\mathsf {Enc},\mathsf {Dec})\) when clear from the context.
Noisy leakages. The noise in the observed version Y of a real distribution X, denoted by \(\beta (X|Y)\), is measured by comparing how close is the product distribution \(\mathbf {P}_X\cdot \mathbf {P}_Y\) to the joint distribution \(\mathbf {P}_{(X,Y)}\). More formally, we have the following definition, which comes from [22] (see also [7]), where it was argued that it models physical noise in a realistic way.
Definition 2
(Noisy observations and noisy functions). A random variable \(Y\in \mathcal {X}\) is called a \(\delta \) -noisy observation of X if
where \(X'\) is an independent copy of X. A function f is called \(\delta \)-noisy if f(U) is a \(\delta \)-noisy version of U, where U is uniform over U.
Notice that [22] defined the noisy function as the \(\sum _y \Pr [Y=y] \cdot \mathsf {SD}(X'; (X|Y=y))\). This is equivalent to the above when X and \(X'\) are independent and Y is the leakage from X. Note that this definition may seem a bit counterintuitive as more noise means a bias value closer to 0, while a value closer to 1 means that less noise is present in the leakage observations.
3 Our Main Result
We are now ready to present our main result that was already informally described in Sect. 1.1.
Theorem 1
Let X be a random variable on a group \(\mathbb {G}\) and let \(\text {Enc}(X) = (X_1,\ldots ,X_n)\) be its encoding. Suppose that \(f_i\) for \(i=1,\ldots ,n\) are all \(\delta \)-noisy functions, i.e. \(\beta (X_i| f_i(X_i)) \leqslant \delta \). Then we have the following bounds
-
(i)
For arbitrary \(\mathbb {G}\), if \(\delta \leqslant \frac{1}{2}-\theta \), then \(\beta (X| f_i(X_1),\ldots ,f_i(X_n)) \leqslant \epsilon \) provided that
$$\begin{aligned} n > 8\theta ^{-2}\log (5|\mathbb {G}|\epsilon ^{-1}) \end{aligned}$$ -
(ii)
If \(\mathbb {G}= \mathbb {Z}_p\) and \(\delta \leqslant 1-\frac{1}{p}-\theta \) then \(\beta (X| f_i(X_1),\ldots ,f_i(X_n)) \leqslant \epsilon \), provided that
$$\begin{aligned} n> \log (3|\mathbb {G}|\epsilon ^{-1}) \cdot 2^{12\log (1/\theta )/12 \theta } \end{aligned}$$ -
(iii)
If \(\mathbb {G}= \mathbb {Z}_p\) and \(\delta \leqslant 1-\frac{1}{p}-\theta \) then \(\beta (X| f_i(X_1),\ldots ,f_i(X_n)) \leqslant \epsilon \), provided that
$$\begin{aligned} n > 2\theta ^{-4}\log (|\mathbb {G}|\epsilon ^{-1}) \end{aligned}$$
3.1 Proof of Theorem 1
The proof of Theorem 1 was already outlined in Sect. 1.2. In the next sections we describe it in more detail. The final proof appears in Sect. 3.1.5.
3.1.1 Reducing to Uniform Secrets
Below we show that it sufficient to consider only uniform secrets X.
Lemma 2
Suppose that X is uniform over \(\mathbb {G}\) with the encoding \(\mathsf {Enc}(X) = (X_1,\ldots ,X_n)\). Let \(X'\) be an arbitrary distribution over \(\mathbb {G}\) with the encoding \(\mathsf {Enc}(X')= (X'_1,\ldots ,X'_n)\). Then for arbitrary functions \(f_1,\ldots ,f_n\) we have
The proof appears in Appendix A.3. Note that we lose a factor of \(|\mathbb {G}|\) in this transformation. However this does not actually affect the bound we want to prove, because we show that the main part \(\beta ( X | f_1(X_1),\ldots , f_n(X_n))\) converges to 0 exponentially fast with n.
3.1.2 Reducing to Random Walks Conditioned on Noisy Information.
We now show that the noise in a uniform secret X given \(\delta \)-noisy leakages \(f_1(X_1),\ldots ,f_n(X_n)\) is equal to the distance of a random sum of the form \(Z=\sum _{i=1}^{n} Z_i\) with independent summands \(Z_i\), conditioned on noisy information \(\{ f_i(Z_i) \}_{i=1}^n\), from the uniform distribution U.
Lemma 3
For X uniform on a set \(\mathcal {X}\) and any functions \(f_i\) the following equality holds
where U and \(Z_i\) for \(i=1,\ldots ,n\) are uniform and independent over \(\mathcal {X}\).
The proof appears in Appendix A.4. Justifying the title, we note that we can think of the sum \(\sum _{i=1}^{n} Z_i\) as a random walk which starts at 0, with increments \(Z_i\).
3.1.3 Reducing to Unconditional Random Walks
The following lemma is an easy consequence of the Chernoff Bound.
Lemma 4
Let \((Z_i,Y_i)_i\) for \(i=1,\ldots ,n\) be independent random variables such that \(\varDelta (Z_i;U | Y_i) \leqslant \delta \). Then, for any \(\gamma >0\), \(\delta '=\delta +2\gamma \) and \(n' = \gamma n\)
where the maximum is taken over all independent random variables \(Z'_i\).
The proof appears in Appendix A.5. The immediate corollary below shows that we can get rid of the conditinal part in the right-hand side of Eq. (4) in Lemma 3.
Corollary 1
For \(n' = \frac{\theta }{4}\cdot n\) we have
where the maximum is taken over all independent random variables \(Z'_i\).
Note that by combining Step 1, Step 2 and Step 3 with Lemma 1 we can already conclude part (i) of Theorem 1 (see Sect. 3.1.5).
3.1.4 Worst-Case Summands
We prove the following geometrical fact:
Lemma 5
(Shape of extreme points of distributions close to uniform). Let \(\mathcal {X}\) be a finite set and U be uniform over \(\mathcal {X}\). Any distribution \(X\in \mathcal {X}\) such that \(\mathsf {SD}(X;U) \leqslant \delta \) can be written as a convex combination of “extreme” distributions \(X'\) of the following form: with probability \(p=\delta +|\mathcal {X}|^{-1}\) the distribution \(X'\) takes a value a and with probability \(q=1-p\) the distribution \(X'\) is uniform over a set A, where \(a\not \in A\), of size \(|A|=(1-\delta )|\mathcal {X}|-1\) Footnote 3. Equivalently, each of these distributions \(X'\) is of the following form:
for some B such that \(|B| = \delta |\mathbb {G}|\) and \(b\not \in B\).
Note that we always have \((1-\delta )|\mathcal {X}|-1\geqslant 0\), as the range of the noise parameter is \(0 \leqslant \delta \leqslant 1-\frac{1}{\mathcal {X}}\), when we consider secrets over \(\mathcal {X}\).
Corollary 2
Let \(Z_i\), for \(i=1,\ldots ,n\) be independent random variables such that \(\mathsf {SD}(Z_i;U) \leqslant \delta \) for every i. Then \(\mathsf {SD}\left( \sum _{i}Z_i;U\right) \) is maximized for \(Z_i\) as in Lemma 5
Proof
(of Corollary 2 ). Note that the distribution of \(\sum _{i}Z_i\) is a convolution of individual distributions \(\mathbf {P}_{Z_i}\), and therefore it is multilinear in \(\mathbf {P}_{Z_i}\). It follows that \(\mathsf {SD}\left( \sum _{i}Z_i;U\right) = \frac{1}{2}\left\| \mathbf {P}_{\sum _{i}Z_i}-\mathbf {P}_U \right\| _1\) is convex in \(\mathbf {P}_{Z_i}\) and the claim follows by the extreme point principle.
Using Lemma 5 we derive the following generalization of Lemma 1
Theorem 2
Let \(Z_1,Z_2\) be independent random variables on a group \(\mathbb {G}\). Then we have
where the constant is given by
where \(\mu _0\) is the point mass at 0, \(\delta _i = \mathsf {SD}(Z_i;U)\), and \(\mu _A\), \(\mu _B\) are uniform over the sets A and B. Moreover, the sharp constant is achieved for the following random variables: \(Z_i\) is constant with probability \(\delta _i+\frac{1}{|\mathbb {G}|}\) and with probability \(1-\delta _i-\frac{1}{|\mathbb {G}|}\) is uniform on some set of size \((1-\delta _i)|\mathbb {G}|-1\).
Lemma 5 is a corollary from Theorem 2, whose proof appears in Appendix A.6. Note that Lemma 1 follows from Theorem 2, since \(c_{\max }(\mathbb {G}) \leqslant 2\) trivially, since
by the triangle inequality and the fact that the total mass of a probability measure is 1.
3.1.5 Concrete Bounds
In view of Corollary 1 it remains to give an upper bound on the distance between sums of independent random variables which are not too far from uniform and the uniform distribution, i.e., on:
To this end, we split our analysis depending on the structure of \(\mathbb {G}\) and chosen technique.
Case: \(\ \mathbb {G}\ is\ arbitrary.\) From Corollary 1 and Lemma 1 applied \((n-1)\) times it follows that
From the assumption \(\delta < \frac{1}{2}-\theta \) and the elementary inequality \(1-u \leqslant \mathrm {e}^{-u}\) we obtain \(\left( 2\delta +\theta \right) ^{\frac{\theta }{4} n} \leqslant \mathrm {e}^{-\frac{1}{4}\theta ^2 n}\), which gives us
for uniform X. Taking into account Step 1, we finally obtain
for any X, which is equivalent to part (i) of Theorem 1.
Case: \(\ \mathbb {G}= \mathbb {Z}_p, for\ p\ prime\ (by\ additive\ combinatorics).\) When \(\mathbb {G}=\mathbb {Z}_p\), we improve Lemma 1 in the following way, using Corollary 2 and some tools from additive combinatorics (see Appendix A.7 for a proof).
Lemma 6
Let \(Z_1,Z_2\) be independent random variables on \(\mathbb {Z}_p\) such that \(\mathsf {SD}(Z_i;U)\leqslant \delta _i\). Then
where
and \(\phi (\delta _1,\delta _2) \overset{\text {def}}{=} \delta _1+\delta _2+\min (\max (\delta _1,\delta _2),1-|\delta _1-\delta _2|)-1\).
We will use only the following consequence of Lemma 6.
Corollary 3
Let \(Z_1,Z_2\) be independent random variables on \(\mathbb {Z}_p\) such that \(\mathsf {SD}(Z_i;U)\leqslant \delta _i \leqslant \delta \). Suppose that \(\delta >\frac{1}{3}\). Then we have
Using recursively Corollary 3 and applying Corollary 1 we obtain the following bound for uniform X
which, by Step 1, implies part (ii) of Theorem 1. For a detailed derivation, see Lemma 8 in Appendix A.7.
\({Case\ \mathbb {G}= \mathbb {Z}_p,\ for\ a\ prime\ number\ p\ (by\ harmonic\ analysis).}\) We start by obtaining the following auxiliary estimate on trigonometric sums, valid for any \(A\subset \mathbb {Z}_p\) such that \(|A|=\theta p\) and \(k \in \{1,2,\ldots ,p-1\}\):
The proof uses a geometrical argument and some trigonometric identities and is given inside the proof of Lemma 10 in Appendix A.8. Based on Corollary 2 and the XOR lemma (see Lemma 9 in Appendix A.8), we prove that
for uniform X, which by Step 1 implies part (iii) of Theorem 1; the details are given in Lemma 10 in Appendix A.8.
4 Lower Bounds
Proposition 1
(The noise threshold (1) is optimal). For any group \(\mathbb {G}\), there exist a \(\delta \)-noisy function f where
\(\mathbb {H}\) being the biggest proper subgroup of \(\mathbb {G}\), such that for every n we have
The proof appears in Appendix A.1.
Proposition 2
(The growth rate in (2) is optimal). For \(\mathbb {G}= \mathbb {Z}_2\), X being uniform on \(\mathbb {G}\), and any \(\theta < \frac{1}{2}\) there exists a \(\left( \frac{1}{2}-\theta \right) \)-noisy leakage function f such that for every n satisfying
we have
In turn, for \(\mathbb {G}= \mathbb {Z}_p\) where p is prime, X being uniform on \(\mathbb {G}\), and any \(\theta < 1-\frac{1}{p}\) there exists a \(\left( 1-\frac{1}{p}-\theta \right) \)-noisy leakage function f such that for every n satisfying
we have
The proof appears in Appendix A.2. Note that for \(\theta < \frac{1}{4}\) we have \(\log ( (1-2\theta )^{-1}) < \frac{1}{4}\theta ^{-1}\), and then Eq. (12) can be replaced by
Similarly, for \(\theta < \frac{1}{2}\) we have \(\log ( (1-\theta )^{-1}) < \frac{1}{2}\theta ^{-1}\), and then Eq. (13) can be replaced by
Notes
- 1.
We show also that the dependency on \(\theta \) is necessary as otherwise we can provide attacks against the encoding.
- 2.
The main restriction when comparing our result with a direct application of the XOR Lemma, is that we require the probability p of flipping the shares to be \(> 1/4\) (in contrast to [12] where \(p >0\) is sufficient). The later restriction stems from the fact that leakages in the binomial noise model with parameter p are transformed in our noise model to a requirement of \(delta=1-2p\) noisy-function. We emphasize that for the general type of “noisy leakage” we consider, our bounds are optimal as shown by our lower bounds.
- 3.
If \(\delta |\mathcal {X}|\) is not an integer, then instead of a uniform distribution we consider the distribution flat over a set A such that \(|A|= \lceil (1-\delta )|\mathcal {X}|-1 \rceil \), which assigns the mass of \(\frac{1}{(1-\delta )|\mathcal {X}|-1}\) to all but one points, and the mass of \(\frac{ \lceil (1-\delta )|\mathcal {X}|-1 \rceil - \left( (1-\delta )|\mathcal {X}|-1\right) }{ (1-\delta )|\mathcal {X}|-1}\) to the ramining point.
- 4.
Otherwise we could decompose either the positive part \(\mu ^{+}\) into a combination of two distributions (when \(\mu ^{+}\) is supported on more than one point) or the negative part \(\mu ^{-}\) (when the constraint Equation (29) is not binding at some point in the support).
References
Akavia, A., Goldwasser, S., Vaikuntanathan, V.: Simultaneous hardcore bits and cryptography against memory attacks. In: Reingold, O. (ed.) TCC 2009. LNCS, vol. 5444, pp. 474–495. Springer, Heidelberg (2009)
Alwen, J., Dodis, Y., Wichs, D.: Leakage-resilient public-key cryptography in the bounded-retrieval model. In: Halevi, S. (ed.) CRYPTO 2009. LNCS, vol. 5677, pp. 36–54. Springer, Heidelberg (2009)
Brakerski, Z., Kalai, Y. T., Katz, J., Vaikuntanathan, V.: Overcoming the hole in the bucket: public-key cryptography resilient to continual memory leakage. In: 51st FOCS, Las Vegas, Nevada, USA, pp. 501–510. IEEE Computer Society Press, 23–26 October 2010
Chari, S., Jutla, C.S., Rao, J.R., Rohatgi, P.: Towards sound approaches to counteract power-analysis attacks. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 398–412. Springer, Heidelberg (1999)
Davì, F., Dziembowski, S., Venturi, D.: Leakage-resilient storage. In: Garay, J.A., De Prisco, R. (eds.) SCN 2010. LNCS, vol. 6280, pp. 121–137. Springer, Heidelberg (2010)
Dodis, Y., Haralambiev, K., López-Alt, A., Wichs, D.: Cryptography against continuous memory attacks. In: 51st FOCS, Las Vegas, Nevada, USA, pp. 511–520. IEEE Computer Society Press, 23–26 October 2010
Duc, A., Dziembowski, S., Faust, S.: Unifying leakage models: from probing attacks to noisy leakage. In: Nguyen, P.Q., Oswald, E. (eds.) EUROCRYPT 2014. LNCS, vol. 8441, pp. 423–440. Springer, Heidelberg (2014)
Duc, A., Faust, S., Standaert, F.-X.: Making masking security proofs concrete. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015. LNCS, vol. 9056, pp. 401–429. Springer, Heidelberg (2015)
Dziembowski, S., Faust, S.: Leakage-resilient circuits without computational assumptions. In: Cramer, R. (ed.) TCC 2012. LNCS, vol. 7194, pp. 230–247. Springer, Heidelberg (2012)
Dziembowski, S., Faust, S., Skorski, M.: Noisy leakage revisited. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015. LNCS, vol. 9057, pp. 159–188. Springer, Heidelberg (2015)
Dziembowski S., Pietrzak, K.: Leakage-resilient cryptography. In: 49th FOCS, Philadelphia, Pennsylvania, USA, pp. 293–302. IEEE Computer Society Press, 25–28 October 2008
Faust, S., Rabin, T., Reyzin, L., Tromer, E., Vaikuntanathan, V.: Protecting circuits from leakage: the computationally-bounded and noisy cases. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 135–156. Springer, Heidelberg (2010)
Goldreich, O.: Three XOR-lemmas — an exposition. In: Goldreich, O. (ed.) Studies in Complexity and Cryptography. LNCS, vol. 6650, pp. 248–272. Springer, Heidelberg (2011)
Goldwasser S., Rothblum, G.N.: How to compute in the presence of leakage. In: 53rd FOCS, New Brunswick, NJ, USA, pp. 31–40. IEEE Computer Society Press, 20–23 October 2012
Ishai, Y., Sahai, A., Wagner, D.: Private circuits: securing hardware against probing attacks. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 463–481. Springer, Heidelberg (2003)
Kocher, P.C., Jaffe, J., Jun, B.: Differential power analysis. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999)
Maurer, U.M., Pietrzak, K., Renner, R.S.: Indistinguishability amplification. IACR Cryptology ePrint Archive, 2006:456 (2006)
Maurer, U.M., Pietrzak, K., Renner, R.S.: Indistinguishability amplification. In: Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 130–149. Springer, Heidelberg (2007)
Micali, S., Reyzin, L.: Physically observable cryptography. In: Naor, M. (ed.) TCC 2004. LNCS, vol. 2951, pp. 278–296. Springer, Heidelberg (2004)
Naor, M., Segev, G.: Public-key cryptosystems resilient to key leakage. In: Halevi, S. (ed.) CRYPTO 2009. LNCS, vol. 5677, pp. 18–35. Springer, Heidelberg (2009)
Pollard, J.M.: A generalisation of the theorem of cauchy and davenport. J. Lond. Math. Soc. s2–8(3), 460–462 (1974)
Prouff, E., Rivain, M.: Masking against side-channel attacks: a formal security proof. In: Johansson, T., Nguyen, P.Q. (eds.) EUROCRYPT 2013. LNCS, vol. 7881, pp. 142–159. Springer, Heidelberg (2013)
Rao, A.: An exposition of bourgain’s 2-source extractor. In: Electronic Colloquium on Computational Complexity (ECCC), vol. 14, p. 034 (2007)
Standaert, F.-X., Malkin, T.G., Yung, M.: A unified framework for the analysis of side-channel key recovery attacks. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 443–461. Springer, Heidelberg (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Proofs
A Proofs
1.1 A.1 Proof of Proposition 1
Proof
Let X be uniform over the set \(\mathbb {G}\) and let \(\phi \) be the canonical quotient homomorphism, that is \(\phi (g) = g+\mathbb {H}\). For n shares \(X_1,\ldots ,X_n\) of X we have that \(X_i | \phi (X_i)\) and X are \(\left( 1-\frac{|\mathbb {H}|}{|\mathbb {G}|}\right) \)-far, because \(X_i | \phi (X_i)=y_i\) is uniform over a set of \(|\mathbb {G}|/|\mathbb {H}|\) elements, for every choice of \(y_i\). Similarly, \(X=\sum _{i} X_i\) is \(\left( 1-\frac{|\mathbb {H}|}{|\mathbb {G}|}\right) \)-far from uniform given \(\phi (X_i)\) for all i, because \(\phi (X) = \sum _{i=1}\phi (X_i) = \sum _{i} y_i\). To see this, note that for independent uniform U we have
where the first line follows from the fact that applying a function to two random variables only decreases the statistical distance, and the second line uses the homomorphic property of \(\phi \). The last expression is at least \(\left( 1-\frac{|\mathbb {H}|}{|\mathbb {G}|}\right) \) as already observed for uniform X.
1.2 A.2 Proof of Proposition 2
Proof
Fix \(\mathbb {G}= \mathbb {Z}_2\) and consider a uniform secret X, its shares \(X_1,\ldots ,X_n\), and leakage functions \(f_i = f\) for \(i=1,\ldots ,n\) where f(x) flips the bit x with probability \(\theta < \frac{1}{2}\). It is easy to see that these functions are \(\left( \frac{1}{2}-\theta \right) \)-noisy, that is \( \mathsf {SD}\left( \left. X_i; U \right| f(X_i) \right) = \frac{1}{2}-\theta \) where U is an independent uniform random variable. Note that for uniform X (and any functions \(f_i\)) we have the equality of distributions
where \(\{Z_i\}_{i=1}^{n}\) are independent and uniform on \(\mathbb {G}\) (see Sect. 3.1.2). As a consequence we get
One can check that every \(X'_i\) has bias \(\delta =\frac{1}{2}-\theta \) (it outputs a bit with probability \(\frac{1}{2}\pm \delta \)) conditioned on \(f(X'_i)=y\) for every \(y\in \{0,1\}\). Since the xor-sum \(Y_1+Y_2+\ldots +Y_n\) of \(\delta \)-biased independent bits \(Y_1,Y_2,\ldots , Y_n\) has bias exactly \(2^{n-1}\delta ^n\), we conclude that \( \mathsf {SD}\left( \left. X'_1+\ldots +X'_n; U \right| f_1(X'_1),\ldots ,f_n(X'_n)\right) = 2^{n-1}\delta ^n \). To go below \(\epsilon \), we need \((1- 2\theta )^n < 2\epsilon \) or
Finally, consider the case \(\mathbb {G}= \mathbb {Z}_p\). We proceed as in the previous case, achieving
for arbitrary functions. We take the functions \(f_i\) so that the distribution of \(Z_i\) given \(f(Z_i)=y_i\) for every i has the following form:
where a is a point and A is a set such that \(a\not \in A\), \(|A| = \delta |\mathbb {G}|\). As it follows from the proof of Lemma 10 in Appendix A.8, we can choose A so that \(|\mathbb {E}\phi (V_i)| \geqslant 1-\theta \) for some character \(\phi \), where \(V_i \) is the distribution of \(Z_i\) conditioned on \(f(Z_i)\). This means that the Fourier transform \(\hat{V_i}\) of \(V_i\) is at least \(1-\theta \) in the supremum norm, that is \(\Vert \hat{V_i}\Vert _{\infty }\geqslant 1-\theta \). Since the Fourier transform is multiplicative under convolution (summing independent variables) we see that we can prepare functions \(f_i\) so that \(\Vert \hat{V}\Vert _{\infty } \geqslant (1-\theta )^{n}\), where \(V=V_1+\ldots +V_n\). The Parseval identity gives us \(\Vert \hat{V}\Vert _{2} =\left\| \mu _{Z}-\mu _{U} \right\| _2\). Since \(\Vert \hat{Z}\Vert _{\infty } \leqslant \Vert \hat{V}\Vert _{2}\) and \(\left\| \mu _{V}-\mu _{U} \right\| _2 \leqslant \left\| \mu _{V}-\mu _{U} \right\| _1\) we finally obtain
The claim follows now by averaging over different values of \(y_1,\ldots ,y_n\), exactly as in the previous case.
1.3 A.3 Proof of Lemma 2
We prove the following version, from which we conclude Lemma 2.
Suppose that X is uniform and \(X_i\) be the encoding of X. Let g be a probabilistic function, \((G_i)_i\) be the encoding of \(G=g(X)\) and let \(f_i\) be noisy leakage functions. Then we have
Proof
Let V be uniform and \((V_i)_i\) be the encoding of V and let \(X',V'\) be independent copies of X, V. Note that \(X,( f_i(G_i) )_i\) is identically distributed as \(X, ( f_i(V_i) )_i | V = g(X)\). Therefore
Let \(\epsilon (x) = \Pr [V=x | (f_i(V_i))_i = (y_i)_i]-\frac{1}{|G|}\). Suppose first, that g is deterministic. We have
and
Note that \(\left| \epsilon (g(x))-\frac{1}{|G|}\sum _{x'}\epsilon (g(x')) \right| \leqslant \sum _{x'}|\epsilon (x')|\) and \(\frac{1}{|G|}\sum _{x'}\epsilon (g(x'))\leqslant \sum _{x'}\epsilon (x')\). If \(\sum _{x'}\epsilon (x') \leqslant \frac{1}{\frac{3}{2}|G|}\) then we obtain
otherwise
This way, we have shown
and by taking the average the result follows. If g is randomized, the proof is the same but \(\epsilon (g(x))\) is replaced by \(\mathbf {E}_{g} \epsilon (g(x))\) (note that we have \(\beta ( X | (f_i(G_i))_i ) \leqslant \beta ( g(X) | (f_i(G_i))_i )\).
1.4 A.4 Proof of Lemma 3
We start with the following observation: suppose that \(X_i\) for \(i=1,\ldots ,n\) are shares of the uniform secret X. Let \(X'_i\) for \(i=1,\ldots ,n\) be all uniform and independent. Then we have the following equality of distributions
Therefore,
As a consequence we obtain the following equality
Thus, our problem reduces to investigate the random walk on \(\mathbb {G}\) defined as \(\sum _{i=1}^{n}X'_i | f_i(X'_i)\). We need to show that it (under some restrictions) eventually approaches the uniform distribution as n increases, and estimate the convergence speed.
1.5 A.5 Proof of Lemma 4
Proof
We can assume that \(\delta +2\gamma < 1\). We start with the following observation:
Claim.Suppose that \(\delta _1,\ldots ,\delta _n\) are independent random variables with expected value at most \(\delta < 1\). Then with probability \(1-\exp (-2n\gamma ^2)\), at least \(n'=\gamma n\) of them is smaller than \(\delta +2\gamma \).
Proof (Proof of Claim). With probability \(1-\exp (-2 n\gamma ^2)\) we have \(\frac{1}{n}\sum _{i}\delta _i < \delta + \gamma \). Let \(n'\) be the number of i’s for which \(\delta _i < \delta +2\gamma \). Since we have \( \sum _{i}\delta _i > (n-n')(\delta +2\gamma )\), with probability \(1-\exp (-2n\gamma ^2)\) it holds that \(n(\delta +\theta ) > (n-n')(\delta +2\gamma )\) or \(n' > \frac{\gamma }{\delta +2\gamma }\cdot n > \gamma n\).
By applying the claim we see that with probability \(1-\exp (-2n\theta ^2)\) over \((y_i)\leftarrow (Y_i)_i\), there always exists a set \(I \subset \{1,\ldots , n\}\) such that \(|I|\geqslant n'\) (possibly depending on \((y_i)_i\)) such that \(\mathsf {SD}\left( \left. Z_{i};U \right| Y_{i}=y_{i}\right) \leqslant \delta +2\theta \) for \(i \in I'\). Since the distributions \(\left( Z_i, Y_i \right) _{i}\) are independent for different i’s and since \(U + Z \,{\buildrel d \over =}\,U\) for any independent random variable Z, from the elementary properties of the statistical distance we obtain
The lemma now easily follows, as for every I as above we have
1.6 A.6 Proof of Theorem 2
Proof
Let \(\mu _i\) be a distribution of \(Z_i\) for \(i=1,2\) and let \(\mu _U\) denotes the uniform measure. Let \(\varDelta (\mu _i,\mu _U)=\delta _i\). Note that we can decompose \(\mu _i = \mu _U + \delta _i\mu ^{+}_i-\delta _i\mu ^{-}_i\). Therefore
where we have made use of the fact that \(\mu _U*\nu = \mu _U\) for any distribution \(\nu \). Now we have
This is clearly at most 2. To identify the worst case choice of \(\mu _i\) that maximizes this quantity, observe that we have to bound the last expression with respect to the constraints
which come from the fact that \(\mu _i\), as decomposed, has to be positive. There is no restriction on \(\mu ^{+}_i\). Note now that the form \(\mu ^{+}_1*\mu ^{+}_2 + \mu ^{-}_1 * \mu ^{-}_2 - \mu ^{+}_1* \mu ^{-}_2 - \mu ^{-}_1* \mu ^{+}_2\) is bilinear with respect to measures \(\mu ^{+}_i,\mu ^{-}_i\) and the real-valued function \(\mu \rightarrow \left\| \mu \right\| _{\ell _{1}(G)}\) defined on signed measures is convex. It follows that \( \left\| \mu ^{+}_1*\mu ^{+}_2 + \mu ^{-}_1 * \mu ^{-}_2 - \mu ^{+}_1* \mu ^{-}_2 - \mu ^{-}_1* \mu ^{+}_2 \right\| _{\ell _1(\mathbb {G})}\) attains its maximal value for measures that are extreme points of their domain. Looking at the restrictions in (29) we see that this is the case where \(\mu ^{+}_i\) are a point mass and \(\mu ^{-}_i\) are uniform over the subset of cardinality \(\delta _i|G|\) Footnote 4. Thus we can assume that \(\mu ^{+}_1=\mu _a\), \(\mu ^{+}_2 = \mu _b\) are point mass at a, b and \(\mu ^{-}_1 = \mu _A\), \(\mu ^{-}_2 = \mu _B\) are uniform over A, B where \(|A| = \delta _1|G|\) and \(|B|=\delta _2|G|\). This way our quantity simplifies to
where we have used the fact that the norm \(\ell _1(G)\) is shift invariant and that a point mass act as shifts under the convolution.
From this we easily derive the following result
Lemma 7
(Mixing time for a sum of random variables on a group). Let \(\{Z_i\}_{i=1,\ldots ,n}\) be independent random variables on an abelian group \(\mathbb {G}\), such that \(\varDelta (Z_i;U) =\delta _i\) where \(\delta _i \leqslant \frac{1}{2}-\theta \) and \(\theta > 0\). Then for \(n\geqslant \log (1/\epsilon ) / (2\theta )\) it holds that
1.7 A.7 Proof of Lemma 6
We will show that the constant given by (8) could be much better estimated when \(\mathbb {G}=\mathbb {Z}_p\). The trivial estimate is 2, however this is possible only if \(A+B\) is disjoint with A and B. Here we remind the following result due to Cauchy and Davenport
Theorem
(Cauchy-Davenport Theorem). For any \(A,B\subset \mathbb {Z}_p\), where p is prime, we have \(|A+B|\geqslant \min (|A|+|B|-1,p)\).
In view of this result, a better estimate is impossible if only \(\delta _1+\delta _2+\max (\delta _1,\delta _2)>1+1/p\). From this we know that the estimate (7) is not sharp for \(\delta _1+\delta _2 \geqslant \frac{2}{3}+\frac{2}{3p}\). Therefore we expect to improve the estimate for sufficiently big values of \(\delta _1+\delta _2\) whereas for the smaller we can still use the general result. To this end, we will need a result stronger than the Cauchy-Davenport Theorem
Theorem
(Pollard’s Theorem [21]). For any \(A,B\subset \mathbb {Z}_p\), where p is prime, we have
where \(r_{A,B}(x)\) counts in how many different ways can we represent x as a sum \(a+b\) with \(a\in A, b\in B\).
Intuitively, Pollard’s theorem says that the distribution of \(r_{A,B}(x)\) cannot be too “heavy tailed".
Proof
(of Lemma 6 ). In fact, we will show that \(\mu _{A}*\mu _{B}\) always puts some large mass on every sufficiently big set C, essentially on A or B. Observe first that
where \(r_{A,B}(x)\) counts for how many different ways can we represent x as a sum \(a+b\) with \(a\in A, b\in B\). By trivial estimates \(r_{A,B}(x) \leqslant \min (|A|,|B|)\) we see that
Using this we can estimate the expression in (8) as follows
From Pollard’s theorem, for every set C we obtain
the maximum is for \(t_{\max } = \frac{|A|+|B|+|C|-p}{2}\) provided that \(|A|+|B|+|C|-p \geqslant 0\). We check that the required inequality \(|A| + |B| - p\leqslant t_{\max } \leqslant \min (|A|,|B|)\) is true if only the set C satisfies
Note that if \(t_{\max } \not \in \mathbb {Z}\) then the conditions above are still sufficient provided that we replace \(t_{\max }\) with \(\lceil t_{\max } \rceil \) or \(\lfloor t_{\max } \rfloor \). Considering the function \(f(t) = t(|A|+|B|+|C|-p-t)\) by the mean-value theorem we see that
Therefore, we obtain
Setting \(C\subset A\cup B\) such that \(|C| = \min \left( \max (|A|,|B|), p- ||A| - |B||\right) \) we see that the condition \(|C|\geqslant |A|+|B|-p\) is satisfied. Provided that \(|A|+|B|+|C|-p \geqslant 0\) we obtain
and the result follows by (8).
From this result we obtain the following result, from which we conclude the part (ii) of Theorem 1 by replacing \(\theta \) by \(\frac{\theta }{4}\) and combining with Corollary 1 in the same way as in the derivation of part (ii).
Lemma 8
(Mixing time for a sum of random variables on \(\mathbb {Z}_p\) ). Let \(\{Z_i\}_{i=1,\ldots ,n}\) be independent random variables on \(\mathbb {G}=\mathbb {Z}_p\), such that \(\mathsf {SD}(Z_i;U) \leqslant \delta _i\) where \(\delta _i \leqslant 1-p^{-1}-\theta \) and \(\theta > 0\). Then for \(n\geqslant 3\cdot 2^{4/\theta } \log (1/\epsilon ) / \theta \) it holds that
Proof
First, using Corollary 3, we show that every sufficiently long sum has distance at most \(\frac{1}{3}\). Once we have that, it is enough to split the entire sum into sufficiently many blocks and then apply Theorem 2. Consider \(n_0=2^{m}\). By applying Lemma 6 several times we see that
where \(B_i\) are numbers defined by the following recursion
We will prove that \(1-p^{-1}\) is the repelling point: if we start from any \(B_0\) satisfying \(\frac{1}{3}\leqslant B_0 < 1-p^{-1}\) then \(B_i\) decreases below \(\frac{1}{3}\). Let \(C_i = 1-B_i\). If \(B_{i-1} \geqslant \frac{1}{3}\), then by Corollary 3 we get
From this we conclude that if \(\frac{1}{3}\leqslant B_{i-1} < 1-p^{-1}\) then \(C_{i-1} > p^{-1}\) and hence \(C_i > C_{i-1}\) or equivalently \(B_{i} < B_{i-1}\). Moreover, if \(C_{i-1}\geqslant p^{-1}+\theta \), we get
Since \(C_i \leqslant 1\) and \(C_0 \geqslant p^{-1}+\theta > \theta \), for some \(j \leqslant \frac{4}{\theta }\log \left( \frac{1}{\theta }\right) \) we must have \(B_j < \frac{1}{3}\). Thus for \(m=\lceil \frac{4}{\theta }\log \left( \frac{1}{\theta }\right) \rceil \) we have
Consider \(\ell =\log (1/\epsilon )\) blocks of random variables \(\left\{ (Z_{2^{m}j+1},\ldots ,\left( Z_{2^{m}j+2^{m}}\right) \right\} _j\) for \(j=0,\ldots ,N-1\). For every such a \(2^{m}\)-element block from the last observation it follows that
Applying \(\ell \) times Lemma 2 yields the estimate
which finishes the proof.
1.8 A.8 Harmonic Analysis
We need the following lemma, being a generalization of Vazirani’s XOR lemma.
Lemma 9
(XOR lemma for abelian groups, [23]). Let Z be a distribution over a finite abelian group \(\mathbb {G}\), such that \(| \mathbb {E}\phi (Z)| \leqslant \epsilon \) for every non-trivial character \(\phi \) on \(\mathbb {G}\). Then X is \(\epsilon \sqrt{|\mathbb {G}|}\)-close to uniform.
Lemma 10
(Mixing times of random sums over \(\mathbb {Z}_p\) ). Let \(\{Z_i\}_{i=1,\ldots ,n}\) be independent random variables on \(\mathbb {G}=\mathbb {Z}_p\), such that \(\mathsf {SD}(Z_i;U) \leqslant 1-p^{-1}-\theta \) and \(\theta > 0\). Then for \(n\geqslant 8\cdot \log (|\mathbb {G}|/\epsilon ) / \theta ^3\) it holds that
Proof
We apply some facts from harmonic analysis. Let \(Z_i\) be the worst-case distributions that maximize \(\mathsf {SD}(\sum \limits _{i=1}^{n}X_i, U)\) under the constraints \(\mathsf {SD}(Z_i,U)\leqslant 1-p^{-1}-\theta \). By Eq. (6) that
Consider a non-trivial character \(\phi (x) = \exp (2k\pi i/p)\) on \(\mathbb {Z}_p\). Since \(A\not =\emptyset \) we have \(\theta \geqslant \frac{1}{p}\). We will show an upper bound on \(\mathbf {E}\phi (X_i)\). First, observe that
is maximized exactly when \(kA = \left\{ -\frac{|A|-1}{2},\ldots ,0,\ldots ,\frac{|A|-1}{2}\right\} \). Indeed, we have
Claim. For any subset A of \(\mathbb {Z}_p\) and any non-trivial character \(\phi \) over \(\mathbb {Z}_p\) we have the following estimate
Proof
Note that every non-trivial character is of the form \(\phi (x) =\exp (2k\pi i x/p)\) where \(k\in \{1,2,\ldots ,p-1\}\). Next, we can assume that \(k=1\), by replacing A with \(A'=k\cdot A\), which doesn’t change the set size. Now, by the triangle inequality we have
It remains to estimate \(|m_A|\) where
is the mass center of the set \(\phi (A) = \{\phi (x):\ x\in A\}\). Note that \(\phi (A)\) may be any arbitrary |A|-element subset of the set of all p-th roots of unity (because \(\phi \) is a bijection), see Fig. 2 for an illustration. Our task is therefore to maximize the length of \(m_{A}\) which happens when A is the set of subsequent unity roots. In particular
where the last equality follows by known trigonometric identities. Since \(\theta < 1\) we can omit the absolute value here, and this finishes the proof.
We will prove the following inequality
Claim
For any \(\theta < 1\) and any \(c\leqslant \frac{4}{3}-\frac{\pi ^2}{18}\), we have \(1-\theta + \frac{\sin \pi \theta }{p\sin \frac{\pi }{p}} \leqslant 1-c\theta ^3\)
Proof
We want to prove that \(f(\theta ) = c\theta ^3-\theta +\frac{\sin \pi \theta }{p\sin \frac{\pi }{p}} \leqslant 0\). We have \(f(0)=0\) and \(\frac{\partial f(\theta )}{\partial \theta } = -1+3c\theta ^2 +\pi \frac{\cos \pi \theta }{p\sin \frac{\pi }{p}}\) Since for \(t \in \left[ 0,\frac{\pi }{2}\right] \) it holds that \(\cos t \leqslant 1-\frac{4 t^2}{\pi ^2}\) and \(\sin t \geqslant t-\frac{t^3}{6}\), we obtain
and since \(\theta \geqslant \frac{1}{p}\), the result follows.
From the last claim it follows that we can put \(c=\frac{1}{2}\) and thus
Now the result follows by Lemma 9.
Rights and permissions
Copyright information
© 2016 International Association for Cryptologic Research
About this paper
Cite this paper
Dziembowski, S., Faust, S., Skórski, M. (2016). Optimal Amplification of Noisy Leakages. In: Kushilevitz, E., Malkin, T. (eds) Theory of Cryptography. TCC 2016. Lecture Notes in Computer Science(), vol 9563. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49099-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-662-49099-0_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-49098-3
Online ISBN: 978-3-662-49099-0
eBook Packages: Computer ScienceComputer Science (R0)