Abstract
The slightly subexponential algorithm of Blum, Kalai and Wasserman (BKW) provides a basis for assessing LPN/LWE security. However, its huge memory consumption strongly limits its practical applicability, thereby preventing precise security estimates for cryptographic LPN/LWE instantiations.
We provide the first time-memory trade-offs for the BKW algorithm. For instance, we show how to solve LPN in dimension k in time \(2^{\frac{4}{3} \frac{k}{\log k} }\) and memory \(2^{\frac{2}{3} \frac{k}{\log k} }\). Using the Dissection technique due to Dinur et al. (Crypto ’12) and a novel, slight generalization thereof, we obtain fine-grained trade-offs for any available (subexponential) memory while the running time remains subexponential.
Reducing the memory consumption of BKW below its running time also allows us to propose a first quantum version QBKW for the BKW algorithm.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
The Learning Parity with Noise (LPN) problem [4] and its generalization to arbitrary moduli, the Learning with Errors (LWE) problem [29], lie at the heart of our most promising coding-based and lattice-based post-quantum cryptographic constructions [2, 26, 28]. With the NIST standardization [1], we have the urgent pressure to identify LPN/LWE instantiations that allow for efficient constructions, but yet give us the desired level of classic and quantum security.
Hence, we have to run cryptanalytic algorithms on medium-scale parameter sets in order to properly extrapolate towards cryptographic instantiations.
In LPN of dimension k and error-rate , one has to recover a secret
from samples
for uniformly random
and inner products with Bernoulli distributed error \(e_i\), i.e., \(\Pr [e_i=1] = p\).
It is not hard to see that for constant error p, any candidate solution can be checked for correctness in time polynomial in p and k. This gives a simple brute-force LPN algorithm with time complexity \(2^k\) and constant memory.
The algorithm of Blum, Kalai and Wassermann [5] solves the faulty system of linear equations from LPN/LWE by a block-wise Gaussian elimination. In a nutshell, BKW takes sums of two vectors whenever they cancel a block of \({\varTheta }(\frac{k}{\log k})\) bits to keep the accumulating error under control. In cryptographic constructions we usually have constant p, for which the BKW algorithm runs in time \(2^{\frac{k}{\log k}(1+o(1))}\), albeit using the same amount of memory.
These two algorithms, brute-force and BKW, settle the initial endpoints for our time-memory trade-offs. Hence, for gaining a \({\log (k)}\)-factor in the run time exponent one has to invest memory up to the point where memory equals run time. Interestingly, this behavior occurs in the second method of choice for measuring LWE security, lattice reduction, too.
Whereas on lattices of dimension n, lattice enumeration such as Kannan’s algorithm [21] takes time \(2^{\mathcal {O}(n \log n)}\) with polynomial memory only, lattice sieving methods [12, 22,23,24] require \(2^{\mathcal {O}(n)}\) time and memory. Due to their large memory requirements, in practice lattice sieving is currently outperformed by enumeration, and there is an increasing interest in constructing time-memory trade-offs, e.g., by lattice tuple sieving [3, 18, 19].
For BKW, the research so far mainly concentrated on run time, where many optimizations have been proposed in the cryptographic literature, e.g. [7, 16, 17, 25, 32]. While these improvements may significantly reduce the running time for breaking concrete LPN instances such as \((k,p) = (512, \nicefrac 1 4)\) or \((512, \nicefrac 1 8)\), to the best of our knowledge for the running time exponent \(\frac{k}{\log k}(1+o(1))\) all improvements only affect the o(1)-term. Moreover, all proposals share the same huge memory requirements as original BKW, making it impossible to run them even in moderately large dimensions.
As a consequence state-of-the-art BKW implementations are currently only possible in dimension k up to around 100. For instance [14] reported a break of \((k, p) = (135, \nicefrac 1 4)\). However, of the total 6 days running time, the authors spent 2.5 days for an exponential preprocessing, followed by less than 2 h BKW in dimension 99, and another 3.5 days of exponential decoding. The reason for this run-time imbalance is that BKW in dimension 99 had already consumed all available memory, namely \(2^{40}\) bits.
Hence, if we really want to break larger LPN instances in practice, we must study time-memory trade-offs that sacrifice a bit of running time, but stay in the sub-exponential time regime at the benefit of a manageable memory.
Our contribution. We provide the first time-memory trade-offs for the BKW algorithm. These trade-offs give us a smooth interpolation for the complexities of solving LPN between the known endpoints \(2^{k}\) time for brute-force and \(2^{\frac{k}{\log k}}\) for BKW.
Since our algorithms’ running times remain subexponential even for given memory below the requirement \(2^{\frac{k}{\log k}}\) of classic BKW, we (asymptotically) outperform all previous algorithms (e.g. [14]) that solved LPN in exponential time when classic BKW was not applicable due to memory restrictions.
As a starting point, we consider—instead of 2-sums as in the original BKW algorithm—c-sums for (constant) \(c>2\) that cancel some blocks of bits. The use of sums of more than 2 vectors has already been discussed in the literature for improving the running time of BKW, e.g. by Zhang et al. [32] as an extension LF(k) of the BKW variants LF1 and LF2 by Levieil and Fouque [25], using Wagner’s k-list algorithm [31].
Since the number of c-sums grows exponentially in c, so does the number of c-sums whose sum cancels some block of bits. In turn, we systematically use c-sums to significantly lower the number of samples N that need to be stored, at the slightly increased cost of finding such c-sums.
We show that the complexities of any c-sum BKW algorithm are dominated by the cost of computing c-sums. As a consequence we abstract the c-sum-problem from the BKW algorithm and study various memory-friendly algorithms to solve it. We ensure that our c-sum algorithms do not require more memory than already consumed by c-sum BKW for storing its samples. In fact, our BKW algorithms have sample and (thus) memory complexity as little as \(N_c:={M_{\mathrm {BKW}}}^{\frac{\log c}{c-1}}\) for any constant \(c\in \mathbb {N}\), where \(M_\mathrm {BKW}:=2^{\frac{k}{\log k}}\) denotes the memory (and sample) requirement of classic BKW.
In Table 1, we give a brief overview of our c-sum algorithms complexities, and therefore also of our c-sum BKW complexities. We stress that all c-sum algorithms from Table 1, including those that use the Dissection technique [10, 30], may be studied for arbitrary list sizes outside of the BKW context.
Naive. We first consider a naive approach that computes all \((c-1)\)-sums of list entries and looks for some matching \(c^\mathrm {th}\) vector. This naive approach already gives us a smooth first time-memory trade-off, informally captured in the following theorem.
Theorem 1.1
(Naive BKW Trade-Off, informal). Let \(c\in \mathbb {N}\). The LPN problem in dimension k can be solved in Time T and space M where
Observe that the trade-off behaves quite nicely as a function of c. While we can reduce the memory consumption almost by a factor of \(\frac{1}{c}\) this comes at the cost of only a \((\log c)\)-factor in the run time exponent.
Note that for \(c=2\) Theorem 1.1 yields the well-known BKW complexities. While we consider constant c throughout this work, we point out that our results hold up to a choice of \(c(k)=k^{1-\frac{\log \log k}{\log k}}\) for which the formulas in Theorem 1.1 (as well as for the upcoming trade-offs) result in exponential running time in k with polynomial memory, matching the endpoint of the LPN brute-force algorithm. See Fig. 1 (stars) for an illustration of this time-memory trade-off.
QBKW. Using a standard Grover-search in our naive c-sum algorithm to identify \((c-1)\)-sums for which there exists a matching \(c^\mathrm {th}\) vector, we halve the running time complexity exponent from \(\log c \cdot \frac{k}{\log k}\) down to \(\frac{\log c}{2} \cdot \frac{k}{\log k}\). See Fig. 1 (triangles) for the resulting trade-off curve.
Dissection. We replace our naive c-sum algorithm by more advanced time-memory techniques like Schroeppel-Shamir [30] and its generalization, Dissection [10], to reduce the classic running time. We call the resulting algorithm Dissection-BKW. To give some illustrative results, with the Schroeppel-Shamir technique Dissection-BKW achieves exponents
(see the diamond at \(\log M = \frac{2}{3} \frac{k}{\log k}\) in Fig. 1). Using 7-Dissection, Dissection-BKW achieves exponents
(see the diamond at \(\log M \approx 0.47 \frac{k}{\log k}\) in Fig. 1).
Theorem 1.2
(Dissection BKW Trade-Off, informal). Let \(c\in \mathbb {N}\) be sufficiently large. The LPN problem in dimension k can be solved in Time T and space M where
Hence, in comparison to Theorem 1.1 Dissection mitigates the price we pay for saving a factor of \(\frac{\log c}{c-1}\) in memory from 1 down to \(\left( 1-\sqrt{2/c}\right) \).
The trade-off is depicted by the diamonds in Fig. 1. Interestingly, when classically employing Schroeppel-Shamir we are on par (see point \((\frac{2}{3},\frac{4}{3})\) in Fig. 1) with the complexities from the quantum trade-off as Schroeppel-Shamir allows for a square-root gain in the running time; the same as using a Grover-search in a quantum algorithm.
Tailored Dissection. Eventually, we introduce a new slight generalization of the Dissection technique that we call tailored Dissection. It allows us to achieve a piece-wise continuous trade-off (line segments depicted in Fig. 1) covering the sub-exponential memory regime entirely.
The full version of this work [13] also contains a discussion how our results translate from the LPN setting to LWE.
2 Preliminaries
2.1 Notation
For \(a\le b\in \mathbb {N}\) let \([a,b]:=\{a,a+1,\ldots ,b\}\) and \([a]:=[1,a]\). For a set S and \(s\le |S|\) let \(\left( {\begin{array}{c}S\\ s\end{array}}\right) \) denote the set of all size-s subsets of S. A list \(L=(l_1,\ldots ,l_i)\) is an element \(L\in {S}^i\) and is of length \(|L|=i\). We let \(\emptyset \) denote the empty list. For two lists \(L_1\), \(L_2\) we write \(L_1\subseteq L_2\) if all elements from \(L_1\) are contained in \(L_2\) at least as often as in \(L_1\). We write shortly \(l\in L_2\) for \((l)\subseteq L_2\). For lists \(L_1=(l_1,\ldots ,l_i)\) and \(L_2=(l_{i+1},\ldots ,l_j)\) we let \(L_1\cup L_2:=(l_1,\ldots ,l_i,l_{i+1},\ldots ,l_j)\). Logarithms are always base 2.
For \(v\in \mathbb {F}_2^a\) and \(b\le a\) we denote the last b coordinates of v by \(\mathrm {low}_b(v)\). We let denote the \(i^\mathrm {th}\) unit vector. \(0^b\) denotes the zero vector of dimension b.
By we denote the uniform distribution on a finite set M, by \(\mathrm {Ber}_p\) we denote the Bernoulli distribution, i.e., \(X\sim \mathrm {Ber}_p\) means that \(\Pr \left[ X=1\right] = 1-\Pr \left[ X=0\right] =p\). For n independent random variables \(X_1,\ldots ,X_n{\mathop {\sim }\limits ^{iid}}\mathrm {Ber}_p\) their sum X is binomial distributed with parameters n and p, denoted \(X\sim \mathrm {Bin}_{n,p}\). A probability p(k) is called overwhelming in k, if \(1-p(k)\) is negligible in k. We denote deterministic assignments in algorithms by \(\leftarrow \).
Theorem 2.1
(Chernoff Bound, [27]). Let \(X\sim \mathrm {Bin}_{n,p}\). Then
2.2 The LPN Problem
Definition 2.1
((Search) LPN Problem). Let \(k\in \mathbb {N}\), and \(p\in [0,\frac{1}{2})\) be a constant. Let \(\mathtt {Sample}\) denote an oracle that, when queried, samples
, \(e \sim \mathrm {Ber}_p\) and outputs a sample of the form
. The \(\mathtt {LPN}_k\) problem consists of recovering
given access to \(\mathtt {Sample}\). In the following we call k the dimension,
the secret, p the error rate, b the label of
and e the noise.
Brute-Force Sampling Algorithm. A straight-forward way to recover the first bit \(s_1\) of the secret is to query \(\mathtt {Sample}\) until we obtain a sample of the form
. Then
. Hence, \(\Pr \left[ s_1=b\right] =1-p>\frac{1}{2}\). However, as \(\mathtt {Sample}\) draws
uniformly from \(\mathbb {F}_2^k\), we expect to have to query the oracle \(2^k\) times to have
.
Further, to boost the confidence in recovering \(s_1\) correctly from merely \(1-p\) to being overwhelming (in k) one may collect many samples and decide on \(s_1\) by majority vote, whose error is bounded in the following lemma.
Lemma 2.1
(Majority Vote). Let \(q>\frac{1}{2}\) and \(X_1,\ldots ,X_n\sim Ber _q\) independently. Then
Proof
Since \(X:=\sum _{i=1}^nX_i\sim \mathrm {Bin}_{n,q}\), using Theorem 2.1 with \(r=1-\frac{1}{2q}\) gives
\(\square \)
Corollary 2.1
For \( n:=\frac{2(1-p)k}{(\frac{1}{2}-p)^2} \) many samples, a majority vote recovers \(s_1\) correctly with probability at least \(1-\exp (-k)\).
Proof
Let us define \(X_i=1\) iff \(b_i=s_1\), which implies \(q=1-p\). Lemma 2.1 yields the desired result. \(\square \)
Therefore, for any constant error rate p, the majority vote requires only a linear number \(n = \mathcal {O}(k)\) of labels of the form . Clearly, we can recover the remaining bits \(s_j\), \(j=2,\ldots ,k\) of the secret
by querying \(\mathtt {Sample}\) until we obtained sufficiently many samples with
. Overall, the probability that we recover all bits of
correctly is at least \((1-\exp (-k))^k\ge 1-k\cdot \exp (-k)=1-\mathrm {negl}(k)\).
2.3 Combining Samples
In [5], Blum, Kalai and Wasserman introduced the idea to construct from a collection of N arbitrary samples rather than merely waiting for a sample where
. Their core idea is based on synthesizing a new sample from two existing ones \((\mathbf {a}_1, b_1)\), \((\mathbf {a}_2,b_2)\) via addition
For such a synthesized sample, which we call a 2-sum of samples, we have and \(e_1\oplus e_2\sim \mathrm {Ber}_{p'}\) where \(p'=\frac{1}{2}-\frac{1}{2}(1-2p)^2>p\) according to the following Piling-Up lemma.
Lemma 2.2
(Piling-Up Lemma [14]). Let \(p\in [0,1]\) and \(e_i\sim Ber _p\), \(i\in [n]\) be identically, independently distributed. Then
Summing up two or more samples enables us to synthesize samples at the expense of an error rate approaching \(\frac{1}{2}\) exponentially fast in the number of summands.
3 The c-Sum-Problem and its Application to BKW
3.1 A Generalized BKW Algorithm
While BKW repeatedly adds pairs of samples to zero out chunks of coordinates, we add \(c>2\) samples to accomplish the same. Beneficially, as the number of size-c subsets grows exponentially in c, this allows for a drastic reduction in the number of initially required samples (thus, memory as well) while still finding sufficiently many sums of c vectors adding up to zero on a block of coordinates.
We give our in Algorithm 1. For a block-size b and \(j\in [a]\) we refer to the coordinates \([k-jb+1,k-(j-1)b]\) as the \(j^\mathrm {th}\) stripe. Essentially,
consists of a for-loop (line 4) that generates zeros (resp. the first unit vector) on the \(j^\mathrm {th}\) stripe for \(j\in [a-1]\) (resp. the \(a^\mathrm {th}\) stripe). This step is repeated multiple times (see line 2) to obtain sufficiently many labels of
samples in order to let a majority vote determine the first bit \(s_1\) of the secret
with overwhelming probability. This process is repeated k times to recover all bits of
.
For a list L as constructed in line 3, \(j\in [a]\) and \(t\in \mathbb {F}_2^b\) we let denote an algorithm that outputs a new list L where the coordinates of each entry match t on the \(j^\mathrm {th}\) stripe (see lines 5, 6). If L should shrink to size 0 throughout an execution, we abort and return a failure symbol (see lines 7, 8).

We point out that (essentially) the original BKW algorithm may be obtained by letting add pairs of vectors whose sum matches t on the \(j^\mathrm {th}\) stripe to \(L'\).
Let us introduce the c-sum-problem lying at the heart of any algorithm that shall be used to instantiate . In short, given a list of vectors \(L \in (\mathbb {F}_2^b)^*\), i.e., a stripe from the
point of view, the \(c\)-sum-problem asks to collect sums of c vectors that add up to some target \(t \in \mathbb {F}_2^b\).
Definition 3.1
(The c-Sum-Problem ( )). Let \(b,c,N \in \mathbb {N}\) with \(c\ge 2\). Let \(L:=(l_1,\ldots ,l_{N})\) be a list where
for all i and let \(t\in \mathbb {F}_2^b\) be a target. A single-solution of the
is a size-c set
such that

A solution is a set of at least N distinct single-solutions. The c-sum-problem consists of finding a solution when given L, t while c is usually fixed in advance. We refer to (L, t, c) as an instance of the
, concisely (L, t) if c is clear from the context.
Note that by definition a solution to the \(c\)-sum-problem consists of at least N single-solutions. These may again be combined into a new list of size (at least) N. Thus, we may apply a
solving algorithm on different b-bit stripes of a list. Further, the list does not shrink if a solution exists in each iteration.
Obviously, a solution should exist whenever the list size N is large enough, since then sufficiently many c-sums add up to some given target t. In the following, we show that the lower bound for the list-size N from Algorithm 1 guarantees the existence of such a solution with overwhelming probability under the following heuristic assumption that is regularly used in the analysis of BKW-type algorithms [6, 7, 32].
Independence Heuristic. Obviously, c-sums are stochastically dependent for
with
. However, their dependency should only mildly affect the runtime of an iterative collision search as in BKW-type algorithms. For instance, it has been shown in [9] that the dependence between 2-sums
merely affects the overall runtime exponent by an o(1)-term. We heuristically assume that this also holds for \(c>2\), and therefore treat (iterative) c-sums as independent in our run time analyses.
We provide various experiments in Sect. 6.1 that support the Independence Heuristic.
For Algorithm 1, we need the following lemma only for the special case \(\alpha =1\). However, our Tailored Dissection approach in Sect. 5.3 also requires \(\alpha >1\).
Lemma 3.1
Let (L, t) be a instance with
Then, under the Independence Heuristic, (L, t) has at least \(N^{\alpha }\) single-solutions with probability \(1-\exp (-N/4)\).
Proof
For every with
define an indicator variable that takes value
iff
.
Let be the number of single-solutions to the
. Under the Independence Heuristic, the
can be analyzed as if independent, thus \(X\sim \mathrm {Bin}_{\left( {\begin{array}{c}N^{\alpha }\\ c\end{array}}\right) ,{2^{-b}}}\) is binomially distributed. Hence,
Since \(\log N = \frac{b+c \log c + 1}{c-1}\), we obtain
Thus, our choice of N guarantees \(\mathbb {E}[X] \ge 2N^{\alpha }\). We can upper-bound the probability that the \(c\)-sum-problem has less than \(N^{\alpha }\) single-solutions solution using our Chernoff bound from Theorem 2.1 as
\(\square \)
Observation 3.1
Clearly, any algorithm solving the \(c\)-sum-problem may be used to replace in lines 5 and 6 of
(Algorithm 1). As a consequence we do not distinguish between
and an algorithm solving the \(c\)-sum-problem.
We now give Theorem 3.2 stating that inherits its complexities from
. As a result, we may focus on solving the \(c\)-sum-problem in the remainder of this work.
Theorem 3.2
(Correctness and Complexities of ). Let
denote an algorithm solving the
in expected time \({T_{c,N}}\) and expected memory \(M_{c,N}\) with overwhelming success probability, where \(N \ge 2^{\frac{b+c\log c +1}{c-1}}\). Under the Independence Heuristic
solves the \(\mathtt {LPN}_k\) problem with overwhelming probability in time T, memory M, where
using \(N^{1+o(1)}\) samples.
Proof
Let \(a=\frac{\log k}{(1+\varepsilon _a)\log c}\) and \(b=\frac{k}{a}\) as in Algorithm 1. Consider one iteration of the for-loop in line 2. As stated in Lemma 3.1 there exists a solution to the instance (implicitly defined via the \(j^\mathrm {th}\) stripe of L and target \(0^b\) (resp.
)) with probability at least \(1-\exp (-\frac{N}{4})\). Hence, the probability that there is a solution to the instance in each iteration is greater than
We now analyze the probability that a single bit of the secret gets recovered correctly. Since we build c-sums iteratively a times in lines 4–6, eventually we obtain vectors being a sum of at most \(c^{a}\) samples, having an error of \(\frac{1}{2}-\frac{1}{2}(1-2p)^{c^{a}}\) according to Lemma 2.2.
Note that the labels \(b_1,\ldots ,b_n\) collected in line 10 are stochastically independent as each \(b_i\) is obtained from freshly drawn samples. Now Corollary 2.1 yields that \(n:=\frac{8(1-p) k}{(1-2p)^{2c^a}}\) samples are sufficient for the majority vote to determine a bit of correctly with error probability at most \(\exp (-k)\). Using the Union Bound, one bit of the secret
gets successfully recovered with probability at least
As we have to correctly recover all, i.e., k bits of , the overall success probability of
is at least
Let us look at the term \(2ank \cdot \exp \left( -\frac{N}{4}\right) \). Since \(n = \tilde{\mathcal {O}}(2^{\kappa _p c^a})\) for some constant \(\kappa _p\), we obtain for constant c

Thus, the factor \(\exp \left( -\frac{N}{4}\right) \) clearly dominates and makes the overall success probability overwhelming in k.
Let us now analyze the time and memory complexity of . Since
has only expected time/memory complexity, this expectation will be inherited to
. We later show how to remove the expectation from
’s complexities by a standard technique.
Let us start with the running time of , where we ignore the negligible overhead of iterations of line 2 caused by failures in the
-algorithm. Clearly, \(T_{c,N}\ge N\). Hence, one iteration of the for-loop can be carried out in time \(\tilde{\mathcal {O}}(\max \{N,a\cdot T_{c,N}\}) = \tilde{\mathcal {O}}(T_{c,N})\). Thus, for recovering the whole secret we get a running time of \(\tilde{\mathcal {O}}(n\cdot T_{c,N})\). From Equation (4) we already know that N dominates n, which implies that \(T_{c,N}\) dominates n. More precisely, we have \(n\cdot T_{c,N}= (T_{c,N})^{1+o(1)}\). Hence, we can also express the overall expected running time as \((T_{c,N})^{1+o(1)}\).
The memory consumption is dominated by , which gives us in total expected \(\tilde{\mathcal {O}}(M_{c,N})=(M_{c,N})^{1+o(1)}\). The sample complexity of
is \( \tilde{\mathcal {O}}(knN)=\tilde{\mathcal {O}}(nN)=N^{1+o(1)}.\)
It remains to remove the expectations from the complexity statements for time and memory, while keeping the success probability overwhelming. We run k times aborting each run if it exceeds its expected running time or memory by a factor of 4. A standard Markov bound then shows that this modified algorithm fails to provide a solution with probability at most \(2^{-k}\). For details see the full version [13]. \(\square \)
In the following section, we discuss an algorithm to naively solve the \(c\)-sum-problem leading to a first trade-off in the subexponential time/memory regime that also allows for a quantum speedup.
4 First Time-Memory Trade-Offs for BKW
4.1 A Classic Time-Memory Trade-Off
A straight-forward algorithm to solve an instance of the \(c\)-sum-problem is to compute all sums of c elements from L. For \(N=|L|\) this approach takes time \(\mathcal {O}(N^c)\) while it requires memory \(\mathcal {O}(N)\).
With little effort, we can do slightly better: Let L be sorted. Now let us brute-force all \((c-1)\)-sums from L, add t, and check whether the result appears in L. This gives us by construction a c-sum that matches t.
The details of this approach are given in Algorithm 2. Notice that whenever we call
, we should first sort the input list L, which can be done in additional time \({\tilde{\mathcal {O}}}(N)\).

The following lemma shows correctness and the complexities of Algorithm 2.
Lemma 4.1
finds a solution of the
in time \(\tilde{\mathcal {O}}\left( N^{c-1}\right) \) and memory \(\tilde{\mathcal {O}}(N)\).
Proof
See full version [13].
Let us now replace in the algorithm the
subroutine with Algorithm 2
, and call the resulting algorithm
.
Theorem 4.1
(Naive Trade-Off). Let \(c\in \mathbb {N}\). For all \(\varepsilon >0\) and sufficiently large k, under the Independence Heuristic solves the \(\mathtt {LPN}_{k}\) problem with overwhelming success probability in time \(T=2^{\vartheta (1+\varepsilon )}\), using \(M=2^{\mu (1+\varepsilon )}\) memory and samples, where
Proof
Let \(N:=2^{\frac{b+c\cdot \log c + 1}{c-1}}\). According to Lemma 4.1 is correct and we can apply Theorem 3.2 to conclude that
solves LPN with overwhelming success probability.
Further Lemma 4.1 shows that runs in time
Thus, by Theorem 3.2 runs in time \(T=2^{b(1+o(1))}\). Since
operates on stripes of width \(b=\log c \cdot \frac{k(1+\varepsilon _{a})}{\log k}\) (see the definition in Algorithm 1), we obtain the claimed complexity \(T=2^{\vartheta (1+\varepsilon _a+o(1))}=2^{\vartheta (1+\varepsilon )}\) for every \(\varepsilon >\varepsilon _a\) and sufficiently large k.
Since requires memory \(M_{c,N}=\tilde{\mathcal {O}}(N)\), by Theorem 3.2 the memory complexity of
is (for constant c)
The sample complexity of is \(N^{1+o(1)}\) (see Theorem 3.2) and therefore identical to M. \(\square \)
Figure 1 shows the time-memory trade-off from Theorem 4.1 depicted by stars.
4.2 A Quantum Time-Memory Trade-Off
Grover’s algorithm [15] identifies a marked element in an unsorted database D in time \(\mathcal {O}(\sqrt{|D|})\) with overwhelming probability. A matching lower bound \({\varOmega }(\sqrt{|D|})\) by Donotaru and Høyer [11] shows that Grover’s algorithm is optimal. We use a modification of Grover’s algorithm due to [8], denoted Grover, that applies even in case the number of marked elements is unknown.
Theorem 4.2
(Grover Algorithm [8]). Let \(f:D\rightarrow \{0,1\}\) be a function with non-empty support. Then Grover outputs with overwhelming probability a uniformly random preimage of 1, making q queries to f, where
We use Grover’s algorithm to speed up our naive approach to solving the \(c\)-sum-problem. While we previously brute-forced all \((c-1)\)-sums in list L and checked if there is a fitting \(c^\mathrm {th}\) entry in L, we now employ a Grover-search to immediately obtain \((c-1)\)-sums for which there exists a suitable \(c^\mathrm {th}\) element in L. Let us define the Grover function \(f_t\) as

Given some we can recover all \(i_c\) such that
is a single-solution of instance (L, t) in time \(\tilde{\mathcal {O}}(\log |L|)\) if L is sorted.

Lines 3–7 of and
are identical. We merely replaced the brute-force search (line 1 in Algorithm 2) by a Grover-search (lines 1, 2 in Algorithm 3).
Lemma 4.2
solves the
with overwhelming probability in time \(\tilde{\mathcal {O}}\left( N^{c/2}\right) \) and memory \(\tilde{\mathcal {O}}(N)\).
Proof
See full version [13].
Let \(\mathtt {QBKW}\) denote algorithm where
is instantiated using
.
Theorem 4.3
Let \(c\in \mathbb {N}\). For all \(\varepsilon >0\) and sufficiently large k, under the Independence Heuristic \(\mathtt {QBKW}\) solves the \(\mathtt {LPN}_{k}\) problem with overwhelming success probability in time \(T=2^{\vartheta (1+\varepsilon )}\), using \(M=2^{\mu (1+\varepsilon )}\) memory and samples, where
Proof
The proof proceeds along the lines of the proof of Theorem 4.1.
The trade-off from Theorem 4.3 is depicted in Fig. 1 on Page 5 by triangles.
5 Time-Memory Trade-Offs for BKW via Dissection
While a naive approach already led to a first time-memory trade-off, a meet-in-the-middle approach for solving the \(c\)-sum-problem prohibits a trade-off, as we explain in Sect. 5.1. As a consequence, we resort to the more advanced Dissection technique [10, 30] in Sect. 5.2. However, as Dissection merely leads to instantiations for a rather sparse choice of available memory, we give a slight generalization of Dissection tailored to any given amount of memory. This allows for a trade-off covering the whole range of possible memory (see Sect. 5.3).
Let us define the specific structure of single-solutions that are recovered by all \(c\)-sum-problem algorithms in this section.
Definition 5.1
(Equally-split (Single-)Solution). Let (L, t) be an instance. Partition L into lists \(L_1,\ldots ,L_c\) of size \(\frac{L}{c}\) each. A single-solution is equally-split (wrt. \(L_1,\ldots ,L_c\)) if it corresponds to list elements \(l_1,\ldots ,l_c\) whereby \(l_i \in L_i\) for all \(i\in [c]\). An equally-split solution is a collection of N equally-split single-solutions.
Lemma 5.1
Let (L, t) be a instance with
Then, under the Independence Heuristic, (L, t) has at least \(N^{\alpha }\) equally-split single-solutions with probability \(1-\exp (-N/4)\).
Proof
Let X be a random variable for the number of equally-split single-solutions. Then \(\mathbb {E}[X] = \left( \frac{N^{\alpha }}{c}\right) ^c \cdot 2^{-b}\). Hence, Eq. (2) is satisfied, and the rest follows as in the proof of Lemma 3.1. \(\square \)
5.1 Meet-in-the-Middle and Beyond
Let (L, t) be an instance of the . In a meet-in-the-middle approach one splits a c-sum \(t=v_{i_1}\oplus \ldots \oplus v_{i_c}\) into two parts \(t=\big (v_{i_1}\oplus \ldots \oplus v_{i_\frac{c}{2}}\big )\oplus \big (v_{i_{\frac{c}{2}+1}}\oplus \ldots \oplus v_{i_c}\big )\). Let us define \(L_1:=(v_1,\ldots ,v_{\frac{\left| L\right| }{2}})\), \(L_2:=(v_{\frac{\left| L\right| }{2}+1},\ldots ,v_{\left| L\right| })\) and consider a single-solution corresponding to \(\frac{c}{2}\) elements from \(L_1\) and \(L_2\) each:
If we compute all \(\frac{c}{2}\)-sums of elements from \(L_1\) and save them in a new list \(L_1^{\frac{c}{2}}\), then for each \(\frac{c}{2}\)-sum v of \(L_2\) we can check if \(w:=v\oplus t \in L_1^{\frac{c}{2}}\). If so, v and w form a single-solution since \(v\oplus w=t\). Obviously, this approach has expected time and memory complexity \(\tilde{\mathcal {O}}(\max (|L|,|L|^\frac{c}{2}))=\tilde{\mathcal {O}}(|L|^\frac{c}{2})\) for \(c\ge 2\). Yet, choosing \(c>2\) only leads to worse complexities, time and memory-wise, while for \(c=2\) the complexities for the meet-in-the-middle approach are as bad as for .
Schroeppel and Shamir’s Meet-in-the-Middle. We present a heuristic simplification of the Schroeppel-Shamir algorithm [30] due to Howgrave-Graham and Joux [20].
In a nutshell, the idea of Schroeppel and Shamir is to run a meet-in-the-middle attack but impose an artificial constraint \(\tau \) on the \(\frac{c}{2}\)-sums. This results in lists \(L_1^{\frac{c}{2},\tau }\) that are significantly smaller than \(L_1^{\frac{c}{2}}\) in the original meet-in-the-middle approach. In order to find all single-solutions, one iterates \(\tau \) over its whole domain. \(L_1^{\frac{c}{2},\tau }\) is in turn built from smaller lists as follows. Let \(t=v_{i_1}\oplus \ldots \oplus v_{i_c}\) and write
Create four lists \(L_{11}^{\frac{c}{4}}\), \(L_{12}^{\frac{c}{4}}\), \(L_{21}^{\frac{c}{4}}\), \(L_{22}^{\frac{c}{4}}\) containing all \(\frac{c}{4}\)-sums of elements from the first, second, third, and fourth quarter of L. Let \(\ell _{i,j}\) denote elements from \(L_{ij}^{\frac{c}{4}}\) for \(i,j=1,2\). As a constraint we choose \(\mathrm {low}_{\frac{c}{4}\log |L|}(\ell _{11}\oplus \ell _{12})=\tau \) for some fixed \(\tau \). Now we construct \(L_1^{\frac{c}{2},\tau }\) from \(L_{11}^{\frac{c}{4}}\) and \(L_{12}^{\frac{c}{4}}\) using a meet-in-the-middle approach requiring expected time \(\tilde{\mathcal {O}}(|L|^{\frac{c}{4}})\). Similarly, we can compute all elements of list \(L_{2}^{\frac{c}{2},t\oplus \tau }\) to obtain sums of \(\frac{c}{2}\) vectors (\(\frac{c}{4}\) from \(L_{2,1}^{\frac{c}{4}}\) and \(L_{2,2}^{\frac{c}{4}}\) each) matching \(t\oplus \tau \) on the last \(|\tau |=\frac{c}{4}\log |L|\) bits. Eventually, given the elements from \(L_{2}^{\frac{c}{2},t\oplus \tau }\) and list \(L_1^{\frac{c}{2},\tau }\) one can some recover single-solutions as before. Thereby list \(L_1^{\frac{c}{2},\tau }\) is of expected size
We conclude that the expected memory consumption of the algorithm is given by \(\tilde{\mathcal {O}}(\max \left\{ |L|,|L|^{\frac{c}{4}}\right\} )\).
As we iterate over \(2^{|\tau |}=|L|^{\frac{c}{4}}\) choices of \(\tau \) requiring expected time \(\tilde{\mathcal {O}}(|L|^{\frac{c}{4}})\) per iteration, the overall expected running time is given by \(\tilde{\mathcal {O}}(|L|^{\frac{c}{2}})\) for all \(c\ge 4\). Hence, for \(c=4\) we obtain an algorithm as fast as the meet-in-the-middle approach, while consuming only expected memory \(\tilde{\mathcal {O}}(|L|)\).
Note that it is sufficient to store list \(L_1^{\frac{c}{2},\tau }\) while all elements from \(L_2^{\frac{c}{2},t\oplus \tau }\) may be computed and checked on-the-fly, i.e., without storing its elements.
The Schroeppel-Shamir algorithm is a special case of Dissection run on 4 lists in the subsequent section, which further exploits the asymmetry between storing lists and computing (memory-free) lists on the fly.
5.2 Dissection
Dissection can be considered as a memory-friendly member of the class of k-list algorithms [31]. A Dissection algorithm is given c lists \(L_c,\ldots ,L_1\) and a target t. For simplicity the algorithm merely sums list elements to obtain target t rather than outputting a single-solution explicitly. One could keep track of indices of those elements that sum to t, but for ease of notation we omit this. Instead, we employ some abstract “index recovering” procedure later.
High-level Structure of Dissection. A Dissection identifying sums of vectors adding up to some target t consists of the following steps
- (1) :
-
A loop iterates over an artificially introduced constraint \(\tau \). For each \(\tau \):
- (1.1) :
-
A meet-in-the-middle approach is run to obtain a list L of sums from the first few lists that add up to \(\tau \) on a some bits. List L is kept in memory.
- (1.2) :
-
Dissection is called recursively to find sums from the remaining lists that sum to \(t \oplus \tau \) on a some bits. Found sums are passed to the parent call on-the-fly.
- (1.3) :
-
For all sums passed from the recursive call in step (1.2) list L is searched to construct sums adding up to \(\tau \) on some more bits.
Before giving the fully-fledged Dissection algorithm, let us give an 11-Dissection example.
High-level structure of an 11-Dissection on input lists \(L_{11},\ldots ,L_1\). Recursively called 7- and 4-Dissection enclosed in dashed boxes. Arrows entering a list from the right side indicate a check on-the-fly. Arrows leaving a list on the left side indicate that a found match is returned on-the-fly. Arrows entering a list from the top indicate that the list below is populated with entries from above and stored entirely.
Example 5.1
(11-Dissection). An 11-Dissection is run on lists \(L_{11},\ldots ,L_1\) and some target t. It loops through an artificial constraint \(\tau _3\). Within each iteration: A list \(L_{(11,8)}\) containing sums \(l_{11} \oplus \ldots \oplus l_8\) consistent with constraint \(\tau _3\) where \(l_{11}\in L_{11}\), ..., \(l_8\in L_8\) is computed and stored. Then, still within the first iteration of the loop, a 7-Dissection is run (see Fig. 2) on lists \(L_7,\ldots ,L_1\) and a modified target \(t\oplus \tau _3\). The 7-Dissection itself introduces a constraint \(\tau _2\) and stores a list \(L_{(7,5)}\) containing sums of elements from \(L_7, L_6, L_5\) fulfilling \(\tau _2\). It recursively calls a 4-Dissection, i.e. Schroeppel-Shamir (Sect. 5.1), on target \(t \oplus \tau _3 \oplus \tau _2\). Internally, the 4-Dissection introduces another constraint \(\tau _1\).
Whenever a partial sum is identified by the 4-Dissection, it is passed to the 7-Dissection on-the-fly while the 4-Dissection carries on. See the “chain” at the bottom in Fig. 2 traversing lists from right to left. Once, the 7-Dissection receives a sum, it immediately checks for a match in list \(L_{(7,5)}\) and discards or returns it—in case of the latter—enriched by a matching element from \(L_{(7,5)}\) to the 11-Dissection and continues.
Whenever the 11-Dissection receives a sum from the 7-Dissection, it instantly checks for a match in list \(L_{(11,8)}\) to detect if a single-solution has been found.
Definition 5.2
(Join Operator). Let \(d\in \mathbb {N}\) and \(L_1\), \(L_2\in {(\mathbb {F}_2^d)}^*\) be lists. The join of \(L_1\) and \(L_2\) is defined as
For \(t\in \mathbb {F}_2^{\le d}\) the join of \(L_1\) and \(L_2\) on t is defined as
If \(L_2=(l_2)\) we write \(L_1\bowtie _t l_2\) instead of \(L_1\bowtie _t(l_2)\).
Clearly, computing elements contained in a sequence of joins can be implemented memory-friendly without having to explicitly compute intermediate lists.
Definition 5.3
(The Magic Sequence [10]). Let \(c_{-1}:=1\) and
for all \(i \in \mathbb {N}\cup \{0\}\). The magic sequence is defined as \(\mathsf {magic}:=(c_i)_{i\in \mathbb {N}^{\ge 1}}\).
One easily verifies that, alternatively,
As Dissection formulated in the language of k-list algorithms might be of independent interest we deter from adding -solving related details to the Dissection algorithm presented next, but rather introduce a simple wrapper algorithm (see Algorithm 5) to solve the \(c\)-sum-problem afterwards.
We define the class of Dissection algorithms recursively. Let \(L_2,L_1\in {(\mathbb {F}_2^d)}^*\), \(d\in \mathbb {N}\) and \(t\in \mathbb {F}_2^{\le d}\). Then

Here, “\(\texttt {inner}\)” indicates that was called recursively by another Dissection algorithm in contrast to an initial, explicit call to run Dissection that will be called with parameter “\(\texttt {outer}\)”.
We proceed to define for \(i \ge 1\). Lists are denoted by L and input-lists are numbered \(L_{c_i}\) down to \(L_1\) for \(c_i\in \mathsf {magic}\). As a reading aid, list and element indices keep track of input-lists they originated from: A list \(L_{(j,i)}\) is the output of a join of lists \(L_j \bowtie \dots \bowtie L_i\) for \(j>i\). List elements are denoted l. We write \(l_{(j,i)}\) to indicate that \(l_{(j,i)}\) is a sum of elements \(l_\kappa \in L_\kappa \) for \(\kappa =j,\ldots ,i\).


The optimality of this recursive structure is shown in [10]. We now establish the required properties of Algorithm 4 in a series of lemmata. We give detailed proofs in the full version [13].
Lemma 5.2
(Correctness of ). For some fixed \(j_a\) let \(l_a:=L_a(j_a)\) denote the \(j_a^\mathrm {th}\) element of list \(L_a\). When
halts, set S contains \((j_{c_i},\ldots ,j_1)\in {[2^\lambda ]}^{c_i}\) if and only if \(\bigoplus _{a=1}^{c_i} l_a=t\).
Proof
See full version [13].
Lemma 5.3
(Memory Consumption of ). For all \(i\ge 1\) algorithm
requires expected memory \(\tilde{\mathcal {O}}(\max \{2^\lambda ,\mathbb {E}[|S|]\})\).
Proof
See full version [13].
Lemma 5.4
Let \(i\ge 1\) and consider one iteration of the for-loop in line 1 within a call of . Then, in total, expected \(\tilde{\mathcal {O}}(2^{c_{i-2}\cdot \lambda })\) elements are returned in line 5.
Proof
See full version [13].
Lemma 5.5
Let \(i\ge 1\). Algorithm runs in expected time \(\tilde{\mathcal {O}}(2^{c_{i-1}\cdot \lambda })\).
Proof
See full version [13].
Lemma 5.6
(Running Time of ). Let \(i\ge 1\). A call of algorithm
runs in expected time \(\tilde{\mathcal {O}}(\max \{2^{c_{i-1}\cdot \lambda },\mathbb {E}[|S|]\})\).
Proof
See full version [13].
Lemma 5.7
Let \(b\in \mathbb {N}\) and \(c_i \in \mathsf {magic}\). For \(t\in \mathbb {F}_2^b\) let (L, t) be an instance of the where \(|L|=N:= 2^{\frac{b+c_i\cdot \log c_i +1}{c_i-1}}\). Under the Independence Heuristic
solves the
with overwhelming probability in expected time \(T=\tilde{\mathcal {O}}(N^{c_{i-1}})\) and expected memory \(M=\tilde{\mathcal {O}}(N)\).
Proof
From Lemma 5.1 we know that at least N equally-split single-solutions exist with overwhelming probability. Lemma 5.2 ensures that recovers all of them. Note that the lists defined in line 1 are of length
One easily verifies that \(\lambda = \frac{b+\log c_i +1}{c_i-1} \le \frac{b}{i}\) as syntactically required by . Hence, Algorithm 5 solves the
with overwhelming probability.
From Lemma 5.3 we have
Under the Independence Heuristic we obtain \( \mathbb {E}\big [|S|\big ]=\left( \frac{N}{c_i}\right) ^{c_i}\cdot 2^{-b}=2N \), where the last equality follows from Eq. (3). Therefore, \(M= \tilde{\mathcal {O}}(N)\).
From Lemma 5.6 we conclude
\(\square \)
Since \(c_{i-1}=c_i-i-1\) by Eq. (5) and \(i\approx \sqrt{2c_i}\) by Eq. (6), we see that reduces the time complexity of solving the \(c\)-sum-problem from \(N^{c-1}\) (
) down to roughly \(N^{c-\sqrt{2c}}\) (see also Table 1). Let
denote the variant of
, where
is instantiated using
.
Theorem 5.1
(Dissection Trade-Off). Let \(c_i\in \mathsf {magic}\), \(\varepsilon >0\) and \(k\in \mathbb {N}\) sufficiently large. Under the Independence Heuristic solves the \(\mathtt {LPN}_{k}\) problem with overwhelming success probability in time \(T=2^{\vartheta (1+\varepsilon )}\), using \(M=2^{\mu (1+\varepsilon )}\) memory and samples, where
Proof
Let \(N:= 2^{\frac{b+c_i\cdot \log c_i + 1}{c_i-1}}\). It follows from Lemma 5.7 and Theorem 3.2 that solves LPN with overwhelming probability. We now combine Lemma 5.7 and Theorem 3.2 to compute the running time T and memory complexity M of
. We have
whereby operates on stripes of size \(b=\log c_i \cdot \frac{k \cdot (1+\varepsilon _a)}{\log k}\). Hence
for every \(\varepsilon >\varepsilon _a\) and sufficiently large k. Finally
Analogously we have for M:
for every \(\varepsilon >\varepsilon _a\) and sufficiently large k. The sample complexity of is \(N^{1+o(1)}=M\). \(\square \)
The trade-off of Theorem 5.1 clearly improves over the naive trade-off initially obtained in Theorem 4.1. While the memory consumption of the Dissection approach remains the same as for the naive trade-off, we can reduce the price we have to pay in time from \(\log c_i\) down to \((1-\frac{i}{c_i-1})\cdot \log c_i\).
The trade-off from Theorem 5.1 is given in Fig. 1 on Page 5 as diamonds. Although it improves over the naive tradeoff from Sect. 4.1, this improvement comes at the price of covering only \(c_i\)-sums with \(c_i\in \mathsf {magic}\). Given some available memory M, one would choose the minimal \(\mathtt {c}_{\mathtt {i}}\in \mathsf {magic}\) such that consumes at most memory M. However, such a choice of \(c_i\) is unlikely to fully use M. In the following section, we show how to further speed up the algorithm by using M entirely.
5.3 Tailored Dissection
Assume, we run and our available memory is not fully used. Recall that
collects single-solutions while iterating an outmost loop over some constraint. In a nutshell, access to additional memory allows us to increase the size of lists \(L_{c_i},\ldots ,L_1\), where \(N=|L_{c_i}|+ \ldots + |L_1|\). Thereby, the number of equally-split single-solutions of a \(c\)-sum-problem instance increases significantly beyond N. As it suffices to identify (roughly) N single-solutions, we may prune the outmost loop of
to recover N (rather than all) equally-split single-solutions.
Yet, examining a fraction of the solution space only leads to recovering a respective fraction of single-solutions if the latter are distributed sufficiently uniformly.Footnote 1
Let us briefly recall that the total number of existing single-solutions taken over the initial choice of input lists is close to a binomial distribution under the Independence Heuristic. This allowed us to show that succeeds with high probability as sufficiently many single-solutions exist with high probability.
Now, let us denote the random variable of the number of single-solutions gained in the \(j^\mathrm {th}\) iteration of the outmost loop of by \(Z_j\). In order to prove that a certain number of iterations is already sufficient to collect enough single-solutions with high probability we require information on the distribution of sums of \(Z_j\). However, as we show shortly, already \(Z_j\) is distributed rather awkwardly and it seems to be a challenging task to obtain Chernoff-style results ensuring that the sum of \(Z_j\) does not fall too short from its expectation with high probability. In turn, we resort to the following heuristic.
Tailoring Heuristic. Let \(c_i\in \mathsf {magic}\). Let random variable \(Z_j\) denote the number of single-solutions gained in the \(j^\mathrm {th}\) iteration of the outmost for-loop of taken over the initial choice of input lists. We heuristically assume that there exists a polynomial function \(\mathsf {poly}(\lambda )\), such that for all
we have

In particular, it follows from Equation (8) that for all \(\iota \le 2^{i\lambda }\) we have
That is, we can compensate the deviation of the sums of \(Z_j\) below its expectation by iterating \(\mathsf {poly}(\lambda )\) more often.Footnote 2
As for the Independence Heuristic we ran experiments to verify the Tailoring Heuristic (Sect. 6.2).

We stress that the only syntactical difference of compared to
(Algorithm 5) is the increase of the size of list L from N to \(N^\alpha \) for \(\alpha \ge 1\) and the halt condition added as a comment in line 2.
Lemma 5.8
(Tailored Dissection). Let \(b\in \mathbb {N}, c_i \in \mathsf {magic}\) and \(\alpha \in [1,\frac{c_{i}-1}{c_{i-1}}]\). For \(t\in \mathbb {F}_2^b\) let (L, t) be an instance of the for \(|L|=N^\alpha \) whereby \(N:=2^{\frac{b+c_i\cdot \log c_i+1}{c_i-1}}\). Under the Independence and Tailoring Heuristic \(\mathtt {tailored}\)-\(\mathtt {c}_{\mathtt {i}}\)-
solves the
with overwhelming probability in expected time \(T=\tilde{\mathcal {O}}\left( N^{c_{i-1}-i\cdot (\alpha -1)}\right) \) and expected memory \(M=\tilde{\mathcal {O}}\left( N^{\alpha }\right) \).
Proof
See full version [13].
The complexities of \(\mathtt {tailored}\)-\(\mathtt {c}_{\mathtt {i}}\)- are given in Table 1 where we simplified \(\alpha \)’s upper bound using \(c_{i-1}\approx c_i-\sqrt{2c_i}\). Let
denote
, where the \(c\)-sum-problem is solved via
.
Theorem 5.2
(Tailored-Dissection Trade-Off). Let \(c_i\in \mathsf {magic}\) and further \(\alpha \in [1,\frac{c_{i}-1}{c_{i-1}}]\), \(\varepsilon >0\) and \(k\in \mathbb {N}\) sufficiently large. Under the Independence and Tailoring Heuristic solves the \(\mathtt {LPN}_{k}\) problem with overwhelming success probability in time \(T=2^{\vartheta (1+\varepsilon )}\), using \(M=2^{\mu (1+\varepsilon )}\) memory and samples, where
Proof
See full version [13].
Intuitively, it is clear how to choose optimal parameters for for any given amount of memory M: Find the minimal \(c_i\in \mathsf {magic}\) such that
uses at most memory M. Then, resort to
using memory M entirely by choosing the right \(\alpha \). The trade-off achieved via this approach is given in Fig. 1 on page 5 (line segments). A formal justification of the approach is contained in the full version [13].
Distribution of the number of single-solutions over successive runs of . Under the Independence Heuristic this distribution is close to \(\mathrm {Bin}_{{(N/c_i)}^{c_i},2^{-b}}\). All parameters are given in Table 2.
6 Experimental Verification of Heuristics
We present experimental results to verify our Independence Heuristic as well as our Tailoring Heuristic.
6.1 Experiments for the Independence Heuristic
We tested the Independence Heuristic for \(c_i\in \{4,7\}\). We iteratively monitored the number of single-solutions found after a run of on successive stripes starting with a list of size N. For each \(c_i\) we repeatedly called
on three stripes. After each call we stored the number of single-solutions found. To analyze the impact of dependencies amongst the list elements—rather than influences due to variations in the number of single-solutions found—we pruned lists of more than N single-solutions down to N before starting the next run, and discarded lists where less than N single-solutions were found.
Note that during the first run all list elements are independent and uniformly distributed, even if their c-sums are not. While the list elements remain uniform on a “fresh” stripe in subsequent runs of they are not independent anymore what could affect (besides c-sums being dependent) the distribution of existing single-solutions even further. Under the Independence Heuristic the number of single-solutions found is close to being \(\text {Bin}_{{(N/c_i)}^{c_i},2^{-b}}\) distributed after any run of
.
Our experiments with parameters as given in Table 2 lead to the plots given in Fig. 3. The measured relative frequencies are given by points, while the continuous plots are the benchmark distributions \(\text {Bin}_{{(N/c_i)}^{c_i},2^{-b}}\) for our Independence Heuristic.
We see from Fig. 3 that the distribution of the output list-size is close to their benchmark, even after three iterations of the subroutine, where already \(4^3=64\)-sums (resp. \(7^3=343\)-sums) haven been built.
We also used the Independence Heuristic in Lemma 5.8 to show that the random variables \(Z_j\) of the number of single-solutions gained in the \(j^\mathrm {th}\) iteration over a constraint is close to being binomially \(\text {Bin}_{x\cdot y,2^{-(b-i\lambda )}}\) distributed, where \(x \sim \text {Bin}_{2^{(i+1)\cdot \lambda },2^{-i\lambda }}\) and \(y \sim \text {Bin}_{2^{\lambda c_{i-1}},2^{-i\lambda }}\). In order to verify this experimentally, we computed several instances with parameter set
Each time we performed three consecutive runs of on successive stripes and stored the number of single-solutions obtained per constraint after each iteration. The results are given in Fig. 4. Again, the obtained relative frequencies accurately match the benchmark curve \(\text {Bin}_{x\cdot y,2^{-(b-i\lambda )}}\).
6.2 Experiments on the Tailoring Heuristic
As in the previous subsection, let \(Z_j\) be the number of single-solutions obtained per constraint iteration in . In Tailored Dissection it is required that the sum of these random variables is close to its expectation. The Tailoring Heuristic states that this is true with overwhelming probability, if we slightly increase the expected number of required iterations by a polynomial factor.

To test the Tailoring Heuristic we ran (without tailoring) on three stripes (\(a=3\)) with \(N=8192\) and varying choice of b. We summed the numbers \(Z_j\) of single-solutions found per randomized constraint \(\tau _j\) during the last run in a list, until their sum exceeded N. The results can be found in Table 3 and Fig. 5.
We see in Table 3 that the experimentally required numbers of iteration very accurately match their theoretical predictions (that were computed under the Independence Heuristic). Moreover, we experimentally need only a small factor to achieve a 99% confidence interval, even for low expectations. This means that the distribution has small variance and sharply concentrates around its mean, as can also been seen in Fig. 5. This all supports the validity of our Tailoring Heuristic for the analysis of Tailored Dissection BKW.
Notes
- 1.
Imagine many single-solutions concentrated on few constraints \(\tau _i\) as a counterexample.
- 2.
Clearly, it does not make sense to iterate \(\iota \cdot \mathsf {poly}(\lambda )>2^{i\cdot \lambda }\) times. However, once we would have \(\iota \cdot \mathsf {poly}(\lambda )>2^{i\cdot \lambda }\) iterating \(2^{i\lambda }\) times is sufficient to collect enough single-solutions as shown in Lemma 3.1.
References
Alekhnovich, M.: More on average case vs approximation complexity. In: 44th FOCS, pp. 298–307. IEEE Computer Society Press, October 2003
Bai, S., Laarhoven, T., Stehlé, D.: Tuple lattice sieving. LMS J. Comput. Math. 19(A), 146–162 (2016)
Blum, A., Furst, M.L., Kearns, M.J., Lipton, R.J.: Cryptographic primitives based on hard learning problems. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 278–291. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-48329-2_24
Blum, A., Kalai, A., Wasserman, H.: Noise-tolerant learning, the parity problem, and the statistical query model. In: 32nd ACM STOC, pp. 435–440. ACM Press, May 2000
Bogos, S., Tramèr, F., Vaudenay, S.: On solving LPN using BKW and variants - implementation and analysis. Crypt. Commun. 8(3), 331–369 (2016). https://doi.org/10.1007/s12095-015-0149-2
Bogos, S., Vaudenay, S.: Optimization of \(\sf LPN\) solving algorithms. In: Cheon, J.H., Takagi, T. (eds.) ASIACRYPT 2016, Part I. LNCS, vol. 10031, pp. 703–728. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53887-6_26
Boyer, M., Brassard, G., Høyer, P., Tapp, A.: Tight bounds on quantum searching. arXiv preprint quant-ph/9605034 (1996)
Devadas, S., Ren, L., Xiao, H.: On iterative collision search for LPN and subset sum. In: Kalai, Y., Reyzin, L. (eds.) TCC 2017, Part II. LNCS, vol. 10678, pp. 729–746. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70503-3_24
Dinur, I., Dunkelman, O., Keller, N., Shamir, A.: Efficient dissection of composite problems, with applications to cryptanalysis, knapsacks, and combinatorial search problems. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO 2012. LNCS, vol. 7417, pp. 719–740. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32009-5_42
Dohotaru, C., Hoyer, P.: Exact quantum lower bound for grover’s problem. arXiv preprint arXiv:0810.3647 (2008)
Ducas, L.: Shortest vector from lattice sieving: a few dimensions for free. In: Nielsen, J.B., Rijmen, V. (eds.) EUROCRYPT 2018, Part I. LNCS, vol. 10820, pp. 125–145. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78381-9_5
Esser, A., Heuer, F., Kübler, R., May, A., Sohler, C.: Dissection-BKW. Cryptology ePrint Archive, Report 2018/569 (2018). https://eprint.iacr.org/2018/569
Esser, A., Kübler, R., May, A.: LPN decoded. In: Katz, J., Shacham, H. (eds.) CRYPTO 2017, Part II. LNCS, vol. 10402, pp. 486–514. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63715-0_17
Grover, L.K.: A fast quantum mechanical algorithm for database search. In: 28th ACM STOC, pp. 212–219. ACM Press, May 1996
Guo, Q., Johansson, T., Löndahl, C.: Solving LPN using covering codes. In: Sarkar, P., Iwata, T. (eds.) ASIACRYPT 2014, Part I. LNCS, vol. 8873, pp. 1–20. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45611-8_1
Guo, Q., Johansson, T., Stankovski, P.: Coded-BKW: solving LWE using lattice codes. In: Gennaro, R., Robshaw, M.J.B. (eds.) CRYPTO 2015, Part I. LNCS, vol. 9215, pp. 23–42. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-47989-6_2
Herold, G., Kirshanova, E.: Improved algorithms for the approximate k-list problem in euclidean norm. In: Fehr, S. (ed.) PKC 2017, Part I. LNCS, vol. 10174, pp. 16–40. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-662-54365-8_2
Herold, G., Kirshanova, E., Laarhoven, T.: Speed-Ups and time–memory trade-offs for tuple lattice sieving. In: Abdalla, M., Dahab, R. (eds.) PKC 2018, Part I. LNCS, vol. 10769, pp. 407–436. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76578-5_14
Howgrave-Graham, N., Joux, A.: New generic algorithms for hard knapsacks. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 235–256. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13190-5_12
Kannan, R.: Improved algorithms for integer programming and related lattice problems. In: 15th ACM STOC, pp. 193–206. ACM Press, April 1983
Laarhoven, T.: Sieving for shortest vectors in lattices using angular locality-sensitive hashing. In: Gennaro, R., Robshaw, M.J.B. (eds.) CRYPTO 2015, Part I. LNCS, vol. 9215, pp. 3–22. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-47989-6_1
Laarhoven, T., Mariano, A.: Progressive lattice sieving. In: Lange, T., Steinwandt, R. (eds.) PQCrypto 2018. LNCS, vol. 10786, pp. 292–311. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-79063-3_14
Laarhoven, T., de Weger, B.: Faster sieving for shortest lattice vectors using spherical locality-sensitive hashing. In: Lauter, K.E., Rodríguez-Henríquez, F. (eds.) LATINCRYPT 2015. LNCS, vol. 9230, pp. 101–118. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22174-8_6
Levieil, É., Fouque, P.-A.: An improved LPN algorithm. In: De Prisco, R., Yung, M. (eds.) SCN 2006. LNCS, vol. 4116, pp. 348–359. Springer, Heidelberg (2006). https://doi.org/10.1007/11832072_24
Lyubashevsky, V., Peikert, C., Regev, O.: A toolkit for ring-LWE cryptography. In: Johansson, T., Nguyen, P.Q. (eds.) EUROCRYPT 2013. LNCS, vol. 7881, pp. 35–54. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38348-9_3
Mitzenmacher, M., Upfal, E.: Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, New York (2005)
Regev, O.: New lattice based cryptographic constructions. In: 35th ACM STOC, pp. 407–416. ACM Press, June 2003
Regev, O.: On lattices, learning with errors, random linear codes, and cryptography. J. ACM 56(6), 34:1–34:40 (2009). https://doi.org/10.1145/1568318.1568324
Schroeppel, R., Shamir, A.: A T=O(2\({}^{\text{ n/2 }}\)), S=O(2\({}^{\text{ n/4 }}\)) algorithm for certain np-complete problems. SIAM J. Comput. 10(3), 456–464 (1981). https://doi.org/10.1137/0210033
Wagner, D.: A Generalized birthday problem. In: Yung, M. (ed.) CRYPTO 2002. LNCS, vol. 2442, pp. 288–304. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45708-9_19
Zhang, B., Jiao, L., Wang, M.: Faster algorithms for solving LPN. In: Fischlin, M., Coron, J.-S. (eds.) EUROCRYPT 2016, Part I. LNCS, vol. 9665, pp. 168–195. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49890-3_7
Acknowledgements
We would like to thank Eamonn Postlethwaite for his detailed feedback and helpful suggestions on an earlier version of this paper. We are grateful to the anonymous CRYPTO reviewers for their valuable comments.
Andre Esser was supported by DFG Research Training Group GRK 1817. Felix Heuer, Alexander May and Christian Sohler were supported by Mercator Research Center Ruhr, project “LPN-Krypt”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 International Association for Cryptologic Research
About this paper
Cite this paper
Esser, A., Heuer, F., Kübler, R., May, A., Sohler, C. (2018). Dissection-BKW. In: Shacham, H., Boldyreva, A. (eds) Advances in Cryptology – CRYPTO 2018. CRYPTO 2018. Lecture Notes in Computer Science(), vol 10992. Springer, Cham. https://doi.org/10.1007/978-3-319-96881-0_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-96881-0_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96880-3
Online ISBN: 978-3-319-96881-0
eBook Packages: Computer ScienceComputer Science (R0)