Abstract
We present a theory for Euclidean dimensionality reduction with subgaussian matrices which unifies several restricted isometry property and Johnson–Lindenstrauss-type results obtained earlier for specific datasets. In particular, we recover and, in several cases, improve results for sets of sparse and structured sparse vectors, low-rank matrices and tensors, and smooth manifolds. In addition, we establish a new Johnson–Lindenstrauss embedding for datasets taking the form of an infinite union of subspaces of a Hilbert space.
Similar content being viewed by others
References
D. Achlioptas. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. System Sci., 66(4):671–687, 2003.
R. Adamczak, R. Latała, A. Litvak, A. Pajor, and N. Tomczak-Jaegermann. Geometry of log-concave ensembles of random matrices and approximate reconstruction. C. R. Math. Acad. Sci. Paris, 349(13-14):783–786, 2011.
P. Agarwal, S. Har-Peled, and H. Yu. Embeddings of surfaces, curves, and moving points in Euclidean space. SIAM J. Comput., 42(2):442–458, 2013.
N. Alon. Problems and results in extremal combinatorics. I. Discrete Math., 273(1-3):31–53, 2003.
R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin. A simple proof of the restricted isometry property for random matrices. Constr. Approx., 28(3):253–263, 2008.
R. Baraniuk and M. Wakin. Random projections of smooth manifolds. Found. Comput. Math., 9(1):51–77, 2009.
G. Biau, L. Devroye, and G. Lugosi. On the performance of clustering in Hilbert spaces. IEEE Trans. Inform. Theory, 54(2):781–790, 2008.
E. Bingham and H. Mannila. Random projection in dimensionality reduction: applications to image and text data. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 245–250. ACM, 2001.
T. Blumensath. Sampling and reconstructing signals from a union of linear subspaces. IEEE Trans. Inform. Theory, 57(7):4660–4671, 2011.
T. Blumensath and M. Davies. Sampling theorems for signals from the union of finite-dimensional linear subspaces. IEEE Trans. Inform. Theory, 55(4):1872–1882, 2009.
E. Candès and Y. Plan. Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements. IEEE Trans. Inform. Theory, 57(4):2342–2359, 2011.
E. Candès and T. Tao. Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inform. Theory, 52(12):5406–5425, 2006.
K. Clarkson. Tighter bounds for random projections of manifolds. In Proceedings of the twenty-fourth annual symposium on Computational geometry, pages 39–48. ACM, 2008.
S. Dasgupta. Learning mixtures of Gaussians. In 40th Annual Symposium on Foundations of Computer Science (New York, 1999), pages 634–644. IEEE Computer Soc., Los Alamitos, CA, 1999.
S. Dasgupta and A. Gupta. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures Algorithms, 22(1):60–65, 2003.
S. Dirksen. Tail bounds via generic chaining. Electron. J. Probab., 20:no. 53, 1–29, 2015.
D. Donoho. Compressed sensing. IEEE Trans. Inform. Theory, 52(4):1289–1306, 2006.
P. Drineas, M. Mahoney, S. Muthukrishnan, and T. Sarlós. Faster least squares approximation. Numer. Math., 117(2):219–249, 2011.
A. Eftekhari and M. Wakin. New analysis of manifold embeddings and signal recovery from compressive measurements. Applied and Computational Harmonic Analysis, 39(1):67 – 109, 2015.
Y. Eldar and M. Mishali. Robust recovery of signals from a structured union of subspaces. IEEE Trans. Inform. Theory, 55(11):5302–5316, 2009.
S. Foucart and H. Rauhut. A Mathematical Introduction to Compressive Sensing. Birkhaüser, Boston, 2013.
P. Frankl and H. Maehara. The Johnson-Lindenstrauss lemma and the sphericity of some graphs. J. Combin. Theory Ser. B, 44(3):355–362, 1988.
R. Giryes, S. Nam, M. Elad, R. Gribonval, and M. E. Davies. Greedy-like algorithms for the cosparse analysis model. Linear Algebra Appl., 441:22–60, 2014.
Y. Gordon. On Milman’s inequality and random subspaces which escape through a mesh in \({\bf R}^n\). In Geometric aspects of functional analysis (1986/87), volume 1317 of Lecture Notes in Math., pages 84–106. Springer, Berlin, 1988.
C. Hegde, M. Wakin, and R. Baraniuk. Random projections for manifold learning. In Advances in neural information processing systems, pages 641–648, 2007.
P. Indyk and R. Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC ’98, pages 604–613, New York, NY, USA, 1998. ACM.
P. Indyk and A. Naor. Nearest-neighbor-preserving embeddings. ACM Trans. Algorithms, 3(3):Art. 31, 12, 2007.
T. Jayram and D. Woodruff. Optimal bounds for Johnson-Lindenstrauss transforms and streaming problems with subconstant error. ACM Trans. Algorithms, 9(3):Art. 26, 17, 2013.
W. Johnson and J. Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. In Conference in modern analysis and probability (New Haven, Conn., 1982), volume 26 of Contemp. Math., pages 189–206. Amer. Math. Soc., Providence, RI, 1984.
A. Kalai, A. Moitra, and G. Valiant. Disentangling gaussians. Commun. ACM, 55(2):113–120, February 2012.
B. Klartag and S. Mendelson. Empirical processes and random projections. J. Funct. Anal., 225(1):229–245, 2005.
V. Koltchinskii. Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems. Springer, Berlin, 2011.
K. Larsen and J. Nelson. The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction. ArXiv:1411.2404, 2014.
G. Lecué and S. Mendelson. Sparse recovery under weak moment assumptions. ArXiv:1401.2188, 2014.
J. Lee. Riemannian manifolds. Springer-Verlag, New York, 1997.
Y. Lu and M. Do. A theory for sampling signals from a union of subspaces. IEEE Trans. Signal Process., 56(6):2334–2345, 2008.
O. Maillard and R. Munos. Linear regression with random projections. J. Mach. Learn. Res., 13:2735–2772, 2012.
W. Mantzel. Parametric estimation of randomly compressed functions. PhD. Thesis, Georgia Institute of Technology, 2013.
W. Mantzel and J. Romberg. Compressed subspace matching on the continuum. Information and Inference, 4(2):79–107, 2015.
W. Mantzel, J. Romberg, and K. Sabra. Compressive matched-field processing. The Journal of the Acoustical Society of America, 132(1), 2012.
J. Matoušek. On variants of the Johnson-Lindenstrauss lemma. Random Structures Algorithms, 33(2):142–156, 2008.
S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Reconstruction and subgaussian operators in asymptotic geometric analysis. Geom. Funct. Anal., 17(4):1248–1282, 2007.
S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Uniform uncertainty principle for Bernoulli and subgaussian ensembles. Constr. Approx., 28(3):277–289, 2008.
S. Nam, M. Davies, M. Elad, and R. Gribonval. The cosparse analysis model and algorithms. Appl. Comput. Harmon. Anal., 34(1):30–56, 2013.
P. Niyogi, S. Smale, and S. Weinberger. Finding the homology of submanifolds with high confidence from random samples. Discrete Comput. Geom., 39(1-3):419–441, 2008.
M. Raginsky, R. Willett, Z. Harmany, and R. Marcia. Compressed sensing performance bounds under Poisson noise. IEEE Trans. Signal Process., 58(8):3990–4002, 2010.
H. Rauhut, R. Schneider, and Z. Stojanac. Low rank tensor recovery via iterative hard thresholding. In Proc. 10th International Conference on Sampling Theory and Applications, 2013.
B. Recht, M. Fazel, and P. Parrilo. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev., 52(3):471–501, 2010.
T. Sarlos. Improved approximation algorithms for large matrices via random projections. In Foundations of Computer Science, 2006. FOCS ’06. 47th Annual IEEE Symposium on, pages 143–152, Oct 2006.
L. Schulman. Clustering for edge-cost minimization. In Proceedings of the Thirty-second Annual ACM Symposium on Theory of Computing, STOC ’00, pages 547–555, New York, NY, USA, 2000. ACM.
G. Stewart. Error and perturbation bounds for subspaces associated with certain eigenvalue problems. SIAM Rev., 15:727–764, 1973.
M. Talagrand. Regularity of Gaussian processes. Acta Math., 159(1-2):99–149, 1987.
M. Talagrand. Majorizing measures without measures. Ann. Probab., 29(1):411–417, 2001.
M. Talagrand. The generic chaining. Springer-Verlag, Berlin, 2005.
S. Vempala. The random projection method. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 65. American Mathematical Society, Providence, RI, 2004.
N. Verma. Distance preserving embeddings for general \(n\)-dimensional manifolds. J. Mach. Learn. Res., 14:2415–2448, 2013.
Acknowledgments
The research for this paper was initiated after a discussion with Justin Romberg about compressive parameter estimation. The author would like to thank him for providing William Mantzel’s PhD thesis [38] and also the two reviewers for some useful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Thomas Strohmer.
This research was supported by SFB grant 1060 of the Deutsche Forschungsgemeinschaft (DFG).
Rights and permissions
About this article
Cite this article
Dirksen, S. Dimensionality Reduction with Subgaussian Matrices: A Unified Theory. Found Comput Math 16, 1367–1396 (2016). https://doi.org/10.1007/s10208-015-9280-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10208-015-9280-x
Keywords
- Random dimensionality reduction
- Johnson–Lindenstrauss embeddings
- Restricted isometry properties
- Compressed sensing
- Union of subspaces