Distribution Testing Lower Bounds via Reductions from Communication Complexity

Distribution Testing Lower Bounds via Reductions from Communication Complexity

Authors Eric Blais, Clément L. Canonne, Tom Gur



PDF
Thumbnail PDF

File

LIPIcs.CCC.2017.28.pdf
  • Filesize: 1.06 MB
  • 40 pages

Document Identifiers

Author Details

Eric Blais
Clément L. Canonne
Tom Gur

Cite As Get BibTex

Eric Blais, Clément L. Canonne, and Tom Gur. Distribution Testing Lower Bounds via Reductions from Communication Complexity. In 32nd Computational Complexity Conference (CCC 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 79, pp. 28:1-28:40, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.CCC.2017.28

Abstract

We present a new methodology for proving distribution testing lower bounds, establishing a connection between distribution testing and the simultaneous message passing (SMP) communication model. Extending the framework of Blais, Brody, and Matulef (Computational Complexity, 2012), we show a simple way to reduce (private-coin) SMP problems to distribution testing problems. This method allows us to prove new distribution testing lower bounds, as well as to provide simple proofs of known lower bounds.

Our main result is concerned with testing identity to a specific distribution p, given as a parameter. In a recent and influential work, Valiant and Valiant (FOCS, 2014) showed that the sample complexity of the aforementioned problem is closely related to the 2/3-quasinorm of p. We obtain alternative bounds on the complexity of this problem in terms of an arguably more intuitive measure and using simpler proofs. More specifically, we prove that the sample complexity is essentially determined by a fundamental operator in the theory of interpolation of Banach spaces, known as Peetre's K-functional. We show that this quantity is closely related to the size of the effective support of p (loosely speaking, the number of supported elements that constitute the vast majority of the mass of p). This result, in turn, stems from an unexpected connection to functional analysis and refined concentration of measure inequalities, which arise naturally in our reduction.

Subject Classification

Keywords
  • Distribution testing
  • communication complexity
  • lower bounds
  • simultaneous message passing
  • functional analysis

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Jayadev Acharya, Clément L. Canonne, and Gautam Kamath. A Chasm Between Identity and Equivalence Testing with Conditional Queries. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015), volume 40 of Leibniz International Proceedings in Informatics (LIPIcs), pages 449-466. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2015. URL: http://dx.doi.org/10.4230/LIPIcs.APPROX-RANDOM.2015.449.
  2. Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, and Shengjun Pan. Competitive closeness testing. In Proceedings of COLT, pages 47-68, 2011. Google Scholar
  3. Jayadev Acharya and Constantinos Daskalakis. Testing Poisson Binomial Distributions. In Proceedings of SODA, pages 1829-1840, 2015. Google Scholar
  4. Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath. Optimal testing for properties of distributions. In Proceedings of NIPS, pages 3577-3598, 2015. Google Scholar
  5. Maryam Aliakbarpour, Eric Blais, and Ronitt Rubinfeld. Learning and testing junta distributions. In Proceedings of COLT, volume 49 of JMLR Workshop and Conference Proceedings, pages 19-46. JMLR.org, 2016. Google Scholar
  6. Sergey V. Astashkin. Rademacher functions in symmetric spaces. Journal of Mathematical Sciences, 169(6):725-886, sep 2010. URL: http://dx.doi.org/10.1007/s10958-010-0074-z.
  7. Tuğkan Batu, Sanjoy Dasgupta, Ravi Kumar, and Ronitt Rubinfeld. The complexity of approximating the entropy. SIAM Journal on Computing, 35(1):132-150, 2005. Google Scholar
  8. Tuğkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. Testing random variables for independence and identity. In Proceedings of FOCS, pages 442-451, 2001. Google Scholar
  9. Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing that distributions are close. In Proceedings of FOCS, pages 189-197, 2000. Google Scholar
  10. Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing closeness of discrete distributions. ArXiV, abs/1009.5397, 2010. This is a long version of [9]. Google Scholar
  11. Tuğkan Batu, Ravi Kumar, and Ronitt Rubinfeld. Sublinear algorithms for testing monotone and unimodal distributions. In Proceedings of STOC, pages 381-390, New York, NY, USA, 2004. ACM. URL: http://dx.doi.org/10.1145/1007352.1007414.
  12. Colin Bennett and Robert C. Sharpley. Interpolation of Operators. Pure and Applied Mathematics. Elsevier Science, 1988. URL: https://books.google.com/books?id=HpqF9zjZWMMC.
  13. Bhaswar B. Bhattacharya and Gregory Valiant. Testing closeness with unequal sized samples. In Proceedings of NIPS, pages 2611-2619, 2015. Google Scholar
  14. Arnab Bhattacharyya, Eldar Fischer, Ronitt Rubinfeld, and Paul Valiant. Testing monotonicity of distributions over general partial orders. In Proceedings of ITCS, pages 239-252, 2011. URL: http://conference.itcs.tsinghua.edu.cn/ICS2011/content/papers/38.html.
  15. Eric Blais, Joshua Brody, and Kevin Matulef. Property testing lower bounds via communication complexity. Computational Complexity, 21(2):311-358, 2012. URL: http://dx.doi.org/10.1007/s00037-012-0040-x.
  16. Joshua Brody, Kevin Matulef, and Chenggang Wu. Lower bounds for testing computability by small width OBDDs. In TAMC, volume 6648 of Lecture Notes in Computer Science, pages 320-331. Springer, 2011. Google Scholar
  17. Clément L. Canonne. Big Data on the Rise? Testing Monotonicity of Distributions. In Proceedings of ICALP, pages 294-305. Springer, 2015. URL: http://dx.doi.org/10.1007/978-3-662-47672-7_24.
  18. Clément L. Canonne. A Survey on Distribution Testing: your data is Big. But is it Blue? Electronic Colloquium on Computational Complexity (ECCC), 22:63, April 2015. URL: https://eccc.weizmann.ac.il/report/2015/063/.
  19. Clément L. Canonne. Are Few Bins Enough: Testing Histogram Distributions. In Proceedings of PODS. Association for Computing Machinery (ACM), 2016. URL: http://dx.doi.org/10.1145/2902251.2902274.
  20. Clément L. Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld. Testing Shape Restrictions of Discrete Distributions. In 33rd Symposium on Theoretical Aspects of Computer Science (STACS 2016), volume 47 of Leibniz International Proceedings in Informatics (LIPIcs), pages 25:1-25:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.STACS.2016.25.
  21. Clément L. Canonne, Dana Ron, and Rocco A. Servedio. Testing probability distributions using conditional samples. SIAM Journal on Computing, 44(3):540-616, 2015. Also available on arXiv at http://arxiv.org/abs/1211.2664. URL: http://dx.doi.org/10.1137/130945508.
  22. Sourav Chakraborty, Eldar Fischer, Yonatan Goldhirsh, and Arie Matsliah. On the power of conditional samples in distribution testing. In Proceedings of ITCS, pages 561-580, New York, NY, USA, 2013. ACM. URL: http://dx.doi.org/10.1145/2422436.2422497.
  23. Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant. Optimal algorithms for testing closeness of discrete distributions. In Proceedings of SODA, pages 1193-1203. Society for Industrial and Applied Mathematics (SIAM), 2014. Google Scholar
  24. Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio, Gregory Valiant, and Paul Valiant. Testing k-modal distributions: Optimal algorithms via reductions. In Proceedings of SODA, pages 1833-1852. Society for Industrial and Applied Mathematics (SIAM), 2013. URL: http://dl.acm.org/citation.cfm?id=2627817.2627948.
  25. Ilias Diakonikolas and Daniel M. Kane. A new approach for testing properties of discrete distributions. In Proceedings of FOCS. IEEE Computer Society, 2016. Google Scholar
  26. Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. Optimal Algorithms and Lower Bounds for Testing Closeness of Structured Distributions. In Proceedings of FOCS. Institute of Electrical and Electronics Engineers (IEEE), oct 2015. URL: http://dx.doi.org/10.1109/focs.2015.76.
  27. Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. Testing Identity of Structured Distributions. In Proceedings of SODA, pages 1841-1854. Society for Industrial and Applied Mathematics (SIAM), January 2015. Google Scholar
  28. Moein Falahatgar, Ashkan Jafarpour, Alon Orlitsky, Venkatadheeraj Pichapathi, and Ananda Theertha Suresh. Faster algorithms for testing under conditional sampling. ArXiV, abs/1504.04103, April 2015. Google Scholar
  29. Eldar Fischer, Oded Lachish, and Yadu Vasudev. Improving and extending the testing of distributions for shape-restricted properties. ArXiV, abs/1609.06736, September 2016. Google Scholar
  30. Oded Goldreich. The uniform distribution is complete with respect to testing identity to a fixed distribution. Electronic Colloquium on Computational Complexity (ECCC), 23:15, 2016. Google Scholar
  31. Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. Journal of the ACM, 45(4):653-750, July 1998. Google Scholar
  32. Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. Electronic Colloquium on Computational Complexity (ECCC), 7:20, 2000. URL: https://eccc.weizmann.ac.il/report/2000/020/.
  33. Paweł Hitczenko and Stanisław Kwapień. On the Rademacher series. In Probability in Banach spaces, 9 (Sandjberg, 1993), volume 35 of Progr. Probab., pages 31-36. Birkhäuser Boston, Boston, MA, 1994. Google Scholar
  34. Tord Holmstedt. Interpolation of Quasi-Normed Spaces. Mathematica Scandinavica, 26(0):177-199, 1970. URL: http://www.mscand.dk/article/view/10976.
  35. Piotr Indyk, Reut Levi, and Ronitt Rubinfeld. Approximating and Testing k-Histogram Distributions in Sub-linear Time. In Proceedings of PODS, pages 15-22, 2012. Google Scholar
  36. Jiantao Jiao, Kartik Venkat, and Tsachy Weissman. Order-optimal estimation of functionals of discrete distributions. ArXiV, abs/1406.6956, 2014. URL: http://arxiv.org/abs/1406.6956.
  37. Reut Levi, Dana Ron, and Ronitt Rubinfeld. Testing properties of collections of distributions. Theory of Computing, 9:295-347, 2013. URL: http://dx.doi.org/10.4086/toc.2013.v009a008.
  38. Stephen J. Montgomery-Smith. The distribution of Rademacher sums. Proceedings of the American Mathematical Society, 109(2):517-522, 1990. Google Scholar
  39. Ilan Newman. Property testing of massively parametrized problems - A survey. In Property Testing, volume 6390 of Lecture Notes in Computer Science, pages 142-157. Springer, 2010. Google Scholar
  40. Ilan Newman and Mario Szegedy. Public vs. private coin flips in one round communication games. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages 561-570. ACM, 1996. Google Scholar
  41. Liam Paninski. Estimating entropy on m bins given fewer than m samples. IEEE Transactions on Information Theory, 50(9):2200-2203, 2004. Google Scholar
  42. Liam Paninski. A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Transactions on Information Theory, 54(10):4750-4755, 2008. URL: http://dx.doi.org/10.1109/TIT.2008.928987.
  43. Jaak Peetre. A theory of interpolation of normed spaces. Notas de Matemática, No. 39. Instituto de Matemática Pura e Aplicada, Conselho Nacional de Pesquisas, Rio de Janeiro, 1968. Google Scholar
  44. David Pollard. Asymptopia. http://www.stat.yale.edu/~pollard/Books/Asymptopia, 2003. Manuscript.
  45. Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam Smith. Strong lower bounds for approximating distributions support size and the distinct elements problem. SIAM Journal on Computing, 39(3):813-842, 2009. Google Scholar
  46. Ronitt Rubinfeld. Taming big probability distributions. XRDS: Crossroads, The ACM Magazine for Students, 19(1):24, sep 2012. URL: http://dx.doi.org/10.1145/2331042.2331052.
  47. Ronitt Rubinfeld and Rocco A. Servedio. Testing monotone high-dimensional distributions. Random Structures and Algorithms, 34(1):24-44, January 2009. URL: http://dx.doi.org/10.1002/rsa.v34:1.
  48. Ronitt Rubinfeld and Madhu Sudan. Robust characterization of polynomials with applications to program testing. SIAM Journal on Computing, 25(2):252-271, 1996. Google Scholar
  49. Gregory Valiant and Paul Valiant. A CLT and tight lower bounds for estimating entropy. Electronic Colloquium on Computational Complexity (ECCC), 17:179, 2010. URL: https://eccc.weizmann.ac.il/report/2010/179/.
  50. Gregory Valiant and Paul Valiant. Estimating the unseen: A sublinear-sample canonical estimator of distributions. Electronic Colloquium on Computational Complexity (ECCC), 17:180, 2010. URL: https://eccc.weizmann.ac.il/report/2010/180/.
  51. Gregory Valiant and Paul Valiant. Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs. In Proceedings of STOC, pages 685-694. ACM, 2011. Google Scholar
  52. Gregory Valiant and Paul Valiant. The power of linear estimators. In Proceedings of FOCS, pages 403-412, October 2011. See also \citeValiantValiant:10lb and [50]. URL: http://dx.doi.org/10.1109/FOCS.2011.81.
  53. Gregory Valiant and Paul Valiant. An automatic inequality prover and instance optimal identity testing. In Proceedings of FOCS, pages 51-60, 2014. Google Scholar
  54. Paul Valiant. Testing symmetric properties of distributions. SIAM Journal on Computing, 40(6):1927-1968, 2011. Google Scholar
  55. Wikipedia. Hypergeometric distribution - wikipedia, the free encyclopedia, 2016. [Online; accessed 28-May-2016]. URL: https://en.wikipedia.org/w/index.php?title=Hypergeometric_distribution&oldid=702419153#Multivariate_hypergeometric_distribution.
  56. Yihong Wu and Pengkun Yang. Minimax rates of entropy estimation on large alphabets via best polynomial approximation. ArXiV, abs/1407.0381, 2014. URL: http://arxiv.org/abs/1407.0381.
  57. Bin Yu. Assouad, Fano, and Le Cam. In Festschrift for Lucien Le Cam, pages 423-435. Springer, 1997. URL: http://dx.doi.org/10.1007/978-1-4612-1880-7_29.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail