{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,6,17]],"date-time":"2024-06-17T02:13:57Z","timestamp":1718590437349},"reference-count":31,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2019,3,29]],"date-time":"2019-03-29T00:00:00Z","timestamp":1553817600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"Big data and streaming data are encountered in a variety of contemporary applications in business and industry. In such cases, it is common to use random projections to reduce the dimension of the data yielding compressed data. These data however possess various anomalies such as heterogeneity, outliers, and round-off errors which are hard to detect due to volume and processing challenges. This paper describes a new robust and efficient methodology, using Hellinger distance, to analyze the compressed data. Using large sample methods and numerical experiments, it is demonstrated that a routine use of robust estimation procedure is feasible. The role of double limits in understanding the efficiency and robustness is brought out, which is of independent interest.<\/jats:p>","DOI":"10.3390\/e21040348","type":"journal-article","created":{"date-parts":[[2019,3,29]],"date-time":"2019-03-29T17:09:58Z","timestamp":1553879398000},"page":"348","source":"Crossref","is-referenced-by-count":2,"title":["Robust Inference after Random Projections via Hellinger Distance for Location-Scale Family"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"http:\/\/orcid.org\/0000-0003-2698-6974","authenticated-orcid":false,"given":"Lei","family":"Li","sequence":"first","affiliation":[{"name":"Department of Statistics, George Mason University, Fairfax, VA 22030, USA"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-8645-6372","authenticated-orcid":false,"given":"Anand N.","family":"Vidyashankar","sequence":"additional","affiliation":[{"name":"Department of Statistics, George Mason University, Fairfax, VA 22030, USA"}]},{"given":"Guoqing","family":"Diao","sequence":"additional","affiliation":[{"name":"Department of Statistics, George Mason University, Fairfax, VA 22030, USA"}]},{"given":"Ejaz","family":"Ahmed","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Statistics, Brock University, St. Catharines, ON L2S 3A1, Canada"}]}],"member":"1968","published-online":{"date-parts":[[2019,3,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1214\/aos\/1176343842","article-title":"Minimum Hellinger distance estimates for parametric models","volume":"5","author":"Beran","year":"1977","journal-title":"Ann. Stat."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1081","DOI":"10.1214\/aos\/1176325512","article-title":"Efficiency versus robustness: The case for minimum Hellinger distance and related methods","volume":"22","author":"Lindsay","year":"1994","journal-title":"Ann. Stat."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1098\/rspa.1934.0050","article-title":"Two new properties of mathematical likelihood","volume":"144","author":"Fisher","year":"1934","journal-title":"Proc. R. Soc. Lond. Ser. A"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"391","DOI":"10.1093\/biomet\/30.3-4.391","article-title":"The estimation of the location and scale parameters of a continuous population of any given form","volume":"30","author":"Pitman","year":"1939","journal-title":"Biometrika"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1090\/S0002-9939-1994-1207537-3","article-title":"On location and scale maximum likelihood estimators","volume":"120","author":"Gupta","year":"1994","journal-title":"Proc. Am. Math. Soc."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"775","DOI":"10.3150\/13-BEJ506","article-title":"Maximum likelihood characterization of distributions","volume":"20","author":"Duerinckx","year":"2014","journal-title":"Bernoulli"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1214","DOI":"10.1214\/aoms\/1177704861","article-title":"Maximum likelihood characterization of distributions","volume":"32","author":"Teicher","year":"1961","journal-title":"Ann. Math. Stat."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Thanei, G.A., Heinze, C., and Meinshausen, N. (2017). Random projections for large-scale regression. Big and Complex Data Analysis, Springer.","DOI":"10.1007\/978-3-319-41573-4_3"},{"key":"ref_9","unstructured":"Slawski, M. (2017). Compressed least squares regression revisited. Artificial Intelligence and Statistics, Addison-Wesley."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"3673","DOI":"10.1214\/18-EJS1486","article-title":"On principal components regression, random projections, and column subsampling","volume":"12","author":"Slawski","year":"2018","journal-title":"Electron. J. Stat."},{"key":"ref_11","first-page":"7508","article-title":"A statistical perspective on randomized sketching for ordinary least-squares","volume":"17","author":"Raskutti","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_12","unstructured":"Ahfock, D., Astle, W.J., and Richardson, S. (arXiv, 2017). Statistical properties of sketching algorithms, arXiv."},{"key":"ref_13","unstructured":"Vidyashankar, A., Hanlon, B., Lei, L., and Doyle, L. (2018). Anonymized Data: Trade off between Efficiency and Privacy, preprint."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1016\/0378-3758(95)00006-U","article-title":"Minimum Hellinger distance estimation of mixture proportions","volume":"48","author":"Woodward","year":"1995","journal-title":"J. Stat. Plan. Inference"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/S0169-7161(97)15004-0","article-title":"Minimum distance estimation: The approach using density-based distances","volume":"Volume 15","author":"Basu","year":"1997","journal-title":"Robust Inference, Handbook of Statistics"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"556","DOI":"10.1007\/s11749-014-0360-z","article-title":"Bayesian model robustness via disparities","volume":"23","author":"Hooker","year":"2014","journal-title":"Test"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1016\/S0167-7152(00)00112-7","article-title":"Minimum Hellinger distance estimation for supercritical Galton\u2013Watson processes","volume":"50","author":"Sriram","year":"2000","journal-title":"Stat. Probab. Lett."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"802","DOI":"10.1080\/01621459.1987.10478501","article-title":"Minimum Hellinger distance estimation for the analysis of count data","volume":"82","author":"Simpson","year":"1987","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1080\/01621459.1989.10478744","article-title":"Hellinger deviance tests: Efficiency, breakdown points, and examples","volume":"84","author":"Simpson","year":"1989","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1875","DOI":"10.1016\/j.jspi.2005.08.010","article-title":"Minimum Hellinger distance estimation for randomized play the winner design","volume":"136","author":"Cheng","year":"2006","journal-title":"J. Stat. Plan. Inference"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Basu, A., Shioya, H., and Park, C. (2011). Statistical Inference: The Minimum Distance Approach, Chapman and Hall\/CRC.","DOI":"10.1201\/b10956"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1111\/j.1467-842X.2006.00428.x","article-title":"Robust inference in parametric models using the family of generalized negative exponential dispatches","volume":"48","author":"Bhandari","year":"2006","journal-title":"Aust. N. Z. J. Stat."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"2746","DOI":"10.3150\/16-BEJ826","article-title":"A generalized divergence for statistical inference","volume":"23","author":"Ghosh","year":"2017","journal-title":"Bernoulli"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1080\/01621459.1986.10478264","article-title":"Minimum Hellinger distance estimation for multivariate location and covariance","volume":"81","author":"Tamura","year":"1986","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_25","unstructured":"Li, P. (2008, January 20\u201322). Estimators and tail bounds for dimension reduction in l \u03b1 (0 < \u03b1 \u2264 2) using stable random projections. Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, CA, USA."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Boyd, S., and Vandenberghe, L. (2004). Convex Optimization, Cambridge University Press.","DOI":"10.1017\/CBO9780511804441"},{"key":"ref_27","unstructured":"Lichman, M. (2019, March 29). UCI Machine Learning Repository. Available online: https:\/\/archive.ics.uci.edu\/ml\/index.php."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2473","DOI":"10.1016\/j.eswa.2007.12.020","article-title":"The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients","volume":"36","author":"Yeh","year":"2009","journal-title":"Expert Syst. Appl."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1016\/0304-4149(89)90095-1","article-title":"Estimation in sparsely sampled random walks","volume":"31","author":"Guttorp","year":"1989","journal-title":"Stoch. Process. Appl."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"958","DOI":"10.1214\/aos\/1176349649","article-title":"Consistent estimation in partially observed random walks","volume":"13","author":"Guttorp","year":"1985","journal-title":"Ann. Stat."},{"key":"ref_31","unstructured":"Apostol, T.M. (1974). Mathematical Analysis, Addison Wesley Publishing Company."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/4\/348\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,16]],"date-time":"2024-06-16T19:59:39Z","timestamp":1718567979000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/4\/348"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,3,29]]},"references-count":31,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2019,4]]}},"alternative-id":["e21040348"],"URL":"https:\/\/doi.org\/10.3390\/e21040348","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,3,29]]}}}