{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,5,1]],"date-time":"2024-05-01T07:34:01Z","timestamp":1714548841912},"reference-count":32,"publisher":"Wiley","issue":"4","license":[{"start":{"date-parts":[[2022,5,2]],"date-time":"2022-05-02T00:00:00Z","timestamp":1651449600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"funder":[{"DOI":"10.13039\/501100008982","name":"National Science Foundation of Sri Lanka","doi-asserted-by":"publisher","award":["1455172","1934985","1940124","1940276"],"id":[{"id":"10.13039\/501100008982","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100004863","name":"New York State Foundation for Science, Technology and Innovation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100004863","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000200","name":"United States Agency for International Development","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000200","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100002414","name":"Xerox Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100002414","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Statistical Analysis"],"published-print":{"date-parts":[[2022,8]]},"abstract":"Abstract<\/jats:title>Anomaly detection aims to identify observations that deviate from the typical pattern of data. Anomalous observations may correspond to financial fraud, health risks, or incorrectly measured data in practice. We focus on unsupervised detection and the continuous and categorical (mixed) variable case. We show that detecting anomalies in mixed data is enhanced through first embedding the data then assessing an anomaly scoring scheme. We propose a kurtosis\u2010weighted Factor Analysis of Mixed Data<\/jats:italic> for anomaly detection to obtain a continuous embedding for anomaly scoring. We illustrate that anomalies are highly separable in the first and last few ordered dimensions of this space, and test various anomaly scoring experiments within this subspace. Results are illustrated for both simulated and real datasets, and the proposed approach is highly accurate for mixed data throughout these diverse scenarios.<\/jats:p>","DOI":"10.1002\/sam.11585","type":"journal-article","created":{"date-parts":[[2022,5,2]],"date-time":"2022-05-02T11:52:34Z","timestamp":1651492354000},"page":"480-493","update-policy":"http:\/\/dx.doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Factor analysis of mixed data for anomaly detection"],"prefix":"10.1002","volume":"15","author":[{"ORCID":"http:\/\/orcid.org\/0000-0002-4261-6992","authenticated-orcid":false,"given":"Matthew","family":"Davidow","sequence":"first","affiliation":[{"name":"Center for Applied Mathematics Cornell University Ithaca New York USA"}]},{"given":"David S.","family":"Matteson","sequence":"additional","affiliation":[{"name":"Center for Applied Mathematics Cornell University Ithaca New York USA"}]}],"member":"311","published-online":{"date-parts":[[2022,5,2]]},"reference":[{"key":"e_1_2_10_2_1","doi-asserted-by":"publisher","DOI":"10.1201\/b17700"},{"key":"e_1_2_10_3_1","first-page":"2712","article-title":"Robust random cut forest based anomaly detection on streams","volume":"48","author":"Guha S.","year":"2016","journal-title":"Int. Conf. Mach. Learn."},{"key":"e_1_2_10_4_1","doi-asserted-by":"crossref","unstructured":"Liu F. T. K. M.Ting andZ.\u2010H.Zhou Isolation forest. 2008 eighth IEEE Int. Conf. Data Mining IEEE 2008 pp.413\u2013422.","DOI":"10.1109\/ICDM.2008.17"},{"key":"e_1_2_10_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-31863-9_6"},{"key":"e_1_2_10_6_1","unstructured":"G.Pang L.Cao andL.Chen Outlier detection in complex categorical data by modelling the feature value couplings IJCAI Int. Joint Conf. Artif. Intell. 2016"},{"key":"e_1_2_10_7_1","doi-asserted-by":"crossref","unstructured":"G.Pang H.Xu L.Cao andW.Zhao Selective value coupling learning for detecting outliers in high\u2010dimensional categorical data Proc. 2017 ACM on Conf. Inf. Knowl. Manag. 2017 pp.807\u2013816.","DOI":"10.1145\/3132847.3132994"},{"key":"e_1_2_10_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-48247-5_28"},{"key":"e_1_2_10_9_1","first-page":"226","article-title":"A density\u2010based algorithm for discovering clusters in large spatial databases with noise","volume":"96","author":"Ester M.","year":"1996","journal-title":"Inkdd"},{"key":"e_1_2_10_10_1","doi-asserted-by":"crossref","unstructured":"T. R.Bandaragoda K. M.Ting D.Albrecht F. T.Liu andJ. R.Wells Efficient anomaly detection by isolation using nearest neighbour ensemble. 2014 IEEE Int. Conf. Data Mining Workshop IEEE 2014 pp.698\u2013705.","DOI":"10.1109\/ICDMW.2014.70"},{"key":"e_1_2_10_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-018-1168-z"},{"key":"e_1_2_10_12_1","doi-asserted-by":"crossref","unstructured":"J.Chen S.Sathe C.Aggarwal andD.Turaga Outlier detection with autoencoder ensembles Proc. 2017 SIAM Int. Conf. Data Mining SIAM 2017 pp.90\u201398.","DOI":"10.1137\/1.9781611974973.11"},{"key":"e_1_2_10_13_1","first-page":"1","article-title":"Deep learning for anomaly detection: A review","volume":"54","author":"Pang G.","year":"2020","journal-title":"Mach. Learn."},{"key":"e_1_2_10_14_1","doi-asserted-by":"crossref","unstructured":"Z.Li Y.Zhao N.Botta C.Ionescu andX.Hu Copod: Copula\u2010based outlier detection. 2020 IEEE Int. Conf. Data Mining (ICDM) IEEE 2020 pp.1118\u20131123.","DOI":"10.1109\/ICDM50108.2020.00135"},{"key":"e_1_2_10_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-14142-8_8"},{"issue":"96","key":"e_1_2_10_16_1","first-page":"1","article-title":"Pyod: A python toolbox for scalable outlier detection","volume":"20","author":"Zhao Y.","year":"2019","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_10_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2015.01.001"},{"key":"e_1_2_10_18_1","doi-asserted-by":"publisher","DOI":"10.5626\/JCSE.2013.7.4.272"},{"key":"e_1_2_10_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1030194.1015492"},{"key":"e_1_2_10_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/0142-0615(89)90010-0"},{"key":"e_1_2_10_21_1","doi-asserted-by":"crossref","unstructured":"M. E.Houle Dimensionality discriminability density and distance distributions 2013 IEEE 13th Int. Conf. Data Mining Workshops IEEE 2013 pp.468\u2013473.","DOI":"10.1109\/ICDMW.2013.139"},{"key":"e_1_2_10_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1970392.1970395"},{"key":"e_1_2_10_23_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csda.2007.05.018"},{"key":"e_1_2_10_24_1","doi-asserted-by":"publisher","DOI":"10.1214\/11-EJS636"},{"key":"e_1_2_10_25_1","first-page":"2496","article-title":"Robust PCA via outlier pursuit","volume":"58","author":"Xu H.","year":"2010","journal-title":"Adv. Neural Inf. Proces. Syst."},{"key":"e_1_2_10_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1541880.1541882"},{"key":"e_1_2_10_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.acha.2006.04.006"},{"key":"e_1_2_10_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11222-007-9033-z"},{"key":"e_1_2_10_29_1","doi-asserted-by":"publisher","DOI":"10.1080\/10618600.2019.1704296"},{"key":"e_1_2_10_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2017.2749215"},{"key":"e_1_2_10_31_1","unstructured":"D.DuaandC.Graff(2019).UCI Machine Learning Repository available athttp:\/\/archive.ics.uci.edu\/ml."},{"key":"e_1_2_10_32_1","doi-asserted-by":"crossref","unstructured":"L.R\u00fcschendorf Mathematical risk analysis. Springer Ser. Oper. Res. Financ. Eng. Springer Heidelberg 2013.","DOI":"10.1007\/978-3-642-33590-7"},{"key":"e_1_2_10_33_1","unstructured":"Y.Zhu F.Dai andR.Maitra Visualization of labeled mixed\u2010featured datasets Mach. Learn. (2019) arXiv preprint arXiv:1904.06366."}],"container-title":["Statistical Analysis and Data Mining: The ASA Data Science Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/sam.11585","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/sam.11585","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/sam.11585","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,23]],"date-time":"2023-08-23T17:26:36Z","timestamp":1692811596000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/sam.11585"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,2]]},"references-count":32,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,8]]}},"alternative-id":["10.1002\/sam.11585"],"URL":"https:\/\/doi.org\/10.1002\/sam.11585","archive":["Portico"],"relation":{},"ISSN":["1932-1864","1932-1872"],"issn-type":[{"value":"1932-1864","type":"print"},{"value":"1932-1872","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,2]]},"assertion":[{"value":"2020-07-16","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-03-22","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-05-02","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}