{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,19]],"date-time":"2024-09-19T16:34:14Z","timestamp":1726763654478},"reference-count":68,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2023,11,5]],"date-time":"2023-11-05T00:00:00Z","timestamp":1699142400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"While data breaches are a frequent and universal phenomenon, the characteristics and dimensions of data breaches are unexplored. In this novel exploratory research, we apply machine learning (ML) and text analytics to a comprehensive collection of data breach litigation cases to extract insights from the narratives contained within these cases. Our analysis shows stakeholders (e.g., litigants) are concerned about major topics related to identity theft, hacker, negligence, FCRA (Fair Credit Reporting Act), cybersecurity, insurance, phone device, TCPA (Telephone Consumer Protection Act), credit card, merchant, privacy, and others. The topics fall into four major clusters: \u201cphone scams\u201d, \u201ccybersecurity\u201d, \u201cidentity theft\u201d, and \u201cbusiness data breach\u201d. By utilizing ML, text analytics, and descriptive data visualizations, our study serves as a foundational piece for comprehensively analyzing large textual datasets. The findings hold significant implications for both researchers and practitioners in cybersecurity, especially those grappling with the challenges of data breaches.<\/jats:p>","DOI":"10.3390\/info14110600","type":"journal-article","created":{"date-parts":[[2023,11,5]],"date-time":"2023-11-05T12:35:06Z","timestamp":1699187706000},"page":"600","source":"Crossref","is-referenced-by-count":3,"title":["Exploring Key Issues in Cybersecurity Data Breaches: Analyzing Data Breach Litigation with ML-Based Text Analytics"],"prefix":"10.3390","volume":"14","author":[{"given":"Dominik","family":"Molitor","sequence":"first","affiliation":[{"name":"Gabelli School of Business, Fordham University, New York, NY 10023, USA"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-9927-5343","authenticated-orcid":false,"given":"Wullianallur","family":"Raghupathi","sequence":"additional","affiliation":[{"name":"Gabelli School of Business, Fordham University, New York, NY 10023, USA"}]},{"given":"Aditya","family":"Saharia","sequence":"additional","affiliation":[{"name":"Gabelli School of Business, Fordham University, New York, NY 10023, USA"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-5082-3166","authenticated-orcid":false,"given":"Viju","family":"Raghupathi","sequence":"additional","affiliation":[{"name":"Koppelman School of Business, Brooklyn College, City University of New York, Brooklyn, NY 11210, USA"}]}],"member":"1968","published-online":{"date-parts":[[2023,11,5]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"e1211","DOI":"10.1002\/widm.1211","article-title":"Enterprise data breach: Causes, challenges, prevention, and future directions","volume":"7","author":"Cheng","year":"2017","journal-title":"Wiley Interdiscip. Rev. Data Min. Knowl. Discov."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"12103","DOI":"10.1109\/ACCESS.2018.2805680","article-title":"A survey on security threats and defensive techniques of machine learning: A data driven view","volume":"6","author":"Liu","year":"2018","journal-title":"IEEE Access"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"103638","DOI":"10.1016\/j.im.2022.103638","article-title":"Antecedents and consequences of data breaches: A systematic review","volume":"59","author":"Schlackl","year":"2022","journal-title":"Inf. Manag."},{"key":"ref_4","unstructured":"IBM (2023, September 28). Cost of a Data Breach Report. Available online: https:\/\/www.ibm.com\/downloads\/cas\/E3G5JMBP."},{"key":"ref_5","unstructured":"PwC (2023, September 28). PwC\u2019s 23rd Annual Global CEO Survey. Available online: https:\/\/www.pwc.com\/gx\/en\/issues\/c-suite-insights\/ceo-survey-2023.html."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"101693","DOI":"10.1016\/j.jsis.2021.101693","article-title":"Information systems security research agenda: Exploring the gap between research and practice","volume":"30","author":"Dhillon","year":"2021","journal-title":"J. Strateg. Inf. Syst."},{"key":"ref_7","first-page":"321","article-title":"A methodology for estimating the tangible cost of data breaches","volume":"19","author":"Layton","year":"2014","journal-title":"J. Inf. Secur. Appl."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1080\/19393550802529734","article-title":"Anatomy of a data breach","volume":"17","author":"Sherstobitoff","year":"2008","journal-title":"Inf. Secur. J. A Glob. Perspect."},{"key":"ref_9","unstructured":"Watters, P.A. (2012). Cyber Security: Concepts and Cases, CreateSpace Independent Publishing Platform."},{"key":"ref_10","unstructured":"Irwin, L. (2023, September 28). The 6 Most Common Ways Data Breaches Occur. Available online: https:\/\/www.itgovernance.eu\/blog\/en\/the-6-most-common-ways-data-breaches-occur."},{"key":"ref_11","first-page":"3","article-title":"Cybersecurity incident handling: A case study of the Equifax data breach","volume":"19","author":"Wang","year":"2018","journal-title":"Issues Inf. Syst."},{"key":"ref_12","first-page":"2","article-title":"Communication in Cybersecurity: A Public Communication Model for Business Data Breach Incident Handling","volume":"18","author":"Wang","year":"2017","journal-title":"Issues Inf. Syst."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1111\/jels.12035","article-title":"Empirical analysis of data breach litigation","volume":"11","author":"Romanosky","year":"2014","journal-title":"J. Empir. Leg. Stud."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Sanzgiri, A., and Dasgupta, D. (2016, January 5\u20137). Classification of insider threat detection techniques. Proceedings of the 11th Annual Cyber and Information Security Research Conference, Oak Ridge, TN, USA.","DOI":"10.1145\/2897795.2897799"},{"key":"ref_15","unstructured":"(2023, November 01). Congressional Research Service, Available online: https:\/\/crsreports.congress.gov\/."},{"key":"ref_16","unstructured":"CNN (2023, November 01). Yahoo Says 500 Million Accounts Stolen. Available online: https:\/\/money.cnn.com\/2016\/09\/22\/technology\/yahoo-data-breach\/."},{"key":"ref_17","unstructured":"McAfee (2023, November 01). Grand Theft Data. Available online: https:\/\/www.mcafee.com."},{"key":"ref_18","unstructured":"Greenberg, A. (2023, November 01). More than Half of Corporate Breaches Go Unreported, according to Study. Available online: https:\/\/www.scmagazine.com\/news\/more-than-half-of-corporate-breaches-go-unreported-according-to-study."},{"key":"ref_19","unstructured":"Huq, N. (2015). Follow the data: Dissecting data breaches and debunking myths. TrendMicro Res. Pap."},{"key":"ref_20","unstructured":"McGee Kolbasuk, M. (2023, November 01). Why Data Breaches go Unreported. Available online: https:\/\/www.bankinfosecurity.com\/health-data-breaches-go-unreported-a-6804."},{"key":"ref_21","unstructured":"Privacy Rights Clearinghouse (2023, September 28). Data Breaches Chronology. Available online: https:\/\/privacyrights.org\/data-breaches."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Anderson, R., Barton, C., B\u00f6hme, R., Clayton, R., Van Eeten, M.J., Levi, M., Moore, T., and Savage, S. (2013, January 11\u201312). Measuring the cost of cybercrime. Proceedings of the 11th Workshop on the Economics of Information Security (WEIS), Washington, DC, USA.","DOI":"10.1007\/978-3-642-39498-0_12"},{"key":"ref_23","unstructured":"U.S. News (2023, November 01). Equifax Breach Could Have \u2018Decades of Impact\u2019. Available online: https:\/\/www.usnews.com\/news\/articles\/2017-09-08\/equifax-breach-could-have-decades-of-impact-on-consumers."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"153","DOI":"10.2478\/popets-2020-0067","article-title":"SoK: Anatomy of data breaches","volume":"2020","author":"Saleem","year":"2020","journal-title":"Proc. Priv. Enhancing Technol."},{"key":"ref_25","unstructured":"Bielinski, C. (2023, November 01). 2018 Trustwave Global Security Report. Available online: https:\/\/www.trustwave.com\/en-us\/resources\/library\/documents\/2018-trustwave-global-security-report\/."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1016\/j.bushor.2016.01.002","article-title":"Why you should care about the Target data breach","volume":"59","author":"Manworren","year":"2016","journal-title":"Bus. Horiz."},{"key":"ref_27","unstructured":"Rashid, A., Ramdhany, R., Edwards, M., Kibirige Mukisa, S., Ali Babar, M., Hutchison, D., and Chitchyan, R. (2014). Detecting and Preventing Data Exfiltration, Lancaster University."},{"key":"ref_28","first-page":"794","article-title":"Organizational data breaches 2005-2010: Applying SCP to the healthcare and education sectors","volume":"5","author":"Collins","year":"2011","journal-title":"Int. J. Cyber Criminol."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1108\/09685221111173049","article-title":"A longitudinal analysis of data breaches","volume":"19","author":"Ncube","year":"2011","journal-title":"Inf. Manag. Comput. Secur."},{"key":"ref_30","first-page":"33","article-title":"An exploratory analysis of data breaches from 2005-2011: Trends and insights","volume":"8","author":"Ayyagari","year":"2012","journal-title":"J. Inf. Priv. Secur."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1057\/sj.2013.24","article-title":"Examining the correlates and spatial distribution of organizational data breaches in the United States","volume":"26","author":"Khey","year":"2013","journal-title":"Secur. J."},{"key":"ref_32","unstructured":"Zadeh, A. (2023, November 01). Characterizing Data Breach Severity: A Data Analytics Approach. Available online: https:\/\/aisel.aisnet.org\/treos_amcis2022\/19."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1004","DOI":"10.1016\/j.procs.2019.04.141","article-title":"Digging deeper into data breaches: An exploratory data analysis of hacking breaches over time","volume":"151","author":"Hammouchi","year":"2019","journal-title":"Procedia Comput. Sci."},{"key":"ref_34","unstructured":"Shu, X., Tian, K., Ciambrone, A., and Yao, D. (2017). Breaking the target: An analysis of target data breach and lessons learned. arXiv."},{"key":"ref_35","unstructured":"Smith, T.T. (2016). Examining Data Privacy Breaches in Healthcare. [Ph.D. Thesis, Walden University]."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1108\/JFC-09-2013-0055","article-title":"Data breach trends in the United States","volume":"22","author":"Holtfreter","year":"2015","journal-title":"J. Financ. Crime"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3439873","article-title":"Developing a global data breach database and the challenges encountered","volume":"13","author":"Neto","year":"2021","journal-title":"J. Data Inf. Qual. (JDIQ)"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/j.dss.2018.02.007","article-title":"Cyber-analytics: Modeling factors associated with healthcare data breaches","volume":"108","author":"McLeod","year":"2018","journal-title":"Decis. Support Syst."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Algarni, A.M., and Malaiya, Y.K. (2016, January 7\u20138). A consolidated approach for estimation of data security breach costs. Proceedings of the 2016 2nd International Conference on Information Management (ICIM), London, UK.","DOI":"10.1109\/INFOMAN.2016.7477530"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Kafali, \u00d6., Jones, J., Petruso, M., Williams, L., and Singh, M.P. (2017, January 20\u201328). How good is a security policy against real breaches? A HIPAA case study. Proceedings of the 2017 IEEE\/ACM 39th International Conference on Software Engineering (ICSE), Buenos Aires, Argentina.","DOI":"10.1109\/ICSE.2017.55"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1080\/07421222.2015.1063315","article-title":"Estimating the contextual risk of data breach: An empirical approach","volume":"32","author":"Sen","year":"2015","journal-title":"J. Manag. Inf. Syst."},{"key":"ref_42","first-page":"50","article-title":"Data security: A review of major security breaches between 2014 and 2018","volume":"6","author":"Hall","year":"2018","journal-title":"Fed. Bus. Discip. J."},{"key":"ref_43","first-page":"121","article-title":"Examining the costs and causes of cyber incidents","volume":"2","author":"Romanosky","year":"2016","journal-title":"J. Cybersecur."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"703","DOI":"10.25300\/MISQ\/2017\/41.3.03","article-title":"User compensation as a data breach recovery action","volume":"41","author":"Goode","year":"2017","journal-title":"MIS Q."},{"key":"ref_45","unstructured":"(2023, November 01). As Data Breach Class Actions Arise. Available online: https:\/\/www.law.com\/newyorklawjournal\/?slreturn=20231005012616."},{"key":"ref_46","unstructured":"(2023, November 01). 2021 Year in Review: Data Breach and Cybersecurity Litigations. Available online: https:\/\/www.privacyworld.blog\/2021\/12\/2021-year-in-review-data-breach-and-cybersecurity-litigations\/."},{"key":"ref_47","unstructured":"Black, M. (2023, November 01). HCA Data Breach Class Action Lawsuit May Include 11 Million; Mission Patients Notified. Available online: https:\/\/www.citizen-times.com\/story\/news\/local\/2023\/08\/29\/hca-data-breach-class-action-lawsuit-may-represent-11-million-patients\/70699685007\/."},{"key":"ref_48","unstructured":"Yenouskas, J., and Swank, L. (2023, November 01). Emerging Legal Issues in Data Breach Class Actions. Available online: https:\/\/www.americanbar.org\/groups\/business_law\/resources\/business-law-today\/2018-july\/emerging-legal-issues-in-data-breach-class-actions\/."},{"key":"ref_49","unstructured":"Hill, M., and Swinhoe, D. (2023, November 01). The 15 Biggest Data Breaches of the 21st Century. Available online: https:\/\/www.csoonline.com\/article\/534628\/the-biggest-data-breaches-of-the-21st-century.html."},{"key":"ref_50","unstructured":"Bellamy, F.D. (2023, November 01). Data Breach Class Action Litigation and Changing Legal Landscape. Available online: https:\/\/www.reuters.com\/legal\/legalindustry\/data-breach-class-action-litigation-changing-legal-landscape-2022-06-27\/."},{"key":"ref_51","first-page":"4","article-title":"A review of machine learning algorithms for text-documents classification","volume":"1","author":"Khan","year":"2010","journal-title":"J. Adv. Inf. Technol."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1177\/0894439307313703","article-title":"Identifying events using computer-assisted text analysis","volume":"26","author":"Landmann","year":"2008","journal-title":"Soc. Sci. Comput. Rev."},{"key":"ref_53","first-page":"1110","article-title":"Content analysis: An introduction to its methodology","volume":"57","author":"Ford","year":"2004","journal-title":"Pers. Psychol."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Raghupathi, V., Ren, J., and Raghupathi, W. (2020). Studying public perception about vaccination: A sentiment analysis of tweets. Int. J. Environ. Res. Public Health, 17.","DOI":"10.3390\/ijerph17103464"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"41518","DOI":"10.1109\/ACCESS.2018.2859052","article-title":"Legal decision support: Exploring big data analytics approach to modeling pharma patent validity cases","volume":"6","author":"Raghupathi","year":"2018","journal-title":"IEEE Access"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4018\/IJHISI.2019100101","article-title":"Exploring big data analytic approaches to cancer blog text analysis","volume":"14","author":"Raghupathi","year":"2019","journal-title":"Int. J. Healthc. Inf. Syst. Inform. (IJHISI)"},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"e18813","DOI":"10.2196\/18813","article-title":"Understanding the dimensions of medical crowdfunding: A visual analytics approach","volume":"22","author":"Ren","year":"2020","journal-title":"J. Med. Internet Res."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Sz\u00e9kely, N., and Vom Brocke, J. (2017). What can we learn from corporate sustainability reporting? Deriving propositions for research and practice from over 9,500 corporate sustainability reports published between 1999 and 2015 using topic modelling technique. PLoS ONE, 12.","DOI":"10.1371\/journal.pone.0174807"},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1016\/j.tranpol.2021.06.020","article-title":"Sustainability disclosure for container shipping: A text-mining approach","volume":"110","author":"Zhou","year":"2021","journal-title":"Transp. Policy"},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1145\/2133806.2133826","article-title":"Probabilistic topic models","volume":"55","author":"Blei","year":"2012","journal-title":"Commun. ACM"},{"key":"ref_61","unstructured":"Graham, S., Weingart, S., and Milligan, I. (2023, November 01). Getting Started with Topic Modeling and MALLET. Available online: https:\/\/uwspace.uwaterloo.ca\/handle\/10012\/11751."},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"15169","DOI":"10.1007\/s11042-018-6894-4","article-title":"Latent Dirichlet allocation (LDA) and topic modeling: Models, applications, a survey","volume":"78","author":"Jelodar","year":"2019","journal-title":"Multimed. Tools Appl."},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40064-016-3252-8","article-title":"An overview of topic modeling and its current applications in bioinformatics","volume":"5","author":"Liu","year":"2016","journal-title":"SpringerPlus"},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Crain, S.P., Zhou, K., Yang, S.-H., and Zha, H. (2012). Dimensionality reduction and topic modeling: From latent semantic indexing to latent dirichlet allocation and beyond. Min. Text Data, 129\u2013161.","DOI":"10.1007\/978-1-4614-3223-4_5"},{"key":"ref_65","first-page":"131","article-title":"Tag recommendation using probabilistic topic models","volume":"2009","author":"Krestel","year":"2009","journal-title":"ECML PKDD Discov. Chall."},{"key":"ref_66","first-page":"7","article-title":"Text mining for information systems researchers: An annotated topic modeling tutorial","volume":"39","author":"Debortoli","year":"2016","journal-title":"Commun. Assoc. Inf. Syst. (CAIS)"},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Syed, S., and Spruit, M. (2018, January 18). Full-text or abstract? Examining topic coherence scores using latent dirichlet allocation. Proceedings of the 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Tokyo, Japan.","DOI":"10.1109\/DSAA.2017.61"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Yi, Y., Liu, L., Li, C.H., Song, W., and Liu, S. (2012, January 3\u20135). Machine learning algorithms with co-occurrence based term association for text mining. Proceedings of the 2012 Fourth International Conference on Computational Intelligence and Communication Networks, Mathura, India.","DOI":"10.1109\/CICN.2012.141"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/14\/11\/600\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,5]],"date-time":"2023-11-05T13:29:55Z","timestamp":1699190995000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/14\/11\/600"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,5]]},"references-count":68,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2023,11]]}},"alternative-id":["info14110600"],"URL":"https:\/\/doi.org\/10.3390\/info14110600","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,5]]}}}