{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,3]],"date-time":"2024-09-03T16:48:46Z","timestamp":1725382126201},"reference-count":32,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2021,6,4]],"date-time":"2021-06-04T00:00:00Z","timestamp":1622764800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Informatics"],"abstract":"Organizations have been challenged by the need to process an increasing amount of data, both structured and unstructured, retrieved from heterogeneous sources. Criminal investigation police are among these organizations, as they have to manually process a vast number of criminal reports, news articles related to crimes, occurrence and evidence reports, and other unstructured documents. Automatic extraction and representation of data and knowledge in such documents is an essential task to reduce the manual analysis burden and to automate the discovering of names and entities relationships that may exist in a case. This paper presents SEMCrime, a framework used to extract and classify named-entities and relations in Portuguese criminal reports and documents, and represent the data retrieved into a graph database. A 5WH1 (Who, What, Why, Where, When, and How) information extraction method was applied, and a graph database representation was used to store and visualize the relations extracted from the documents. Promising results were obtained with a prototype developed to evaluate the framework, namely a name-entity recognition with an F-Measure of 0.73, and a 5W1H information extraction performance with an F-Measure of 0.65.<\/jats:p>","DOI":"10.3390\/informatics8020037","type":"journal-article","created":{"date-parts":[[2021,6,8]],"date-time":"2021-06-08T02:23:00Z","timestamp":1623118980000},"page":"37","source":"Crossref","is-referenced-by-count":6,"title":["A Graph Database Representation of Portuguese Criminal-Related Documents"],"prefix":"10.3390","volume":"8","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-8285-7005","authenticated-orcid":false,"given":"Gon\u00e7alo","family":"Carnaz","sequence":"first","affiliation":[{"name":"Informatics Departament, University of \u00c9vora, 7002-554 \u00c9vora, Portugal"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-0793-0003","authenticated-orcid":false,"given":"Vitor Beires","family":"Nogueira","sequence":"additional","affiliation":[{"name":"Informatics Departament, University of \u00c9vora, 7002-554 \u00c9vora, Portugal"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-3448-6726","authenticated-orcid":false,"given":"M\u00e1rio","family":"Antunes","sequence":"additional","affiliation":[{"name":"Computer Science and Communication Research Centre (CIIC), School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal"},{"name":"INESC TEC, CRACS, 4200-465 Porto, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2021,6,4]]},"reference":[{"key":"ref_1","unstructured":"Gleick, J., and Calil, A. (2013). A Informa\u00e7\u00e3o: Uma Hist\u00f3ria, Uma Teoria, Uma Enxurrada, Companhia das Letras."},{"key":"ref_2","first-page":"431","article-title":"Big Data technologies: A survey","volume":"30","author":"Oussous","year":"2018","journal-title":"J. King Saud Univ. Comput. Inf. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Cavanillas, J.M., Curry, E., and Wahlster, W. (2016). New Horizons for a Data-Driven Economy: A Roadmap for Usage and Exploitation of Big Data in Europe, Springer.","DOI":"10.1007\/978-3-319-21569-3"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1145\/602421.602441","article-title":"COPLINK: Managing law enforcement data and knowledge","volume":"46","author":"Chen","year":"2003","journal-title":"Commun. ACM"},{"key":"ref_5","unstructured":"Stasko, J., G\u00f6rg, C., Liu, Z., and Singhal, K. (November, January 28). Jigsaw: Supporting investigative analysis through interactive visualization. Proceedings of the VAST IEEE Symposium on Visual Analytics Science and Technology, Sacramento, CA, USA."},{"key":"ref_6","first-page":"13","article-title":"Implementation of a police intelligence analysis framework","volume":"5","author":"Stampouli","year":"2011","journal-title":"Int. J. Secur. Its Appl."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1016\/j.datak.2009.10.003","article-title":"Frameworks for entity matching: A comparison","volume":"69","author":"Rahm","year":"2010","journal-title":"Data Knowl. Eng."},{"key":"ref_8","unstructured":"Albertetti, F., and Stoffel, K. (2012, January 11). From police reports to data marts: A step towards a crime analysis framework. Proceedings of the 5th International Workshop on Computational Forensics, Tsukuba, Japan."},{"key":"ref_9","first-page":"258","article-title":"Human-centered text mining: A new software system","volume":"7377","author":"Poelmans","year":"2012","journal-title":"Lect. Notes Comput. Sci. (Incl. Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinform.)"},{"key":"ref_10","unstructured":"Hosseinkhani, J., Chaprut, S., and Taherdoost, H. (2012, January 24\u201326). Criminal network mining by web structure and content mining. Advances in Remote Sensing, Finite Differences and Information Security. Proceedings of the 11th WSEAS International Conference on Information Security and Privacy (ISP \u201912), Prague, Czech Republic."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Hossain, M.S., Butler, P., Boedihardjo, A.P., Ramakrishnan, N., and Tech, V. (2012, January 12\u201316). Storytelling in Entity Networks to Support Intelligence Analysts. Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China.","DOI":"10.1145\/2339530.2339742"},{"key":"ref_12","first-page":"36","article-title":"Semantic Mining and Analysis of Heterogeneous Data for Novel Intelligence Insights","volume":"1","author":"Adderley","year":"2014","journal-title":"Fourth Int. Conf. Adv. Inf. Min. Manag."},{"key":"ref_13","first-page":"189","article-title":"Fighting Organized Crime Through Open Source Intelligence: Regulatory Strategies of the CAPER Project","volume":"271","author":"Casanovas","year":"2014","journal-title":"Front. Artif. Intell. Appl."},{"key":"ref_14","first-page":"275","article-title":"Environmental scanning and knowledge representation for the detection of organised crime threats","volume":"8577","author":"Brewster","year":"2014","journal-title":"Lect. Notes Comput. Sci. (Incl. Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinform.)"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1016\/S1574-6526(07)03005-2","article-title":"Chapter 5 Conceptual Graphs","volume":"Volume 3","author":"Lifschitz","year":"2008","journal-title":"Handbook of Knowledge Representation"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Wijeratne, S., Doran, D., Sheth, A., and Dustin, J.L. (2015, January 27\u201329). Analyzing the social media footprint of street gangs. Proceedings of the 2015 IEEE International Conference on Intelligence and Security Informatics (ISI), Baltimore, MD, USA.","DOI":"10.1109\/ISI.2015.7165945"},{"key":"ref_17","first-page":"19","article-title":"Investigating Crimes using Text Mining and Network Analysis","volume":"126","author":"Elyezjy","year":"2015","journal-title":"Int. J. Comput. Appl."},{"key":"ref_18","first-page":"11","article-title":"A Mobile Information System Based on Crowd-Sensed and Official Crime Data for Finding Safe Routes: A Case Study of Mexico City","volume":"2016","author":"Mata","year":"2016","journal-title":"Mob. Inf. Syst."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Wiedemann, G., Yimam, S.M., and Biemann, C. (2018). A Multilingual Information Extraction Pipeline for Investigative Journalism. arXiv.","DOI":"10.18653\/v1\/D18-2014"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Al-Zaidy, R., Fung, B.C.M., and Youssef, A.M. (2011). Towards Discovering Criminal Communities from Textual Data. Proceedings of the 2011 ACM Symposium on Applied Computing, TaiChung, Taiwan, 1 January 2011, ACM.","DOI":"10.1145\/1982185.1982225"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Pinheiro, V., Furtado, V., Pequeno, T., Nogueira, D., and Aplicada, I. (2010, January 23\u201326). Natural Language Processing Based on Semantic Inferentialism for Extracting Crime Information from Text. Proceedings of the 2010 IEEE International Conference on Intelligence and Security Informatics, Vancouver, BC, Canada.","DOI":"10.1109\/ISI.2010.5484783"},{"key":"ref_22","unstructured":"Pinheiro, V., Pequeno, T., Furtado, V., Assun\u00e7\u00e3o, T., and Freitas, E. (2018, January 26\u201329). SIM: Um modelo sem\u00e2ntico-inferencialista para sistemas de linguagem natural. Proceedings of the Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web, Vila Velha, Brazil."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1016\/j.ins.2009.08.004","article-title":"Collective intelligence in law enforcement\u2014The WikiCrimes system","volume":"180","author":"Furtado","year":"2010","journal-title":"Inf. Sci."},{"key":"ref_24","first-page":"569","article-title":"Processamento de linguagem natural para indexa\u00e7\u00e3o autom\u00e1tica sem\u00e2ntico-ontol\u00f3gica","volume":"9","year":"2013","journal-title":"Rev. Ibero Am. Ci\u00eancia Informa\u00e7\u00e3o"},{"key":"ref_25","first-page":"31","article-title":"Extracting Crime Information from Online Newspaper Articles","volume":"Volume 155","author":"Arulanandam","year":"2014","journal-title":"Proceedings of the Second Australasian Web Conference, Auckland, New Zealand, 20\u201323 January 2014"},{"key":"ref_26","first-page":"1215","article-title":"Named Entity Recognition in Crime News Documents Using Classifiers Combination","volume":"23","author":"Shabat","year":"2015","journal-title":"Middle-East J. Sci. Res."},{"key":"ref_27","unstructured":"Ejem, R. (2017). Relation Extraction in Police Records, Univerzita Karlova, Matematicko-Fyzik\u00e1ln\u00ed Fakulta."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Martin-Rodilla, P., Hattori, M.L., and Gonzalez-Perez, C. (2019). Assisting Forensic Identification through Unsupervised Information Extraction of Free Text Autopsy Reports: The Disappearances Cases during the Brazilian Military Dictatorship. Information, 10.","DOI":"10.3390\/info10070231"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Sarmento, L. (2006). SIEM\u00caS\u2014A named-entity recognizer for portuguese relying on similarity rules. Proceedings of the International Workshop on Computational Processing of the Portuguese Language, Itatiaia, Brazil, 13\u201317 May 2006, Springer.","DOI":"10.1007\/11751984_10"},{"key":"ref_30","unstructured":"Gianola, L. (2020). Aspects Textuels de la Proc\u00e9dure Judiciaire Exploit\u00e9e en Analyse Criminelle et Perspectives Pour son Traitement Automatique. [Ph.D. Thesis, Universit\u00e9 de Cergy-Pontoise]."},{"key":"ref_31","unstructured":"Braz, J. (2019). Investigacao Criminal, Almedina."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1016\/j.websem.2011.03.003","article-title":"Design and use of the Simple Event Model (SEM)","volume":"9","author":"Segers","year":"2011","journal-title":"J. Web Semant."}],"container-title":["Informatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2227-9709\/8\/2\/37\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,14]],"date-time":"2024-07-14T09:31:06Z","timestamp":1720949466000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2227-9709\/8\/2\/37"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,4]]},"references-count":32,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2021,6]]}},"alternative-id":["informatics8020037"],"URL":"https:\/\/doi.org\/10.3390\/informatics8020037","relation":{},"ISSN":["2227-9709"],"issn-type":[{"value":"2227-9709","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,4]]}}}