Abstract
The increased ability of Artificial Intelligence (AI) technologies to generate and parse texts will inevitably lead to more proposals for AI’s use in the semantic sentiment analysis (SSA) of textual sources. We argue that instead of focusing solely on debating the merits of automated versus manual processing and analysis of texts, it is critical to also rethink our underlying storage and representation formats. Further, we argue that accommodating multivariate metadata exemplifies how underlying data storage infrastructure can reshape the ethical debate surrounding the use of such algorithms. In other words, a system that employs automated analysis typically requires manual intervention to assess the quality of its output, and thus demands that we select between multiple competing NLP algorithms. Settling on an algorithm or ensemble is not a decision that has to be made a priori, but when made, involves implicit ethical considerations. An underlying storage and representation system that allows for the existence and evaluation of multiple variants of the same source data, while maintaining attribution to the individual sources of each variant, would be a much-needed enhancement to existing storage technologies, as well as, facilitate the interpretation of proliferating AI semantic analysis technologies. To this end, we take the view that AI functions as (or acts as an implicate meta-ordering of) the SSA sociotechnical system in a manner that allows for novel solutions for safer cyber curation. This can be done by holding the attribution of source data in symmetrical relationship to its further multiple differing annotations as coexisting data points within a single publishing ecosystem. In this way, the AI program allows for the annotations of individual and aggregate data by means of competing algorithmic models, or varying degrees of human intervention. We discuss the feasibility of such a scheme, using our own infrastructure model, (MultiVerse), as an illustrative model for such a system, and analyse its ethical implications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The term “Multiverse” is widely used in different domains to describe different concepts. In science, it refers to everything that exists in totality [13] - as a hypothetical group of multiple universes. In quantum-computation, it refers to a reality in which many classical computations can occur simultaneously [19]. In a bibliographic-archival system, referred to as “Archival Multiverse”, it denotes “the plurality of evidentiary texts (records in multiple forms and cultural contexts), memory-keeping practices and institutions, bureaucratic and personal motivations, community perspectives and needs, and cultural and legal constructs” [24](Pluralizing the Archival Curriculum Group). In Information Systems, it deals with the complexity, plurality, and increasingly post-physical nature of information flows [31]. Our use of the term “MultiVerse” with a capitalized ‘V’ denotes a version of our proposed digital infrastructure for a richer metadata representation, which captures the nature of representing multiple versions of a source data object, and was named partially due to the system’s earliest tests being focused on translated poetry verses.
References
Ackerman, M.S.: The intellectual challenge of CSCW: the gap between social requirements and technical feasibility. Human-Comput. Interact. 15(2–3), 179–203 (2000)
Al Asaad, B., Erascu, M.: A tool for fake news detection. In: 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 379–386. IEEE (2018)
Alowaidi, S., Saleh, M., Abulnaja, O.: Semantic sentiment analysis of Arabic texts. Int. J. Adv. Comput. Sci. Appl. 8(2), 256–262 (2017)
Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the kepler scientific workflow system. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 118–132. Springer, Heidelberg (2006). https://doi.org/10.1007/11890850_14
Amershi, S., Cakmak, M., Knox, W.B., Kulesza, T.: Power to the people: the role of humans in interactive machine learning. AI Mag. 35(4), 105–120 (2014)
Ananny, M., Crawford, K.: Seeing without knowing: limitations of the transparency ideal and its application to algorithmic accountability. New Media Soc. 20(3), 973–989 (2018)
Angwin, J., Parris Jr, T., Mattu, S.: Breaking the black box: when algorithms decide what you pay. ProPublica (2016)
Angwin, J., Larson, J., Mattu, S., Kirchner, L.: Machine bias: there’s software used across the country to predict future criminals and it’s biased against blacks (2016). https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing. Accessed 2019
Athar, A., Teufel, S.: Context-enhanced citation sentiment detection. In: Proceedings of the 2012 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 597–601 (2012)
Bavoil, L., et al.: Vistrails: enabling interactive multiple-view visualizations. In: VIS 05. IEEE Visualization, pp. 135–142. IEEE (2005)
Bostrom, N.: Superintelligence: Paths, Dangers, Strategies. Oxford University Press, Oxford (2014)
Cambria, E., Olsher, D., Rajagopal, D.: SenticNet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 1515–1521 (2014)
Carr, B., Ellis, G.: Universe or multiverse? Astron. Geophys. 49(2), 2–29 (2008)
Cellan-Jones, R.: Stephen hawking warns artificial intelligence could end mankind. BBC News 2(2014), 10 (2014)
Crawford, K.: Can an algorithm be agonistic? Ten scenes from life in calculated publics. Sc. Technol. Human Values 41(1), 77–92 (2016)
Davidson, S.B., Freire, J.: Provenance and scientific workflows: challenges and opportunities. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1345–1350 (2008)
(DDP), T.D.D.P.: Multiple translations of comedia di dante degli allaghieri col commento di jacopo della lana bolognese, a cura di luciano scarabelli (bologna: Tipografia regia, 1866–67), as found on dante lab (2013). http://dantelab.dartmouth.edu
Desai, D.R., Kroll, J.A.: Trust but verify: a guide to algorithms and the law. Harv. JL Tech. 31, 1 (2017)
Deutsch, D.: The structure of the multiverse. Proc. R. Soc. London. Ser. A: Math. Phys. Eng. Sci. 458(2028), 2911–2923 (2002)
Dos Santos, C., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 69–78 (2014)
Dridi, A., Atzeni, M., Recupero, D.R.: FineNews: fine-grained semantic sentiment analysis on financial microblogs and news. Int. J. Mach. Learn. Cybern. 10(8), 2199–2207 (2019). https://doi.org/10.1007/s13042-018-0805-x
Drozdal, J., et al.: Trust in automl: exploring information needs for establishing trust in automated machine learning systems. In: Proceedings of the 25th International Conference on Intelligent User Interfaces, pp. 297–307 (2020)
Dwork, C., Mulligan, D.K.: It’s not privacy, and it’s not fair. Stan. Law Rev. Online 66, 35 (2013)
The Archival Education and Research Institute (AERI), Pluralizing the Archival Curriculum Group (PACG): Educating for the archival multiverse. The American Archivist, pp. 69–101 (2011)
El Alaoui, I., Gahi, Y., Messoussi, R., Chaabi, Y., Todoskoff, A., Kobi, A.: A novel adaptable approach for sentiment analysis on big social data. J. Big Data 5(1), 12 (2018)
Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Mag. 17(3), 37 (1996)
Freire, J., Koop, D., Santos, E., Silva, C.T.: Provenance for computational tasks: a survey. Comput. Sci. Eng. 10(3), 11–21 (2008)
Gao, H., Barbier, G., Goolsby, R.: Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intell. Syst. 26(3), 10–14 (2011)
Garfinkel, P.: A linguist who cracks the code in names to predict ethnicity. New York Times (2016)
Gil, Y., et al.: Towards human-guided machine learning. In: Proceedings of the 24th International Conference on Intelligent User Interfaces, pp. 614–624 (2019)
Gilliland, A.J., Willer, M.: Metadata for the information multiverse. In: iConference 2014 Proceedings (2014)
Goebel, R.: Explainable AI: the new 42? In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2018. LNCS, vol. 11015, pp. 295–303. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99740-7_21
Grove, W.M., Meehl, P.E.: Comparative efficiency of informal (subjective, impressionistic) and formal (mechanical, algorithmic) prediction procedures: the clinical-statistical controversy. Psychol. Public Policy Law 2(2), 293 (1996)
Holzinger, A., Kieseberg, P., Weippl, E., Tjoa, A.M.: Current advances, trends and challenges of machine learning and knowledge extraction: from machine learning to explainable AI. In: Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-MAKE 2018. LNCS, vol. 11015, pp. 1–8. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99740-7_1
Jhaver, S., Birman, I., Gilbert, E., Bruckman, A.: Human-machine collaboration for content regulation: the case of reddit automoderator. ACM Trans. Comput.-Human Interact. (TOCHI) 26(5), 1–35 (2019)
Johnson, C., Taylor, J.: Rejecting technology: a normative defense of fallible officiating. Sport, Ethics Philos. 10(2), 148–160 (2016)
Joy, B.: Why the future doesn’t need us. Wired Mag. 8(4), 238–262 (2000)
Katwala, A.: An algorithm determined UK students’ grades (2020)
Kharif, O.: No credit history? No problem. Lenders are looking at your phone data. Bloomberg.com (2016)
Kurzweil, R.: The Singularity is Near: When Humans Transcend Biology. Penguin, New York (2005)
Lehner, P.E., Mullin, T.M., Cohen, M.S.: A probability analysis of the usefulness of decision aids. In: Machine Intelligence and Pattern Recognition, vol. 10, pp. 427–436. Elsevier (1990)
Licklider, J.C.: Man-computer symbiosis. IRE Trans. Human Factors Electron. 1, 4–11 (1960)
Lintott, C.J., et al.: Galaxy zoo: morphologies derived from visual inspection of galaxies from the Sloan digital sky survey. Mon. Not. R. Astron. Soc. 389(3), 1179–1189 (2008)
Madrigal, A.: Inside facebook’s fast-growing content-moderation effort. The Atlantic (2018)
Makridakis, S.: The forthcoming artificial intelligence (AI) revolution: its impact on society and firms. Futures 90, 46–60 (2017)
Martin, K.: Ethical implications and accountability of algorithms. J. Bus. Ethics 160(4), 835–850 (2019). https://doi.org/10.1007/s10551-018-3921-3
Mateos-Garcia, J.: To err is algorithm: algorithmic fallibility and economic organisation (2017)
Molina-González, M.D., Martínez-Cámara, E., Martín-Valdivia, M.T., Perea-Ortega, J.M.: Semantic orientation for polarity classification in Spanish reviews. Expert Syst. Appl. 40(18), 7250–7257 (2013)
Monti, F., Frasca, F., Eynard, D., Mannion, D., Bronstein, M.M.: Fake news detection on social media using geometric deep learning. arXiv preprint arXiv:1902.06673 (2019)
Mukku, S.S., Choudhary, N., Mamidi, R.: Enhanced sentiment classification of Telugu text using ML techniques. In: SAAIP at IJCAI, vol. 2016, pp. 29–34 (2016)
Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system, p. 4 (2008). https://bitcoin.org/bitcoin.pdf
Nakov, P.: Semantic sentiment analysis of twitter data. arXiv preprint arXiv:1710.01492 (2017)
Oinn, T., et al.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17), 3045–3054 (2004)
O’neil, C.: Weapons of math destruction: How big data increases inequality and threatens democracy. Broadway Books, Portland (2016)
Peckham, M.: What 7 of the most world’s smartest people think about artificial intelligence. Time Magazine (2016)
Peng, J., Mit, C., Liu, Q., Uci, I., Ihler, A., Berger, B.: Crowdsourcing for structured labeling with applications to protein folding (2013)
Piateski, G., Frawley, W.: Knowledge Discovery in Databases. MIT Press, Cambridge (1991)
Rafiq, R.I., Hosseinmardi, H., Han, R., Lv, Q., Mishra, S.: Scalable and timely detection of cyberbullying in online social networks. In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp. 1738–1747 (2018)
Rajput, A.: Natural language processing, sentiment analysis, and clinical analytics. In: Innovation in Health Informatics, pp. 79–97. Elsevier (2020)
Redhu, S., Srivastava, S., Bansal, B., Gupta, G.: Sentiment analysis using text mining: a review. Int. J. Data Sci. Technol. 4(2), 49–53 (2018)
Russakovsky, O., Li, L.J., Fei-Fei, L.: Best of both worlds: human-machine collaboration for object annotation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2121–2131 (2015)
Saif, H., He, Y., Alani, H.: Semantic sentiment analysis of Twitter. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 508–524. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35176-1_32
Saif, H., He, Y., Fernandez, M., Alani, H.: Contextual semantics for sentiment analysis of Twitter. Inf. Process. Manag. 52(1), 5–19 (2016)
Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.R.: Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, vol. 11700. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-030-28954-6
Seering, J., Wang, T., Yoon, J., Kaufman, G.: Moderator engagement and community development in the age of algorithms. New Media Soc. 21(7), 1417–1443 (2019)
Stecklow, S.: Why Facebook is losing the war on hate speech in Myanmar (2018). https://www.reuters.com/investigates/special-report/myanmar-facebook-hate
Taylor, T.B.: Judgment day: big data as the big decider. Ph.D. thesis, Wake Forest University (2018)
Vijayanarasimhan, S., Grauman, K.: What’s it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2262–2269. IEEE (2009)
Vondrick, C., Patterson, D., Ramanan, D.: Efficiently scaling up crowd sourced video annotation. Int. J. Comput. Vis. 101(1), 184–204 (2013). https://doi.org/10.1007/s11263-012-0564-1
Wah, C., Van Horn, G., Branson, S., Maji, S., Perona, P., Belongie, S.: Similarity comparisons for interactive fine-grained categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 859–866 (2014)
Wexler, R.: How companies hide software flaws that impact who goes to prison and who gets out. Washington Monthly (2017)
Wisser, L.: Pandora’s algorithmic black box: the challenges of using algorithmic risk assessments in sentencing. Am. Crim. L. Rev. 56, 1811 (2019)
Yousif, A., Niu, Z., Tarus, J.K., Ahmad, A.: A survey on sentiment analysis of scientific citations. Artif. Intell. Rev. 52(3), 1805–1838 (2019). https://doi.org/10.1007/s10462-017-9597-8
Ziewitz, M.: Governing algorithms: myth, mess, and methods. Sci. Technol. Human Values 41(1), 3–16 (2016)
Zinovyeva, E., Härdle, W.K., Lessmann, S.: Antisocial online behavior detection using deep learning. Decis. Supp. Syst. 138, 113362 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Israel, M.J., Graves, M., Amer, A. (2021). On Trusting a Cyber Librarian: How Rethinking Underlying Data Storage Infrastructure Can Mitigate Risksof Automation. In: Shaghaghi, N., Lamberti, F., Beams, B., Shariatmadari, R., Amer, A. (eds) Intelligent Technologies for Interactive Entertainment. INTETAIN 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 377. Springer, Cham. https://doi.org/10.1007/978-3-030-76426-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-76426-5_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76425-8
Online ISBN: 978-3-030-76426-5
eBook Packages: Computer ScienceComputer Science (R0)