{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,19]],"date-time":"2024-09-19T16:08:12Z","timestamp":1726762092986},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"47-48","license":[{"start":{"date-parts":[[2020,8,17]],"date-time":"2020-08-17T00:00:00Z","timestamp":1597622400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,8,17]],"date-time":"2020-08-17T00:00:00Z","timestamp":1597622400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100010607","name":"Universit\u00e0 degli Studi di Perugia","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100010607","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"published-print":{"date-parts":[[2020,12]]},"abstract":"Abstract<\/jats:title>Crowds express emotions as a collective individual, which is evident from the sounds that a crowd produces in particular events, e.g., collective booing, laughing or cheering in sports matches, movies, theaters, concerts, political demonstrations, and riots. A critical question concerning the innovative concept of crowd emotions<\/jats:italic> is whether the emotional content of crowd sounds can be characterized by frequency-amplitude features, using analysis techniques similar to those applied on individual voices, where deep learning classification is applied to spectrogram images derived by sound transformations. In this work, we present a technique based on the generation of sound spectrograms from fragments of fixed length, extracted from original audio clips recorded in high-attendance events, where the crowd acts as a collective individual. Transfer learning techniques are used on a convolutional neural network, pre-trained on low-level features using the well-known ImageNet extensive dataset of visual knowledge. The original sound clips are filtered and normalized in amplitude for a correct spectrogram generation, on which we fine-tune the domain-specific features. Experiments held on the finally trained Convolutional Neural Network show promising performances of the proposed model to classify the emotions of the crowd.<\/jats:p>","DOI":"10.1007\/s11042-020-09428-x","type":"journal-article","created":{"date-parts":[[2020,8,17]],"date-time":"2020-08-17T03:28:11Z","timestamp":1597634891000},"page":"36063-36075","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":26,"title":["Emotional sounds of crowds: spectrogram-based analysis using deep learning"],"prefix":"10.1007","volume":"79","author":[{"given":"Valentina","family":"Franzoni","sequence":"first","affiliation":[]},{"given":"Giulio","family":"Biondi","sequence":"additional","affiliation":[]},{"given":"Alfredo","family":"Milani","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,8,17]]},"reference":[{"key":"9428_CR1","doi-asserted-by":"publisher","unstructured":"Bhor HN, Koul T, Malviya R, Mundra K (2018) Digital media marketing using trend analysis on social media. Proceedings of the 2nd International Conference on Inventive Systems and Control, ICISC 2018, pp 1398\u20131400. doi: https:\/\/doi.org\/10.1109\/ICISC.2018.8399038","DOI":"10.1109\/ICISC.2018.8399038"},{"key":"9428_CR2","doi-asserted-by":"crossref","unstructured":"Biondi G, Franzoni V, Gervasi O, Perri D (2019) An approach for improving automatic mouth emotion recognition BT - computational science and its applications \u2013 ICCSA 2019. pp 649\u2013664","DOI":"10.1007\/978-3-030-24289-3_48"},{"key":"9428_CR3","doi-asserted-by":"crossref","unstructured":"Biondi G, Franzoni V, Poggioni V (2017) A deep learning semantic approach to emotion recognition using the IBM watson bluemix alchemy language, vol. 10406 LNCS","DOI":"10.1007\/978-3-319-62398-6_51"},{"issue":"5","key":"9428_CR4","doi-asserted-by":"publisher","first-page":"335","DOI":"10.1177\/1059712316664187","volume":"24","author":"A Bonarini","year":"2016","unstructured":"Bonarini A (2016) Can my robotic home cleaner be happy? Issues about emotional expression in non-bio-inspired robots. Adapt Behav 24(5):335\u2013349","journal-title":"Adapt Behav"},{"key":"9428_CR5","doi-asserted-by":"crossref","unstructured":"Canales L, Martinez-Barco P (2014) Emotion detection from text: a survey. Processing in the 5th Information Systems Research Working Days (JISIC 2014), pp 37\u201343O","DOI":"10.3115\/v1\/W14-6905"},{"key":"9428_CR6","doi-asserted-by":"crossref","unstructured":"Chen L, Zhang A, Lou X (2019) Cross-subject driver status detection from physioLogical signals based on hybrid feature selection and transfer learning. Expert Syst Appl","DOI":"10.1016\/j.eswa.2019.02.005"},{"key":"9428_CR7","doi-asserted-by":"crossref","unstructured":"Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"9428_CR8","doi-asserted-by":"crossref","unstructured":"Deng JJ, Leung CHC, Milani A, Chen L (2015) Emotional states associated with music. ACM Trans Interact Intell Syst","DOI":"10.1145\/2723575"},{"key":"9428_CR9","doi-asserted-by":"publisher","unstructured":"Du J, Xu J, Song H-Y, Tao C (2017) Leveraging machine learning-based approaches to assess human papillomavirus vaccination sentiment trends with Twitter data. BMC Med Inform Decis Making 17 art no 69. doi: https:\/\/doi.org\/10.1186\/s12911-017-0469-6","DOI":"10.1186\/s12911-017-0469-6"},{"issue":"7","key":"9428_CR10","doi-asserted-by":"publisher","first-page":"1072","DOI":"10.1177\/14614448156259","volume":"19","author":"S Dvir-Gvirsman","year":"2017","unstructured":"Dvir-Gvirsman S (2017) Media audience homophily: Partisan websites, audience identity and polarization processes. New Media and Society 19(7):1072\u20131091. https:\/\/doi.org\/10.1177\/14614448156259","journal-title":"New Media and Society"},{"key":"9428_CR11","unstructured":"EBU R 128\u20132014 (2014) Loudness normalisation and permitted maximum level of audio signals"},{"key":"9428_CR12","doi-asserted-by":"crossref","unstructured":"Ekman P (1992) An argument for basic emotions. Cogn Emot","DOI":"10.1037\/0033-295X.99.3.550"},{"issue":"8","key":"9428_CR13","doi-asserted-by":"publisher","first-page":"1915","DOI":"10.1109\/TPAMI.2012.231","volume":"35","author":"C Farabet","year":"2013","unstructured":"Farabet C, Couprie C, Najman L, LeCun Y (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915\u20131929","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"9428_CR14","doi-asserted-by":"crossref","unstructured":"Fayek HM, Lech M, Cavedon L (2015) Towards real-time speech emotion recognition using deep neural networks. In 2015, 9th International Conference on Signal Processing and Communication Systems, ICSPCS 2015 - Proceedings","DOI":"10.1109\/ICSPCS.2015.7391796"},{"key":"9428_CR15","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1016\/j.neunet.2017.02.013","volume":"92","author":"HM Fayek","year":"2017","unstructured":"Fayek HM, Lech M, Cavedon L (2017) Evaluating deep learning architectures for speech emotion recognition. Neural Netw 92:60\u201368","journal-title":"Neural Netw"},{"key":"9428_CR16","unstructured":"Forsell M (2007) Acoustic correlates of perceived emotions in speech. Infancy"},{"key":"9428_CR17","unstructured":"Franzoni V, Biondi G, Milani A (2019) Crowd emotional sounds: spectrogram-based analysis using convolutional neural networks. In SAT 2019 Proceedings of the Workshop Socio-Affective TechnoLogies: an interdisciplinary approach co-located with IEEE SMC 2019 (Systems, Man and Cybernetics), pp 32\u201336"},{"key":"9428_CR18","doi-asserted-by":"crossref","unstructured":"Franzoni V, Milani A, Biondi G, Micheli F (2019) A Preliminary work on dog emotion recognition. In IEEE\/WIC\/ACM International Conference on Web Intelligence - Companion Volume, pp 91\u201396","DOI":"10.1145\/3358695.3361750"},{"key":"9428_CR19","doi-asserted-by":"crossref","unstructured":"Gervasi O, Franzoni V, Riganelli M, Tasso S (2019) Automating facial emotion recognition. Web Intell","DOI":"10.3233\/WEB-190397"},{"key":"9428_CR20","unstructured":"Hawks H (1932) The crowd roar, Warner bros. https:\/\/en.wikipedia.org\/wiki\/The_Crowd_Roars_(1932_film)"},{"key":"9428_CR21","doi-asserted-by":"crossref","unstructured":"Huang Z, Dong M, Mao Q, Zhan Y (2014) Speech emotion recognition using CNN","DOI":"10.1145\/2647868.2654984"},{"key":"9428_CR22","doi-asserted-by":"crossref","unstructured":"Kim Y, Kim Y (2019) Incivility on facebook and political polarization: the mediating role of seeking further comments and negative emotion. Comput Human Behavior 99 pp 219\u2013227, Elsevier 2019","DOI":"10.1016\/j.chb.2019.05.022"},{"key":"9428_CR23","unstructured":"Krizhevsky A, Sutskever I, Hinton GE (2012) AlexNet. Adv Neural Inf Process Syst"},{"key":"9428_CR24","unstructured":"Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In ImageNet classification with deep convolutional neural networks"},{"issue":"4","key":"9428_CR25","doi-asserted-by":"publisher","first-page":"363","DOI":"10.25046\/aj030437","volume":"3","author":"M Lech","year":"2018","unstructured":"Lech M, Stolar M, Bolia R, Skinner M (2018) Amplitude-frequency analysis of emotional speech using transfer learning and classification of spectrogram images. Adv Sci Technol Eng Syst J 3(4):363\u2013371","journal-title":"Adv Sci Technol Eng Syst J"},{"key":"9428_CR26","doi-asserted-by":"crossref","unstructured":"Liu X, Cheung G, Ji X, Zhao D, Gao W (2019) Graph-based joint dequantization and contrast enhancement of poorly lit JPEG images. IEEE Trans Image Process","DOI":"10.1109\/TIP.2018.2872871"},{"key":"9428_CR27","doi-asserted-by":"crossref","unstructured":"Mirsamadi S, Barsoum E, Zhang C (2017) Automatic speech emotion recognition using recurrent neural networks with local attention. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings","DOI":"10.1109\/ICASSP.2017.7952552"},{"key":"9428_CR28","doi-asserted-by":"crossref","unstructured":"Moore BCJ, Glasberg BR (1983) Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. J Acoust Soc Am","DOI":"10.1121\/1.389861"},{"key":"9428_CR29","doi-asserted-by":"crossref","unstructured":"Prasomphan S (2015) Detecting human emotion via speech recognition by using speech spectrogram. In 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp 1\u201310","DOI":"10.1109\/DSAA.2015.7344793"},{"key":"9428_CR30","unstructured":"Quatieri TF (1993) Energy separation in signal modulations with application to speech analysis. IEEE Trans Signal Process"},{"key":"9428_CR31","doi-asserted-by":"crossref","unstructured":"Riganelli M, Franzoni V, Gervasi O, Tasso S (2017) EmEx, a tool for automated emotive face recognition using convolutional neural networks, vol. 10406 LNCS","DOI":"10.1007\/978-3-319-62398-6_49"},{"key":"9428_CR32","doi-asserted-by":"publisher","first-page":"101003","DOI":"10.1016\/j.jocs.2019.05.009","volume":"36","author":"K Sailunaz","year":"2019","unstructured":"Sailunaz K, Alhajj R (2019) Emotion and sentiment analysis from Twitter text. Journal of Computational Science 36:101003","journal-title":"Journal of Computational Science"},{"key":"9428_CR33","doi-asserted-by":"crossref","unstructured":"Srinivasan SM, Sangwan RS, Neill CJ, Zu T (2019) Twitter data for predicting election results: Insights from emotion classification. IEEE Technol Soc Mag 38(1)8664560 pp 58\u201363. IEEE Press 2019","DOI":"10.1109\/MTS.2019.2894472"},{"issue":"3","key":"9428_CR34","doi-asserted-by":"publisher","first-page":"185","DOI":"10.1121\/1.1915893","volume":"8","author":"SS Stevens","year":"1937","unstructured":"Stevens SS, Volkmann J, Newman EB (1937) A scale for the measurement of the PsychoLogical magnitude pitch. J. Acoust. Soc. Am. 8(3):185\u2013190","journal-title":"J. Acoust. Soc. Am."},{"key":"9428_CR35","doi-asserted-by":"crossref","unstructured":"Stolar MN, Lech M, Bolia RS, Skinner M (2018) Real time speech emotion recognition using RGB image classification and transfer learning. In 2017, 11th International Conference on Signal Processing and Communication Systems, ICSPCS 2017 - Proceedings","DOI":"10.1109\/ICSPCS.2017.8270472"},{"key":"9428_CR36","doi-asserted-by":"publisher","unstructured":"Yadollahi A, Shahraki AG, Zaiane OR (2017) Current state of text sentiment analysis from opinion to emotion mining. ACM Comput Surv 50(2) n. a25, doi: https:\/\/doi.org\/10.1145\/3057270","DOI":"10.1145\/3057270"},{"issue":"2","key":"9428_CR37","doi-asserted-by":"publisher","first-page":"248","DOI":"10.1121\/1.1908630","volume":"33","author":"E Zwicker","year":"1961","unstructured":"Zwicker E (1961) Subdivision of the audible frequency range into critical bands (Frequenzgruppen). J Acoust Soc Am 33(2):248\u2013248","journal-title":"J Acoust Soc Am"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-020-09428-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11042-020-09428-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-020-09428-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,8,17]],"date-time":"2021-08-17T00:15:35Z","timestamp":1629159335000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11042-020-09428-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,17]]},"references-count":37,"journal-issue":{"issue":"47-48","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["9428"],"URL":"https:\/\/doi.org\/10.1007\/s11042-020-09428-x","relation":{},"ISSN":["1380-7501","1573-7721"],"issn-type":[{"value":"1380-7501","type":"print"},{"value":"1573-7721","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,8,17]]},"assertion":[{"value":"19 January 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 June 2020","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 July 2020","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 August 2020","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}