{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,13]],"date-time":"2024-08-13T19:31:49Z","timestamp":1723577509465},"reference-count":103,"publisher":"Springer Science and Business Media LLC","issue":"24","license":[{"start":{"date-parts":[[2022,6,25]],"date-time":"2022-06-25T00:00:00Z","timestamp":1656115200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,6,25]],"date-time":"2022-06-25T00:00:00Z","timestamp":1656115200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100005376","name":"Mid Sweden University","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100005376","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"published-print":{"date-parts":[[2022,10]]},"abstract":"Abstract<\/jats:title>Since 2008, a variety of systems have been designed to detect events in security cameras. There are also more than a hundred journal articles and conference papers published in this field. However, no survey has focused on recognizing events in the surveillance system. Thus, motivated us to provide a comprehensive review of the different developed event detection systems. We start our discussion with the pioneering methods that used the TRECVid-SED dataset and then developed methods using VIRAT dataset in TRECVid evaluation. To better understand the designed systems, we describe the components of each method and the modifications of the existing method separately. We have outlined the significant challenges related to untrimmed security video action detection. Suitable metrics are also presented for assessing the performance of the proposed models. Our study indicated that the majority of researchers classified events into two groups on the basis of the number of participants and the duration of the event for the TRECVid-SED Dataset. Depending on the group of events, one or more models to identify all the events were used. For the VIRAT dataset, object detection models to localize the first stage activities were used throughout the work. Except one study, a 3D convolutional neural network (3D-CNN) to extract Spatio-temporal features or classifying different activities were used. From the review that has been carried, it is possible to conclude that developing an automatic surveillance event detection system requires three factors: accurate and fast object detection in the first stage to localize the activities, and classification model to draw some conclusion from the input values.<\/jats:p>","DOI":"10.1007\/s11042-021-11864-2","type":"journal-article","created":{"date-parts":[[2022,6,25]],"date-time":"2022-06-25T01:02:41Z","timestamp":1656118961000},"page":"35463-35501","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Event detection in surveillance videos: a review"],"prefix":"10.1007","volume":"81","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-7320-2306","authenticated-orcid":false,"given":"Abdolamir","family":"Karbalaie","sequence":"first","affiliation":[]},{"given":"Farhad","family":"Abtahi","sequence":"additional","affiliation":[]},{"given":"M\u00e5rten","family":"Sj\u00f6str\u00f6m","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,6,25]]},"reference":[{"key":"11864_CR1","doi-asserted-by":"publisher","first-page":"285","DOI":"10.1016\/j.jvcir.2018.11.035","volume":"58","author":"AA Afiq","year":"2019","unstructured":"Afiq AA, et al. (2019) A review on classifying abnormal behavior in crowd scene. J Vis Commun Image Represent 58:285\u2013303","journal-title":"J Vis Commun Image Represent"},{"key":"11864_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/1922649.1922653","volume":"43","author":"JK Aggarwal","year":"2007","unstructured":"Aggarwal JK, Ryoo MS (2007) Human activity analysis: a review. ACM Comput Surv 43:1\u201343","journal-title":"ACM Comput Surv"},{"key":"11864_CR3","doi-asserted-by":"crossref","unstructured":"Aggarwal JK, Ryoo MS (2011) Human activity analysis: A review. ACM Comput Surv, vol 43, no 3","DOI":"10.1145\/1922649.1922653"},{"key":"11864_CR4","unstructured":"Al-fedaghi S (2020) Modeling Events and Events of Events in Software Engineering. no 1"},{"key":"11864_CR5","doi-asserted-by":"crossref","unstructured":"Ameya M, Kurokawa S, Hirose M (2012) Millimeter-wave antenna pattern measurement using high extinction ratio Mach-Zehnder modulator. In: Proc 6th Eur Conf Antennas Propagation, EuCAP 2012, pp 2574\u20132577","DOI":"10.1109\/EuCAP.2012.6206542"},{"key":"11864_CR6","unstructured":"Awad G, et al. (2016) TRECVID 2016: Evaluating Vdeo search, video event detection, localization, and hyperlinking Gaithersburg"},{"key":"11864_CR7","unstructured":"Awad G et al (2018) TRECVID 2018: Benchmarking Video Activity Detection, Video Captioning and Matching, Video Storytelling Linking and Video Search To cite this version: HAL Id: hal-01919873 TRECVID 2018: Benchmarking Video Activity Detection, Video Captioning and Mat, Proc TRECVID 2018, pp 1\u201338"},{"key":"11864_CR8","unstructured":"Awad G et al (2019) TRECVID 2019: An Evaluation campaign to benchmark Video Activity Detection, Video Captioning and Matching, and Video Search retrieval, TRECVID 2019, 23sd Int Work Video Retr Eval"},{"issue":"1","key":"11864_CR9","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1007\/s11042-010-0643-7","volume":"51","author":"L Ballan","year":"2011","unstructured":"Ballan L, Bertini M, Del Bimbo A, Seidenari L, Serra G (2011) Event detection and recognition for semantic annotation of video. Multimed Tools Appl 51(1):279\u2013302","journal-title":"Multimed Tools Appl"},{"key":"11864_CR10","unstructured":"Beigi M et al (2018) Object-centric Spatio-Temporal Activity Detection and Recognition"},{"key":"11864_CR11","doi-asserted-by":"publisher","first-page":"480","DOI":"10.1016\/j.eswa.2017.09.029","volume":"91","author":"A Ben Mabrouk","year":"2018","unstructured":"Ben Mabrouk A, Zagrouba E (2018) Abnormal behavior recognition for intelligent video surveillance systems: a review. Expert Syst Appl 91:480\u2013491","journal-title":"Expert Syst Appl"},{"key":"11864_CR12","first-page":"3464","volume":"2016-Augus","author":"A Bewley","year":"2016","unstructured":"Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking. Proc - Int Conf Image Process ICIP 2016-Augus:3464\u20133468","journal-title":"Proc - Int Conf Image Process ICIP"},{"key":"11864_CR13","doi-asserted-by":"publisher","first-page":"1082","DOI":"10.1063\/1.4791421","volume":"1512","author":"P Bhatt","year":"2013","unstructured":"Bhatt P, Bhatt R, Mukadam MD, Yusuf SM (2013) Prussian blue based molecular magnet K0.3Mn 2.85[cr(CN)6]2snh2o with ferrimagnetic ordering temperature of 60 K. AIP Conf Proc 1512:1082\u20131083","journal-title":"AIP Conf Proc"},{"key":"11864_CR14","doi-asserted-by":"publisher","first-page":"1395","DOI":"10.1109\/ICCV.2005.28","volume":"II","author":"M Blank","year":"2005","unstructured":"Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. Proc IEEE Int Conf Comput Vis II:1395\u20131402","journal-title":"Proc IEEE Int Conf Comput Vis"},{"key":"11864_CR15","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1007\/978-3-319-46562-3_23","volume":"513","author":"A Bux","year":"2017","unstructured":"Bux A, Angelov P, Habib Z (2017) Vision based human activity recognition: a review. Adv Intell Syst Comput 513:341\u2013371","journal-title":"Adv Intell Syst Comput"},{"key":"11864_CR16","doi-asserted-by":"crossref","unstructured":"Carreira J, Zisserman A (2017) Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset, A new Model Kinet. dataset, CoRR, arXiv:abs\/1705.07750, vol 2, pp 3","DOI":"10.1109\/CVPR.2017.502"},{"issue":"3","key":"11864_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/1541880.1541882","volume":"41","author":"V Chandola","year":"2009","unstructured":"Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3): 1\u201358","journal-title":"ACM Comput Surv"},{"key":"11864_CR18","unstructured":"Chang X et al (2019) MMVG-INF-Etrol @ TRECVID 2019: Activities in Extended Video. In: 33rd conference on neural information processing systems, no 2017"},{"key":"11864_CR19","unstructured":"Chen J (2017) Informedia @ Trecvid 2017 informedia@TRECVID 2017 MED and AVS"},{"key":"11864_CR20","unstructured":"de Campos TE (2014) A survey on computer vision tools for action recognition, crowd surveillance and suspect retrieval, XXXIV Congr da Soc Bras Comput \u2013 CSBC 2014, no May, pp 1123\u20131132"},{"issue":"August 2018","key":"11864_CR21","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1016\/j.engappai.2018.08.014","volume":"77","author":"C Dhiman","year":"2019","unstructured":"Dhiman C, Vishwakarma DK (2019) A review of state-of-the-art techniques for abnormal human activity recognition. Eng Appl Artif Intell 77(August 2018):21\u201345","journal-title":"Eng Appl Artif Intell"},{"key":"11864_CR22","first-page":"6201","volume":"2019-Octob","author":"C Feichtenhofer","year":"2019","unstructured":"Feichtenhofer C, Fan H, Malik J, He K (2019) Slowfast networks for video recognition. Proc IEEE Int Conf Comput Vis 2019-Octob:6201\u20136210","journal-title":"Proc IEEE Int Conf Comput Vis"},{"issue":"i","key":"11864_CR23","first-page":"1933","volume":"2016-Decem","author":"C Feichtenhofer","year":"2016","unstructured":"Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional Two-Stream network fusion for video action recognition. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016-Decem(i):1933\u20131941","journal-title":"Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit"},{"issue":"4","key":"11864_CR24","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1109\/MMUL.2005.87","volume":"12","author":"ARJ Francois","year":"2005","unstructured":"Francois ARJ, Nevatia R, Hobbs J, Bolles RC, Smith JR (2005) VERL: An ontology framework for representing and annotating video events. IEEE Multimed 12(4):76\u201386","journal-title":"IEEE Multimed"},{"key":"11864_CR25","doi-asserted-by":"crossref","unstructured":"Gleason J, Ranjan R, Schwarcz S, Castillo CD, Chen JC, Chellappa R (2019) A proposal-based solution to spatio-temporal action detection in untrimmed videos. In: Proc - 2019 IEEE Winter Conf Appl Comput Vision, WACV 2019, pp 141\u2013150","DOI":"10.1109\/WACV.2019.00021"},{"key":"11864_CR26","doi-asserted-by":"crossref","unstructured":"Gleason J, Ranjan R, Schwarcz S, Castillo C, Chen J-C, Chellappa R (2019) A proposal-based solution to spatio-temporal action detection in untrimmed videos. In: 2019 IEEE winter conference on applications of computer vision (WACV), pp 141\u2013150","DOI":"10.1109\/WACV.2019.00021"},{"key":"11864_CR27","unstructured":"Gu C, Sun C, Ross DA, Toderici G, Pantofaru C, Ricco S (2018) AVA A video dataset of atomic visual actions. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 6047\u20136056"},{"key":"11864_CR28","unstructured":"Hakeem A, Sheikh Y, Shah M (2004) CASE E: a hierarchical event representation for the analysis of videos. In: Proc Natl Conf Artif Intell, pp 263\u2013268"},{"key":"11864_CR29","first-page":"3154","volume":"2018-Janua","author":"K Hara","year":"2017","unstructured":"Hara K, Kataoka H, Satoh Y (2017) Learning spatio-Temporal features with 3D residual networks for action recognition. Proc - 2017 IEEE Int Conf Comput Vis Work ICCVW 2017 2018-Janua:3154\u20133160","journal-title":"Proc - 2017 IEEE Int Conf Comput Vis Work ICCVW 2017"},{"key":"11864_CR30","doi-asserted-by":"crossref","unstructured":"Hara K, Kataoka H, Satoh Y (2018) Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit, pp 6546\u20136555","DOI":"10.1109\/CVPR.2018.00685"},{"issue":"1","key":"11864_CR31","doi-asserted-by":"publisher","first-page":"28","DOI":"10.12720\/joig.2.1.28-32","volume":"2","author":"M Hassan","year":"2014","unstructured":"Hassan M, Ahmad T, Farooq A, Ali SA, hassan SR, Liaqat N (2014) A review on human actions recognition using vision based techniques. J Image Graph 2(1):28\u201332","journal-title":"J Image Graph"},{"issue":"3","key":"11864_CR32","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1109\/TPAMI.2014.2345390","volume":"37","author":"JF Henriques","year":"2015","unstructured":"Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583\u2013596","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"11864_CR33","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1016\/j.imavis.2017.01.010","volume":"60","author":"S Herath","year":"2017","unstructured":"Herath S, Harandi M, Porikli F (2017) Going deeper into action recognition: a survey. Image Vis Comput 60:4\u201321","journal-title":"Image Vis Comput"},{"key":"11864_CR34","doi-asserted-by":"crossref","unstructured":"Hou R, Chen C, Shah M (2017) An end-to-end 3d convolu- tional neural network for action detection and segmentation in videos. arXiv:1712.01111","DOI":"10.1109\/ICCV.2017.620"},{"issue":"3","key":"11864_CR35","doi-asserted-by":"publisher","first-page":"334","DOI":"10.1109\/TSMCC.2004.829274","volume":"34","author":"W Hu","year":"2004","unstructured":"Hu W, Tan T, Wang L, Maybank S (2004) A survey on visual surveillance of object motion and behaviors. IEEE Trans Syst Man Cybern Part C Appl Rev 34(3):334\u2013352","journal-title":"IEEE Trans Syst Man Cybern Part C Appl Rev"},{"issue":"PART 2","key":"11864_CR36","first-page":"788","volume":"5303 LNCS","author":"C Huang","year":"2008","unstructured":"Huang C, Wu B, Nevatia R (2008) Robust object tracking by hierarchical association of detection responses. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 5303 LNCS(PART 2):788\u2013801","journal-title":"Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics)"},{"key":"11864_CR37","doi-asserted-by":"crossref","unstructured":"J\u00e9gou H, Douze M, Schmid C, P\u00e9rez P (2010) Aggregating local descriptors into a compact image representation. In: Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit, pp 3304\u20133311","DOI":"10.1109\/CVPR.2010.5540039"},{"key":"11864_CR38","doi-asserted-by":"crossref","unstructured":"Jiang L, Hauptmann AG, Xiang G (2012) Leveraging high-level and low-level features for multimedia event detection, MM 2012 - Proc. 20th ACM Int Conf Multimed, pp 449\u2013458","DOI":"10.1145\/2393347.2393412"},{"key":"11864_CR39","unstructured":"Jiang RSY-G, Liu J, Roshan Zamir A, Toderici G, Laptev I, Shah M (2013) THUMOS challenge: Action recognition with a large number of classes. http:\/\/crcv.ucf.edu\/ICCV13-Action-Workshop\/"},{"key":"11864_CR40","doi-asserted-by":"crossref","unstructured":"Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: 2014 IEEE conference on computer vision and pattern recognition, pp 1725\u20131732","DOI":"10.1109\/CVPR.2014.223"},{"issue":"2","key":"11864_CR41","doi-asserted-by":"publisher","first-page":"319","DOI":"10.1109\/TPAMI.2008.57","volume":"31","author":"R Kasturi","year":"2009","unstructured":"Kasturi R et al (2009) Framework for performance evaluation of face, text, and vehicle detection and tracking in video: Data, metrics, and protocol. IEEE Trans Pattern Anal Mach Intell 31(2):319\u2013336","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"11864_CR42","unstructured":"Kay W, Carreira J, Simonyan K, Zhang B, Hillier C, Vijayanarasimhan S, Viola F, Green T, Back T, Natsev P et al (2017) The kinetics human action video dataset. arXiv:1705.06950"},{"issue":"4","key":"11864_CR43","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1007\/s13735-017-0133-z","volume":"6","author":"MY Kazi Tani","year":"2017","unstructured":"Kazi Tani MY, Ghomari A, Lablack A, Bilasco IM (2017) OVIS: Ontology video surveillance indexing and retrieval system. Int J Multimed Inf Retr 6(4):295\u2013316","journal-title":"Int J Multimed Inf Retr"},{"key":"11864_CR44","doi-asserted-by":"crossref","unstructured":"Ke SR, Thuc HLU, Lee YJ, Hwang JN, Yoo JH, Choi KH (2013) A review on video-based human activity recognition. vol 2, no 2","DOI":"10.3390\/computers2020088"},{"key":"11864_CR45","doi-asserted-by":"crossref","unstructured":"Ko T (2008) A survey on behavior analysis in video surveillance for homeland security applications, Proc - Appl Imag Pattern Recognit Work","DOI":"10.1109\/AIPR.2008.4906450"},{"key":"11864_CR46","unstructured":"Kong Y, Fu Y (2018) Human Action Recognition and Prediction: A Survey. vol 13, no 9,"},{"key":"11864_CR47","doi-asserted-by":"crossref","unstructured":"Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: A large video database for human motion recognition. In: Proceedings of the IEEE international conference on computer vision, pp 2556\u20132563","DOI":"10.1109\/ICCV.2011.6126543"},{"issue":"3","key":"11864_CR48","doi-asserted-by":"publisher","first-page":"367","DOI":"10.1109\/TCSVT.2014.2358029","volume":"25","author":"T Li","year":"2015","unstructured":"Li T, Chang H, Wang M, Ni B, Hong R, Yan S (2015) Crowded scene analysis: a survey. IEEE Trans Circuits Syst Video Technol 25(3):367\u2013386","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"11864_CR49","doi-asserted-by":"crossref","unstructured":"Li W, Wong Y, Liu AA, Li Y, Su YT, Kankanhalli M (2017) Multi-camera action dataset for cross-camera action recognition benchmarking. In: Proc - 2017 IEEE Winter Conf. Appl. Comput. Vision, WACV 2017, pp 187\u2013196","DOI":"10.1109\/WACV.2017.28"},{"key":"11864_CR50","first-page":"1895","volume":"4","author":"A Martin","year":"1997","unstructured":"Martin A, Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) . The DET Curve in Assessment of Detection Task Performance 4:1895\u20131898","journal-title":"The DET Curve in Assessment of Detection Task Performance"},{"issue":"6\u20137","key":"11864_CR51","doi-asserted-by":"publisher","first-page":"421","DOI":"10.1016\/j.imavis.2013.03.005","volume":"31","author":"D Metaxas","year":"2013","unstructured":"Metaxas D, Zhang S (2013) A review of motion analysis methods for human nonverbal communication computing. Image Vis Comput 31(6\u20137):421\u2013433","journal-title":"Image Vis Comput"},{"key":"11864_CR52","doi-asserted-by":"crossref","unstructured":"Oh S et al (2011) AVSS 2011 demo session: A large-scale benchmark dataset for event recognition in surveillance video. In: 2011 8th IEEE international conference on advanced video and signal based surveillance (AVSS), no 3, pp 527\u2013528","DOI":"10.1109\/AVSS.2011.6027400"},{"key":"11864_CR53","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1016\/j.eswa.2016.06.011","volume":"63","author":"L Onofri","year":"2016","unstructured":"Onofri L, Soda P, Pechenizkiy M, Iannello G (2016) A survey on using domain and contextual knowledge for human activity recognition in video streams. Expert Syst Appl 63:97\u2013111","journal-title":"Expert Syst Appl"},{"key":"11864_CR54","unstructured":"Over P et al (2013) TRECVID 2013 \u2013 An overview of the goals, tasks, data, evaluation mechanisms, and metrics. In: 2013 TREC video retrieval evaluation, TRECVID 2013, no. November"},{"issue":"12","key":"11864_CR55","doi-asserted-by":"publisher","first-page":"3448","DOI":"10.1016\/j.comnet.2007.02.001","volume":"51","author":"A Patcha","year":"2007","unstructured":"Patcha A, Park JM (2007) An overview of anomaly detection techniques: Existing solutions and latest technological trends. Comput Networks 51 (12):3448\u20133470","journal-title":"Comput Networks"},{"key":"11864_CR56","unstructured":"Phan S et al (2017) NII Hitachi UIT at TRECVID 2017"},{"key":"11864_CR57","doi-asserted-by":"crossref","unstructured":"Pirsiavash H, Ramanan D (2012) Detecting activities of daily living in first-person camera views. In: 2012 IEEE conference on computer vision and pattern recognition, pp 2847\u20132854","DOI":"10.1109\/CVPR.2012.6248010"},{"issue":"6","key":"11864_CR58","doi-asserted-by":"publisher","first-page":"865","DOI":"10.1109\/TSMCC.2011.2178594","volume":"42","author":"OP Popoola","year":"2012","unstructured":"Popoola OP, Wang K (2012) Video-based abnormal human behavior recognitiona review. IEEE Trans Syst Man Cybern Part C Appl Rev 42(6):865\u2013878","journal-title":"IEEE Trans Syst Man Cybern Part C Appl Rev"},{"issue":"6","key":"11864_CR59","doi-asserted-by":"publisher","first-page":"976","DOI":"10.1016\/j.imavis.2009.11.014","volume":"28","author":"R Poppe","year":"2010","unstructured":"Poppe R (2010) A survey on vision-based human action recognition. Image Vis Comput 28(6):976\u2013990","journal-title":"Image Vis Comput"},{"key":"11864_CR60","doi-asserted-by":"crossref","unstructured":"Qu\u00e9not G, Joly P, Benois-Pineau J (2012) Evaluation of visual information indexing and retrieval, pp 83\u201396","DOI":"10.1007\/978-1-4614-3588-4_6"},{"key":"11864_CR61","doi-asserted-by":"publisher","first-page":"107560","DOI":"10.1109\/ACCESS.2019.2932114","volume":"7","author":"M Ramzan","year":"2019","unstructured":"Ramzan M et al (2019) A review on state-of-the-art violence detection techniques. IEEE Access 7:107560\u2013107575","journal-title":"IEEE Access"},{"key":"11864_CR62","unstructured":"Rana AJ et al (2019) An Online System for Real-Time Activity Detection in Untrimmed Surveillance Videos"},{"key":"11864_CR63","doi-asserted-by":"crossref","unstructured":"Ranjan R, Gleason J, Schwarcz S, Castillo CD, Chen JC, Chellappa R (2020) Spatio-temporal action detection in untrimmed videos. In: 2018 TREC Video Retrieval Evaluation, TRECVID 2018","DOI":"10.1109\/WACV.2019.00021"},{"issue":"5","key":"11864_CR64","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1109\/TSMCC.2010.2042446","volume":"40","author":"TD R\u00e4ty","year":"2010","unstructured":"R\u00e4ty TD (2010) Survey on contemporary remote surveillance systems for public safety. IEEE Trans Syst Man Cybern Part C Appl Rev 40(5):493\u2013515","journal-title":"IEEE Trans Syst Man Cybern Part C Appl Rev"},{"key":"11864_CR65","first-page":"1689","volume":"2018-Janua","author":"M Ravanbakhsh","year":"2018","unstructured":"Ravanbakhsh M, Nabi M, Mousavi H, Sangineto E, Sebe N (2018) Plug-and-play CNN for crowd motion analysis: an application in abnormal event detection. Proc - 2018 IEEE Winter Conf Appl Comput Vision, WACV 2018 2018-Janua:1689\u20131698","journal-title":"Proc - 2018 IEEE Winter Conf Appl Comput Vision, WACV 2018"},{"issue":"6","key":"11864_CR66","doi-asserted-by":"publisher","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","volume":"39","author":"S Ren","year":"2017","unstructured":"Ren S, He K, Girshick R, Sun J (2017) Faster r-CNN: towards Real-Time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137\u20131149","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"11864_CR67","doi-asserted-by":"crossref","unstructured":"Rose T, Fiscus J, Over P, Garofolo J, Michel M (2009) The TRECVid 2008 event detection evaluation. In: 2009 workshop on applications of computer vision (WACV), pp 1\u20138","DOI":"10.1109\/WACV.2009.5403089"},{"key":"11864_CR68","unstructured":"Saha S, Cuzzolin F (2015)"},{"issue":"8","key":"11864_CR69","first-page":"1951","volume":"45","author":"V Sangeetha","year":"2006","unstructured":"Sangeetha V, Prasad KJR (2006) Syntheses of novel derivatives of 2-acetylfuro[2,3-a]carbazoles, benzo[1,2-b]-1,4-thiazepino[2,3-a]carbazoles and 1-acetyloxycarbazole-2- carbaldehydes. Indian J Chem - Sect B Org Med Chem 45(8):1951\u20131954","journal-title":"Indian J Chem - Sect B Org Med Chem"},{"key":"11864_CR70","unstructured":"Sch C, Barbara L Recognizing Human Actions: A Local SVM Approach, pp 3\u20137"},{"key":"11864_CR71","doi-asserted-by":"crossref","unstructured":"Scherp A, Franz T, Saathoff C, Staab S (2009) F - A model of events based on the foundational ontology DOLCE+dns ultralite. In: K-CAP\u201909 - Proc 5th Int Conf Knowl Capture, pp 137\u2013144","DOI":"10.1145\/1597735.1597760"},{"key":"11864_CR72","unstructured":"Sharif HU, Saha AK, Arefin KS, Sharif H (2011) Event Detection from Video Streams. vol 01, no 02"},{"issue":"6","key":"11864_CR73","doi-asserted-by":"publisher","first-page":"1257","DOI":"10.1109\/TSMCC.2012.2215319","volume":"42","author":"AA Sodemann","year":"2012","unstructured":"Sodemann AA, Ross MP, Borghetti BJ (2012) A review of anomaly detection in automated surveillance. IEEE Trans Syst Man, Cybern Part C (Applications Rev 42(6):1257\u20131272","journal-title":"IEEE Trans Syst Man, Cybern Part C (Applications Rev"},{"key":"11864_CR74","unstructured":"Soomro K, Zamir AR, Shah M, Recognition A (2012) UCF101: A Dataset Of 101 Human Actions Classes From Videos in The Wild, no November"},{"key":"11864_CR75","first-page":"2325","volume":"2016-Decem","author":"R Stewart","year":"2016","unstructured":"Stewart R, Andriluka M, Ng AY (2016) End-to-end people detection in crowded scenes. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2016-Decem:2325\u20132333","journal-title":"Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit"},{"key":"11864_CR76","doi-asserted-by":"crossref","unstructured":"Subetha T, Chitrakala S (2016) A survey on human activity recognition from videos. In: 2016 Int Conf Inf Commun Embed Syst ICICES 2016, no Icices, pp 1\u20137","DOI":"10.1109\/ICICES.2016.7518920"},{"issue":"1","key":"11864_CR77","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1007\/s00138-013-0529-6","volume":"25","author":"W Tong","year":"2014","unstructured":"Tong W et al (2014) E-LAMP: Integration of innovative ideas for multimedia event detection. Mach Vis Appl 25(1):5\u201315","journal-title":"Mach Vis Appl"},{"key":"11864_CR78","first-page":"4489","volume":"2015 Inter","author":"D Tran","year":"2015","unstructured":"Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3D convolutional networks. Proc IEEE Int Conf Comput Vis 2015 Inter:4489\u20134497","journal-title":"Proc IEEE Int Conf Comput Vis"},{"issue":"2","key":"11864_CR79","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1007\/s10462-017-9545-7","volume":"50","author":"RK Tripathi","year":"2018","unstructured":"Tripathi RK, Jalal AS, Agrawal SC (2018) Suspicious human activity recognition: a review. Artif Intell Rev 50(2):283\u2013339","journal-title":"Artif Intell Rev"},{"issue":"6","key":"11864_CR80","doi-asserted-by":"publisher","first-page":"7585","DOI":"10.1007\/s11042-018-6472-9","volume":"78","author":"RK Tripathi","year":"2019","unstructured":"Tripathi RK, Jalal AS, Agrawal SC (2019) Abandoned or removed object detection from visual surveillance: a review. Multimed Tools Appl 78 (6):7585\u20137620","journal-title":"Multimed Tools Appl"},{"issue":"11","key":"11864_CR81","doi-asserted-by":"publisher","first-page":"1473","DOI":"10.1109\/TCSVT.2008.2005594","volume":"18","author":"P Turaga","year":"2008","unstructured":"Turaga P, Chellappa R, Subrahmanian VS, Udrea O (2008) Machine recognition of human activities: a survey. IEEE Trans Circuits Syst Video Technol 18(11):1473\u20131488","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"11864_CR82","unstructured":"(2012) Tum kitchen data set. Technische Universitat Munchen"},{"key":"11864_CR83","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/j.imavis.2016.05.005","volume":"53","author":"C Tzelepis","year":"2016","unstructured":"Tzelepis C et al (2016) Event-based media processing and analysis: a survey of the literature. Image Vis Comput 53:3\u201319","journal-title":"Image Vis Comput"},{"issue":"10","key":"11864_CR84","doi-asserted-by":"publisher","first-page":"983","DOI":"10.1007\/s00371-012-0752-6","volume":"29","author":"S Vishwakarma","year":"2013","unstructured":"Vishwakarma S, Agrawal A (2013) A survey on activity recognition and behavior understanding in video surveillance. Vis Comput 29(10):983\u20131009","journal-title":"Vis Comput"},{"key":"11864_CR85","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/j.patrec.2018.02.010","volume":"119","author":"J Wang","year":"2019","unstructured":"Wang J, Chen Y, Hao S, Peng X, Hu L (2019) Deep learning for sensor-based activity recognition: a survey. Pattern Recognit Lett 119:3\u201311","journal-title":"Pattern Recognit Lett"},{"key":"11864_CR86","first-page":"5823","volume":"2017-Octob","author":"X Wang","year":"2017","unstructured":"Wang X, Girshick R, Gupta A, He K (2017) [2018-CVPR] Non-local Neural Networks Cvpr2018, pp. 7794\u20137803, 2018. [11]R. Hou, C. Chen, and M. Shah, Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos. Proc IEEE Int Conf Comput Vis 2017-Octob:5823\u20135832","journal-title":"Proc IEEE Int Conf Comput Vis"},{"issue":"2","key":"11864_CR87","doi-asserted-by":"publisher","first-page":"224","DOI":"10.1016\/j.cviu.2010.10.002","volume":"115","author":"D Weinland","year":"2011","unstructured":"Weinland D, Ronfard R, Boyer E (2011) A survey of vision-based methods for action representation, segmentation and recognition. Comput Vis Image Underst 115(2):224\u2013241","journal-title":"Comput Vis Image Underst"},{"key":"11864_CR88","first-page":"3645","volume":"2017-Septe","author":"N Wojke","year":"2018","unstructured":"Wojke N, Bewley A, Paulus D (2018) Simple online and realtime tracking with a deep association metric. Proc - Int Conf Image Process ICIP 2017-Septe:3645\u20133649","journal-title":"Proc - Int Conf Image Process ICIP"},{"issue":"6","key":"11864_CR89","doi-asserted-by":"publisher","first-page":"1063","DOI":"10.1109\/TCSVT.2014.2367352","volume":"25","author":"J Xu","year":"2015","unstructured":"Xu J, Denman S, Sridharan S, Fookes C (2015) An efficient and robust system for multiperson event detection in real-world indoor surveillance scenes. IEEE Trans Circuits Syst Video Technol 25(6):1063\u20131076","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"11864_CR90","unstructured":"Xu J, Fookes C, Sridharan S (2016) Automatic Event Detection for Signal-based Surveillance. pp 1\u201356"},{"key":"11864_CR91","unstructured":"Yang P, Xiong J, Xie D, Pu S (2016) HRI Team@ TRECVID 2016 Surveillance Event detection, pp 2\u20135"},{"key":"11864_CR92","first-page":"622","volume":"11164 LNCS","author":"L Yao","year":"2018","unstructured":"Yao L, Qian Y (2018) DT-3DREsnet-LSTM: An architecture for temporal activity recognition in videos. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 11164 LNCS:622\u2013632","journal-title":"Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics)"},{"issue":"February","key":"11864_CR93","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1016\/j.ijdrr.2017.02.021","volume":"22","author":"B Yogameena","year":"2017","unstructured":"Yogameena B, Nagananthini C (2017) Computer vision based crowd disaster avoidance system: a survey. Int J Disaster Risk Reduct 22 (February):95\u2013129","journal-title":"Int J Disaster Risk Reduct"},{"key":"11864_CR94","doi-asserted-by":"crossref","unstructured":"Yoon JH, Yang MH, Lim J, Yoon KJ (2015) Bayesian multi-object tracking using motion context from multiple objects. In: Proc - 2015 IEEE Winter Conf Appl Comput Vision, WACV 2015, pp 33\u201340","DOI":"10.1109\/WACV.2015.12"},{"issue":"4","key":"11864_CR95","first-page":"13","volume":"8","author":"M Zab\u0142ocki","year":"2014","unstructured":"Zab\u0142ocki M, Frejlichowski D, Hofman R, Go\u015bciewska K (2014) Intelligent video surveillance systems for public spaces \u2013 a survey. J Theor Appl Comput Sci 8(4):13\u201327","journal-title":"J Theor Appl Comput Sci"},{"key":"11864_CR96","doi-asserted-by":"crossref","unstructured":"Zach C, Pock T, Bischof H (2007) A duality based approach for realtime TV-l 1 optical flow. In: Pattern recognition, vol. 0, no. x. Springer, Berlin, pp 214\u2013223","DOI":"10.1007\/978-3-540-74936-3_22"},{"issue":"5","key":"11864_CR97","first-page":"1","volume":"19","author":"HB Zhang","year":"2019","unstructured":"Zhang HB et al (2019) A comprehensive survey of vision-based human action recognition methods. Sensors (Switzerland) 19(5):1\u201320","journal-title":"Sensors (Switzerland)"},{"key":"11864_CR98","first-page":"428","volume":"2019","author":"Y Zhao","year":"2019","unstructured":"Zhao Y, Han R, Rao Y (2019) A new feature pyramid network for object detection. Proc - 2019 Int Conf Virtual Real Intell Syst ICVRIS 2019:428\u2013431","journal-title":"Proc - 2019 Int Conf Virtual Real Intell Syst ICVRIS"},{"key":"11864_CR99","doi-asserted-by":"publisher","first-page":"62","DOI":"10.1016\/j.neucom.2017.04.079","volume":"278","author":"Z Zhao","year":"2018","unstructured":"Zhao Z, et al. (2018) A unified framework with a benchmark dataset for surveillance event detection. Neurocomputing 278:62\u201374","journal-title":"Neurocomputing"},{"key":"11864_CR100","first-page":"831","volume":"11205 LNCS","author":"B Zhou","year":"2018","unstructured":"Zhou B, Andonian A, Oliva A, Torralba A (2018) Temporal relational reasoning in videos. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 11205 LNCS:831\u2013846","journal-title":"Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics)"},{"key":"11864_CR101","doi-asserted-by":"crossref","unstructured":"Zhou K, Zhu Y, Zhao Y (2017) A spatio-temporal deep architecture for surveillance event detection based on convLSTM. In: 2017 IEEE visual communications and image processing (VCIP), pp 1\u20134","DOI":"10.1109\/VCIP.2017.8305063"},{"issue":"1","key":"11864_CR102","doi-asserted-by":"publisher","first-page":"817","DOI":"10.1007\/s11042-018-6163-6","volume":"78","author":"Y Zhu","year":"2019","unstructured":"Zhu Y, Zhou K, Wang M, Zhao Y, Zhao Z (2019) A comprehensive solution for detecting events in complex surveillance videos. Multimed Tools Appl 78(1):817\u2013838","journal-title":"Multimed Tools Appl"},{"issue":"8","key":"11864_CR103","doi-asserted-by":"publisher","first-page":"2329","DOI":"10.1016\/j.patcog.2015.03.006","volume":"48","author":"M Ziaeefard","year":"2015","unstructured":"Ziaeefard M, Bergevin R (2015) Semantic human activity recognition: a literature review. Pattern Recognit 48(8):2329\u20132345","journal-title":"Pattern Recognit"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-021-11864-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11042-021-11864-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-021-11864-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,23]],"date-time":"2023-11-23T18:46:39Z","timestamp":1700765199000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11042-021-11864-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,25]]},"references-count":103,"journal-issue":{"issue":"24","published-print":{"date-parts":[[2022,10]]}},"alternative-id":["11864"],"URL":"https:\/\/doi.org\/10.1007\/s11042-021-11864-2","relation":{},"ISSN":["1380-7501","1573-7721"],"issn-type":[{"value":"1380-7501","type":"print"},{"value":"1573-7721","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,25]]},"assertion":[{"value":"24 January 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 August 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 December 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 June 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}