{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,19]],"date-time":"2024-08-19T19:09:51Z","timestamp":1724094591561},"reference-count":12,"publisher":"Hindawi Limited","license":[{"start":{"date-parts":[[2018,1,1]],"date-time":"2018-01-01T00:00:00Z","timestamp":1514764800000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004386","name":"Universiti Malaya","doi-asserted-by":"publisher","award":["RP030A-14AET","FP061-2014A"],"id":[{"id":"10.13039\/501100004386","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computational Intelligence and Neuroscience"],"published-print":{"date-parts":[[2018]]},"abstract":"Human detection in videos plays an important role in various real life applications. Most of traditional approaches depend on utilizing handcrafted features which are problem-dependent and optimal for specific tasks. Moreover, they are highly susceptible to dynamical events such as illumination changes, camera jitter, and variations in object sizes. On the other hand, the proposed feature learning approaches are cheaper and easier because highly abstract and discriminative features can be produced automatically without the need of expert knowledge. In this paper, we utilize automatic feature learning methods which combine optical flow and three different deep models (i.e., supervised convolutional neural network (S-CNN), pretrained CNN feature extractor, and hierarchical extreme learning machine) for human detection in videos captured using a nonstatic camera on an aerial platform with varying altitudes. The models are trained and tested on the publicly available and highly challenging UCF-ARG aerial dataset. The comparison between these models in terms of training, testing accuracy, and learning speed is analyzed. The performance evaluation considers five human actions (digging, waving, throwing, walking, and running). Experimental results demonstrated that the proposed methods are successful for human detection task. Pretrained CNN produces an average accuracy of 98.09%. S-CNN produces an average accuracy of 95.6% with soft-max and 91.7% with Support Vector Machines (SVM). H-ELM has an average accuracy of 95.9%. Using a normal Central Processing Unit (CPU), H-ELM\u2019s training time takes 445 seconds. Learning in S-CNN takes 770 seconds with a high performance Graphical Processing Unit (GPU).<\/jats:p>","DOI":"10.1155\/2018\/1639561","type":"journal-article","created":{"date-parts":[[2018,2,12]],"date-time":"2018-02-12T23:30:46Z","timestamp":1518478246000},"page":"1-14","source":"Crossref","is-referenced-by-count":35,"title":["Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models"],"prefix":"10.1155","volume":"2018","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-5522-0033","authenticated-orcid":true,"given":"Nouar","family":"AlDahoul","sequence":"first","affiliation":[{"name":"Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-4758-5400","authenticated-orcid":true,"given":"Aznul Qalid","family":"Md Sabri","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia"}]},{"given":"Ali Mohammed","family":"Mansoor","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia"}]}],"member":"98","reference":[{"key":"2","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000029664.99615.94"},{"key":"6","doi-asserted-by":"publisher","DOI":"10.1049\/iet-cvi.2015.0037"},{"key":"11","first-page":"29","volume-title":"Sequential Deep Learning for Human Action Recognition","year":"2011"},{"key":"13","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2015.2424995"},{"key":"17","series-title":"270","first-page":"270","volume-title":"A Scheme for the Detection and Tracking of People Tuned for Aerial Image Sequences","volume":"6952","year":"2011"},{"key":"18","volume-title":"Detection and tracking of humans from an airborne platform","volume":"9249","year":"2014"},{"issue":"12","key":"19","doi-asserted-by":"crossref","first-page":"1273","DOI":"10.1016\/j.robot.2010.06.002","volume":"58","year":"2010","journal-title":"Robotics and Autonomous Systems"},{"issue":"1","key":"22","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/BF01420984","volume":"12","journal-title":"International Journal of Computer Vision"},{"key":"23","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(81)90024-2"},{"key":"24","year":"2006"},{"key":"25","year":"2012"},{"key":"28","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2005.12.126"}],"container-title":["Computational Intelligence and Neuroscience"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/cin\/2018\/1639561.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/cin\/2018\/1639561.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/cin\/2018\/1639561.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2018,2,12]],"date-time":"2018-02-12T23:31:16Z","timestamp":1518478276000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.hindawi.com\/journals\/cin\/2018\/1639561\/"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018]]},"references-count":12,"alternative-id":["1639561","1639561"],"URL":"https:\/\/doi.org\/10.1155\/2018\/1639561","relation":{},"ISSN":["1687-5265","1687-5273"],"issn-type":[{"value":"1687-5265","type":"print"},{"value":"1687-5273","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018]]}}}