Abstract
In this paper, we propose a framework of pan-tilt-zoom control and DNN-based visual sensing to establish an automatic camera shooting system that can partially be capable to autonomously execute photograph and videoing tasks in the real-world application scenarios, including human figure photography in commercial and entertaining sites. Based on the technique of real-time visual detection and tracking with the techniques of Kalman filter and re-identification, we propose a camera control system with a continuous composition of lens, based on the photographic and videoing strategies, including the atomic rules of shots, and the trajectory planning of the camera, to generate the proportional–integral–derivative controller to the pan-tilt. We design and produce an artificial intelligence (AI) automatic camera for lively photography and clip videoing in open circumstances. We illustrate the efficiency of the proposed methods by simulation of photograph cropping and emulation with the models of AI automatic camera that we developed.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Joubert N, Goldman DB, Berthouzoz F, Roberts M, Landay JA, Hanrahan P (2016) Towards a drone cinematographer: guiding quadrotor cameras using visual composition principles. In: SIGGRAPH Asia
Levinson J, Thrun S (2013) Automatic online calibration of cameras and lasers. Robot Sci Syst 2(7)
Lino C, Christie M (2015) Intuitive and efficient camera control with the toric space. ACM Trans Gr 34(4):1–12. https://doi.org/10.1145/2766965
Xie K, Yang H, Huang S (2018) Creating and chaining camera moves for quadrotor videography. ACM Trans Gr 37(4):1–13. https://doi.org/10.1145/3197517.3201284
Dinh T, Yu Q, Medioni G (2009) Real time tracking using an active Pan-Tilt-Zoom network camera. In: IEEE/RSJ International Conference on IEEE, pp. 3786–3793 (2009) https://doi.org/10.1109/IROS.2009.5353915
Joubert N, Roberts M, Troung A et al (2015) An interactive tool for designing quadrotor camera shots. SIGGRAGH Asia 34(6):238. https://doi.org/10.1145/2816795.2818106
Sampedro C, Martinez C, Chauhan A, Campoy P (2014) A supervised approach to electric tower detection and classification for power line inspection. In: 2014 international joint conference on neural networks (IJCNN). pp. 1970–1977 (2014) https://doi.org/10.1109/IJCNN.2014.6889836
Yan N, Zhou T, Gu C, et al (2020) Instance segmentation model for substation equipment based on Mask R-CNN. In: International conference on electrical engineering and control technologies, pp. 192–198. https://doi.org/10.1109/CEECT50755.2020.9298600
Zhang Y, Yuan X, Fang Y, Chen S (2017) UAV low altitude photogrammetry for power line inspection. Int J Geo-Inf. https://doi.org/10.3390/ijgi6010014
He LW, Cohen MF, Salesin DH (1996) The virtual cinematographer: a paradigm for automatic real-time camera control and directing. In: Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, pp. 217–224 (1996) https://doi.org/10.1145/237170.237259
Girshick R (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), pp. 1440–1449 (2015) https://doi.org/10.1109/ICCV.2015.169
He K, Gkioxari G, Dollár P et al (2017) Mask R-CNN. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2018.2844175
Huang Z, Huang L, Gong Y, et al (2019) Mask scoring R-CNN. CVPR, pp. 6409–6418
Liu W, Anguelov D, Erhan D, et al (2016) SSD: single shot multibox detector. In: Computer Vision – ECCV, 9905:21–37. https://doi.org/10.1007/978-3-319-46448-02
Redmon J, Divvala S, Girshick R, et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788. https://doi.org/10.1109/CVPR.2016.91
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: CVPR, pp. 7263–7271 https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. In: Computer Science
Ren S, He K, Girshick R et al (2017) Faster R-CNN: towards real-Time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Bertinetto L, Valmadre J, Henriques JF, et al (2016) Fully-convolutional siamese networks for object tracking. In: Computer Vision – ECCV 2016 Workshops. 9914:860–865 (2016) https://doi.org/10.1007/978-3-319-48881-3_56
Bo L, Yan J, Wu W (2018) High performance visual tracking with siamese region proposal Network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 8971–8980. https://doi.org/10.1109/CVPR.2018.00935
Bochinski E, Eiselein V, Sikora T (2017) High-Speed tracking-by-detection without using image information. In: IEEE international conference on advanced video and signal based surveillance. https://doi.org/10.1109/AVSS.2017.8078516
Chen L, Ai H, Zhuang Z, Shang C (2018) Real-time multiple people tracking with deeply learned candidate selection and person re-Identification. In: IEEE international conference on multimedia and expo. https://doi.org/10.1109/ICME.2018.8486597
Feng W, Hu Z, Wu W, et al (2019) Multi-Object tracking with multiple cues and Switcher-Aware classification. https://arxiv.org/abs/1901.06129
Tiantian Y, Guodong Y, Junzhi Y (2017) Feature fusion based insulator detection for aerial inspection. In: Chinese control conference CCC, pp. 10972–10979. https://doi.org/10.23919/ChiCC.2017.8029108
Assa J, Cohen-Or D, Yeh IC et al (2008) Motion overview of human actions. ACM Trans Gr 27(5):1–10. https://doi.org/10.1145/1409060.1409068
Rhodes C, Morari M, Tsimring LS et al (1997) Data-based control trajectory planning for nonlinear systems. Phys Rev E 56(3):2398–2406. https://doi.org/10.1103/PhysRevE.56.2398
Yang C, Li Z, Li J (2013) Trajectory planning and optimized adaptive control for a class of wheeled inverted pendulum vehicle models. IEEE Trans Cybern 43(1):24–36. https://doi.org/10.1109/TSMCB.2012.2198813
Kyrkou C (2021) C3NET: end-to-end deep learning for efficient real-time visual active camera control. https://arxiv.org/pdf/2107.13233.pdf
Kyrkou C (2020) Imitation-based active camera control with deep convolutional neural network. In: IEEE 4th international conference on image processing, applications and systems. https://doi.org/10.1109/IPAS50080.2020.9334958
Brady DJ, Fang L, Ma Z (2020) Deep learning for camera data acquisition, control, and image estimation. Adv Optics Photonics 12(4):787–846. https://doi.org/10.1364/AOP.398263
Fleck S, Straßer W (2008) Smart camera based monitoring system and its application to assisted living. Proc IEEE 96(10):1698–1714
Chen X, Fang H, Lin TY, et al (2015) Microsoft COCO captions: data collection and evaluation server. https://arxiv.org/abs/1504.00325
Bourdev L, Brandt J (2005) Robust object detection via soft cascade. CVPR. https://doi.org/10.1109/CVPR.2005.310
Pham MT, Cham TJ (2007) Fast training and selection of haar features using statistics in boosting-based face detection. In: 2007 IEEE 11th international conference on computer vision. IEEE. https://doi.org/10.1109/ICCV.2007.4409038
Liao S, Jain AK, Li SZ (2016) A fast and accurate unconstrained face detector. PAMI. https://doi.org/10.1109/TPAMI.2015.2448075
Yan J, Zhen L, Wen L, Li SZ (2014) The fastest deformable part model for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp.2497–2504 (2014) https://doi.org/10.1109/CVPR.2014.320
Zhang S, Zhu X, Lei Z et al (2019) FaceBoxes: a CPU real-time face detector with high accuracy. Neurocomputing 364:297–309. https://doi.org/10.1109/BTAS.2017.8272675
Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82(1):35–45. https://doi.org/10.1115/1.3662552
Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking. In: NIPS. Curran Associates Inc
Christianson D (1996) Declarative camera control for automatic cinematography. In: Proceedings os AAAI'96, Volume 1
Wang J, Sun A, Zheng C, Wang J (2010) Research on a new crawler type inspection robot for power transmission lines. In: 2010 1st international conference on applied robotics for the power industry CARPI, pp. 1–5. https://doi.org/10.1109/CARPI.2010.5624471
Alhassan AB, Zhang X, Shen H, Xu H (2020) Power transmission line inspection robots: a review, trends and challenges for future research. Int J Electr Power Energy Syst 118:105862. https://doi.org/10.1016/j.ijepes.2020.105862
Deng C, Liu JY, Liu YB, Tan YY (2016) Real time autonomous transmission line following system for quadrotor helicopters. In: Int Conf Smart Grid Clean Energy Technol ICSGCE, pp. 61–64 (2016) https://doi.org/10.1109/ICSGCE.2016.7876026
Katrašnik J, Pernuš F, Likar B (2010) A climbing-flying robot for power line inspection. In: InTech, pp. 95–110
Patel AR, Patel MA, Vyas DR (2012) Modeling and analysis of quadrotor using sliding mode control. In: 44th IEEE southeast symposium system theory, pp. 111–114. https://doi.org/10.1109/SSST.2012.6195140
Wronkowicz A (2016) Vision diagnostics of power transmission lines: approach to recognition of insulators. In: Proc 9th Int Conf Comput Recognit Syst CORES 2015, Advances in Intelligent Systems and Computing, 403:431–440
Oluwatosin OP, Syed SA, Apis O (2021) Application of computer vision in pipeline inspection robot. In: Proceedings of the 11th annual international conference on industrial engineering and operations management, Singapore
Huang J, Wang J, Tan Y, Wu D, Cao Y (2020) An automatic analog instrument reading system using computer vision and inspection robot. IEEE Trans Instrum Measurement 69(9):6322–6335. https://doi.org/10.1109/TIM.2020.2967956
Huang Y, Xiong S, Liao Y (2021) Research on fire inspection robot based on computer vision. IOP Conf Ser Earth Environ Sci. https://doi.org/10.1088/1755-1315/632/5/052066
Dinh TH, Ha QP, La HM (2016) Computer vision-based method for concrete crack detection. In: 2016 14th international conference on control, automation, robotics and vision (ICARCV), pp. 1–6. https://doi.org/10.1109/ICARCV.2016.7838682
Oh J, Jang G, Oh S et al (2009) Bridge inspection robot system with machine vision. Autom Constr 18(7):929–941. https://doi.org/10.1016/j.autcon.2009.04.003
Pflugfelder R, Mičušík B (2010) Self-calibrating cameras in video surveillance. In: Belbachir A (ed) Smart cameras. Springer, Berlin
Maggiani L, Salvadori C, Petracca M, Pagano P, Saletti R (2013) Reconfigurable FPGA architecture for computer vision applications in Smart Camera Networks. In: 2013 seventh international conference on distributed smart cameras (ICDSC), pp. 1–6 (2013) https://https://doi.org/10.1109/ICDSC.2013.6778212
Magno M, Tombari F, Brunelli D, Di Stefano L, Benini L (2013) Multimodal video analysis on self-powered resource-limited wireless smart camera. IEEE J Emerg Sel Top Circuits Syst 3(2):223–235
Senouci B, Charfi I, Heyrman B et al (2016) Fast prototyping of a SoC-based smart-camera: a real-time fall detection case study. J Real-Time Image Proc 12:649–662
Liu G, Shi H, Kiani A et al (2022) Smart traffic monitoring system using computer vision and edge computing. IEEE Trans Intell Transp Syst 23(8):12027–12038. https://doi.org/10.1109/TITS.2021.3109481
Amato G, Bolettieri P, Moroni D, et al (2018) A wireless smart camera network for parking monitoring. In: IEEE Globecom Workshops (GC Wkshps), pp. 1–6, (2018) https://doi.org/10.1109/GLOCOMW.2018.8644226
Arijon D (1976) Grammar of the film language. Communication Arts Books, Hastings House Publishers, New York
Wonham WM (1968) On the separation theorem of stochastic control. SIAM Journal on Control 6(2):312–326. https://doi.org/10.1137/0306023
Funding
The presented work was supported by the Shanghai Municipal Science and Technology Major Project (No. 2018SHZDZX01) and the ZHANGJIANG LAB, Support of Science and Technology Project (No. 52094022004W) by State grid Shanghai Electric Power Company.
Author information
Authors and Affiliations
Contributions
YR: conceptualization, methodology, programming, computation, model analysis, writing. NY, XY, FT, QT and YW: technique and resources, coding, investigation. WL: conceptualization, methodology, writing and editing, supervision, funding acquisition.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This paper does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file1 (MP4 24563 KB)
Supplementary file2 (MP4 78291 KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ren, Y., Yan, N., Yu, X. et al. On automatic camera shooting systems via PTZ control and DNN-based visual sensing. Intel Serv Robotics 16, 265–285 (2023). https://doi.org/10.1007/s11370-023-00462-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11370-023-00462-w