Abstract
In Industry 4.0, manufacturing entails a rapid change in customer demands which leads to mass customization. The variation in customer requirements leads to small batch sizes and several process variations. Assembly task is one of most important steps in any manufacturing process. A factory floor worker often needs a guidance system due to variations in product or process, to assist them in assembly task. Existing Augmented Reality (AR) based systems use markers for each assembly component for detection which is time consuming and laborious. This paper proposed utilizing state-of-the-art deep learning based object detection technique and employed a regression based mapping technique to obtain the 3D locations of assembly components. Automatic detection of machine parts was followed by a multimodal interface involving both eye gaze and hand tracking to guide the manual assembly process. We proposed eye cursor to guide the user through the task and utilized fingertip distances along with object sizes to detect any error committed during the task. We analyzed the proposed mapping method and found that the mean mapping error was 1.842 cm. We also investigated the effectiveness of the proposed multimodal user interface by conducting two user studies. The first study indicated that the current interface design with eye cursor enabled participants to perform the task significantly faster compared to the interface without eye cursor. The shop floor workers during the second user study reported that the proposed guidance system was comprehendible and easy to use to complete the assembly task. Results showed that the proposed guidance system enabled 11 end users to finish the assembly of one pneumatic cylinder within 55 s with average TLX score less than 25 in a scale of 100 and Cronbach alpha score of 0.8 indicating convergence of learning experience.
Similar content being viewed by others
References
Doolani S et al (2020) A review of extended reality (xr) technologies for manufacturing training. Technologies 84:77
Werrlich S, Kai N, and Gunther N. (2017) Demand analysis for an augmented reality based assembly training. In: Proceedings of the 10th International Conference on PErvasive Technologies Related to Assistive Environments.
Azuma RT (1997) A survey of augmented reality. Presence Teleopera Virtual Environ 64:355–385
Blattgerste J et al (2017) Comparing conventional and augmented reality instructions for manual assembly tasks. Proceedings of the 10th international conference on pervasive technologies related to assistive environments.
Deshpande A, Kim I (2018) The effects of augmented reality on improving spatial problem solving for object assembly. Adv Eng Inform 38:760–775
Khuong BM et al (2014) The effectiveness of an AR-based context-aware assembly support system in object assembly. 2014 IEEE Virtual Reality (VR). IEEE.
Khoshelham K, Tran H, Acharya D (2019) Indoor mapping eyewear: geometric evaluation of spatial mapping capability of HoloLens. Int Arch Photogramm Remote Sens Spat Inf Sci 42:805–810
Li J et al (2019) Application research of improved Yolo V3 algorithm in PCB electronic component detection. Appl Sci 918:3750
Luo Q, Fang X, Liu L, Yang C, Sun Y (2020) Automated visual defect detection for flat steel surface: a survey. IEEE Trans Instrum Meas 69(3):626–644
Frank AG, Dalenogare LS, Ayala NF (2019) Industry 4.0 technologies: Implementation patterns in manufacturing companies. Int J Prod Econ 210:15–26
Renner P and Thies P (2017) Attention guiding techniques using peripheral vision and eye tracking for feedback in augmented-reality based assistance systems. 2017 IEEE symposium on 3D user interfaces (3DUI). IEEE.
Fast-Berglund Å, Gong L, Li D (2018) Testing and validating Extended Reality (xR) technologies in manufacturing. Procedia Manufacturing 25:31–38
Tang A et al. (2003) Comparative effectiveness of augmented reality in object assembly. In: Proceedings of the SIGCHI conference on Human factors in computing systems.
Henderson S, Feiner S (2010) Exploring the benefits of augmented reality documentation for maintenance and repair. IEEE Trans Visual Comput Gr 17(10):1355–1368
Henderson SJ, and Steven KF. Augmented reality in the psychomotor phase of a procedural task. 2011 10th IEEE international symposium on mixed and augmented reality. IEEE, 2011.
Upadhyay GK, et al. (2020) Augmented reality and machine learning based product identification in retail using vuforia and mobilenets. 2020 International Conference on Inventive Computation Technologies (ICICT). IEEE.
Microsoft (2023) Spatial mapping - Mixed Reality | Microsoft Learn
Blankemeyer S, Wiemann R, Raatz A (2018) Intuitive assembly support system using augmented reality, Tagungsband des 3 Kongresses Montage Handhabung Industrieroboter. Springer Vieweg, Berlin, Heidelberg, pp 195–203
Hebenstreit M et al. (2020). An Industry 4.0 Production Workplace Enhanced by Using Mixed Reality Assembly Instructions with Microsoft HoloLens. Mensch und Computer 2020-Workshopband
Radkowski, Rafael, and Jarid Ingebrand. (2017) HoloLens for assembly assistance-a focus group report. In: International Conference on Virtual, Augmented and Mixed Reality. Springer, Cham.
Funk M, Sven M, and Albrecht S. (2015) Using in-situ projection to support cognitively impaired workers at the workplace. Proceedings of the 17th international ACM SIGACCESS conference on Computers & accessibility
Sand O, et al. (2016.) smart. assembly–projection-based augmented reality for supporting assembly workers. In: International Conference on Virtual, Augmented and Mixed Reality. Springer, Cham
Hajek J et al. (2018) "Closing the calibration loop: an inside-out-tracking paradigm for augmented reality in orthopedic surgery." International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham,
Agarwal A, JeevithaShree DV, Saluja KS, Sahay A, Mounika P, Sahu A, Bhaumik R, Rajendran VK and Biswas P (2019), Comparing two webcam based eye gaze trackers for users with severe speech and motor impairment, international conference on research into design (ICoRD 2019)
Fukuda K et al. (2020) Assembly motion recognition framework using only images. 2020 IEEE/SICE International Symposium on System Integration (SII). IEEE.
Tavakoli H et al. (2021) Small object detection for near real-time egocentric perception in a manual assembly scenario. arXiv preprint arXiv:2106.06403 .
Li X et al. (2020) Object detection in the context of mobile augmented reality. 2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE
Yaskawa (2023) Motoman GP12 Robot | 12.0 kg
Su Y et al. (2019) Deep multi-state object pose estimation for augmented reality assembly. 2019 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). IEEE.
Bahri H, David K, and Jan K. (2019) Accurate object detection system on hololens using Yolo algorithm. 2019 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO). IEEE
Park K-B et al (2020) Deep learning-based smart task assistance in wearable augmented reality. Robotics Comput-Int Manuf 63:101887
Farasin A et al. (2020) "Real-time object detection and tracking in mixed reality using microsoft hololens." 15th international joint conference on computer vision, imaging and computer graphics theory and applications, VISIGRAPP 2020. Vol. 4. SciTePress.
Eckert M, Matthias B, and Christoph MF. (2018) "Object detection featuring 3D audio localization for Microsoft HoloLens." Proc. 11th Int. Joint Conf. on Biomedical Engineering Systems and Technologies. Vol. 5
Skarbez R, Smith M, Whitton MC (2021) Revisiting milgram and kishino’s reality-virtuality continuum. Front Virtual Reality 2:647997
Scaramuzza D, & Zhang Z. (2019). Visual-inertial odometry of aerial robots. arXiv preprint arXiv:1906.03289.
Mukhopadhay A and Biswas P (2019), Comparing CNNs for Non-Conventional Traffic Participants, ACM Automotive UI (AutoUI)
Microsoft (2023) MRTK packages - MRTK 2 | Microsoft Learn
Dwivedi P. (2020, Jun 30). Yolov5 compared to Faster RCNN. Who wins?. https://towardsdatascience.com/Yolov5-compared-to-faster-rcnn-who-wins-a771cd6c9fb4
Gandhi,, R. (2018). R-CNN,Fast R-CNN, Faster R-CNN, Yolo-Object Detection Algorithm. https://towardsdatascience.com/r-cnn-fast-r-cnn-faster-r-cnn-Yolo-object-detection-algorithms-36d53571365e
Girshick R. Fast r-cnn (2015) Proceedings of the IEEE international conference on computer vision.
He K et al (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 379:1904–1916
Liu S et al. Path aggregation network for instance segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
Bochkovskiy A, Chien-Yao W and Hong-Yuan ML. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020).
Roboflow (2023) https://roboflow.com
PyTorch (2023) https://pytorch.org/hub/ultralytics_Yolov5/index.html
Biswas P, Langdon P (2015) Multimodal intelligent eye-gaze tracking system. Int J Human-Comput Interact 31(4):1044–7318
Zhiqiang W., & Jun L. (2017). A review of object detection based on convolutional neural network. In 2017 36th Chinese control conference (CCC) (pp. 11104–11109). IEEE.
Vuforia (2023) VuMarks | Vuforia Library
Microsoft (2023) TextToSpeech Class (Microsoft.Maui.Media) | Microsoft Learn
Microsoft (2023) HandJointUtils Class (Microsoft.MixedReality.Toolkit.Input) | Microsoft Learn
Microsoft (2023) HoloLens 2 – Overview, Features and Specs | Microsoft HoloLens
Redmon J, Divvala S, Girshick R, & Farhadi A. (2016). You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition pp 779–788.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file1 (MP4 90558 kb)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Raj, S., Murthy, L.R.D., Shanmugam, T.A. et al. Augmented reality and deep learning based system for assisting assembly process. J Multimodal User Interfaces 18, 119–133 (2024). https://doi.org/10.1007/s12193-023-00428-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12193-023-00428-3