Machine learning for the perception of autonomous construction machinery Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter (O) March 10, 2023

Machine learning for the perception of autonomous construction machinery

Maschinelles Lernen für die Perzeption autonomer Baumaschinen
  • Nina Felicitas Heide

    Nina Felicitas Heide received her degree in electrical engineering and information technology from the Karlsruhe Institute of Technology (KIT) in 2017, where she also received her PhD in electrical engineering and information technology in 2022. Her research focuses on perception, machine learning, and explainable artificial intelligence specializing in the research field of autonomous off-road vehicles.

    ORCID logo
    and Janko Petereit

    Janko Petereit received his degree in electrical engineering and information technology from the Karlsruhe Institute of Technology (KIT) in 2009, where he also received his PhD in informatics in 2016. He now manages the Multi-Sensor Systems research group at the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB in Karlsruhe, Germany. His research focuses on motion planning and multi-sensor fusion for autonomous mobile robots.

    ORCID logo EMAIL logo

Abstract

Robotic systems require holistic capabilities to sense, perceive, and act autonomously within their application environment. A safe and trustworthy autonomous operation is essential, especially in hazardous environments and critical applications like autonomous construction machinery for the decontamination of landfill sites. This article presents an enhanced combination of machine learning (ML) methods with classic artificial intelligence (AI) methods and customized validation methods to ensure highly reliable and accurate sensing and perception of the environment for autonomous construction machinery. The presented methods have been developed, evaluated, and applied within the Competence Center »Robot Systems for Decontamination in Hazardous Environments« (ROBDEKON) for investigating and developing robotic systems for autonomous decontamination tasks. The objective of this article is to give a holistic, in-depth overview for the ML-based part of the perception pipeline for an autonomous construction machine working in unstructured environments.


Corresponding author: Janko Petereit, Fraunhofer IOSB , Karlsruhe, Germany; and Fraunhofer Research Center Machine Learning, Karlsruhe, Germany, E-mail:

Funding source: Federal Ministry of Education and Research (BMBF)

Award Identifier / Grant number: 13N14674

About the authors

Nina Felicitas Heide

Nina Felicitas Heide received her degree in electrical engineering and information technology from the Karlsruhe Institute of Technology (KIT) in 2017, where she also received her PhD in electrical engineering and information technology in 2022. Her research focuses on perception, machine learning, and explainable artificial intelligence specializing in the research field of autonomous off-road vehicles.

Janko Petereit

Janko Petereit received his degree in electrical engineering and information technology from the Karlsruhe Institute of Technology (KIT) in 2009, where he also received his PhD in informatics in 2016. He now manages the Multi-Sensor Systems research group at the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB in Karlsruhe, Germany. His research focuses on motion planning and multi-sensor fusion for autonomous mobile robots.

  1. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: The described research has been conducted within the competence center “ROBDEKON – Robotic Systems for Decontamination in Hazardous Environments”, which is funded by the Federal Ministry of Education and Research (BMBF) within the scope of the German Federal Government’s “Research for Civil Security” program under grant no. 13N14674.

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

[1] P. Woock, N. F. Heide, and D. Kuehn, “Robotersysteme f’üur die Dekontamination in menschenfeindlichen Umgebungen,” in Proceedings at Leipziger Deponiefachtagung, vol. 2020, 2020.Search in Google Scholar

[2] J. Petereit, J. Beyerer, T. Asfour, et al.., “ROBDEKON: robotic systems for decontamination in hazardous environments,” in IEEE SSRR, 2019.10.1109/SSRR.2019.8848969Search in Google Scholar

[3] S. Khan, H. Rahmani, and S. A. A. Shah, A Guide to Convolutional Neural Networks for Computer Vision (Synthesis Lectures on Computer Vision), Morgan & Claypool Publishers, 2018.10.1007/978-3-031-01821-3Search in Google Scholar

[4] T. Emter, C. Frese, A. Zube, and J. Petereit, “Algorithm toolbox for autonomous mobile robotic systems,” ATZ Offhighway, vol. 10, pp. 48–53, 2017. https://doi.org/10.1007/s41321-017-0037-0.Search in Google Scholar

[5] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: the KITTI dataset,” Int. J. Robot. Res., vol. 32, no. 11, pp. 1231–1237, 2013. https://doi.org/10.1177/0278364913491297.Search in Google Scholar

[6] P. Wolf, A. Vierling, J. Husemann, K. Berns, and P. Decker, “Extending skills of autonomous off-road robots on the example of behavior-based edge compaction in a road construction scenario,” in Commercial Vehicle Technology 2020/2021, Springer Vieweg, 2021, pp. 51–62.10.1007/978-3-658-29717-6_5Search in Google Scholar

[7] J. Ma, J. Luo, H. Pu, Y. Peng, S. Xie, and J. Gu, “Design, simulation and manufacturing of a tracked robot for nuclear accidents,” in IEEE ROBIO, 2014.10.1109/ROBIO.2014.7090601Search in Google Scholar

[8] H. Osumi, “Application of robot technologies to the disaster sites,” in Report of JSME Research Committee on the Great East Japan Earthquake Disaster, 2014.Search in Google Scholar

[9] F. Mascarich, T. Wilson, T. Dang, S. Khattak, C. Papachristos, and K. Alexis, Towards Robotically Supported Decommissioning of Nuclear Sites, 2017 [Online]. Available at: http://arxiv.org/pdf/1705.06401v1.Search in Google Scholar

[10] S. Notheis, P. Kern, M. Mende, B. Hein, H. Worn, and S. Gentes, “Towards an autonomous manipulator system for decontamination and release measurement,” in IEEE/ASME MESA, 2012.10.1109/MESA.2012.6275572Search in Google Scholar

[11] G. Haskins, J. Kruecker, U. Kruger, et al.., “Learning deep similarity metric for 3D MR-TRUS image registration,” Int. J. Comput. Assist. Radiol. Surg., vol. 14, no. 3, pp. 417–425, 2019. https://doi.org/10.1007/s11548-018-1875-7.Search in Google Scholar PubMed PubMed Central

[12] A. Dhall, K. Chelani, V. Radhakrishnan, and K. M. Krishna, “LiDAR-camera calibration using 3D-3D point correspondences,” arxiv/1705.09785v1, 2017.Search in Google Scholar

[13] J. Kümmerle, T. Kühner, and M. Lauer, “Automatic calibration of multiple cameras and depth sensors with a spherical target,” in IEEE/RSJ IROS, 2018.10.1109/IROS.2018.8593955Search in Google Scholar

[14] A. Pujol-Miro, J. Ruiz-Hidalgo, and J. R. Casas, “Registration of images to unorganized 3D point clouds using contour cues,” in 25th European Signal Processing Conference, 2017.10.23919/EUSIPCO.2017.8081173Search in Google Scholar

[15] N. Schneider, F. Piewak, C. Stiller, and U. Franke, “Regnet: multimodal sensor registration using deep neural networks,” in IEEE IV, 2017.10.1109/IVS.2017.7995968Search in Google Scholar

[16] G. Iyer, R. K. Ram, J. K. Murthy, and K. M. Krishna, “CalibNet: geometrically supervised extrinsic calibration using 3D spatial transformer networks,” in IEEE/RSJ IROS, 2018.10.1109/IROS.2018.8593693Search in Google Scholar

[17] A. Handa, M. Bloesch, V. Pătrăucean, S. Stent, J. McCormac, and A. Davison, “Gvnn: neural network library for geometric computer vision,” in ECCV, 2016.10.1007/978-3-319-49409-8_9Search in Google Scholar

[18] J. Jiao, R. Wang, W. Wang, S. Dong, Z. Wang, and W. Gao, “Local stereo matching with improved matching cost and disparity refinement,” IEEE MultiMed., vol. 21, no. 4, pp. 16–27, 2014. https://doi.org/10.1109/mmul.2014.51.Search in Google Scholar

[19] H. Hirschmüller, “Accurate and efficient stereo processing by semi-global matching and mutual information,” in IEEE CVPR, 2005.10.1109/CVPR.2005.56Search in Google Scholar

[20] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 11, pp. 1222–1239, 2001. https://doi.org/10.1109/34.969114.Search in Google Scholar

[21] J. Čech, J. Sanchez-Riera, and R. Horaud, “Scene flow estimation by growing correspondence seeds,” in IEEE CVPR, 2011.10.1109/CVPR.2011.5995442Search in Google Scholar

[22] J. Zbontar and Y. LeCun, “Stereo matching by training a convolutional neural network to compare image patches,” J. Mach. Learn. Res., vol. 17, no. 1, pp. 2287–2318, 2016.Search in Google Scholar

[23] S. Zagoruyko and N. Komodakis, “Learning to compare image patches via convolutional neural networks,” in IEEE CVPR, 2015.10.1109/CVPR.2015.7299064Search in Google Scholar

[24] J. Zbontar and Y. LeCun, “Computing the stereo matching cost with a convolutional neural network,” in IEEE CVPR, 2015.10.1109/CVPR.2015.7298767Search in Google Scholar

[25] W. Luo, A. G. Schwing, and R. Urtasun, “Efficient deep learning for stereo matching,” in IEEE CVPR, 2016.10.1109/CVPR.2016.614Search in Google Scholar

[26] P. Wolf, T. Ropertz, and K. Berns, “Behavior-based obstacle detection in off-road environments considering data quality,” in International Conference on Informatics in Control, Automation and Robotics, 2020, pp. 786–809.10.1007/978-3-030-11292-9_39Search in Google Scholar

[27] P. Wolf and K. Berns, “Data-fusion for robust off-road perception considering data quality of uncertain sensors,” in IEEE/RSJ IROS, 2021, pp. 6876–6883.10.1109/IROS51168.2021.9636541Search in Google Scholar

[28] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in IEEE CVPR, 2015.10.1109/CVPR.2015.7298965Search in Google Scholar

[29] B. Wu, A. Wan, X. Yue, and K. Keutzer, “SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud,” in IEEE ICRA, 2018.10.1109/ICRA.2018.8462926Search in Google Scholar

[30] A. Milioto, I. Vizzo, J. Behley, and C. Stachniss, “RangeNet++: fast and accurate LiDAR semantic segmentation,” in IEEE/RSJ IROS, 2019.10.1109/IROS40897.2019.8967762Search in Google Scholar

[31] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: deep learning on point sets for 3d classification and segmentation,” in IEEE CVPR, 2017.Search in Google Scholar

[32] B. Khaleghi, “The how of explainable AI: pre-modelling explainability,” 2019 [Online]. Available at: https://towardsdatascience.com/the-how-of-explainable-ai-pre-modelling-explainability-699150495fe4.Search in Google Scholar

[33] J. D. Wang and H. C. Liu, “An approach to evaluate the fitness of one class structure via dynamic centroids,” Expert Syst. Appl., vol. 38, no. 11, pp. 13764–13772, 2011. https://doi.org/10.1016/j.eswa.2011.04.178.Search in Google Scholar

[34] S. Mani, A. Sankaran, S. Tamilselvam, and A. Sethi, “Coverage testing of deep learning models using dataset characterization,” arXiv preprint: 1911.07309, 2019.Search in Google Scholar

[35] The Alan Turing Institute, Impact Story: A Right to Explanation, The Alan Turing Institute, 2021 [Online]. Available at: https://www.turing.ac.uk/research/impact-stories/a-right-to-explanation.Search in Google Scholar

[36] European Parliament and the Council of the European Union, “General data protection regulation,” 2016 [Online]. Available at: http://data.europa.eu/eli/reg/2016/679/oj.Search in Google Scholar

[37] N. Heide, T. Emter, and J. Petereit, “Calibration of multiple 3D LiDAR sensors to a common vehicle frame,” in ISR 2018; 50th International Symposium on Robotics, 2018.Search in Google Scholar

[38] N. Heide, C. Frese, T. Emter, and J. Petereit, “Real-time hyperspectral stereo processing for the generation of 3D depth information,” in IEEE ICIP, 2018.10.1109/ICIP.2018.8451194Search in Google Scholar

[39] N. F. Heide, P. Woock, M. Sauer, T. Leitritz, and M. Heizmann, “UCSR: registration and fusion of cross-source 2D and 3D sensor data in unstructured environments,” in Fusion, 2020.10.23919/FUSION45008.2020.9190307Search in Google Scholar

[40] N. F. Heide, S. Gamer, and M. Heizmann, “UEM-CNN: enhanced stereo matching for unstructured environments with dataset filtering and novel error metrics,” in ISR 2020; 52th International Symposium on Robotics, 2020.Search in Google Scholar

[41] N. F. Heide, A. Albrecht, and M. Heizmann, “SET: stereo evaluation toolbox for combined performance assessment of camera systems, 3D reconstruction and visual SLAM,” IEEE ICICSP, 2019.10.1109/ICICSP48821.2019.8958548Search in Google Scholar

[42] N. F. Heide, A. Albrecht, and M. Heizmann, “A step towards explainable artificial neural networks in image processing by dataset assessment,” in Forum Bildverarbeitung, Karlsruhe, Germany, KIT Scientific Publishing, 2020.Search in Google Scholar

[43] N. F. Heide, E. Müller, J. Petereit, and M. Heizmann, “X3Seg: model-agnostic explanations for the semantic segmentation of 3D point clouds with prototypes and criticism,” in IEEE ICIP, 2021.10.1109/ICIP42928.2021.9506624Search in Google Scholar

[44] S. Kolski, D. Ferguson, M. Bellino, and R. Y. Siegwart, “Autonomous driving in structured and unstructured environments,” in IEEE IV, 2006, pp. 558–563.10.1109/IVS.2006.1689687Search in Google Scholar

[45] N. F. Heide, A. Albrecht, T. Emter, and J. Petereit, “Performance optimization of autonomous platforms in unstructured outdoor environments using a novel constrained planning approach,” in IEEE IV, 2019.10.1109/IVS.2019.8813805Search in Google Scholar

[46] G. Elbaz, T. Avraham, and A. Fischer, “3D point cloud registration for localization using a deep neural network auto-encoder,” in IEEE CVPR, 2017.10.1109/CVPR.2017.265Search in Google Scholar

[47] X. Huang, J. Zhang, L. Fan, Q. Wu, and C. Yuan, “A systematic approach for cross-source point cloud registration by preserving macro and micro structures,” IEEE Trans. Image Process., vol. 26, no. 7, pp. 3261–3276, 2017. https://doi.org/10.1109/tip.2017.2695888.Search in Google Scholar PubMed

[48] P. J. Besl and N. D. McKay, “Method for registration of 3-d shapes,” in Sensor Fusion IV: Control Paradigms and Data Structures, vol. 1611, SPIE, 1992, pp. 586–606.10.1109/34.121791Search in Google Scholar

[49] I. Gheta, M. Heizmann, A. Belkin, and J. Beyerer, “World modeling for autonomous systems,” in KI 2010: Advances in Artificial Intelligence, vol. 6359, Berlin, Heidelberg, Springer, 2010.Search in Google Scholar

[50] M. Heizmann, I. Gheta, F. Puente León, and J. Beyerer, “Informationsfusion zur Umgebungsexploration,” in Verteilte Messsysteme, Karlsruhe, Germany, KIT Scientific Publishing, 2010, pp. 133–152.10.1524/teme.2010.0098Search in Google Scholar

[51] F. Langer, A. Milioto, A. Haag, J. Behley, and C. Stachniss, “Domain transfer for semantic segmentation of LiDAR data using deep neural networks,” in IEEE/RSJ IROS, 2020.10.1109/IROS45743.2020.9341508Search in Google Scholar

[52] J. Behley, M. Garbade, A. Milioto, et al.., “SemanticKITTI: a dataset for semantic scene understanding of LiDAR sequences,” in IEEE/CVF ICCV, 2019.10.1109/ICCV.2019.00939Search in Google Scholar


Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/auto-2022-0054).


Received: 2022-04-19
Accepted: 2023-01-19
Published Online: 2023-03-10
Published in Print: 2023-03-28

© 2023 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 23.1.2025 from https://www.degruyter.com/document/doi/10.1515/auto-2022-0054/html
Scroll to top button