MSL-RAPTOR: A 6DoF Relative Pose Tracker for Onboard Robotic Perception

Ramtoula, Benjamin; Caccavale, Adam; Beltrame, Giovanni; Schwager, Mac

doi:10.1007/978-3-030-71151-1_46

Benjamin Ramtoula¹³,
Adam Caccavale¹⁴,
Giovanni Beltrame¹⁵ &
…
Mac Schwager¹⁴

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 19))

Included in the following conference series:

International Symposium on Experimental Robotics

2375 Accesses

Abstract

Determining the relative position and orientation of objects in an environment is a fundamental building block for a wide range of robotics applications. To accomplish this task efficiently in practical settings, a method must be fast, use common sensors, and generalize easily to new objects and environments. We present MSL-RAPTOR, a two-stage algorithm for tracking a rigid body with a monocular camera. The image is first processed by an efficient neural network-based front-end to detect new objects and track 2D bounding boxes between frames. The class label and bounding box is passed to the back-end that updates the object’s pose using an unscented Kalman filter (UKF). The measurement posterior is fed back to the 2D tracker to improve robustness. The object’s class is identified so a class-specific UKF can be used if custom dynamics and constraints are known. Adapting to track the pose of new classes only requires providing a trained 2D object detector or labeled 2D bounding box data, as well as the approximate size of the objects. The performance of MSL-RAPTOR is first verified on the NOCS-REAL275 dataset, achieving results comparable to RGB-D approaches despite not using depth measurements. When tracking a flying drone from onboard another drone, it outperforms the fastest comparable method in speed by a factor of 3, while giving lower translation and rotation median errors by 66% and 23% respectively.

B. Ramtoula and A. Caccavale—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 32031; Price includes VAT (Japan)

Softcover Book: JPY 40039; Price includes VAT (Japan)

Hardcover Book: JPY 40039; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-Modal Pose Representations for 6-DOF Object Tracking

Article Open access 09 October 2024

Relative pose estimation from panoramic images using a hybrid neural network architecture

Article Open access 24 October 2024

MPF6D: masked pyramid fusion 6D pose estimation

Article Open access 11 May 2023

References

Spica, R., Falanga, D., Cristofalo, E., Montijano, E., Scaramuzza, D., Schwager, M.: A real-time game theoretic planner for autonomous two-player drone racing. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018)
Google Scholar
Badue, C., et al.: Self-driving cars: a survey. Expert Syst. Appl. 165, 113816 (2021)
Article Google Scholar
Schmidt, T., Newcombe, R., Fox, D.: DART: dense articulated real-time tracking. In: Proceedings of Robotics: Science and Systems, Berkeley, USA (2014)
Google Scholar
Andriluka, M., Roth, S., Schiele, B.: Monocular 3D pose estimation and tracking by detection. In: Conference on Computer Vision and Pattern Recognition, pp. 623–630. IEEE (2010)
Google Scholar
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018)
Google Scholar
Do, T.T., Pham, T., Cai, M., Reid, I.: Real-time monocular object instance 6d pose estimation. In: British Machine Vision Conference (BMVC), vol. 1, p. 6 (2018)
Google Scholar
Rad, M., Lepetit, V.: BB8: a scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3828–3836 (2017)
Google Scholar
Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. In: IEEE International Conference on Computer Vision, pp. 1521–1529 (2017)
Google Scholar
Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301 (2018)
Google Scholar
Lepetit, V., Moreno-Noguer, F., Fua, P.: EPnP: an accurate O(n) solution to the PnP problem. Int. J. Comput. Vis. 81(2), 155 (2009)
Article Google Scholar
Wang, H., Sridhar, S., Huang, J., Valentin, J., Song, S., Guibas, L.J.: Normalized object coordinate space for category-level 6D object pose and size estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2642–2651 (2019)
Google Scholar
Wang, C., Martín-Martín, R., Xu, D., Lv, J., Lu, C., Fei-Fei, L., Savarese, S., Zhu, Y.: 6-PACK: category-level 6D pose tracker with anchor-based keypoints. In: IEEE Conference on Robotics and Automation (ICRA), Paris, France (2020)
Google Scholar
Cho, H., Seo, Y.W., Kumar, B.V., Rajkumar, R.R.: A multi-sensor fusion system for moving object detection and tracking in urban driving environments. In: IEEE Conference on Robotics and Automation (ICRA), pp. 1836–1843 (2014)
Google Scholar
Buyval, A., Gabdullin, A., Mustafin, R., Shimchik, I.: Realtime vehicle and pedestrian tracking for didi udacity self-driving car challenge. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 2064–2069. IEEE (2018)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv:1804.02767 (2018)
Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1328–1338 (2019)
Google Scholar
Julier, S., Uhlmann, J., Durrant-Whyte, H.F.: A new method for the nonlinear transformation of means and covariances in filters and estimators. IEEE Trans. Autom. Control 45(3), 477–482 (2000)
Article MathSciNet Google Scholar
Wan, E.A., Van Der Merwe, R.: The unscented Kalman filter for nonlinear estimation. In: Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium, pp. 153–158 (2000)
Google Scholar
Hartley, R., Zisserman, A.: Multiple view geometry in computer vision. Cambridge University Press (2004)
Google Scholar
Jocher, G., guigarfr, perry0418, Ttayu, Veitch-Michaelis, J., Bianconi, G., Baltacı, F., Suess, D., WannaSea, U., IlyaOvodov: ultralytics/yolov3: Rectangular Inference, Conv2d + Batchnorm2d Layer Fusion, April 2019
Google Scholar
Kristan, M., Matas, J., Leonardis, A., Vojir, T., Pflugfelder, R., Fernandez, G., Nebehay, G., Porikli, F., Čehovin, L.: A novel performance evaluation methodology for single-target trackers. IEEE Trans. Pattern Anal. Mach. Intell. 38(11), 2137–2155 (2016)
Article Google Scholar
Issac, J., Wuthrich, M., Cifuentes, C.G., Bohg, J., Trimpe, S., Schaal, S.: Depth-based object tracking using a robust gaussian filter. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden (2016)
Google Scholar
Wuthrich, M., Pastor, P., Kalakrishnan, M., Bohg, J., Schaal, S.: Probabilistic object tracking using a range camera. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (2013)
Google Scholar
Zhou, Q.Y., Park, J., Koltun, V.: Open3D: a modern library for 3D data processing. arXiv preprint arXiv:1801.09847 (2018)
Suwajanakorn, S., Snavely, N., Tompson, J.J., Norouzi, M.: Discovery of latent 3D keypoints via end-to-end geometric reasoning. In: Advances in Neural Information Processing Systems, pp. 2059–2070 (2018)
Google Scholar

Download references

Acknowledgements

This research was supported in part by ONR grant number N00014-18-1-2830, NSF NRI grant 1830402, the Stanford Ford Alliance program, the Mitacs Globalink research award IT15240, and the NSERC Discovery Grant 2019-05165. We are grateful for this support.

Author information

Authors and Affiliations

University of Oxford, Oxford, UK
Benjamin Ramtoula
Stanford University, Stanford, USA
Adam Caccavale & Mac Schwager
Polytechnique Montréal, Montréal, Canada
Giovanni Beltrame

Authors

Benjamin Ramtoula
View author publications
You can also search for this author in PubMed Google Scholar
Adam Caccavale
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Beltrame
View author publications
You can also search for this author in PubMed Google Scholar
Mac Schwager
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adam Caccavale .

Editor information

Editors and Affiliations

Department of Electrical Engineering and Information Technology, University of Naples Federico II, Naples, Italy
Bruno Siciliano
Department of Mechanical Engineering, National University of Singapore, Singapore, Singapore
Cecilia Laschi
Department of Computer Science, Stanford University, Stanford, CA, USA
Oussama Khatib

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ramtoula, B., Caccavale, A., Beltrame, G., Schwager, M. (2021). MSL-RAPTOR: A 6DoF Relative Pose Tracker for Onboard Robotic Perception. In: Siciliano, B., Laschi, C., Khatib, O. (eds) Experimental Robotics. ISER 2020. Springer Proceedings in Advanced Robotics, vol 19. Springer, Cham. https://doi.org/10.1007/978-3-030-71151-1_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-71151-1_46
Published: 28 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71150-4
Online ISBN: 978-3-030-71151-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

MSL-RAPTOR: A 6DoF Relative Pose Tracker for Onboard Robotic Perception

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-Modal Pose Representations for 6-DOF Object Tracking

Relative pose estimation from panoramic images using a hybrid neural network architecture

MPF6D: masked pyramid fusion 6D pose estimation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

MSL-RAPTOR: A 6DoF Relative Pose Tracker for Onboard Robotic Perception

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-Modal Pose Representations for 6-DOF Object Tracking

Relative pose estimation from panoramic images using a hybrid neural network architecture

MPF6D: masked pyramid fusion 6D pose estimation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation