On the Evaluation of RGB-D-Based Categorical Pose and Shape Estimation

Bruns, Leonard; Jensfelt, Patric

doi:10.1007/978-3-031-22216-0_25

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 577))

Included in the following conference series:

International Conference on Intelligent Autonomous Systems

1243 Accesses

Abstract

Recently, various methods for 6D pose and shape estimation of objects have been proposed. Typically, these methods evaluate their pose estimation in terms of average precision and reconstruction quality in terms of chamfer distance. In this work, we take a critical look at this predominant evaluation protocol, including metrics and datasets. We propose a new set of metrics, contribute new annotations for the Redwood dataset, and evaluate state-of-the-art methods in a fair comparison. We find that existing methods do not generalize well to unconstrained orientations and are actually heavily biased towards objects being upright. We provide an easy-to-use evaluation toolbox with well-defined metrics, method, and dataset interfaces, which allows evaluation and comparison with various state-of-the-art approaches (https://github.com/roym899/pose_and_shape_evaluation).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 34319; Price includes VAT (Japan)

Softcover Book: JPY 42899; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

BOP: Benchmark for 6D Object Pose Estimation

A Real World Dataset for Multi-view 3D Reconstruction

Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking

Notes

1.
Intel Core i7-6850K CPU, NVIDIA TITAN X (Pascal) GPU.

References

Ahmadyan, A., Zhang, L., Ablavatski, A., Wei, J., Grundmann, M.: Objectron: A large scale dataset of object-centric videos in the wild with pose annotations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 7822–7831 (2021)
Google Scholar
Akizuki, S., Hashimoto, M.: ASM-Net: Category-level pose and shape estimation using parametric deformation. In: Proceedings of the British Machine Vision Conference (2021)
Google Scholar
Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: ShapeNet: An information-rich 3D model repository. Tech. Rep. 1512.03012, arXiv preprint (Dec 2015)
Google Scholar
Chen, D., Li, J., Wang, Z., Xu, K.: Learning canonical shape space for category-level 6D object pose and size estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11973–11982 (2020)
Google Scholar
Chen, K., Dou, Q.: SGPA: Structure-guided prior adaptation for category-level 6D object pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2773–2782 (2021)
Google Scholar
Chen, W., Jia, X., Chang, H.J., Duan, J., Shen, L., Leonardis, A.: FS-Net: Fast shape-based network for category-level 6D object pose estimation with decoupled rotation mechanism. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1581–1590 (2021)
Google Scholar
Chen, X., Dong, Z., Song, J., Geiger, A., Hilliges, O.: Category level object pose estimation via neural analysis-by-synthesis. In: European Conference on Computer Vision, pp. 139–156. Springer, Berlin (2020)
Google Scholar
Choi, S., Zhou, Q.Y., Miller, S., Koltun, V.: A large dataset of object scans. arXiv preprint arXiv:1602.02481 (2016)
Deng, H., Bui, M., Navab, N., Guibas, L., Ilic, S., Birdal, T.: Deep Bingham networks: Dealing with uncertainty and ambiguity in pose estimation. arXiv preprint arXiv:2012.11002 (2020)
Engelmann, F., Rematas, K., Leibe, B., Ferrari, V.: From points to multi-object 3D reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4588–4597 (2021)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)
Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
Hodaň, T., Sundermeyer, M., Drost, B., Labbé, Y., Brachmann, E., Michel, F., Rother, C., Matas, J.: BOP challenge 2020 on 6D object localization. In: European Conference on Computer Vision, pp. 577–594. Springer (2020)
Google Scholar
Knapitsch, A., Park, J., Zhou, Q.Y., Koltun, V.: Tanks and temples: benchmarking large-scale scene reconstruction. ACM Trans. Graph. 36(4) (2017)
Google Scholar
Lee, T., Lee, B.U., Kim, M., Kweon, I.S.: Category-level metric scale object shape and pose estimation. IEEE Robot. Autom. Lett. 6(4), 8575–8582 (2021)
Article Google Scholar
Lin, J., Wei, Z., Li, Z., Xu, S., Jia, K., Li, Y.: DualPoseNet: Category-level 6D object pose and size estimation using dual pose network with refined learning of pose consistency. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3560–3569 (2021)
Google Scholar
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: Common objects in context. In: Proceedings of the European Conference on Computer Vision, pp. 740–755 (2014)
Google Scholar
Manhardt, F., Arroyo, D.M., Rupprecht, C., Busam, B., Birdal, T., Navab, N., Tombari, F.: Explaining the ambiguity of object detection and 6D pose from visual data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6841–6850 (2019)
Google Scholar
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32, 8026–8037 (2019)
Google Scholar
Salton, G., McGill, M.J.: Introduction to modern information retrieval. McGraw Hill (1983)
Google Scholar
Tatarchenko, M., Richter, S.R., Ranftl, R., Li, Z., Koltun, V., Brox, T.: What do single-view 3D reconstruction networks learn? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3405–3414 (2019)
Google Scholar
Tian, M., Ang, M.H., Lee, G.H.: Shape prior deformation for categorical 6D object pose and size estimation. In: European Conference on Computer Vision, pp. 530–546. Springer, Berlin (2020)
Google Scholar
Umeyama, S.: Least-squares estimation of transformation parameters between two point patterns. IEEE Trans. Pattern Anal. Mach. Intell. 13(04), 376–380 (1991)
Article Google Scholar
Wang, H., Sridhar, S., Huang, J., Valentin, J., Song, S., Guibas, L.J.: Normalized object coordinate space for category-level 6D object pose and size estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2642–2651 (2019)
Google Scholar
Wang, J., Chen, K., Dou, Q.: Category-level 6D object pose estimation via cascaded relation and recurrent reconstruction networks. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4807–4814 (2021)
Google Scholar
Zhou, Q.Y., Park, J., Koltun, V.: Open3D: A modern library for 3d data processing. arXiv preprint arXiv:1801.09847 (2018)

Download references

Acknowledgements

This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.

Author information

Authors and Affiliations

KTH Royal Institute of Technology, Division of Robotics, Perception and Learning, Stockholm, Sweden
Leonard Bruns & Patric Jensfelt

Authors

Leonard Bruns
View author publications
You can also search for this author in PubMed Google Scholar
Patric Jensfelt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leonard Bruns .

Editor information

Editors and Affiliations

Faculty of Electrical Engineering, University of Zagreb, Zagreb, Croatia
Ivan Petrovic
Department of Information Engineering, University of Padua, Padua, Italy
Emanuele Menegatti
Faculty of Electrical Engineering, University of Zagreb, Zagreb, Croatia
Ivan Marković

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bruns, L., Jensfelt, P. (2023). On the Evaluation of RGB-D-Based Categorical Pose and Shape Estimation. In: Petrovic, I., Menegatti, E., Marković, I. (eds) Intelligent Autonomous Systems 17. IAS 2022. Lecture Notes in Networks and Systems, vol 577. Springer, Cham. https://doi.org/10.1007/978-3-031-22216-0_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-22216-0_25
Published: 18 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22215-3
Online ISBN: 978-3-031-22216-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

On the Evaluation of RGB-D-Based Categorical Pose and Shape Estimation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

BOP: Benchmark for 6D Object Pose Estimation

A Real World Dataset for Multi-view 3D Reconstruction

Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

On the Evaluation of RGB-D-Based Categorical Pose and Shape Estimation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

BOP: Benchmark for 6D Object Pose Estimation

A Real World Dataset for Multi-view 3D Reconstruction

Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation