The Missing Link: Finding Label Relations Across Datasets

Uijlings, Jasper; Mensink, Thomas; Ferrari, Vittorio

doi:10.1007/978-3-031-20074-8_31

Jasper Uijlings¹²,
Thomas Mensink¹² &
Vittorio Ferrari¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13668))

Included in the following conference series:

European Conference on Computer Vision

2072 Accesses
4 Citations

Abstract

Computer vision is driven by the many datasets available for training or evaluating novel methods. However, each dataset has a different set of class labels, visual definition of classes, images following a specific distribution, annotation protocols, etc. In this paper we explore the automatic discovery of visual-semantic relations between labels across datasets. We aim to understand how instances of a certain class in a dataset relate to the instances of another class in another dataset. Are they in an identity, parent/child, overlap relation? Or is there no link between them at all? To find relations between labels across datasets, we propose methods based on language, on vision, and on their combination. We show that we can effectively discover label relations across datasets, as well as their type. We apply our method to four applications: understand label relations, identify missing aspects, increase label specificity, and predict transfer learning gains. We conclude that label relations cannot be established by looking at the names of classes alone, as they depend strongly on how each of the datasets was constructed.

J. Uijlings and T. Mensink—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 12583; Price includes VAT (Japan)

Softcover Book: JPY 15729; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The Open Images Dataset V4

Article 13 March 2020

Discovering Multi-relational Latent Attributes by Visual Similarity Networks

Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations

Notes

1.
An instance is either a single object (for thing classes, e.g. cat, car), or the union of all regions of a stuff class (e.g. grass, water), following the panoptic definition [13].
2.
Available at: https://github.com/google-research/google-research/tree/master/missing_link.

References

Robust vision challenge. http://www.robustvision.net/
Bevandić, P., Oršić, M., Grubišić, I., Šarić, J., Šegvić, S.: Multi-domain semantic segmentation with overlapping labels. In: Proceedings of the WACV (2022)
Google Scholar
Bucher, M., Vu, T., Cord, M., Pérez, P.: Zero-shot semantic segmentation. In: NeurIPS (2019)
Google Scholar
Caesar, H., Uijlings, J., Ferrari, V.: COCO-stuff dataset (2018). http://calvin.inf.ed.ac.uk/datasets/coco-stuff
Caesar, H., Uijlings, J., Ferrari, V.: COCO-stuff: thing and stuff classes in context. In: CVPR (2018)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Google Scholar
Everingham, M., Eslami, S., van Gool, L., Williams, C., Winn, J., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. IJCV 111, 98–136 (2015). https://doi.org/10.1007/s11263-014-0733-5
Article Google Scholar
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Article Google Scholar
Ghiasi, G., Gu, X., Cui, Y., Lin, T.: Open-vocabulary image segmentation. Technical report, ArXiV (2021)
Google Scholar
Google: Wiki words 500 with normalization - a 500 dimensional wor2vec skip-gram model trained on English Wikipedia. https://tfhub.dev/google/Wiki-words-500-with-normalization/2
Jia, C., et al.: Scaling up visual and vision-language representation learning with noisy text supervision. In: ICML (2021)
Google Scholar
Kirillov, A.: Panoptic challenge intro. COCO+Mapillary Joint Recognition Challenge Workshop. http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Overview.pdf
Kirillov, A., He, K., Girshick, R., Rother, C., Dollár, P.: Panoptic segmentation. In: CVPR (2019)
Google Scholar
Kokkinos, I.: UberNet: training a ‘universal’ CNN for low-, mid-, and high-level vision using diverse datasets and limited memory. In: CVPR (2017)
Google Scholar
Kuznetsova, A., et al.: The open images dataset V4: unified image classification, object detection, and visual relationship detection at scale. IJCV 128, 1956–1981 (2020). https://doi.org/10.1007/s11263-020-01316-z
Article Google Scholar
Lambert, J., Liu, Z., Sener, O., Hays, J., Koltun, V.: MSeg: a composite dataset for multi-domain semantic segmentation. In: CVPR (2020)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
McInnes, L., Healy, J., Saul, N., Grossberger, L.: UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3(29), 861 (2018)
Article Google Scholar
Mensink, T., Uijlings, J., Kuznetsova, A., Gygli, M., Ferrari, V.: Factors of influence for transfer learning across diverse appearance domains and task types. IEEE Trans. PAMI (2021)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR Workshop (2013)
Google Scholar
Miller, G.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Ponce, J., et al.: Dataset issues in object recognition. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds.) Toward Category-Level Object Recognition. LNCS, vol. 4170, pp. 29–48. Springer, Heidelberg (2006). https://doi.org/10.1007/11957959_2
Chapter Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML (2021)
Google Scholar
Rebuffi, S.A., Bilen, H., Vedaldi, A.: Learning multiple visual domains with residual adapters. In: NeurIPS (2017)
Google Scholar
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Chapter Google Scholar
Torralba, A., Efros, A.: An unbiased look on dataset bias. In: CVPR (2011)
Google Scholar
Triantafillou, E., et al.: Meta-dataset: a dataset of datasets for learning to learn from few examples. In: ICLR (2020)
Google Scholar
Wang, J., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. PAMI 43(10), 3349–3364 (2020)
Article Google Scholar
Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: SUN database: large-scale scene recognition from Abbey to Zoo. In: CVPR (2010)
Google Scholar
Xiao, J., Owens, A., Torralba, A.: SUN3D: a database of big spaces reconstructed using SfM and object labels. In: ICCV (2013)
Google Scholar
Yu, F., et al.: BDD100K: a diverse driving dataset for heterogeneous multitask learning. In: CVPR (2020)
Google Scholar
Zendel, O., Honauer, K., Murschitz, M., Humenberger, M., Fernandez Dominguez, G.: Analyzing computer vision data - the good, the bad and the ugly. In: CVPR (2017)
Google Scholar
Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20K dataset. In: CVPR (2017)
Google Scholar
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: NeurIPS (2014)
Google Scholar
Zhou, X., Koltun, V., Krähenbühl, P.: Simple multi-dataset detection. In: CVPR (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

Google Research, Zürich, Switzerland
Jasper Uijlings, Thomas Mensink & Vittorio Ferrari

Authors

Jasper Uijlings
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Mensink
View author publications
You can also search for this author in PubMed Google Scholar
Vittorio Ferrari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jasper Uijlings .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2095 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Uijlings, J., Mensink, T., Ferrari, V. (2022). The Missing Link: Finding Label Relations Across Datasets. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13668. Springer, Cham. https://doi.org/10.1007/978-3-031-20074-8_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-20074-8_31
Published: 12 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20073-1
Online ISBN: 978-3-031-20074-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Missing Link: Finding Label Relations Across Datasets

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

The Open Images Dataset V4

Discovering Multi-relational Latent Attributes by Visual Similarity Networks

Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 2095 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

The Missing Link: Finding Label Relations Across Datasets

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

The Open Images Dataset V4

Discovering Multi-relational Latent Attributes by Visual Similarity Networks

Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 2095 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation