Multi-scale kronecker-product relation networks for few-shot learning

Abdelaziz, Mounir; Zhang, Zuping

doi:10.1007/s11042-021-11735-w

Multi-scale kronecker-product relation networks for few-shot learning

Published: 17 January 2022

Volume 81, pages 6703–6722, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

799 Accesses
16 Citations
1 Altmetric
Explore all metrics

Abstract

Few-shot learning aims to train classifiers to learn new visual object categories from few training examples. Recently, metric-learning based methods have made promising progress. Relation Network is a metric-based method that uses simple convolutional neural networks to learn deep relationships between image features in order to recognize new objects. However, during the feature comparing phase, Relation Network is considered sensitive to the spatial positions of the compared objects. Moreover, it learns from only single-scale features which can lead to a poor generalization ability due to scale variation of the compared objects. To solve these problems, we intend to extend Relation Network to be position-aware and integrate multi-scale features for more robust metric learning and better generalization ability. In this paper, we propose a novel few-shot learning method called Multi-scale Kronecker-Product Relation Networks For Few-Shot Learning (MsKPRN). Our method combines feature maps with spatial correlation maps generated from a Kronecker-product module to capture position-wise correlations between the compared features and then feeds them to a relation network module, which captures similarities between the combined features in a multi-scale manner. Extensive experiments demonstrate that the proposed method outperforms the related state-of-the-art methods on popular few-shot learning datasets. Particularly, MsKPRN has improved the accuracy of Relation Network from 50.44 to 57.02 and from 65.63 to 72.06 on 5-way 1-shot and 5-shot scenarios, respectively. Our code will be available on: https://github.com/mouniraziz/MsKPRN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Few-shot learning via relation network based on coarse-grained granulation

Article 23 April 2022

Learning to focus: cascaded feature matching network for few-shot image recognition

Article 30 July 2021

Dual-domain reciprocal learning design for few-shot image classification

Article 01 February 2023

References

Abdelaziz M, Zhang Z (2021) Few-shot learning with saliency maps as additional visual information. Multimedia Tools and Applications 80(7):10491–10508
Article Google Scholar
Baik S, Hong S, Lee KM (2020) Learning to forget for meta-learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2379–2387
Biederman I (1987) Recognition-by-components: A theory of human image understanding. Psychological Review 94(2):115–147
Article Google Scholar
Cai Q, Pan Y, Yao T, Yan C, Mei T (2018) Memory matching networks for one-shot image recognition. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 4080–4088
Chen Z, Fu Y, Zhang Y, Jiang Y-G, Xue X, Sigal L (2019) Multi-level semantic feature augmentation for one-shot learning. IEEE Transactions on Image Processing 28(9):4594–4605
Article MathSciNet Google Scholar
Chen Z, Fu Y, Wang Y-X, Ma L, Liu W, Hebert M (2019) Image deformation meta-networks for one-shot learning. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 8680–8689
Chen H, Li H, Li Y, Chen C (2020) Multi-scale adaptive task attention network for few-shot learning. arXiv:2011.14479
Chu W-H, Li Y-J, Chang J-C, Wang Y-CF (2019) Spot and learn: A maximum-entropy patch sampler for few-shot image classification. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 6251–6260
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT 2019: Annual conference of the north american chapter of the association for computational linguistics, pp 4171–4186
Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(4):594–611
Article Google Scholar
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th international conference on machine learning-vol 70, pp 1126–1135
Flennerhag S, Rusu AA, Pascanu R, Visin F, Yin H, Hadsell R (2020) Meta-learning with warped gradient descent. In: ICLR 2020: Eighth international conference on learning representations
Gidaris S, Komodakis N (2018) Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4367–4375
Han M, Wang R, Yang J, Xue L, Hu M (2020) Multi-scale feature network for few-shot learning. Multimedia Tools and Applications 79(17):11617–11637
Article Google Scholar
Hariharan B, Girshick R (2017) Low-shot visual recognition by shrinking and hallucinating features. In: 2017 IEEE International conference on computer vision (ICCV), pp 3037–3046
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on computer vision and pattern recognition (CVPR), pp 770–778
Huang H, Zhang J, Zhang J, Xu J, Wu Q (2020) Low-rank pairwise alignment bilinear network for few-shot fine-grained image classification. IEEE Transactions on Multimedia
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 7132–7141
Khosla A, Jayadevaprakash N, Yao B, Li FF (2011) Novel dataset for fine-grained image categorization: Stanford dogs. In: Proc. CVPR workshop on fine-grained visual categorization (FGVC) (Vol. 2, No. 1)
Kingma DP, Ba JL (2015) Adam: A method for stochastic optimization. In: ICLR 2015 : International conference on learning representations 2015
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML deep learning workshop, vol 2
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3D Object representations for fine-grained categorization. In: 2013 IEEE International conference on computer vision workshops, pp 554–561
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Communications of The ACM 60(6):84–90
Article Google Scholar
Lake BM, Salakhutdinov R, Gross J, Tenenbaum JB (2011) One shot learning of simple visual concepts. Cogn Sci:33(33)
Lee K, Maji S, Ravichandran A, Soatto S (2019) Meta-learning with differentiable convex optimization. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 10657–10665
Li W, Wang L, Xu J, Huo J, Gao Y, Luo J (2019) Revisiting local descriptor based image-to-class measure for few-shot learning. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 7260–7268
Li Z, Zhou F, Chen F, Li H (2017) Meta-SGD: Learning to learn quickly for few-shot learning. arXiv:1707.09835
Mishra N, Rohaninejad M, Chen X, Abbeel P (2017) A simple neural attentive meta-learner. arXiv:1707.03141
Munkhdalai T, Yu H (2017) Meta networks. In: ICML’17 Proceedings of the 34th international conference on machine learning - vol 70, pp 2554–2563
Oh J, Yoo H, Kim C, Yun S-Y (2021) BOIL: Towards representation change for few-shot learning. In: ICLR 2021: The ninth international conference on learning representations
Oreshkin B, López PR, Lacoste A (2018) TADAM: Task dependent adaptive metric for improved few-shot learning. In: NIPS 2018: The 32nd annual conference on neural information processing systems, pp 721–731
Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: ICLR 2017: International conference on learning representations 2017
Ren M, Ravi S, Triantafillou E, Snell J, Swersky K, Tenenbaum JB, Zemel RS (2018) Meta-learning for semi-supervised few-shot classification. In: ICLR 2018: International conference on learning representations 2018
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Bernstein M (2015) ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115(3):211–252
Article MathSciNet Google Scholar
Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) Meta-learning with memory-augmented neural networks. In: ICML’16 Proceedings of the 33rd international conference on international conference on machine learning - vol 48, pp 1842–1850
Satorras VG, Estrach JB (2018) Few-shot learning with graph neural networks. In: 6th International conference on learning representations, ICLR 2018
Schwartz E, Karlinsky L, Feris RS, Giryes R, Bronstein AM (2019) Baby steps towards few-shot learning with multiple semantics. arXiv:1906.01905
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626
Shen Y, Xiao T, Li H, Yi S, Wang X (2018) End-to-end deep kronecker-product matching for person re-identification. In: 2018 IEEE CVF Conference on computer vision and pattern recognition, pp 6886–6895
Shen Y, Xiao T, Yi S, Chen D, Wang X, Li H (2020) Person re-identification with deep kronecker-product matching and group-shuffling random walk. IEEE Trans Pattern Anal Mach Intell:1–1
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems, pp 4077–4087
Steiner B, DeVito Z, Chintala S, Gross S, Paszke A, Massa F, Yang, E (2019) PyTorch: An imperative style, high-performance deep learning library. In: NeurIPS 2019: Thirty-third conference on neural information processing systems, pp 8024–8035
Sung F, Yang Y, Zhang L, Xiang T, Torr PHS, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 1199–1208
Tan, M et al (2020) EfficientDet: Scalable and efficient object detection. In: 2020 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 10781–10790
Tao A, Sapra K, Catanzaro B (2020) Hierarchical multi-scale attention for semantic segmentation. arXiv:arXiv:2005.10821
Thrun S, Pratt L (1998) Learning to learn: introduction and overview. Learning Learn:3–17
Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artificial Intelligence Review 18(2):77–95
Article Google Scholar
Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D (2016) Matching networks for one shot learning. In NIPS’16 Proceedings of the 30th international conference on neural information processing systems, pp 3637–3645
Wang Y-X, Girshick R, Hebert M, Hariharan B (2018) Low-shot learning from imaginary data. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 7278–7286
Wang X, Ma B, Yu Z, Li F, Cai Y (2020) Multi-scale decision network with feature fusion and weighting for few-shot learning. IEEE Access 8:92172–92181
Google Scholar
Welinder P, Branson S, Mita T, Wah C, Schroff F, Belongie S, Perona P (2010) Caltech-UCSD birds 200
Wu Z, Li Y, Guo L, Jia K (2019) Parn: Position-aware relation networks for few-shot learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6659–6667
Xing C, Rostamzadeh N, Oreshkin B, Pinheiro PO (2019) Adaptive cross-modal few-shot learning. In: NeurIPS 2019: Thirty-third conference on neural information processing systems, pp 4848-4858
Xue Z, Duan L, Li W, Chen L, Luo J (2020) Region comparison network for interpretable few-shot image classification. arXiv:2009.03558
Xue Z, Xie Z, Xing Z, Duan L (2020) Relative position and map networks in few-shot learning for image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 932–933
Zhang H, Koniusz P (2019) Power normalizing second-order similarity network for few-shot learning. In: 2019 IEEE Winter conference on applications of computer vision (WACV), pp 1185–1193
Zhang H, Torr PH, Koniusz P (2020) Few-shot Learning with multi-scale self-supervision. arXiv:2001.01600
Zhang H, Zhang J, Koniusz P (2019) Few-shot learning via saliency-guided hallucination of samples. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 2770–2779
Zhong Z, Zheng L, Kang G, Li S, Yang Y (2020) Random erasing data augmentation. In: Proceedings of the AAAI conference on artificial intelligence, pp 13001–13008

Download references

Acknowledgements

We would like to thank the anonymous referees for their helpful comments and suggestions.

Funding

This study was funded by the National Natural Science Foundation of China (Grant No.61379109,M1321007) and Science and Technology Plan of Hunan Province (Grant No.2014GK2018, 2016JC2011).

Author information

Authors and Affiliations

School of Computer Science & Engineering, Central South University, 932 South Lushan Rd, 410083, Changsha, Hunan, People’s Republic of China
Mounir Abdelaziz & Zuping Zhang

Authors

Mounir Abdelaziz
View author publications
You can also search for this author in PubMed Google Scholar
Zuping Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zuping Zhang.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abdelaziz, M., Zhang, Z. Multi-scale kronecker-product relation networks for few-shot learning. Multimed Tools Appl 81, 6703–6722 (2022). https://doi.org/10.1007/s11042-021-11735-w

Download citation

Received: 06 July 2021
Revised: 30 September 2021
Accepted: 05 November 2021
Published: 17 January 2022
Issue Date: February 2022
DOI: https://doi.org/10.1007/s11042-021-11735-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Multi-scale kronecker-product relation networks for few-shot learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Few-shot learning via relation network based on coarse-grained granulation

Learning to focus: cascaded feature matching network for few-shot image recognition

Dual-domain reciprocal learning design for few-shot image classification

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Multi-scale kronecker-product relation networks for few-shot learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Few-shot learning via relation network based on coarse-grained granulation

Learning to focus: cascaded feature matching network for few-shot image recognition

Dual-domain reciprocal learning design for few-shot image classification

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation