FeatEMD: Better Patch Sampling and Distance Metric for Few-Shot Image Classification | SpringerLink
Skip to main content

FeatEMD: Better Patch Sampling and Distance Metric for Few-Shot Image Classification

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2023 (ICANN 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14254))

Included in the following conference series:

  • 1433 Accesses

Abstract

Few-shot image classification (FSIC) studies the problem of classifying images when given only a few training samples, which presents a challenge for deep learning models to generalize well on unseen image categories. To learn FSIC tasks effectively, recent metric-based methods leverage the similarity measures of deep feature representations with minimum matching costs, introducing a new paradigm in addressing the FSIC challenge. Recent metric-learning techniques, e.g., DeepEMD, measure the distance between features with the earth mover’s distance (EMD), and it is currently the state-of-the-art (SOTA) approach for FSIC. In this paper, we however identify two fundamental limitations in DeepEMD. First, it brings high computational cost, as it randomly samples image patches to extract features. This process is often wasteful due to suboptimal sampling strategies. Second, its accuracy is also limited by the use of optimal-transport costs based on cosine similarity, which only measures directional discrepancies. To mitigate the above shortcomings, we propose an improved method, which we call FeatEMD. First, it introduces a feature saliency-based cropping (FeatCrop) to construct image patches that concentrates computations on object-salient regions. Second, it proposes a Direction-Distance Similarity (\(\textrm{DDS}\)) a more effective distance criterion in capturing subtle differences in latent space features. We conduct comprehensive experiments and ablations to validate our method. Experimental results show FeatEMD establishes new SOTA on two mainstream benchmark datasets. Remarkably, when compared with DeepEMD, FeatEMD reduces up to \( 36\% \) computational costs. Our code is available at https://github.com/SethDeng/FeatEMD.

S. Deng and D. Liao—Contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 9380
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 11725
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  2. Oreshkin, B., Rodríguez López, P., Lacoste, A.: TADAM: task dependent adaptive metric for improved few-shot learning. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R.(eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates Inc (2018). https://proceedings.neurips.cc/paper_files/paper/2018/file/66808e327dc79d135ba18e051673d906-Paper.pdf

  3. Vinyals, O., Blundell, C., Lillicrap, T., kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29. Curran Associates Inc. (2016). https://proceedings.neurips.cc/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf

  4. Yu, X., Aloimonos, Y.: Attribute-based transfer learning for object categorization with zero/one training example. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 127–140. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_10

    Chapter  Google Scholar 

  5. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference On Machine Learning. PMLR, pp. 1126–1135 (2017)

    Google Scholar 

  6. Antoniou, A., Edwards, H., Storkey, A.: How to train your MAML. In: International Conference on Learning Representations (2018)

    Google Scholar 

  7. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf

  8. Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: Relation network for few-shot learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

  9. Ye, H.-J., Hu, H., Zhan, D.-C., Sha, F.: Few-shot learning via embedding adaptation with set-to-set functions. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  10. Li, W., Wang, L., Huo, J., Shi, Y., Gao, Y., Luo, J.: Asymmetric distribution measure for few-shot learning

    Google Scholar 

  11. Wertheimer, D., Hariharan, B.: Few-shot learning with localization in realistic settings. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6558–6567 (2019)

    Google Scholar 

  12. Xie, J., Long, F., Lv, J., Wang, Q., Li, P.: Joint distribution matters: deep brownian distance covariance for few-shot classification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7972–7981 (2022)

    Google Scholar 

  13. Zhang, C., Cai, Y., Lin, G., Shen, C.: DeepEMD: differentiable earth mover’s distance for few-shot learning. IEEE Trans. Pattern Anal. Mach. Intell. (2022)

    Google Scholar 

  14. Liu, Y., Zhang, W., Xiang, C., Zheng, T., Cai, D., He, X.: Learning to affiliate: mutual centralized learning for few-shot classification. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14411–14420 (2022)

    Google Scholar 

  15. Chen, J., Bai, G., Liang, S., Li, Z.: Automatic image cropping: a computational complexity study. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 507–515 (2016)

    Google Scholar 

  16. Takahashi, R., Matsubara, T., Uehara, K.: RICAP: random image cropping and patching data augmentation for deep CNNs. In: Asian Conference on Machine Learning, PMLR, pp. 786–798 (2018)

    Google Scholar 

  17. Oh, G., Choi, D.-W., Moon, B.: Similar patch selection in embedding space for multi-view image denoising. IEEE Access 9, 98581–98589 (2021)

    Article  Google Scholar 

  18. Peng, X., Wang, K., Zhu, Z., Wang, M., You, Y.: Crafting better contrastive views for siamese representation learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16031–16040 (2022)

    Google Scholar 

  19. Cordonnier, J.-B., Mahendran, A., Dosovitskiy, A., Weissenborn, D., Uszkoreit, J., Unterthiner, T.: Differentiable patch selection for image recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2351–2360 (2021)

    Google Scholar 

  20. Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266) (2015)

    Google Scholar 

  21. Ren, M., et al.: Meta-learning for semi-supervised few-shot classification. In: International Conference on Learning Representations (2018)

    Google Scholar 

  22. Mishra, N., Rohaninejad, M., Chen, X., Abbeel, P.: A simple neural attentive meta-learner. In: International Conference on Learning Representations (2018)

    Google Scholar 

Download references

Acknowledgement

This work is supported in part by Science and Technology Development Fund of Macao S.A.R (FDCT) under Nos. 0015/2019/AKP, 0123/2022/AFJ, and 0081/2022/A2, Guang-Dong Basic and Applied Basic Research Foundation (No. 2020B1515130004), and Shenzhen Science and Technology Innovation Commission (Nos. JCYJ20190812160003719, JCYJ20220818101610023). It was carried out in part at SICC, which is supported by SKL-IOTSC, University of Macau.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kejiang Ye .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Deng, S., Liao, D., Gao, X., Zhao, J., Ye, K. (2023). FeatEMD: Better Patch Sampling and Distance Metric for Few-Shot Image Classification. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14254. Springer, Cham. https://doi.org/10.1007/978-3-031-44207-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44207-0_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44206-3

  • Online ISBN: 978-3-031-44207-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics