Abstract
Few-shot image classification (FSIC) studies the problem of classifying images when given only a few training samples, which presents a challenge for deep learning models to generalize well on unseen image categories. To learn FSIC tasks effectively, recent metric-based methods leverage the similarity measures of deep feature representations with minimum matching costs, introducing a new paradigm in addressing the FSIC challenge. Recent metric-learning techniques, e.g., DeepEMD, measure the distance between features with the earth mover’s distance (EMD), and it is currently the state-of-the-art (SOTA) approach for FSIC. In this paper, we however identify two fundamental limitations in DeepEMD. First, it brings high computational cost, as it randomly samples image patches to extract features. This process is often wasteful due to suboptimal sampling strategies. Second, its accuracy is also limited by the use of optimal-transport costs based on cosine similarity, which only measures directional discrepancies. To mitigate the above shortcomings, we propose an improved method, which we call FeatEMD. First, it introduces a feature saliency-based cropping (FeatCrop) to construct image patches that concentrates computations on object-salient regions. Second, it proposes a Direction-Distance Similarity (\(\textrm{DDS}\)) a more effective distance criterion in capturing subtle differences in latent space features. We conduct comprehensive experiments and ablations to validate our method. Experimental results show FeatEMD establishes new SOTA on two mainstream benchmark datasets. Remarkably, when compared with DeepEMD, FeatEMD reduces up to \( 36\% \) computational costs. Our code is available at https://github.com/SethDeng/FeatEMD.
S. Deng and D. Liao—Contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Oreshkin, B., Rodríguez López, P., Lacoste, A.: TADAM: task dependent adaptive metric for improved few-shot learning. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R.(eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates Inc (2018). https://proceedings.neurips.cc/paper_files/paper/2018/file/66808e327dc79d135ba18e051673d906-Paper.pdf
Vinyals, O., Blundell, C., Lillicrap, T., kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29. Curran Associates Inc. (2016). https://proceedings.neurips.cc/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf
Yu, X., Aloimonos, Y.: Attribute-based transfer learning for object categorization with zero/one training example. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 127–140. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_10
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference On Machine Learning. PMLR, pp. 1126–1135 (2017)
Antoniou, A., Edwards, H., Storkey, A.: How to train your MAML. In: International Conference on Learning Representations (2018)
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: Relation network for few-shot learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Ye, H.-J., Hu, H., Zhan, D.-C., Sha, F.: Few-shot learning via embedding adaptation with set-to-set functions. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Li, W., Wang, L., Huo, J., Shi, Y., Gao, Y., Luo, J.: Asymmetric distribution measure for few-shot learning
Wertheimer, D., Hariharan, B.: Few-shot learning with localization in realistic settings. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6558–6567 (2019)
Xie, J., Long, F., Lv, J., Wang, Q., Li, P.: Joint distribution matters: deep brownian distance covariance for few-shot classification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7972–7981 (2022)
Zhang, C., Cai, Y., Lin, G., Shen, C.: DeepEMD: differentiable earth mover’s distance for few-shot learning. IEEE Trans. Pattern Anal. Mach. Intell. (2022)
Liu, Y., Zhang, W., Xiang, C., Zheng, T., Cai, D., He, X.: Learning to affiliate: mutual centralized learning for few-shot classification. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14411–14420 (2022)
Chen, J., Bai, G., Liang, S., Li, Z.: Automatic image cropping: a computational complexity study. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 507–515 (2016)
Takahashi, R., Matsubara, T., Uehara, K.: RICAP: random image cropping and patching data augmentation for deep CNNs. In: Asian Conference on Machine Learning, PMLR, pp. 786–798 (2018)
Oh, G., Choi, D.-W., Moon, B.: Similar patch selection in embedding space for multi-view image denoising. IEEE Access 9, 98581–98589 (2021)
Peng, X., Wang, K., Zhu, Z., Wang, M., You, Y.: Crafting better contrastive views for siamese representation learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16031–16040 (2022)
Cordonnier, J.-B., Mahendran, A., Dosovitskiy, A., Weissenborn, D., Uszkoreit, J., Unterthiner, T.: Differentiable patch selection for image recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2351–2360 (2021)
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266) (2015)
Ren, M., et al.: Meta-learning for semi-supervised few-shot classification. In: International Conference on Learning Representations (2018)
Mishra, N., Rohaninejad, M., Chen, X., Abbeel, P.: A simple neural attentive meta-learner. In: International Conference on Learning Representations (2018)
Acknowledgement
This work is supported in part by Science and Technology Development Fund of Macao S.A.R (FDCT) under Nos. 0015/2019/AKP, 0123/2022/AFJ, and 0081/2022/A2, Guang-Dong Basic and Applied Basic Research Foundation (No. 2020B1515130004), and Shenzhen Science and Technology Innovation Commission (Nos. JCYJ20190812160003719, JCYJ20220818101610023). It was carried out in part at SICC, which is supported by SKL-IOTSC, University of Macau.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Deng, S., Liao, D., Gao, X., Zhao, J., Ye, K. (2023). FeatEMD: Better Patch Sampling and Distance Metric for Few-Shot Image Classification. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14254. Springer, Cham. https://doi.org/10.1007/978-3-031-44207-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-44207-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44206-3
Online ISBN: 978-3-031-44207-0
eBook Packages: Computer ScienceComputer Science (R0)