Abstract
Though deep neural networks have achieved quite impressive performance in various image detection and classification tasks, they are often constrained by requiring intensive computation and large storage space for deployment in different scenarios and devices. This paper presents an innovative network that aims to train a lightweight yet competent student network via transferring multifarious knowledge and features from a large yet powerful teacher network. Based on the observations that different vision tasks are often correlated and complementary, we first train a resourceful teacher network that captures both discriminative and generative features for the objective of image classification (the main task) and image reconstruction (an auxiliary task). A lightweight yet competent student network is then trained by mimicking both pixel-level and spatial-level feature distribution of the resourceful teacher network under the guidance of feature loss and adversarial loss, respectively. The proposed technique has been evaluated over a number of public datasets extensively and experiments show that our student network obtains superior image classification performance as compared with the state-of-the-art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bucilua, C., Caruana, R., Niculescumizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535–541 (2006)
Ghifary, M., Kleijn, W.B., Zhang, M., Balduzzi, D., Li, W.: Deep reconstruction-classification networks for unsupervised domain adaptation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 597–613. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_36
Heo, B., Kim, J., Yun, S., Park, H., Kwak, N., Choi, J.Y.: A comprehensive overhaul of feature distillation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1921–1930 (2019)
Heo, B., Lee, M., Yun, S., Choi, J.Y.: Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3779–3787 (2019)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2014)
Kim, J., Park, S., Kwak, N.: Paraphrasing complex network: network compression via factor transfer. In: Advances in Neural Information Processing Systems, pp. 2760–2769 (2018)
Krizhevsky, A., Nair, V., Hinton, G.: Cifar-10 dataset
Krizhevsky, A., Nair, V., Hinton, G.: Cifar-100 dataset
Liu, P., Liu, W., Ma, H., Mei, T., Seok, M.: Ktan: knowledge transfer adversarial network. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–7. IEEE (2018)
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Bengio, Y.: Fitnets: hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2015)
Sergey, Z., Nikos, K.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928 (2017)
Shu, C., Li, P., Xie, Y., Qu, Y., Dai, L., Ma, L.: Knowledge squeezed adversarial network compression. arXiv preprint arXiv:1904.05100 (2019)
Vasileios, B., Azade, F., Fabio, G.: Adversarial network compression. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4133–4141 (2017)
Yoshihashi, R., Shao, W., Kawakami, R., You, S., Iida, M.: Classification-reconstruction learning for open-set recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4016–4025 (2019)
Zhang, X., Gong, H., Dai, X., Yang, F., Liu, N., Liu, M.: Understanding pictograph with facial features: end-to-end sentence-level lip reading of Chinese. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 9211–9218 (2019)
Zhang, X., Lu, S., Gong, H., Luo, Z., Liu, M.: AMLN: adversarial-based mutual learning network for online knowledge distillation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 158–173. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_10
Zheng, X., Hsu, Y., Huang, J.: Training student networks for acceleration with conditional adversarial networks. In: BMVC (2018)
Acknowledgements
This work is supported in part by National Science Foundation of China under Grant No. 61572113, and the Fundamental Research Funds for the Central Universities under Grants No. XGBDFZ09.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, X., Lu, S., Gong, H., Liu, M., Liu, M. (2020). Training Lightweight yet Competent Network via Transferring Complementary Features. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Communications in Computer and Information Science, vol 1332. Springer, Cham. https://doi.org/10.1007/978-3-030-63820-7_65
Download citation
DOI: https://doi.org/10.1007/978-3-030-63820-7_65
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63819-1
Online ISBN: 978-3-030-63820-7
eBook Packages: Computer ScienceComputer Science (R0)