Measuring Latency-Accuracy Trade-Offs in Convolutional Neural Networks | SpringerLink
Skip to main content

Measuring Latency-Accuracy Trade-Offs in Convolutional Neural Networks

  • Conference paper
  • First Online:
Progress in Artificial Intelligence (EPIA 2023)

Abstract

Several systems that employ machine learning models are subject to strict latency requirements. Fraud detection systems, transportation control systems, network traffic analysis and footwear manufacturing processes are a few examples. These requirements are imposed at inference time, when the model is queried. However, it is not trivial how to adjust model architecture and hyperparameters in order to obtain a good trade-off between predictive ability and inference time. This paper provides a contribution in this direction by presenting a study of how different architectural and hyperparameter choices affect the inference time of a Convolutional Neural Network for network traffic analysis. Our case study focus on a model for traffic correlation attacks to the Tor network, that requires the correlation of a large volume of network flows in a short amount of time. Our findings suggest that hyperparameters related to convolution operations—such as stride, and the number of filters—and the reduction of convolution and max-pooling layers can substantially reduce inference time, often with a relatively small cost in predictive performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 8464
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 10581
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bolukbasi, T., Wang, J., Dekel, O., Saligrama, V.: Adaptive neural networks for efficient inference. In: International Conference on Machine Learning, pp. 527–536. PMLR (2017)

    Google Scholar 

  2. Chang, S.E., Li, Y., Sun, M., Shi, R., So, H.K.H., Qian, X., Wang, Y., Lin, X.: Mix and match: a novel FPGA-centric deep neural network quantization framework. In: 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), pp. 208–220. IEEE (2021)

    Google Scholar 

  3. Choudhary, T., Mishra, V., Goswami, A., Sarangapani, J.: Inference-aware convolutional neural network pruning. Futur. Gener. Comput. Syst. 135, 44–56 (2022)

    Article  Google Scholar 

  4. Dong, Z., Yao, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: Hawq: Hessian aware quantization of neural networks with mixed-precision. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 293–302 (2019)

    Google Scholar 

  5. Fu, C., Zhu, S., Su, H., Lee, C.E., Zhao, J.: Towards fast and energy-efficient binarized neural network inference on FPGA (2018). arXiv:1810.02068

  6. Hacene, G.B., Gripon, V., Arzel, M., Farrugia, N., Bengio, Y.: Quantized guided pruning for efficient hardware implementations of deep neural networks. In: 2020 18th IEEE International New Circuits and Systems Conference (NEWCAS), pp. 206–209. IEEE (2020)

    Google Scholar 

  7. Lebedev, V., Lempitsky, V.: Speeding-up convolutional neural networks: a survey. Bull. Pol. Acad. Sci.: Tech. Sci. 66(6) (2018)

    Google Scholar 

  8. LeCun, Y., Denker, J., Solla, S.: Optimal brain damage. In: Advances in Neural Information Processing Systems, vol. 2 (1989)

    Google Scholar 

  9. Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference (2016). arXiv:1611.06440

  10. Nasr, M., Bahramali, A., Houmansadr, A.: Deepcorr: strong flow correlation attacks on tor using deep learning. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1962–1976 (2018)

    Google Scholar 

  11. Panchenko, A., Lanze, F., Engel, T.: Improving performance and anonymity in the tor network. In: 2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC), pp. 1–10. IEEE (2012)

    Google Scholar 

  12. Putra, T.A., Leu, J.S.: Multilevel neural network for reducing expected inference time. IEEE Access 7, 174129–174138 (2019)

    Article  Google Scholar 

  13. Teerapittayanon, S., McDanel, B., Kung, H.T.: Branchynet: Fast inference via early exiting from deep neural networks. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2464–2469. IEEE (2016)

    Google Scholar 

  14. Yamashita, R., Nishio, M., Do, R.K.G., Togashi, K.: Convolutional neural networks: an overview and application in radiology. Insights Imaging 9(4), 611–629 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

This work is co-financed by Component 5—Capitalization and Business Innovation, integrated in the Resilience Dimension of the Recovery and Resilience Plan within the scope of the Recovery and Resilience Mechanism (MRR) of the European Union (EU), framed in the Next Generation EU, for the period 2021–2026, within project FAIST, with reference 66. This work was also partially supported by Fundação para a Ciência e Tecnologia (FCT), under project DAnon with grant CMU/TIC/0044/2021.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to André Tse .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tse, A., Oliveira, L., Vinagre, J. (2023). Measuring Latency-Accuracy Trade-Offs in Convolutional Neural Networks. In: Moniz, N., Vale, Z., Cascalho, J., Silva, C., Sebastião, R. (eds) Progress in Artificial Intelligence. EPIA 2023. Lecture Notes in Computer Science(), vol 14115. Springer, Cham. https://doi.org/10.1007/978-3-031-49008-8_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-49008-8_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-49007-1

  • Online ISBN: 978-3-031-49008-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics