Measuring Latency-Accuracy Trade-Offs in Convolutional Neural Networks

Tse, André; Oliveira, Lino; Vinagre, João

doi:10.1007/978-3-031-49008-8_26

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14115))

Included in the following conference series:

EPIA Conference on Artificial Intelligence

880 Accesses

Abstract

Several systems that employ machine learning models are subject to strict latency requirements. Fraud detection systems, transportation control systems, network traffic analysis and footwear manufacturing processes are a few examples. These requirements are imposed at inference time, when the model is queried. However, it is not trivial how to adjust model architecture and hyperparameters in order to obtain a good trade-off between predictive ability and inference time. This paper provides a contribution in this direction by presenting a study of how different architectural and hyperparameter choices affect the inference time of a Convolutional Neural Network for network traffic analysis. Our case study focus on a model for traffic correlation attacks to the Tor network, that requires the correlation of a large volume of network flows in a short amount of time. Our findings suggest that hyperparameters related to convolution operations—such as stride, and the number of filters—and the reduction of convolution and max-pooling layers can substantially reduce inference time, often with a relatively small cost in predictive performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 8464; Price includes VAT (Japan)

Softcover Book: JPY 10581; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Trafficnet: A Novel Network Performance Prediction Model via Aggregator-Based Enhancement

Scaling for edge inference of deep neural networks

Article 17 April 2018

Resource-Constrained Real-Time Network Traffic Classification Using One-Dimensional Convolutional Neural Networks

References

Bolukbasi, T., Wang, J., Dekel, O., Saligrama, V.: Adaptive neural networks for efficient inference. In: International Conference on Machine Learning, pp. 527–536. PMLR (2017)
Google Scholar
Chang, S.E., Li, Y., Sun, M., Shi, R., So, H.K.H., Qian, X., Wang, Y., Lin, X.: Mix and match: a novel FPGA-centric deep neural network quantization framework. In: 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), pp. 208–220. IEEE (2021)
Google Scholar
Choudhary, T., Mishra, V., Goswami, A., Sarangapani, J.: Inference-aware convolutional neural network pruning. Futur. Gener. Comput. Syst. 135, 44–56 (2022)
Article Google Scholar
Dong, Z., Yao, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: Hawq: Hessian aware quantization of neural networks with mixed-precision. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 293–302 (2019)
Google Scholar
Fu, C., Zhu, S., Su, H., Lee, C.E., Zhao, J.: Towards fast and energy-efficient binarized neural network inference on FPGA (2018). arXiv:1810.02068
Hacene, G.B., Gripon, V., Arzel, M., Farrugia, N., Bengio, Y.: Quantized guided pruning for efficient hardware implementations of deep neural networks. In: 2020 18th IEEE International New Circuits and Systems Conference (NEWCAS), pp. 206–209. IEEE (2020)
Google Scholar
Lebedev, V., Lempitsky, V.: Speeding-up convolutional neural networks: a survey. Bull. Pol. Acad. Sci.: Tech. Sci. 66(6) (2018)
Google Scholar
LeCun, Y., Denker, J., Solla, S.: Optimal brain damage. In: Advances in Neural Information Processing Systems, vol. 2 (1989)
Google Scholar
Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference (2016). arXiv:1611.06440
Nasr, M., Bahramali, A., Houmansadr, A.: Deepcorr: strong flow correlation attacks on tor using deep learning. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1962–1976 (2018)
Google Scholar
Panchenko, A., Lanze, F., Engel, T.: Improving performance and anonymity in the tor network. In: 2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC), pp. 1–10. IEEE (2012)
Google Scholar
Putra, T.A., Leu, J.S.: Multilevel neural network for reducing expected inference time. IEEE Access 7, 174129–174138 (2019)
Article Google Scholar
Teerapittayanon, S., McDanel, B., Kung, H.T.: Branchynet: Fast inference via early exiting from deep neural networks. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2464–2469. IEEE (2016)
Google Scholar
Yamashita, R., Nishio, M., Do, R.K.G., Togashi, K.: Convolutional neural networks: an overview and application in radiology. Insights Imaging 9(4), 611–629 (2018)
Article Google Scholar

Download references

Acknowledgements

This work is co-financed by Component 5—Capitalization and Business Innovation, integrated in the Resilience Dimension of the Recovery and Resilience Plan within the scope of the Recovery and Resilience Mechanism (MRR) of the European Union (EU), framed in the Next Generation EU, for the period 2021–2026, within project FAIST, with reference 66. This work was also partially supported by Fundação para a Ciência e Tecnologia (FCT), under project DAnon with grant CMU/TIC/0044/2021.

Author information

Authors and Affiliations

INESC TEC, Porto, Portugal
André Tse, Lino Oliveira & João Vinagre
Faculty of Sciences - University of Porto, Porto, Portugal
João Vinagre

Authors

André Tse
View author publications
You can also search for this author in PubMed Google Scholar
Lino Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
João Vinagre
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to André Tse .

Editor information

Editors and Affiliations

Lucy Family Institute for Data & Society, Notre Dame, IN, USA
Nuno Moniz
GECAD, Polytechnic of Porto, Porto, Portugal
Zita Vale
GRIA - LIACC, University of Azores, Ponta-Delgada, Portugal
José Cascalho
CISUC, University of Coimbra, Coimbra, Portugal
Catarina Silva
IEETA, University of Aveiro, Aveiro, Portugal
Raquel Sebastião

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tse, A., Oliveira, L., Vinagre, J. (2023). Measuring Latency-Accuracy Trade-Offs in Convolutional Neural Networks. In: Moniz, N., Vale, Z., Cascalho, J., Silva, C., Sebastião, R. (eds) Progress in Artificial Intelligence. EPIA 2023. Lecture Notes in Computer Science(), vol 14115. Springer, Cham. https://doi.org/10.1007/978-3-031-49008-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-49008-8_26
Published: 15 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49007-1
Online ISBN: 978-3-031-49008-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Measuring Latency-Accuracy Trade-Offs in Convolutional Neural Networks