Abstract
In this study, we analyse two mobile phone activity datasets to predict the future traffic of mobile base stations in urban areas. The predicted time series can be used to reflect the trend of human activity flow. Although common methods such as recurrent neural network and long short-term memory (LSTM) network often achieve a high precision, they have the short back of time-consuming. So, we present the improved gradient-boosted decision tree algorithm based on Kalman filter (GBDT-KF) due to the noise in the original time series, because the decrease in the performance of GBDT is usually caused by overfitting the noise in the signal. According to our experiments, although the RMSE of the predicted values of our GBDT-KF and the ground truth is only 12–14% worse than that of the LSTM model, the proposed GBDT-KF algorithm makes a trade-off between the precision and time complexity and achieves over 100-time training time reduction compared with the LSTM model. By implementing the result of our work, service providers could predict where and when a network congestion would happen; therefore, they could take actions ahead of time. Such applications are useful especially in the era of 5G.





Similar content being viewed by others
References
China: mobile users 2019 | Statista[EB/OL]. Statista, 2019. (2019)[2019 -12 -16]. https://www.statista.com/statistics/278204/china-mobile-users-by-month/
Lei PR, Shen TJ, Peng WC, Su IJ (2011) Exploring spatial-temporal trajectory model for location prediction. In: IEEE International Conference on Mobile Data Management, pp 58–67
Dong F (2012) When and where next: individual mobility prediction. In: ACM Sigspatial International Workshop on Mobile Geographic Information Systems, pp 57–64
Barlacchi G et al (2015) A multi-source dataset of urban life in the city of Milan and the Province of Trentino. Sci Data 2:150055
Nazaripouya H, Wang B, Wang Y, Chu P, Pota HR, Gadh R (2016) Univariate time series prediction of solar power using a hybrid wavelet-ARMA-NARX prediction method. In: Transmission and Distribution Conference and Exposition, pp 1–5
Gumus B, Kilic H (2018) Time dependent prediction of monthly global solar radiation and sunshine duration using exponentially weighted moving average in southern of Turkey. Therm Sci 22(2):943–951
Engle RF (1982) Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50(4):987–1007
Bollerslevb T (1986) Generalized autoregressive conditional heteroskedasticity. J Econ 31(3):307–327
Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. Computer Science
Graves A (2013) Generating sequences with recurrent neural networks, Computer Science
Cho K et al. (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation, Computer Science
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate, Computer Science
Ahmad S, Lavin A, Purdy S et al (2017) Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262:134–147
Gilson M et al. (2019) The covariance perceptron: a new framework for classification and processing of time series in recurrent neural networks. bioRxiv: 562546
Lebret R, Grangier D, Auli M (2016) Neural text generation from structured data with application to the biography domain. In: Conference on Empirical Methods in Natural Language Processing, pp 1203–1213
Bai S, Kolter JZ, Koltun V (2018) An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. Preprint https://arxiv.org/1803.01271
Gers F, Eck D (2001) Applying LSTM to time series predictable through time-window approaches. In: International Conference on Artificial Neural Networks, pp 669–676
Tian D, He G, Wu J, Chen H, Jiang Y (2016) An accurate eye pupil localization approach based on adaptive gradient boosting decision tree. In: Visual Communications and Image Processing (VCIP), pp 1–4
Shashank G et al (2018) Semi-supervised recurrent neural network for adverse drug reaction mention extraction. BMC Bioinform 19(8):212
Ghahramani Z, Hinton GE (1996) Parameter estimation for linear dynamical systems
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)? Geosci Model Dev Discuss 7(3):1247–1250
Ying X, Chen J (2017) Traffic flow forecasting method based on gradient boosting decision tree. In: 2017 5th International Conference on Frontiers of Manufacturing Science and Measuring Technology (FMSMT 2017). Atlantis Press
Acknowledgements
This work was supported in part by a grant from foundation project for the Science and Technology Department of Jilin Province (Grant No. 20170101140JC).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, L., Dai, S., Cao, Z. et al. Using improved gradient-boosted decision tree algorithm based on Kalman filter (GBDT-KF) in time series prediction. J Supercomput 76, 6887–6900 (2020). https://doi.org/10.1007/s11227-019-03130-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-019-03130-y