Abstract
To address the imbalance problem between supply and demand for taxis and passengers, this paper proposes a distributed ensemble empirical mode decomposition with normalization of spatial attention mechanism based bi-directional gated recurrent unit (EEMDN-SABiGRU) model on Spark for accurate passenger hotspot prediction. It focuses on reducing blind cruising costs, improving carrying efficiency, and maximizing incomes. Specifically, the EEMDN method is put forward to process the passenger hotspot data in the grid to solve the problems of non-smooth sequences and the degradation of prediction accuracy caused by excessive numerical differences, while dealing with the eigenmodal EMD. Next, a spatial attention mechanism is constructed to capture the characteristics of passenger hotspots in each grid, taking passenger boarding and alighting hotspots as weights and emphasizing the spatial regularity of passengers in the grid. Furthermore, the bi-directional GRU algorithm is merged to deal with the problem that GRU can obtain only the forward information but ignores the backward information, to improve the accuracy of feature extraction. Finally, the accurate prediction of passenger hotspots is achieved based on the EEMDN-SABiGRU model using real-world taxi GPS trajectory data in the Spark parallel computing framework. The experimental results demonstrate that based on the four datasets in the 00-grid, compared with LSTM, EMD-LSTM, EEMD-LSTM, GRU, EMD-GRU, EEMD-GRU, EMDN-GRU, CNN, and BP, the mean absolute percentage error, mean absolute error, root mean square error, and maximum error values of EEMDN-SABiGRU decrease by at least 43.18%, 44.91%, 55.04%, and 39.33%, respectively.
摘要
针对出租车与乘客之间的供需不平衡问题, 本文提出一种基于Spark的分布式归一化集合经验模态分解和面向空间注意力机制的双向门控循环单元(EEMDN-SABiGRU)模型, 实现乘客热点的精准预测, 旨在于降低盲目巡航开支、提高载客效率和实现收益最大化。首先, 提出一种归一化的集合经验模态分解方法(EEMDN), 处理网格中乘客热点数据, 解决非平稳序列问题和数值差异过大造成的预测精度下降问题, 避免EMD本征模态函数(IMF)存在的模态混叠现象。其次, 构建一种基于乘客上下车热点的权重和乘客的空间规律性的空间注意力机制, 捕捉每个网格中的乘客热点特征。再次, 融合一种双向门控循环单元(GRU)算法, 解决GRU仅能获取前向信息而忽略后向信息问题, 提高特征提取的准确性。最后, 在Spark并行计算框架下, 采用真实的出租车GPS轨迹数据, 基于EEMDN-SABiGRU模型实现了乘客热点的准确预测。实验结果表明, 在00网格4个数据集上, 与LSTM、EMDL-STM、EEMD-LSTM、GRU、EMD-GRU、EEMD-GRU、EMDN-GRU、CNN和BP相比, EEMDN-SABiGRU的平均绝对百分比误差、平均绝对误差、均方根误差和最大误差值分别降低了43.18%、44.91%、55.04%和39.33%。
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding authors upon reasonable request.
References
Ali A, Zhu YM, Zakarya M, 2021. A data aggregation based approach to exploit dynamic spatio-temporal correlations for citywide crowd flows prediction in fog computing. Multim Tool Appl, 80(20):31401–31433. https://doi.org/10.1007/s11042-020-10486-4
Batty M, Axhausen KW, Giannotti F, et al., 2012. Smart cities of the future. Eur Phys J Spec Top, 214(1):481–518. https://doi.org/10.1140/epjst/e2012-01703-3
Bi SB, Xu RZ, Liu AL, et al., 2021. Mining taxi pick-up hotspots based on grid information entropy clustering algorithm. J Adv Transp, 2021:5814879. https://doi.org/10.1155/2021/5814879
Cao Y, Hou XL, Chen N, 2022. Short-term forecast of OD passenger flow based on ensemble empirical mode decomposition. Sustainability, 14(14):8562. https://doi.org/10.3390/su14148562
Cheng X, Mao JD, Li J, et al., 2021. An EEMD-SVD-LWT algorithm for denoising a lidar signal. Measurement, 168:108405. https://doi.org/10.1016/j.measurement.2020.108405
Dong YH, Qian SY, Zhang K, et al., 2017. A novel passenger hotspots searching algorithm for taxis in urban area. Proc 18th IEEE/ACIS Int Conf on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, p.175–180. https://doi.org/10.1109/SNPD.2017.8022719
Engelbrecht J, Booysen MJ, van Rooyen GJ, et al., 2015. Survey of smartphone-based sensing in vehicles for intelligent transportation system applications. IET Intell Transp Syst, 9(10):924–935. https://doi.org/10.1049/iet-its.2014.0248
Gao HH, Liu C, Li YHZ, et al., 2020. V2VR: reliable hybrid-network-oriented V2V data transmission and routing considering RSUs and connectivity probability. IEEE Trans Intell Transp Syst, 22(6):3533–3546. https://doi.org/10.1109/tits.2020.2983835
Gong L, Liu X, Wu L, et al., 2016. Inferring trip purposes and uncovering travel patterns from taxi trajectory data. Cartogr Geogr Inform Sci, 43(2):103–114. https://doi.org/10.1080/15230406.2015.1014424
Huang ZH, Tang JY, Shan GX, et al., 2019. An efficient passenger-hunting recommendation framework with multitask deep learning. IEEE Int Things J, 6(5):7713–7721. https://doi.org/10.1109/JIOT.2019.2901759
Jamil MS, Akbar S, 2017. Taxi passenger hotspot prediction using automatic ARIMA model. Proc 3rd Int Conf on Science in Information Technology, p.23–28. https://doi.org/10.1109/ICSITech.2017.8257080
Jiang XS, Zhang L, Chen XQ, 2014. Short-term forecasting of high-speed rail demand: a hybrid approach combining ensemble empirical mode decomposition and gray support vector machine with real-world applications in China. Transp Res Part C Emerg Technol, 44:110–127. https://doi.org/10.1016/j.trc.2014.03.016
Kim T, Sharda S, Zhou XS, et al., 2020. A stepwise interpretable machine learning framework using linear regression (LR) and long short-term memory (LSTM): city-wide demand-side prediction of yellow taxi and forhire vehicle (FHV) service. Transp Res Part C Emerg Technol, 120:102786. https://doi.org/10.1016/j.trc.2020.102786
Li ML, Yan M, He HW, et al., 2021. Data-driven predictive energy management and emission optimization for hybrid electric buses considering speed and passengers prediction. J Clean Prod, 304:127139. https://doi.org/10.1016/j.jclepro.2021.127139
Li XF, Zhang Y, Du MY, et al., 2020. The forecasting of passenger demand under hybrid ridesharing service modes: a combined model based on WT-FCBF-LSTM. Sustain Cities Soc, 62:102419. https://doi.org/10.1016/j.scs.2020.102419
Li XL, Pan G, Wu ZH, et al., 2012. Prediction of urban human mobility using large-scale taxi traces and its applications. Front Comput Sci, 6(1):111–121. https://doi.org/10.1007/s11704-011-1192-6
Liu J, Wu NQ, Qiao Y, et al., 2020. Short-term traffic flow forecasting using ensemble approach based on deep belief networks. IEEE Trans Intell Transp Syst, 23(1):404–417. https://doi.org/10.1109/TITS.2020.3011700
Liu XP, Zhang YQ, Zhang QC, 2022. Comparison of EEMD-ARIMA, EEMD-BP and EEMD-SVM algorithms for predicting the hourly urban water consumption. J Hydroinf, 24(3):535–558. https://doi.org/10.2166/hydro.2022.146
Luo HM, Cai JM, Zhang KP, et al., 2021. A multi-task deep learning model for short-term taxi demand forecasting considering spatiotemporal dependences. J Traffic Transp Eng Engl Ed, 8(1):83–94. https://doi.org/10.1016/j.jtte.2019.07.002
Nie ZH, Shen F, Xu DJ, et al., 2020. An EMD-SVR model for short-term prediction of ship motion using mirror symmetry and SVR algorithms to eliminate EMD boundary effect. Ocean Eng, 217:107927. https://doi.org/10.1016/j.oceaneng.2020.107927
Niu XX, Ma JW, Wang YK, et al., 2021. A novel decomposition-ensemble learning model based on ensemble empirical mode decomposition and recurrent neural network for landslide displacement prediction. Appl Sci, 11(10):4684. https://doi.org/10.3390/app11104684
Ou JJ, Sun JH, Zhu YC, et al., 2020. STP-TrellisNets: spatial-temporal parallel trellisnets for metro station passenger flow prediction. Proc 29th ACM Int Conf on Information & Knowledge Management, p.1185–1194. https://doi.org/10.1145/3340531.3411874
Qin QD, He HD, Li L, et al., 2020. A novel decomposition-ensemble based carbon price forecasting model integrated with local polynomial prediction. Comput Econ, 55(4):1249–1273. https://doi.org/10.1007/s10614-018-9862-1
Qu BT, Yang WX, Cui G, et al., 2019. Profitable taxi travel route recommendation based on big taxi trajectory data. IEEE Trans Intell Transp Syst, 21(2):653–668. https://doi.org/10.1109/TITS.2019.2897776
Rezaei H, Faaljou H, Mansourfar G, 2021. Stock price prediction using deep learning and frequency decomposition. Exp Syst Appl, 169:114332. https://doi.org/10.1016/j.eswa.2020.114332
Saadallah A, Moreira-Matias L, Sousa R, et al., 2020. BRIGHT—drift-aware demand predictions for taxi networks. IEEE Trans Knowl Data Eng, 32(2):234–245. https://doi.org/10.1109/TKDE.2018.2883616
Seng DW, Lv FS, Liang ZY, et al., 2021. Forecasting traffic flows in irregular regions with multi-graph convolutional network and gated recurrent unit. Front Inform Technol Electron Eng, 22(9):1179–1193. https://doi.org/10.1631/FITEE.2000243
Wang RK, Huang WJ, Hu BT, et al., 2022. Harmonic detection for active power filter based on two-step improved EEMD. IEEE Trans Instrum Meas, 71:9001510. https://doi.org/10.1109/TIM.2022.3146913
Xia DW, Jiang SY, Yang N, et al., 2021a. Discovering spatiotemporal characteristics of passenger travel with mobile trajectory big data. Phys A Stat Mech Appl, 578:126056. https://doi.org/10.1016/j.physa.2021.126056
Xia DW, Zhang MT, Yan XB, et al., 2021b. A distributed WND-LSTM model on MapReduce for short-term traffic flow prediction. Neur Comput Appl, 33(7):2393–2410. https://doi.org/10.1007/s00521-020-05076-2
Xia DW, Bai Y, Geng J, et al., 2022a. A distributed EMDN-GRU model on Spark for passenger waiting time forecasting. Neur Comput Appl, 34(21):19035–19050. https://doi.org/10.1007/s00521-022-07482-0
Xia DW, Zheng YL, Bai Y, et al., 2022b. A parallel grid-search-based SVM optimization algorithm on Spark for passenger hotspot prediction. Multim Tool Appl, 81(19):27523–27549. https://doi.org/10.1007/s11042-022-12077-x
Xu DW, Wang YD, Jia LM, et al., 2017. Real-time road traffic state prediction based on ARIMA and Kalman filter. Front Inform Technol Electron Eng, 18(2):287–302. https://doi.org/10.1631/FITEE.1500381
Yang X, Xue QC, Yang XX, et al., 2021. A novel prediction model for the inbound passenger flow of urban rail transit. Inform Sci, 566:347–363. https://doi.org/10.1016/j.ins.2021.02.036
Yao XW, Wang FG, Zhang Y, 2016. A prediction model of security situation based on EMD-PSO-SVM. Proc Int Conf on Electrical and Information Technologies for Rail Transportation, p.355–363. https://doi.org/10.1007/978-3-662-49370-0_37
Yu FH, Hao HBW, Li QL, 2021. An ensemble 3D convolutional neural network for spatiotemporal soil temperature forecasting. Sustainability, 13(16):9174. https://doi.org/10.3390/su13169174
Zhang WY, Xia DW, Chang GY, et al., 2022. APFD: an effective approach to taxi route recommendation with mobile trajectory big data. Front Inform Technol Electron Eng, 23(10):1494–1510. https://doi.org/10.1631/FITEE.2100530
Zhang XK, Zhang QW, Zhang G, et al., 2018. A novel hybrid data-driven model for daily land surface temperature forecasting using long short-term memory neural network based on ensemble empirical mode decomposition. Int J Environ Res Publ Health, 15(5):1032. https://doi.org/10.3390/ijerph15051032
Zheng LJ, Xia D, Zhao X, et al., 2018. Spatial-temporal travel pattern mining using massive taxi trajectory data. Phys A Stat Mech Appl, 501:24–41. https://doi.org/10.1016/j.physa.2018.02.064
Zheng Y, 2017. Urban computing: enabling urban intelligence with big data. Front Comput Sci, 11(1):1–3. https://doi.org/10.1007/s11704-016-6907-2
Zheng Y, Capra L, Wolfson O, et al., 2014. Urban computing: concepts, methodologies, and applications. ACM Trans Intell Syst Technol, 5(3):38. https://doi.org/10.1145/2629592
Zhou YR, Li J, Chen H, et al., 2020. A spatiotemporal attention mechanism-based model for multi-step citywide passenger demand prediction. Inform Sci, 513:372–385. https://doi.org/10.1016/j.ins.2019.10.071
Zhu L, Yu FR, Wang YG, et al., 2018. Big data analytics in intelligent transportation systems: a survey. IEEE Trans Intell Transp Syst, 20(1):383–398. https://doi.org/10.1109/TITS.2018.2815678
Author information
Authors and Affiliations
Contributions
Dawen XIA and Jian GENG designed the research. Dawen XIA, Jian GENG, and Huaqing LI proposed the approaches and performed the experiments. Ruixi HUANG, Bingqi SHEN, and Yang HU processed the data. Dawen XIA, Jian GENG, and Huaqing LI drafted the paper. Dawen XIA, Jian GENG, Yang HU, Yantao LI, and Huaqing LI revised and finalized the paper.
Corresponding authors
Ethics declarations
Dawen XIA, Jian GENG, Ruixi HUANG, Bingqi SHEN, Yang HU, Yantao LI, and Huaqing LI declare that they have no conflict of interest.
Additional information
Project supported by the National Natural Science Foundation of China (Nos. 62162012, 62173278, and 62072061), the Science and Technology Support Program of Guizhou Province, China (No. QKHZC2021YB531), the Natural Science Research Project of Department of Education of Guizhou Province, China (Nos. QJJ2022015 and QJJ2022047), the Science and Technology Foundation of Guizhou Province, China (Nos. QKHJCZK2022YB195, QKHJCZK2022YB197, and QKHJCZK2023YB143), the Scientific Research Platform Project of Guizhou Minzu University, China (No. GZMUSYS202104), and the 7th Batch High-Level Innovative Talent Project of Guizhou Province, China
List of electronic supplementary materials
Fig. S1 Comparisons of models using the 1-day dataset
Fig. S2 Comparisons of models using the 10-day dataset
Fig. S3 Comparisons of models using the 20-day dataset
Fig. S4 Comparisons of models using the 30-day dataset
Supplementary materials
Rights and permissions
About this article
Cite this article
Xia, D., Geng, J., Huang, R. et al. A distributed EEMDN-SABiGRU model on Spark for passenger hotspot prediction. Front Inform Technol Electron Eng 24, 1316–1331 (2023). https://doi.org/10.1631/FITEE.2200621
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2200621
Key words
- Passenger hotspot prediction
- Ensemble empirical mode decomposition (EEMD)
- Spatial attention mechanism
- Bi-directional gated recurrent unit (BiGRU)
- GPS trajectory
- Spark