Long Short-Term Memory Networks with Multiple Variables for Stock Market Prediction

Gao, Fei; Zhang, Jiangshe; Zhang, Chunxia; Xu, Shuang; Ma, Cong

doi:10.1007/s11063-022-11037-8

Long Short-Term Memory Networks with Multiple Variables for Stock Market Prediction

Published: 10 October 2022

Volume 55, pages 4211–4229, (2023)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Fei Gao¹,
Jiangshe Zhang¹,
Chunxia Zhang¹,
Shuang Xu¹ &
…
Cong Ma¹

691 Accesses
1 Altmetric
Explore all metrics

Abstract

Long short-term memory (LSTM) networks have been successfully applied to many fields including finance. However, when the input contains multiple variables, a conventional LSTM does not distinguish the contribution of different variables and cannot make full use of the information they transmit. To meet the need for multi-variable modeling of financial sequences, we present an application of multi-variable LSTM (MV-LSTM) network for stock market prediction in this paper. The network consists of two serial modules: the first module is a recurrent layer with MV-LSTM as its recurrent unit, which is able to encode information from each variable exclusively; the second module employs a variable attention mechanism by introducing a latent variable and enables the model to measure the importance of each variable to the target. With these two modules, the model can deal with multi-variable financial sequences more effectively. Moreover, a statistical arbitrage investment strategy is constructed based on the prediction model. Extensive experiments on the large-scale Chinese stock data show that the MV-LSTM network has a higher prediction accuracy and provides a better statistical arbitrage investment strategy than other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 11

Stock Market Prediction Using Deep Learning Techniques for Short and Long Horizon

Stock Price Prediction Using GRU, SimpleRNN and LSTM

Stock Market Prediction Based on Advanced LSTM Models

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Vectors are assumed to be in column form in this paper.
http://www.csindex.com.cn/en.
https://www.joinquant.com/.

References

Avellaneda M, Lee JH (2010) Statistical arbitrage in the US equities market. Quant Financ 10(7):761–782
Article MathSciNet MATH Google Scholar
Gatev E, Goetzmann WN, Rouwenhorst KG (2006) Pairs trading: performance of a relative-value arbitrage rule. Rev Financ Stud 19(3):797–827
Article Google Scholar
Vidyamurthy G (2004) Pairs trading: quantitative methods and analysis, vol 217. Wiley, New York
Google Scholar
Huang CF, Hsu CJ, Chen CC, Chang BR, Li CA (2015) An intelligent model for pairs trading using genetic algorithms. Comput Intell Neurosci 2015:939606
Article Google Scholar
Nóbrega JP, Oliveira AL (2014) A combination forecasting model using machine learning and Kalman filter for statistical arbitrage. In: 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1294–1299. IEEE
Petropoulos A, Chatzis SP, Siakoulis V, Vlachogiannakis N (2017) A stacked generalization system for automated FOREX portfolio trading. Expert Syst Appl 90:290–302
Article Google Scholar
Fischer T, Krauss C (2018) Deep learning with long short-term memory networks for financial market predictions. Eur J Oper Res 270(2):654–669
Article MathSciNet MATH Google Scholar
Guo T, Lin T, Antulov-Fantulin N (2019) Exploring interpretable LSTM neural networks over multi-variable data. In: International conference on machine learning, pp. 2494–2504. PMLR
Cont R (2001) Empirical properties of asset returns: stylized facts and statistical issues. Quant Financ 1:223–236
Article MATH Google Scholar
Chakraborti A, Toke IM, Patriarca M, Abergel F (2011) Econophysics review: I. Empirical facts. Quant Financ 11(7):991–1012
Article MathSciNet Google Scholar
Granger CW (1992) Forecasting stock market prices: lessons for forecasters. Int J Forecast 8(1):3–13
Article Google Scholar
Agrawal J, Chourasia V, Mittra A (2013) State-of-the-art in stock prediction techniques. Int J Adv Res Electr Electr Instrum Eng 2(4):1360–1366
Google Scholar
Zhang L, Aggarwal C, Qi GJ (2017) Stock price prediction via discovering multi-frequency trading patterns. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 2141–2149
Ariyo AA, Adewumi AO, Ayo CK (2014) Stock price prediction using the ARIMA model. In: 2014 UKSim-AMSS 16th international conference on computer modelling and simulation, pp. 106–112. IEEE
Alberg D, Shalit H, Yosef R (2008) Estimating stock market volatility using asymmetric GARCH models. Appl Financ Econ 18(15):1201–1208
Article Google Scholar
Pratap A, Raja R, Cao J, Alzabut J, Huang C (2020) Finite-time synchronization criterion of graph theory perspective fractional-order coupled discontinuous neural networks. Adv Diff Equ 2020(1):1–24
Article MathSciNet MATH Google Scholar
Huang C, Liu B, Qian C, Cao J (2021) Stability on positive pseudo almost periodic solutions of HPDCNNs incorporating D operator. Math Comput Simul 190:1150–1163
Article MathSciNet MATH Google Scholar
Wang W (2022) Further results on mean-square exponential input-to-state stability of stochastic delayed Cohen-Grossberg neural networks. Neural Process Lett pp. 1–13
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th international conference on machine learning, pp. 160–167
Althelaya KA, El-Alfy ESM, Mohammed S (2018) Evaluation of bidirectional LSTM for short-and long-term stock market prediction. In: 2018 9th international conference on information and communication systems (ICICS), pp. 151–156
Cao J, Li Z, Li J (2019) Financial time series forecasting model based on CEEMDAN and LSTM. Phys A 519:127–139
Article Google Scholar
Chang V, Man X, Xu Q, Hsu CH (2021) Pairs trading on different portfolios based on machine learning. Expert Syst 38(3):e12649
Article Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing (EMNLP), pp. 1412–1421
Sak H, Senior A, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Fifteenth annual conference of the international speech communication association (ISCA), pp. 338–342
Nguyen H, Tran KP, Thomassey S, Hamad M (2021) Forecasting and anomaly detection approaches using LSTM and LSTM autoencoder techniques with the applications in supply chain management. Int J Inf Manage 57:102282
Article Google Scholar
Wang F, Liu X, Deng G, Yu X, Li H, Han Q (2019) Remaining life prediction method for rolling bearing based on the long short-term memory network. Neural Process Lett 50(3):2437–2454
Article Google Scholar
Kumar S, Sharma R, Tsunoda T, Kumarevel T, Sharma A (2021) Forecasting the spread of COVID-19 using LSTM network. BMC Bioinf 22(6):1–9
Google Scholar
Choi E, Bahadori MT, Sun J, Kulas J, Schuetz A, Stewart, W (2016) Retain: an interpretable predictive model for healthcare using reverse time attention mechanism. Adv Neural Inf Process Syst (NeurIPS), pp. 3504–3512
Qin Y, Song D, Cheng H, Cheng W, Jiang G, Cottrell GW (2017) A dual-stage attention-based recurrent neural network for time series prediction. In: Proceedings of the 26th international joint conference on artificial intelligence (IJCAI), pp. 2627–2633
Huck N (2009) Pairs selection and outranking: an application to the S &P 100 index. Eur J Oper Res 196(2):819–825
Article Google Scholar
Huck N (2010) Pairs trading and outranking: the multi-step-ahead forecasting case. Eur J Oper Res 207(3):1702–1716
Article MathSciNet Google Scholar
Krauss C, Do XA, Huck N (2017) Deep neural networks, gradient-boosted trees, random forests: statistical arbitrage on the S &P 500. Eur J Oper Res 259(2):689–702
Article MATH Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Shen G, Tan Q, Zhang H, Zeng P (2018) Deep learning with gated recurrent unit networks for financial sequence predictions. Procedia Comput Sci 131:895–903
Article Google Scholar
Lee SI, Yoo SJ (2018) A new method for portfolio construction using a deep predictive model. In: Proceedings of the 7th international conference on emerging databases, pp. 260–266
Gao Y, Wang R, Zhou E (2021) Stock prediction based on optimized LSTM and GRU models. Scientific Programming 2021
Hu Y (2021) Stock forecast based on optimized LSSVM model. Comput Sci 48(S1):151–157
Google Scholar
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
Gers FA, Schmidhuber J (2000) Recurrent nets that time and count. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. Neural Computing: New Challenges and Perspectives for the New Millennium (IJCNN), pp. 189–194
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12(10):2451–2471
Article Google Scholar
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations (ICLR)
Tieleman T, Hinton G (2012) Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw Mach Earn 4(2):26–31
Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Chollet F, et al. (2015) Keras. https://github.com/fchollet/keras
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Article MATH Google Scholar
Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. In: Proceedings of the 30th international conference on international conference on machine learning (ICML), pp. III–1319
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp. 249–256. JMLR Workshop and Conference Proceedings
Kim TK (2015) T test as a parametric statistic. Korean J Anesthesiol 68(6):540
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61976174), the Nature Science Basis Research Program of Shaanxi (No. 2021JQ-055), the Ministry of Education of Humanities and Social Science Project of China (No. 22XJCZH004) and the Scientific Research Project of Shaanxi Provincial Department of Education (No. 22JK0186).

Author information

Authors and Affiliations

School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, 710049, People’s Republic of China
Fei Gao, Jiangshe Zhang, Chunxia Zhang, Shuang Xu & Cong Ma

Authors

Fei Gao
View author publications
You can also search for this author inPubMed Google Scholar
Jiangshe Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Chunxia Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Shuang Xu
View author publications
You can also search for this author inPubMed Google Scholar
Cong Ma
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jiangshe Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Evaluation Metrics

Several evaluation metrics used in this paper are defined as follows.

(1)
Max drawdown: Max drawdown (MDD) measures the maximum fall in the value of an investment. It is calculated by the difference between the value of the lowest trough and that of the highest peak before the trough, i.e.,
$$\begin{aligned} {\mathrm{MDD}} = \mathop {max}\limits _{0 \le t_1 \le t_2 \le T} \frac{V_{t_1}-V_{t_2}}{V_{t_1}}, \end{aligned}$$
where T is the length of trading period, $V_{t_1}$ and $V_{t_2}$ are the values of the investment at the time $t_1$ and $t_2$, respectively. A low MDD value indicates slight fluctuations in the investment value and, therefore, a low degree of risk, and vice versa.
(2)
Sharpe ratio: Sharpe ratio measures the performance of an investment compared to a risk-free asset, after adjusting for its risk. It is defined as the excess return for per unit of risk, i.e.,
$$\begin{aligned} {\mathrm{Sharpe \ ratio}} = \frac{R_p-R_f}{\sigma _p}, \end{aligned}$$
where $R_p$ is the expected return of the investment, $R_f$ is the risk-free rate, and $\sigma _p$ is the standard deviation of the investment return. The higher the ratio, the greater the investment return relative to the amount of risk, and thus, the better the investment.
(3)
Sortino ratio: Sortino ratio is an improvement of Sharpe ratio. It only considers the downside risk rather than the total risk like Sharpe ratio, since upside risk is beneficial. It is calculated as
$$\begin{aligned} {\mathrm{Sortino\ ratio}} = \frac{R_p-R_f}{\sigma _d}, \end{aligned}$$
where $\sigma _d$ is the standard deviation of the downside risk deviation. Like Sharpe ratio, a higher Sortino ratio indicates a better investment.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gao, F., Zhang, J., Zhang, C. et al. Long Short-Term Memory Networks with Multiple Variables for Stock Market Prediction. Neural Process Lett 55, 4211–4229 (2023). https://doi.org/10.1007/s11063-022-11037-8

Download citation

Accepted: 13 September 2022
Published: 10 October 2022
Issue Date: August 2023
DOI: https://doi.org/10.1007/s11063-022-11037-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Long Short-Term Memory Networks with Multiple Variables for Stock Market Prediction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Stock Market Prediction Using Deep Learning Techniques for Short and Long Horizon

Stock Price Prediction Using GRU, SimpleRNN and LSTM

Stock Market Prediction Based on Advanced LSTM Models

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A Evaluation Metrics

Appendix A Evaluation Metrics

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now