Abstract
An open research question in deep reinforcement learning is how to focus the policy learning of key decisions within a sparse domain. This paper emphasizes on combining the advantages of input-output hidden Markov models and reinforcement learning. We propose a novel hierarchical modeling methodology that, at a high level, detects and interprets the root cause of a failure as well as the health degradation of the turbofan engine, while at a low level, provides the optimal replacement policy. This approach outperforms baseline deep reinforcement learning (DRL) models and has performance comparable to that of a state-of-the-art reinforcement learning system while being more interpretable.
supported by Collaborative Intelligence for Safety-Critical systems (CISC) project; funded by the European Union’s Horizon 2020 Research and Innovation Programme under the Marie Skłodowska-Curie grant agreement no. 955901. The work of Kelleher is also partly funded by the ADAPT Centre which is funded under the Science Foundation Ireland (SFI) Research Centres Programme (Grant No. 13/RC/2106_P2).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
performance indicates the ability to suggest replacement before failure with the use of the maximum usable life as well as with the least number of failed equipment.
- 2.
The action of hold means that the agent neither suggests to replace nor repair and the system is healthy enough for the next operating cycle.
References
Bengio, Y., Frasconi, P.: Input-output hmms for sequence processing. IEEE Trans. Neural Netw. 7(5), 1231–1249 (1996). https://doi.org/10.1109/72.536317
Bengio, Y., Frasconi, P.: An input output hmm architecture. In: Advances in Neural Information Processing Systems, pp. 427–434 (1995)
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-dynamic programming. In: Athena Scientific (1996)
Chao, A., et al.: Aircraft engine run-to-failure dataset under real flight conditions for prognostics and diagnostics. Data 6(1), 5 (2021)
Chen, Z., et al.: Bayesian filtering: from Kalman filters to particle filters, and beyond. Statistics 182(1), 1–69 (2003)
Do, P., et al.: A proactive condition-based maintenance strategy with both perfect and imperfect maintenance actions. Reliab. Eng. Syst. Saf. 133, 22–32 (2015)
Dulac-Arnold, G., et al.: Challenges of real-world reinforcement learning: definitions, benchmarks and analysis. Mach. Learn. 110(9), 2419–2468 (2021). https://doi.org/10.1007/s10994-021-05961-4
Giantomassi, A., et al.: Hidden Markov model for health estimation and prognosis of turbofan engines. In: International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, vol. 5480 (2011)
Hofmann, P., Tashman, Z.: Hidden markov models and their application for predicting failure events. In: Krzhizhanovskaya, V.V. (ed.) ICCS 2020. LNCS, vol. 12139, pp. 464–477. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50420-5_35
Klingelschmidt, T., Weber, P., Simon, C., Theilliol, D., Peysson, F.: Fault diagnosis and prognosis by using input-output hidden markov models applied to a diesel generator. In: 2017 25th Mediterranean Conference on Control and Automation (MED), pp. 1326–1331 (2017). https://doi.org/10.1109/MED.2017.7984302
Lepenioti, K., et al.: Machine learning for predictive and prescriptive analytics of operational data in smart manufacturing. In: Dupuy-Chessa, S., Proper, H.A. (eds.) CAiSE 2020. LNBIP, vol. 382, pp. 5–16. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49165-9_1
Li, H.Y., Xu, W., Cui, Y., Wang, Z., Xiao, M., Sun, Z.X.: Preventive maintenance decision model of urban transportation system equipment based on multi-control units. IEEE Access 8, 15851–15869 (2019)
Meng, F., An, A., Li, E., Yang, S.: Adaptive event-based reinforcement learning control. In: 2019 Chinese Control And Decision Conference (CCDC), pp. 3471–3476. IEEE (2019)
Ong, K.S.H., Niyato, D., Yuen, C.: Predictive maintenance for edge-based sensor networks: a deep reinforcement learning approach. In: 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), pp. 1–6. IEEE (2020)
Panzer, M., Bender, B.: Deep reinforcement learning in production systems: a systematic literature review. Int. J. Prod. Res. 1–26 (2021)
Parra-Ullauri, J.M., et al.: Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning. Softw. Syst. Model. 21(3), 1091–1113 (2021). https://doi.org/10.1007/s10270-021-00952-4
Rabiner, L., Juang, B.: An introduction to hidden markov models. IEEE ASSP Mag. 3(1), 4–16 (1986). https://doi.org/10.1109/MASSP.1986.1165342
Saxena, A., Goebel, K.: Turbofan engine degradation simulation data set. In: NASA Ames Prognostics Data Repository, pp. 878–887 (2008)
Shahin, K.I., Simon, C., Weber, P.: Estimating iohmm parameters to compute remaining useful life of system. In: Proceedings of the 29th European Safety and Reliability Conference, Hannover, Germany, pp. 22–26 (2019)
Sikorska, J., Hodkiewicz, M., Ma, L.: Prognostic modelling options for remaining useful life estimation by industry. Mech. Syst. Sig. Process. 25(5), 1803–1836 (2011)
Skordilis, E., Moghaddass, R.: A deep reinforcement learning approach for real-time sensor-driven decision making and predictive analytics. Comput. Ind. Eng. 147, 106600 (2020)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press (2018)
Yin, M., Silva, T.: Iohmm (2017). https://github.com/Mogeng/IOHMM
Yoon, H.J., Lee, D., Hovakimyan, N.: Hidden markov model estimation-based q-learning for partially observable markov decision process. In: 2019 American Control Conference (ACC) (2019). https://doi.org/10.23919/acc.2019.8814849
Yoon, H.J., Lee, D., Hovakimyan, N.: Hidden markov model estimation-based q-learning for partially observable markov decision process. In: 2019 American Control Conference (ACC), pp. 2366–2371. IEEE (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Algorithms and Training Parameters
Algorithms and Training Parameters

1.1 Training Parameters
The summary of the DL framework within the RL architectures is as follows: (a) Deep Neural Network (DNN) consisting of a total of 37,000 training parameters and fully-connected (dense) layers with 2 hidden layers that have 128 and 256 neurons, respectively, with ReLU activation. (b) Recurrent Neural Network (RNN) consists of 468,000 training parameters and fully connected (LSTM) layers with 2 hidden layers having 128 and 256 neurons, respectively. The output layer consists of the number of actions the agent can decide for decision-making with linear activation. The parameters of the DRL agent are as follows: discount rate = 0.95, learning rate = 1e−4, and the epsilon decay rate = 0.99 is selected with the initial epsilon = 0.5.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Abbas, A.N., Chasparis, G.C., Kelleher, J.D. (2022). Interpretable Input-Output Hidden Markov Model-Based Deep Reinforcement Learning for the Predictive Maintenance of Turbofan Engines. In: Wrembel, R., Gamper, J., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2022. Lecture Notes in Computer Science, vol 13428. Springer, Cham. https://doi.org/10.1007/978-3-031-12670-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-12670-3_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-12669-7
Online ISBN: 978-3-031-12670-3
eBook Packages: Computer ScienceComputer Science (R0)