Abstract
We consider a multivariate data matrix of size \(n \times d = 2183 \times 15\), where \(n=2183\) is the number of time segments recorded from vibration signals of two gearboxes, and \(d=15\) is the number of variables (traits) characterizing these segments. To learn about the role played by each of the 15 variables in the gearbox diagnostics, we use the Random Forest (RF) methodology with its ‘Variables Importance Plot’ (VIP) algorithm, which yields a kind of ranking of the variables with regard of their importance in the performed diagnostics. This ranking is different in various runs of the RF. We propose to use at this stage an additional module performing a specific ensemble learning yielding credits scores for each variable. It shows clearly the top most important variables.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bartelmus, W., Zimroz, R.: A new feature for monitoring the condition of gearboxes in non-stationary operating systems. Mech. Syst. Signal Process. 23(5), 1528–1534 (2009)
Bartkowiak, A., Zimroz, R.: Dimensionality reduction via variables selection - linear and nonlinear approaches with application to vibration-based condition monitoring of planetary gearbox. Appl. Accoustics 77, 169–177 (2014)
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Taylor & Francis (1984)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Breiman, L.: Statistical modeling: the two cultures. Stat. Sci. 16(3), 199–231 (2001)
Breiman, L., Cutler, A.: Random Forest Manual v. 4.0. Technical Report UC Berkeley (2003)
Burduk, R., Baczyńska, P.: Ensemble of classifiers with modification of confidence values. In: Saeed, K., Homenda, W. (eds.) CISIM 2016. LNCS, vol. 9842, pp. 473–480. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45378-1_42
Cerrada, M., et al.: Fault diagnosis in spur gears based on genetic algorithm and random forest. Mech. Syst. Signal Process. 70, 87–103 (2016)
Dey, A., Shaikh, S.H., Saeed, K., Chaki, N.: Modified majority voting algorithm towards creating reference image for binarization. In: Kumar Kundu, M., Mohapatra, D.P., Konar, A., Chakraborty, A. (eds.) Advanced Computing, Networking and Informatics- Volume 1. SIST, vol. 27, pp. 221–227. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07353-8_26
Genuer, R., Poggi, J.M., Tuleau-Malot, Ch., Elsevier: Variable selection using random forests. Pattern Recogn. Lett. 31(14), 2225–2236 (2010)
Heda, P., Rojek, I., Burduk, R.: Dynamic ensemble selection – application to classification of cutting tools. In: Saeed, K., Dvorský, J. (eds.) CISIM 2020. LNCS, vol. 12133, pp. 345–354. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47679-3_29
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. STS, vol. 103. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-7138-7
Kuhn, M., Johnson, K.: Applied Predictive Modeling. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-6849-3
Li, Y., Wu, F.X., Ngom, A.: A review on machine learning principles for multi-view biological data integration. Briefings Bioinf. 19(2), 325–340 (2018)
Liaw, A., Wiener, M.: Classification and regression by random forest. R News 2(3), 18–22 (2002)
Lipinski, P., Brzychczy, E., Zimroz, R.: Decision tree-based classification for planetary gearboxes’ condition monitoring with the use of vibration data in multidimensional symptom space. Sensors 20, 1–17 (2020). https://doi.org/10.3390/s20215979
Maqsood, I., Abraham, A.: Weather analysis using ensemble of connectionist learning paradigms. Appl. Soft Comput. 7, 995–1004 (2007)
Polikar, R.: Bootstrap inspired techniques in computational intelligence: ensemble of classifiers, incremental learning, data fusion and missing features. IEEE Signal Process. Mag. 24(4), 59–72 (2007)
Polikar, R.: Ensemble based systems in decision making. IEEE Circ. Syst. Mag. 9(3), 21–45 (2006)
Polikar, R.: Ensemble learning. Scholarpedia 4(1), 2776 (2009)
Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Great Britain (1996)
Ripley, B.D.: Package ‘nnet’. Package for feed-forward neural networks with a single hidden layer, and for multinomial log-linear models, pp. 1–11. cran.r-project.org. 3 May 2021
Stapor, K., Ksieniewicz, P., Garcia, S., Wozniak, M.: How to design the fair experimental classifier evaluation. Appl. Soft Comput. J. 104, 107219 (2021)
Zimroz, R., Bartkowiak, A.: Two simple multivariate procedures for monitoring planetary gearboxes in non-stationary operating conditions. Mech. Syst. Signal Process. 38, 237–247 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Bartkowiak, A.M., Zimroz, R. (2021). Importance of Variables in Gearbox Diagnostics Using Random Forests and Ensemble Credits. In: Saeed, K., Dvorský, J. (eds) Computer Information Systems and Industrial Management. CISIM 2021. Lecture Notes in Computer Science(), vol 12883. Springer, Cham. https://doi.org/10.1007/978-3-030-84340-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-84340-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-84339-7
Online ISBN: 978-3-030-84340-3
eBook Packages: Computer ScienceComputer Science (R0)