Abstract
Variable selection is an important concept in data mining, which can improve both the performance of machine learning and the process knowledge by removing the irrelevant and redundant features. The paper presents a hybrid variable selection approach that merges a combination of filters with a wrapper in order to obtain an informative subset of variables in a reasonable time, improving the stability of the single approach of more than 36% in average, without decreasing the system performance. The proposed method is tested on datasets coming from the UCI repository and from industrial contexts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Asuncion, A., Newman, D.: Uci machine learning repository (2007). http://archive.ics.uci.edu/ml/datasets.html
Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press (1961)
Cateni, S., Colla, V.: Improving the stability of wrapper variable selection applied to binary classification. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 8, 214–225 (2016)
Cateni, S., Colla, V.: Improving the stability of sequential forward and backward variables selection. In: 15 th International Conference on Intelligent Systems design and applications ISDA 2015, pp. 374–379 (2016)
Cateni, S., Colla, V., Vannucci, M.: General purpose input variable extraction: a genetic algorithm based procedure give a gap. In: 9th International Conference on Intelligence Systems design and Applications ISDA’09, pp. 1278–1283 (2009)
Cateni, S., Colla, V., Vannucci, M.: Variable selection through genetic algorithms for classification purpose. In: Proceedings of the 10th IASTED International Conference on Artificial Intelligence and Applications, AIA 2010, pp. 6–11 (2010)
Cateni, S., Colla, V., Vannucci, M.: A genetic algorithm based approach for selecting input variables and setting relevant network parameters of som based classifier. Int. J. Simul. Syst. Sci. Technol. 12(2), 30–37 (2011)
Cateni, S., Colla, V., Vannucci, M.: Novel resampling method for the classification of imbalanced datasets for industrial and other rreal-world problems. Int. Conf. Intell. Syst. Des. Appl. ISDA 2011, 402–407 (2011)
Cateni, S., Colla, V., Vannucci, M.: A hybrid feature selection method for classification purposes. In: 8th European Modeling Symposium on Mathematical Modeling and Computer simulation EMS2014 1 Pisa (Italy), pp. 1–8 (2014)
Cateni, S., Colla, V., Vannucci, M.: A method for resampling imbalanced datadata in binary classification tasks for real-world problems. Neurocomputing 135, 32–41 (2014)
Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley, New York (USA) (2001)
Fausett, L.: Foundamentals of Neural Networks. Prentice Hall (1994)
Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., lander, C.B.E.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Mach. Learn. 3, 1157–1182 (2003)
Haykin, S.: Neural Networks: A Comprehensive Foundation. MacMillman Publishing (1994)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems, pp. 507–514 (2005)
Koc, L., Carswell, A.D.: Network intrusion detection using a hnb binary classifier. In: 17th UKSIM-AMSS International Conference on Modelling and Simulation (2015)
Kohavi, R., John, G.: Wrappers for feature selection. Artif. Intell. 97, 273–324 (1997)
Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)
Lee, K.: Combining multiple feature selection methods. Ph.D. Thesis, The Mid-Atlantic Student Workshop on Programming Languages and Systems Pace University (2002)
May, R., Dandy, G., Maier, H.: Review of input variable selection methods for artificial neural networks. Artif. Neural Netw. Methodol. Adv. Biomed. Appl. (2011)
Nikooienejad, A., Wang, W., Johnson, V.E.: Bayesian variable selection for binary outcomes in high dimensional genomic studies using non-local priors. Bioinformatics 32(2) (2016)
Rice, J.A.: Mathematical Statistics and Data Analysis. Third Edition (2006)
Sebban, M., Nock, R.: A hybrid filter/wrapper approach of feature selection using information theory. Pattern Recogn. 35, 835–846 (2002)
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45, 427–437 (2009)
Theodoridis, S., Koutroumbas, K.: Pattern Recogn. (1999)
Turney, P.: Techncal note:bias and the quantification of stability. Mach. Learn. 20, 23–33 (1995)
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation basedfilter solution. In: Proceedings of the 20th International Conference on Machine Learning ICML, vol. 1, pp. 856–863 (2003)
Zhang, K., Li, Y., Scarf, P., Ball, A.: Feature selection for high-dimensional machinery fault diagnosis data using multiple models and radial basis function networks. Neurocomputing 74, 2941–2952 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Cateni, S., Colla, V. (2018). A Hybrid Variable Selection Approach for NN-Based Classification in Industrial Context. In: Esposito, A., Faudez-Zanuy, M., Morabito, F., Pasero, E. (eds) Multidisciplinary Approaches to Neural Computing. Smart Innovation, Systems and Technologies, vol 69. Springer, Cham. https://doi.org/10.1007/978-3-319-56904-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-56904-8_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56903-1
Online ISBN: 978-3-319-56904-8
eBook Packages: EngineeringEngineering (R0)