Abstract
For high dimensional data analytics, feature selection is an indispensable preprocessing step to reduce dimensionality and keep the simplicity and interpretability of models. This is particularly important for fuzzy modeling since fuzzy models are widely recognized for their transparency and interpretability. Despite the substantial work on feature selection, there is little research on determining the optimal number of features for a task. In this paper, we propose a method to help find the optimal number of feature effectively based on mutual information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alonso, J.M., Castiello, C., Mencar, C.: Interpretability of fuzzy systems: current research trends and prospects. In: Springer Handbook of Computational Intelligence, pp. 219–237. Springer, Berlin (2015)
Alpaydin, E.: Introduction to Machine Learning. MIT press, Cambridge (2014)
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (2012)
Dash, M., Liu, H.: Consistency-based search in feature selection. Artif. Intell. 151(1–2), 155–176 (2003)
Gaspar-Cunha, A., Recio, G., Costa, L., Estébanez, C.: Self-adaptive MOEA feature selection for classification of bankruptcy prediction data. Sci. World J. 2014, 20 (2014)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction: Foundations and Applications, vol. 207. Springer, Heidelberg (2008)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002)
Hall, M.A.: Correlation-based feature selection for machine learning. Ph.D. thesis, The University of Waikato (1999)
Hall, M.A., Smith, L.A.: Practical feature subset selection for machine learning (1998)
Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Trans. Knowl. Data Eng. 17(3), 299–310 (2005)
Hughes, G.: On the mean accuracy of statistical pattern recognizers. IEEE Trans. Inf. Theor. 14(1), 55–63 (1968)
Jang, J.S.R., Sun, C.T., Mizutani, E.: Neuro-Fuzzy and Soft Computing, a Computational Approach to Learning and Machine Intelligence. Prentice Hall, Upper Saddle River (1997)
Kaymak, U., Ben-David, A., Potharst, R.: The AUK: a simple alternative to the AUC. Eng. Appl. Artif. Intell. 25(5), 1082–1089 (2012)
Khan, A., Baig, A.R.: Multi-objective feature subset selection using non-dominated sorting genetic algorithm. J. Appl. Res. Technol. 13(1), 145–159 (2015)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Pohjalainen, J., Räsänen, O., Kadioglu, S.: Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits. Comput. Speech Lang. 29(1), 145–171 (2015)
Setnes, M., Kaymak, U.: Fuzzy modeling of client preference from large data sets: an application to target selection in direct marketing. IEEE Trans. Fuzzy Syst. 9(1), 153–163 (2001)
Wilbik, A., van Loon, S., Boer, A.K., Kaymak, U., Scharnhorst, V.: Fuzzy modeling for vitamin b12 deficiency. In: International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 462–471. Springer (2016)
Xue, B., Fu, W., Zhang, M.: Multi-objective feature selection in classification: a differential evolution approach. In: Asia-Pacific Conference on Simulated Evolution and Learning, pp. 516–528. Springer (2014)
Acknowledgement
This work is partially supported by Philips Research within the scope of the BrainBridge Program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Chen, P., Wilbik, A., van Loon, S., Boer, AK., Kaymak, U. (2018). Finding the Optimal Number of Features Based on Mutual Information. In: Kacprzyk, J., Szmidt, E., Zadrożny, S., Atanassov, K., Krawczak, M. (eds) Advances in Fuzzy Logic and Technology 2017. EUSFLAT IWIFSGN 2017 2017. Advances in Intelligent Systems and Computing, vol 641. Springer, Cham. https://doi.org/10.1007/978-3-319-66830-7_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-66830-7_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66829-1
Online ISBN: 978-3-319-66830-7
eBook Packages: EngineeringEngineering (R0)