Abstract
With increases in feature dimensions and the emergence of hierarchical class structures, hierarchical feature selection has become an important data preprocessing step in machine learning. A variety of effective feature selection methods based on granular computing and hierarchical information have been proposed. The fuzzy rough set method is an effective granular computing method for dealing with uncertainty. However, it is time-consuming because the distance calculations are only based on single samples. In this paper, we propose a fuzzy rough set approach using the Hausdorff distance of the sample set for hierarchical feature selection. This integrates the benefits of sample granularity and class hierarchical granularity. Firstly, the general feature selection task is decomposed into coarse-grained and fine-grained tasks according to the hierarchical structure of the data’s semantic labels. This allows a large and difficult classification task to be divided into several small and controllable subtasks. Then, the Hausdorff distance-based fuzzy rough set method is used to select the best feature subset in each coarse- and fine-grained subtask. Unlike single-sample-based distance calculation, Hausdorff distance calculation uses a sample set of different classes. The new model greatly reduces the computational complexity of classification. Finally, we use the top-down support vector machine classifier to experimentally verify the effectiveness of the proposed methods on five hierarchical datasets. Compared with five existing feature selection algorithms in terms of three evaluation metrics, the proposed method provides the highest average accuracy and much lower running time. In particular, on the F194 dataset, our method takes the least time to improve the FH indicator by 2% compared with that of the second-best algorithm.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Datasets and code used in this research have been explained and uploaded to GitHub. They are accessible at: https://github.com/fhqxa/APIN-HFRS.
References
Aksoy S, Nowak K, Purvine E, Young S (2019) Relative hausdorff distance for network analysis. Appl Netw Sci 4(1):80–105
Bargiela A, Pedrycz W (2016) Granular computing. In: Fuzzy logic, systems, artificial neural networks, and learning systems, pp 43–66
Blanco Mesa F, Merigó J, Gil Lafuente A (2017) Fuzzy decision making: a bibliometric-based review. J Intell Fuzzy Syst 32(3):2033–2050
Cai R, Qiao J, Zhang K, Zhang Z, Hao Z (2018) Causal discovery from discrete data using hidden compact representation. In: Advances in neural information processing systems, pp 2666–2674
Cai Z, Zhu W (2018) Multi-label feature selection via feature manifold learning and sparsity regularization. Int J Machine Learn Cybern 9(8):1321–1334
Cerri R, de Carvalho A (2010) New top-down methods using SVMs for hierarchical multilabel classification problems. In: International joint conference on neural networks, pp 1–8
Cesa-Bianchi N, Gentile C, Zaniboni L (2006) Hierarchical classification: combining bayes with SVM. In: International conference on machine learning, pp 177–184
Chen D, Zhao S (2010) Local reduction of decision system with fuzzy rough sets. Fuzzy Sets Syst 161(13):1871–1883
Cheng M, Liu Y, Hou Q, Bian J, Torr P, Hu S, Tu Z (2016) HFS: hierarchical feature selection for efficient image segmentation. In: European conference on computer vision, pp 867–882
Coelho F, Braga A, Verleysen M (2010) Multi-objective semi-supervised feature selection and model selection based on pearson’s correlation coefficient. In: Iberoamerican congress on pattern recognition, pp 509–516
Deng J, Dong W, Socher R, Li L, Li K, Li F (2009) Imagenet: a large-scale hierarchical image database. In: IEEE Conference on computer vision and pattern recognition, pp 248–255
Deng Z (2018) An efficient structure for fast mining high utility itemsets. Appl Intell pp(48) 3161–3177
Ding C, Dubchak I (2001) Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4):349–358
Ding W, Chang B (2008) Improving chinese semantic role classification with hierarchical feature selection strategy. In: Empirical methods in natural language processing, pp 324–333
Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17(2):191–209
Everingham M, Van Gool L, Williams C, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Hu Q, Yu D, Pedrycz W, Chen D (2011) Kernelized fuzzy rough sets and their applications. IEEE Trans Knowl Data Eng 23(11):1649–1667
Jensen R, Shen Q (2009) New approaches to fuzzy-rough feature selection. IEEE Trans Fuzzy Syst 17(4):824–838
Kononenko I (1994) Estimating attributes: analysis and extensions of relief. In: European conference on machine learning, pp 171–182
Kosmopoulos A, Partalas I, Gaussier E, Paliouras G, Androutsopoulos I (2015) Evaluation measures for hierarchical classification: a unified view and novel approaches. Data Min Knowl Disc 29(3):820–865
Kryszkiewicz M (1998) Rough set approach to incomplete information systems. Inf Sci 112 (1-4):39–49
Kuipers B (2000) The spatial semantic hierarchy. Artif Intell 119(1-2):191–233
Liu X, Zhao H (2019) Hierarchical feature extraction based on discriminant analysis. Appl Intell 49(7):2780–2792
Nabil N, Essam H, Kashif H (2020) An efficient henry gas solubility optimization for feature selection. Expert Syst Appl 152(3):364–372
Nie F, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint ℓ2,1-norms minimization. In: Advances in neural information processing systems, pp 1813–1821
Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11(5):341–356
Qian Y, Liang J, Pedrycz W, Dang C (2010) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174(9–10):597–618
Reich Y, Fenves S (1989) Integration of generic learning tasks. Engineering Design Research Center, Carnegie Mellon University, Pittsburgh 24(1):1–12
Roffo G, Melzi S, Castellani U, Vinciarelli A (2017) Infinite latent feature selection: a probabilistic latent graph-based ranking approach. In: IEEE International conference on computer vision, pp 1398–1406
Ruvolo P, Fasel I, Movellan J (2010) A learning approach to hierarchical feature selection and aggregation for audio classification. Pattern Recogn Lett 31(12):1535–1542
Sohn K (2016) Improved deep metric learning with multi-class n-pair loss objective. In: Advances in neural information processing systems, pp 1857–1865
Sun K, Mou S, Qiu J, Wang T, Gao H (2018) Adaptive fuzzy control for nontriangular structural stochastic switched nonlinear systems with full state constraints. IEEE Trans Fuzzy Syst 27(8):1587–1601
Tang W, Mao K (2007) Feature selection algorithm for mixed data with both nominal and continuous features. Pattern Recogn Lett 28(5):563–571
Tubishat M, Ja’afar S, Alswaitti M, Mirjalili S, Idris N (2021) Dynamic salp swarm algorithm for feature selection. Expert Syst Appl 164(7):873–887
Wang N, Li W, Jiang T, Lv S (2017) Physical layer spoofing detection based on sparse signal processing and fuzzy recognition. IET Signal Process 11(5):640–646
Wang S, Zhu W (2018) Sparse graph embedding unsupervised feature selection. IEEE Trans Syst Man Cybern Syst 48(3):329–341
Wang Z, Nie F, Tian L, Wang R, Li X (2020) Discriminative feature selection via a structured sparse subspace learning module. In: International joint conference on artificial intelligence, pp 3009–3015
Wei L, Liao M, Gao X, Zou Q (2015) An improved protein structural classes prediction method by incorporating both sequence and structure information. IEEE Trans Nanobiosci 14(4):339–349
Xu W, Sun W, Liu Y, Zhang W (2013) Fuzzy rough set models over two universes. Int J Machine Learn Cybern 4(6):631–645
Yao Y (2016) A triarchic theory of granular computing. Granular Computing 1(2):145–157
You W, Yang Z, Ji G (2014) PLS-Based recursive feature elimination for high-dimensional small sample. Knowl-Based Syst 55:15–28
Zhang X (2018) Pythagorean fuzzy clustering analysis: a hierarchical clustering algorithm with the ratio index-based ranking methods. Int J Intell Syst 33(9):1798–1822
Zhao H, Wang P, Hu Q, Zhu P (2019) Fuzzy rough set based feature selection for large-scale hierarchical classification. IEEE Trans Fuzzy Syst 27(10):1891–1903
Zhu W (2009) Relationship among basic concepts in covering-based rough sets. Inf Sci 179 (14):2478–2486
Acknowledgements
This work was supported by the Natural Science Foundation of Fujian Province under Grant No. 2021J011003.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Qiu, Z., Zhao, H. A fuzzy rough set approach to hierarchical feature selection based on Hausdorff distance. Appl Intell 52, 11089–11102 (2022). https://doi.org/10.1007/s10489-021-03028-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-03028-4