Abstract
This paper proposes a novel unsupervised feature selection method by jointing self-representation and subspace learning. In this method, we adopt the idea of self-representation and use all the features to represent each feature. A Frobenius norm regularization is used for feature selection since it can overcome the over-fitting problem. The Locality Preserving Projection (LPP) is used as a regularization term as it can maintain the local adjacent relations between data when performing feature space transformation. Further, a low-rank constraint is also introduced to find the effective low-dimensional structures of the data, which can reduce the redundancy. Experimental results on real-world datasets verify that the proposed method can select the most discriminative features and outperform the state-of-the-art unsupervised feature selection methods in terms of classification accuracy, standard deviation, and coefficient of variation.
Similar content being viewed by others
Notes
UCI Repository of Machine Learning Datasets, http://archive.ics.uci.edu
References
Bermejo, P., Gámez, J.A., Puerta, J.M.: A grasp algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets. Pattern Recogn. Lett. 32(5), 701–711 (2011)
Cai, D., He, X., Han, J.: Spectral regression: a unified approach for sparse subspace learning. In: IEEE ICDM, pp. 73–82 (2007)
Cai, D., Zhang, C., He. X.: Unsupervised feature selection for multi-cluster data. In: ACM SIGKDD, pp. 333–342 (2010)
Cai, X., Ding, C., Nie, F., Huang, H.: On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions. In: ACM SIGKDD, pp. 1124–1132 (2013)
Cai, X., Nie, F., Huang, H.: Exact top-k feature selection via l2, 0-norm constraint. In: IJCAI, vol. 13, pp. 1240–1246 (2013)
Chang, X., Nie, F., Yang, Y., Huang, H.: A convex formulation for semi-supervised multi-label feature selection. In: AAAI, pp. 1171–1177 (2014)
Chen, X.-W., Zeng, X., van Alphen, D.: Multi-class feature selection for texture classification. Pattern Recognit. Lett. 27(14), 1685–1691 (2006)
Gottumukkal, R., Asari, V.K.: An improved face recognition technique based on modular pca approach. Pattern Recognit. Lett. 25(4), 429–436 (2004)
Gu, Q., Li, Z., Han, J.: Joint feature selection and subspace learning. In: IJCAI, vol. 22(1), p. 1294
Hall, M. A., Smith, L.A.: Feature selection for machine learning: comparing a correlation-based filter approach to the wrapper. In: FLAIRS, vol. 1999, pp. 235–239 (1999)
He, X, Niyogi, P.: Locality preserving projections. In: NIPS, pp. 153–160 (2004)
Hu, R., Zhu, X., Cheng, D., He, W., Yan, Y., Song, J., Zhang, S.: Graph self-representation method for unsupervised feature selection. Neurocomputing 220, 130–137 (2017)
Liu, R., Yang, N., Ding, X., Ma, L.: An unsupervised feature selection algorithm: Laplacian score combined with distance-based entropy measure. In: IEEE IITA, vol. 3, pp. 65–68 (2009)
Lu, C., Lin, Z., Yan, S.: Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization. IEEE Trans. Image Process. 24(2), 646–654 (2015)
Nikitidis, S., Tefas, A., Pitas, I.: Maximum margin projection subspace learning for visual data analysis. IEEE Trans. Image Process. 23(10), 4413–4425 (2014)
Qian, M, Zhai, C.: Robust unsupervised feature selection. In: IJCAI, pp. 1621–1627 (2013)
Sebban, M., Nock, R.: A hybrid filter/wrapper approach of feature selection using information theory. Pattern Recognit. 35(4), 835–846 (2002)
Swiniarski, R.W., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recognit. Lett. 24(6), 833–849 (2003)
Tabakhi, S., Moradi, P., Akhlaghian, F.: An unsupervised feature selection algorithm based on ant colony optimization. Eng. Appl. Artif. Intell. 32, 112–123 (2014)
Velu, R., Reinsel, G.C.: Multivariate reduced-rank regression: theory and applications, vol. 136. Springer Science Business Media, New York (2013)
Wang, T., Qin, Z., Zhang, S., Zhang, C.: Cost-sensitive classification with inadequate labeled data. Inf. Syst. 37(5), 508–516 (2012)
Wang, H., Gao, Y., Shi, Y., Wang, R.: Group-based alternating direction method of multipliers for distributed linear classification. In: IEEE transactions on cybernetics. https://doi.org/10.1109/TCYB.2016.2570808, pp. 1–15 (2016)
Wu, J., Long, J., Liu, M.: Evolving rbf neural networks for rainfall prediction using hybrid particle swarm optimization and genetic algorithm. Neurocomputing 148, 136–142 (2015)
Yan, S., Xu, D., Zhang, B., Zhang, H.-J., Yang, Q., Lin, S.: Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 40–51 (2007)
Yi, P., Song, A., Guo, J., Wang, R.: Regularization feature selection projection twin support vector machine via exterior penalty. Neural Comput. Appl. 1–15 (2016)
Zhang, S.: Shell-neighbor method and its application in missing data imputation. Appl. Intell. 35(1), 123–133 (2011)
Zhang, S., Jin, Z., Zhu, X.: Missing data imputation by utilizing information within incomplete instances. J. Syst. Softw. 84(3), 452–459 (2011)
Zhang, S., Cheng, D., Zong, M., Gao, L.: Self-representation nearest neighbor search for classification. Neurocomputing 195, 137–142 (2016)
Zhang, S., Li, X., Zong, M., Zhu, X., Cheng, D.: Learning k for knn classification. ACM Trans. Intell. Syst. Technol. 8(3), 43 (2017)
Zhang, S., Li, X., Zong, M., Zhu, X., Wang, R.: Efficient knn classification with different numbers of nearest neighbors. IEEE Trans. Neural Netw. Learn. Syst. 1–12 https://doi.org/10.1109/TNNLS.2017.2673241 (2017)
Zhu, X., Zhang, S., Jin, Z., Zhang, Z., Xu, Z.: Missing value estimation for mixed-attribute data sets. IEEE Trans. Knowl. Data Eng. 23(1), 110–121 (2011)
Zhu, X., Zhang, L., Huang, Z.: A sparse embedding and least variance encoding approach to hashing. IEEE Trans. Image Process. 23(9), 3737–3750 (2014)
Zhu, P., Zuo, W., Zhang, L., Hu, Q., Shiu, S.C.: Unsupervised feature selection by regularized self-representation. Pattern Recognit. 48(2), 438–446 (2015)
Zhu, X., Li, X., Zhang, S.: Block-row sparse multiview multilabel learning for image classification. IEEE Trans. Cybern. 46(2), 450–461 (2016)
Zhu, X., Suk, H.-I., Lee, S.-W., Shen, D.: Subspace regularized sparse multitask learning for multiclass neurodegenerative disease identification. IEEE Trans. Biomed. Eng. 63(3), 607–618 (2016)
Zhu, X., Li, X., Zhang, S., Ju, C., Wu, X.: Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans. Neural Netw. Learn. Syst. 28 (6), 1263–1275 (2017)
Zhu, X., Li, X., Zhang, S., Xu, Z., Yu, L., Wang, C.: Graph pca hashing for similarity search. IEEE Trans. Multimed. 19(9), 2033–2044 (2017)
Zhu, X., Suk, H., Wang, L., Lee, S., Shen, D.: A novel relational regularization feature selection method for joint regression and classification in AD diagnosis. Med. Image Anal. 38, 205–214 (2017)
Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comput. Graph. Stat. 15(2), 265–286 (2006)
Acknowledgments
This work was in part supported by the Marsden Fund of New Zealand and the China Scholarship Council.
Author information
Authors and Affiliations
Corresponding author
Additional information
This article belongs to the Topical Collection: Special Issue on Deep Mining Big Social Data
Guest Editors: Xiaofeng Zhu, Gerard Sanroma, Jilian Zhang, and Brent C. Munsell
Rights and permissions
About this article
Cite this article
Wang, R., Zong, M. Joint self-representation and subspace learning for unsupervised feature selection. World Wide Web 21, 1745–1758 (2018). https://doi.org/10.1007/s11280-017-0508-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-017-0508-3