Abstract
Feature selection aims to select a subset of features to decrease time complexity, reduce storage burden and improve the generalization ability of classification or clustering. For the countless unlabeled high dimensional data, unsupervised feature selection is effective in alleviating the curse of dimension-ality and can find applications in various fields. In this paper, we propose a non-convex regularized self-representation (RSR) model where features can be represented by a linear combination of other features, and propose to impose L 2,p norm (0 < p < 1) regularization on self-representation coefficients for unsupervised feature selection. Compared with the conventional L 2,1 norm regularization, when p < 1, much sparser solution is obtained on the self-representation coefficients, and it is also more effective in selecting salient features. To solve the non-convex RSR model, we further propose an efficient iterative reweighted least squares (IRLS) algorithm with guaranteed convergence to fixed point. Extensive experimental results on nine datasets show that our feature selection method with small p is more effective. It mostly outperforms features selected at p = 1 and other state-of-the-art unsupervised feature selection methods in terms of classification accuracy and clustering result.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
References
Tang, J., Liu, H.: Unsupervised feature selection for linked social media data. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 904–912 (2012)
Liu, H., Wu, X., Zhang, S.: Feature selection using hierarchical feature clustering. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 979–984 (2011)
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Springer, New York (1998)
Langley, P.: Selection of Relevant Features in Machine Learning. Defense Technical Information Center, New Orleans (1994)
Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: L2,1-norm regularized discriminative feature selection for unsupervised learning. Proc. Int. Joint Conf. Artif. Intell. 22(1), 1589–1594 (2011)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (2000)
Hou, C., Nie, F., Yi, D., Wu, Y.: Feature selection via joint embedding learning and sparse regression. Proc. Int. Joint Conf. Artif. Intell. 22(1), 1324–1329 (2011)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems, pp. 507–514 (2005)
Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)
Nie, F., Xiang, S., Jia, Y., Zhang, C., Yan, S.: Trace ratio criterion for feature selection. AAAI 2, 671–676 (2008)
Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th International Conference on Machine Learning, pp. 1151-1157 (2007)
Li, Z., Yang, Y., Liu, J., Zhou, X., Lu, H.: Unsupervised feature selection using nonnegative spectral analysis. In: AAAI, pp. 1026-1032 (2012)
Zhu, P., Zuo, W., Zhang, L., Hu, Q., Shiu, S.C.: Unsupervised feature selection by regularized self-representation. Pattern Recogn. 48(2), 438–446 (2014)
Hou, C., Nie, F., Yi, D., Wu, Y.: Feature selection via joint embedding learning and sparse regression. Proc. Int. Joint Conf. Artif. Intell. 22(1), 1324–1329 (2011)
Zhao, Z., Wang, L., Liu, H.: Efficient spectral feature selection with minimum redundancy. In: AAAI, pp. 673-678 (2010)
Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: Computer Vision and Pattern Recognition, pp. 2790–2797 (2009)
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Liu, G., Lin, Z., Yu, Y.: Robust subspace segmentation by low-rank representation. In: Proceedings of the 27th International Conference on Machine Learning, pp. 663–670 (2010)
El-Shaarawi, A.H., Piegorsch, W.W. (eds.): Encyclopedia of Environ Metrics, vol. 1. John Wiley and Sons, New York (2001)
Su, A.I., Welsh, J.B., Sapinoso, L.M., et al.: Molecular classification of human carcinomas by use of gene expression signatures. Cancer Res. 61(20), 7388–7393 (2001)
Bhattacharjee, A., Richards, W.G., Staunton, J., et al.: Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. 98(24), 13790–13795 (2001)
Nutt, C.L., Mani, D.R., Betensky, R.A., et al.: Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Res. 63(7), 1602–1607 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, W., Zhang, H., Zhu, P., Zhang, D., Zuo, W. (2015). Non-convex Regularized Self-representation for Unsupervised Feature Selection. In: He, X., et al. Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques. IScIDE 2015. Lecture Notes in Computer Science(), vol 9243. Springer, Cham. https://doi.org/10.1007/978-3-319-23862-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-23862-3_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23861-6
Online ISBN: 978-3-319-23862-3
eBook Packages: Computer ScienceComputer Science (R0)