A new kernel fuzzy based feature extraction method using attraction points

Shahdoosti, Hamid Reza; Javaheri, Nayereh

doi:10.1007/s11045-018-0592-2

A new kernel fuzzy based feature extraction method using attraction points

Published: 04 June 2018

Volume 30, pages 1009–1027, (2019)
Cite this article

Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Hamid Reza Shahdoosti¹ &
Nayereh Javaheri¹

207 Accesses
Explore all metrics

Abstract

This paper aims at introducing a novel supervised feature extraction method to be used in small sample size situations. The proposed approach considers the class membership of samples and exploits a nonlinear mapping in order to extract the relevant features and to mitigate the Hughes phenomenon. The proposed objective function is composed of three different terms, namely, attraction function, repulsion function, and the between-feature scatter matrix, where the last term increases the difference between extracted features. Subsequently, the attraction function and the repulsion function are redefined by incorporating the membership degrees of samples. Finally, the proposed method is extended using the kernel trick to capture the inherent nonlinearity of the original data. To evaluate the accuracy of the proposed feature extraction method, four remote sensing images are used in our experiments. The experiments indicate that the proposed feature extraction method is anappropriate choice for classification of hyperspectral images using limited training samples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A fast algorithm for feature extraction of hyperspectral images using the first order statistics

Article 01 February 2018

Object-based feature extraction for hyperspectral data using firefly algorithm

Article 23 November 2019

Hyperspectral Remote Sensing Images and Supervised Feature Extraction

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Baudat, G., & Anouar, F. (2000). Generalized discriminant analysis using a kernel approach. Neural Computation, 12(10), 2385–2404.
Article Google Scholar
Camps-Valls, G., Shervashidze, N., & Borgwardt, K. M. (2010). Spatio-spectral remote sensing image classification with graph kernels. IEEE Geoscience and Remote Sensing Letters, 7(4), 741–745.
Article Google Scholar
Chang, C., & Linin, C. (2008). LIBSVM—A library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Chen, L. F., Mark Liao, H. Y., Ko, M. T., Lin, J Ch., & Yu, G. J. (2000). A new LDA-based face recognition system which can solve the small sample size problem. Pattern Recognition, 33, 1713–1726.
Article Google Scholar
Cui, Y., & Fan, L. (2012). Feature extraction using fuzzy maximum margin criterion. Neurocomputing, 86, 52–58.
Article Google Scholar
Dehghani, H., & Ghassemian, H. (2006). Measurement of uncertainty by the entropy: Application to the classification of MSS data. International Journal of Remote Sensing, 27(18), 4005–4014.
Article Google Scholar
Ding, S., Meng, L., Han, Y., & Xue, Y. (2017a). A review on feature binding theory and its functions observed in perceptual process. Cognitive Computation, 9(2), 194–206.
Article Google Scholar
Ding, S., Zhang, X., An, Y., & Xue, Y. (2017b). Weighted linear loss multiple birth support vector machine based on information granulation for multi-class classification. Pattern Recognition, 67, 32–46.
Article Google Scholar
Foody, G. M. (2004). Thematic map comparison: Evaluating the statistical significance of differences in classification accuracy. Photogrammetric Engineering and Remote Sensing, 70, 627–633.
Article Google Scholar
Gao, F., Lv, W., Zhang, Y., Sun, J., Wang, J., & Yang, E. (2016). A novel semisupervised support vector machine classifier based on active learning and context information. Multidimensional Systems and Signal Processing, 27(4), 969–988.
Article MathSciNet Google Scholar
Hastie, T., Buja, A., & Tibshirane, R. (1995). Penalized discriminant analysis. Annals of Statistics, 23(1), 73–102.
Article MathSciNet MATH Google Scholar
Howland, P., & Park, H. (2004). Generalizing discriminant analysis using the generalized singular value decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), 995–1006.
Article Google Scholar
Imani, M., & Ghassemian, H. (2014a). Feature extraction using attraction points for classification of hyperspectral images in a small sample size situation. Geoscience and Remote Sensing Letters, 11(11), 1986–1990.
Article Google Scholar
Imani, M., & Ghassemian, H. (2014b). Band clustering-based feature extraction for classification of hyperspectral images using limited training samples. Geoscience and Remote Sensing Letters, 11(8), 1325–1329.
Article Google Scholar
Imani, M., & Ghassemian, H. (2015). Feature space discriminant analysis for hyperspectral data feature reduction. ISPRS Journal of Photogrammetry and Remote Sensing, 102, 1–13.
Article Google Scholar
Ji, Sh W, & Ye, J. P. (2008). Generalized linear discriminant analysis: A unified framework and efficient model selection. IEEE Transaction on Neural Networks, 19(10), 1768–1782.
Article Google Scholar
Kamandar, M., & Ghassemian, H. (2013). Linear feature extraction for hyperspectral images based on information theoretic learning. IEEE Geoscience and Remote Sensing Letters, 10(4), 702–706.
Article Google Scholar
Kathrin S. (2004).On the Kronecker product. Master’s Thesis, University of Waterloo.
Kwak, K., & Pedrycz, W. (2005). Face recognition using a fuzzy fisherface classifier. Pattern Recognition, 38, 1717–1732.
Article Google Scholar
Landgrebe, D. A. (2002). Hyperspectral image data analysis. IEEE Signal Processing Magazine, 19(1), 17–28.
Article Google Scholar
Li, H. F., Jiang, T., & Zhang, K Sh. (2006). Efficient and robust feature extraction by maximum margin criterion. IEEE Transaction on Neural Networks, 17(1), 157–165.
Article Google Scholar
Li, J., et al. (2015). Multiple feature learning for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 53(3), 1592–1606.
Article Google Scholar
Liang, Y. X., Li, Ch R, Gong, W. G., & Pan, Y. J. (2007). Uncorrelated linear discriminant analysis based on weighted pairwise fisher criterion. Pattern Recognition, 40, 3606–3615.
Article MATH Google Scholar
Liu, S., Feng, L., Liu, Y., Wu, J., Sun, M., & Wang, W. (2016). Robust discriminative extreme learning machine for relevance feedback in image retrieval. Multidimensional Systems and Signal Processing, 1, 1–19.
Google Scholar
Lotlikar, R., & Kothari, R. (2000). Fractional-step dimensionality reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(6), 623–627.
Article Google Scholar
Lu, J., Plataniotis, K. N., & Venetsanopoulos, A. N. (2005). Regularization studies of linear discriminant analysis in small sample size scenarios with application to face recognition. Pattern Recognition Letters, 26(2), 181–191.
Article Google Scholar
Marconcini, M., Camps-Valls, G., & Bruzzone, L. (2009). A composite semisupervised SVM for classification of hyperspectral images. IEEE Geoscience and Remote Sensing Letters, 6(2), 234–238.
Article Google Scholar
Melgani, M., & Bruzzone, L. (2004). Classification of hyperspectral remote sensing images with support vector machines. IEEE Transactions on Geoscience and Remote Sensing, 42(8), 1778–1790.
Article Google Scholar
Pekalska, E., & Haasdonk, B. (2009). Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(6), 1017–1032.
Article Google Scholar
Prasad, B. K., & Sanyal, G. (2016). Novel features and a cascaded classifier based Arabic numerals recognition system. Multidimensional Systems and Signal Processing, 1, 1–18.
Google Scholar
Price, R., & Gee, F. (2005). Face recognition using direct, weighted linear discriminant analysis and modular subspaces. Pattern Recognition, 38, 209–219.
Article MATH Google Scholar
Scholkopf, B., Smola, A. J., & Muller, K. R. (1997). Kernel principal component. In: Analysis: Lecture notes in computer science.
Shahdoosti, H. R., & Javaheri, N. (2017). Pansharpening of clustered MS and Pan images considering mixed pixels. IEEE Geoscience and Remote Sensing Letters, 14(6), 826–830.
Article Google Scholar
Shahdoosti, H. R., & Javaheri, N. (2018a). A fast algorithm for feature extraction of hyperspectral images using the first order statistics. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-018-5695-0.
Google Scholar
Shahdoosti, H. R., & Javaheri, N. (2018b). A new hybrid feature extraction method in a dyadic scheme for classification of hyperspectral data. International Journal of Remote Sensing, 39(1), 101–130.
Article Google Scholar
Shahdoosti, H. R., & Mirzapour, F. (2017). Spectral–spatial feature extraction using orthogonal linear discriminant analysis for classification of hyperspectral data. European Journal of Remote Sensing, 50(1), 111–124.
Article Google Scholar
Shahshahani, B. M., & Landgrebe, D. A. (1994). The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon. IEEE Transactions on Geoscience and Remote Sensing, 32(5), 1087–1095.
Article Google Scholar
Wang, J. G., Lin, Y Sh, Yang, W. K., & Yang, J. Y. (2008). Kernel maximum scatter difference based feature extraction and its application to face recognition. Pattern Recognition Letters, 29, 1832–1835.
Article Google Scholar
Xia, J., Chanussot, J., Du, P., & He, X. (2014). (Semi-) supervised probabilistic principal component analysis for hyperspectral remote sensing image classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7(6), 2224–2236.
Article Google Scholar
Xue, B., Zhang, M., & Browne, W. N. (2013). Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Transactions on Cybernetics, 43(6), 1656–1671.
Article Google Scholar
Yang, W. K., Wang, J. G., Ren, M. W., Zhang, L., & Yang, J. Y. (2009). Feature extraction using fuzzy inverse FDA. Neurocomputing, 72, 3384–3390.
Article Google Scholar
Ye, J. P. (2006). Computational and theoretical analysis of null space and orthogonal linear discriminant analysis. The Journal of Machine Learning Research, 7, 1183–1204.
MathSciNet MATH Google Scholar
Ye, J. P., & Li, Q. (2005). A two-stage linear discriminant analysis via QR-decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 929–941.
Article Google Scholar
Yu, H., & Yang, J. (2001). A direct LDA algorithm for high-dimensional data—With application to face recognion. Pattern Recognition, 34, 2067–2070.
Article MATH Google Scholar
Zhang, J., Ding, S., Zhang, N., & Shi, Z. (2016). Incremental extreme learning machine based on deep feature embedded. International Journal of Machine Learning and Cybernetics, 7(1), 111–120.
Article Google Scholar
Zhu, M., & Martinez, A. M. (2006). Selecting principal components in a two-stage LDA algorithm. In 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06) (vol. 1, pp. 132–137).

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Hamedan University of Technology, Hamedan, 65155, Iran
Hamid Reza Shahdoosti & Nayereh Javaheri

Authors

Hamid Reza Shahdoosti
View author publications
You can also search for this author in PubMed Google Scholar
Nayereh Javaheri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamid Reza Shahdoosti.

Appendices

Appendix A

Considering $ {\mathbf{A}} = 4{\mathbf{UPU}}^{T} $, $ {\mathbf{B}} = 2\gamma {\mathbf{Q}}^{T} {\mathbf{Q}} $, and $ {\mathbf{C}} = {\mathbf{UU}}^{T} $, and applying the vec operator to Eq. (7) yield:

$$ \text{vec} ({\mathbf{TA}}) + \text{vec} ({\mathbf{BTC}}) - \lambda \text{vec} ({\mathbf{T}}) = 0 $$

(17)

Substituting $ \text{vec} ({\mathbf{TA}}) $ with $ \text{vec} ({\mathbf{I}}_{m \times m} {\mathbf{TA}}) $, where $ {\mathbf{I}}_{m \times m} $ is an m × m identity matrix, and using the equality $ \text{vec} ({\mathbf{abc}}) = ({\mathbf{c}}^{T} \otimes {\mathbf{a}})\text{vec} ({\mathbf{b}}) $ (Kathrin 2004), where $ \otimes $ is the Kronecker product, one can rewrite Eq. (17) as:

$$ ({\mathbf{A}}^{T} \otimes {\mathbf{I}}_{m \times m} )vec({\mathbf{T}}) + ({\mathbf{C}}^{T} \otimes {\mathbf{B}})vec({\mathbf{T}}) - \lambda vec({\mathbf{T}}) = 0 $$

(18)

which is equal to Eq. (8).

Appendix B

Considering Eq. (6), one should maximize the following equation under the normalization constraint:

$$ 2\text{tr} ({\mathbf{TUPU}}^{T} {\mathbf{T}}^{T} ) + \gamma \text{tr} ({\mathbf{U}}^{T} {\mathbf{T}}^{T} {\mathbf{Q}}^{T} {\mathbf{QTU}}\varvec{)} $$

(19)

Using the circular property of trace, one may write:

$$ 2\text{tr} ({\mathbf{T}}^{T} {\mathbf{TUPU}}^{T} ) + \gamma \text{tr} ({\mathbf{T}}^{T} {\mathbf{Q}}^{T} {\mathbf{QTUU}}^{T} \varvec{)} $$

(20)

Using the equality $ \text{tr} ({\mathbf{a}}^{T} {\mathbf{b}}) = \text{vec} ({\mathbf{a}})^{T} \text{vec} ({\mathbf{b}}) $ (Kathrin 2004), one can rewrite Eq. (20) as:

$$ 2\text{vec} ({\mathbf{T}})^{T} \text{vec} ({\mathbf{TUPU}}^{T} ) + \gamma \text{vec} ({\mathbf{T}})^{T} \text{vec} ({\mathbf{Q}}^{T} {\mathbf{QTUU}}^{T} ) $$

(21)

Using the equality $ \text{vec} ({\mathbf{abc}}) = ({\mathbf{c}}^{T} \otimes {\mathbf{a}})\text{vec} ({\mathbf{b}}) $ (Kathrin 2004) and defining $ {\mathbf{A}} = 4{\mathbf{UPU}}^{T} $, $ {\mathbf{B}} = 2\gamma {\mathbf{Q}}^{T} {\mathbf{Q}} $, and $ {\mathbf{C}} = {\mathbf{UU}}^{T} $, Eq. (21) can be rewritten as:

$$ \begin{aligned} & \text{vec} ({\mathbf{T}})^{T} ({\mathbf{A}}^{T} \otimes {\mathbf{I}}_{m \times m} )\text{vec} ({\mathbf{T}}) + \text{vec} ({\mathbf{T}})^{T} ({\mathbf{C}}^{T} \otimes {\mathbf{B}})\text{vec} ({\mathbf{T}}) \\ & \quad = \text{vec} ({\mathbf{T}})^{T} ({\mathbf{A}}^{T} \otimes {\mathbf{I}}_{m \times m} + {\mathbf{C}}^{T} \otimes {\mathbf{B}})\text{vec} ({\mathbf{T}}) \\ \end{aligned} $$

(22)

Due to the fact that $ \text{vec} ({\mathbf{T}}) $ is the eigenvector of $ {\mathbf{A}}^{T} \otimes {\mathbf{I}}_{m \times m} + {\mathbf{C}}^{T} \otimes {\mathbf{B}} $ (see Eq. (8)), one can conclude:

$$ \text{vec} ({\mathbf{T}})^{T} ({\mathbf{A}}^{T} \otimes {\mathbf{I}}_{m \times m} + {\mathbf{C}}^{T} \otimes {\mathbf{B}})\text{vec} ({\mathbf{T}}) = \lambda \text{vec} ({\mathbf{T}})^{T} \text{vec} ({\mathbf{T}}) $$

(23)

Using the equality $ \text{tr} ({\mathbf{a}}^{T} {\mathbf{b}}) = \text{vec} ({\mathbf{a}})^{T} \text{vec} ({\mathbf{b}}) $ and considering the normalization constraint i.e. $ \varvec{tr(}{\mathbf{TT}}^{T} \varvec{) = 1} $, yield:

$$ \lambda \text{vec} ({\mathbf{T}})^{T} \text{vec} ({\mathbf{T}}) = \lambda \text{tr} ({\mathbf{TT}}^{T} ) = \lambda $$

(24)

So, the maximum of Eq. (19) is obtained if $ \text{vec} ({\mathbf{T}}) $ is the eigenvector corresponding to the largest eigenvalue of $ {\mathbf{A}}^{T} \otimes {\mathbf{I}}_{m \times m} + {\mathbf{C}}^{T} \otimes {\mathbf{B}} $.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shahdoosti, H.R., Javaheri, N. A new kernel fuzzy based feature extraction method using attraction points. Multidim Syst Sign Process 30, 1009–1027 (2019). https://doi.org/10.1007/s11045-018-0592-2

Download citation

Received: 22 April 2017
Revised: 10 April 2018
Accepted: 24 May 2018
Published: 04 June 2018
Issue Date: 01 April 2019
DOI: https://doi.org/10.1007/s11045-018-0592-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A new kernel fuzzy based feature extraction method using attraction points

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A fast algorithm for feature extraction of hyperspectral images using the first order statistics

Object-based feature extraction for hyperspectral data using firefly algorithm

Hyperspectral Remote Sensing Images and Supervised Feature Extraction

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Appendix B

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A new kernel fuzzy based feature extraction method using attraction points

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A fast algorithm for feature extraction of hyperspectral images using the first order statistics

Object-based feature extraction for hyperspectral data using firefly algorithm

Hyperspectral Remote Sensing Images and Supervised Feature Extraction

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Appendix B

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation