Dimensionality Reduction and Optimum Feature Selection in Designing Efficient Classifiers

Das, A. K.; Sil, J.

doi:10.1007/978-3-642-17563-3_67

A. K. Das²⁰ &
J. Sil²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6466))

Included in the following conference series:

International Conference on Swarm, Evolutionary, and Memetic Computing

2562 Accesses

Abstract

In the course of day-to-day work, huge volumes of data sets constantly grow accumulating a large number of features, but lack completeness and have relatively low information density. Dimensionality reduction and feature selection are the core issues in handling such data sets and more specifically, discovering relationships in data. Dimensionality reduction by reduct generation is an important aspect of classification where reduced attribute set has the same classification power as the entire set of attributes of an information system. In the paper, multiple reducts are generated integrating the concept of rough set theory (RST) and relational algebra operations. As a next step, the attributes of the reducts, which are relatively better associated and have stronger classification power, are selected to generate the single reduct using classical Apriori algorithm. Different classifiers are built using the single reduct and accuracies are compared to measure the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 13155; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Feature Selection Through Composition of Rough–Fuzzy Sets

Rough set methods in feature selection via submodular function

Article 30 January 2016

Dimensionality Reduction: Is Feature Selection More Effective Than Random Selection?

References

Carter, C., Hamilton, H.: Efficient attribute-oriented generalization for knowledge discovery from large databases. IEEE Trans. Knowledge and Data Engineering 10, 193–208 (1998)
Google Scholar
Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: Proc. 1999 Int. Conf. Knowledge Discovery and Data Mining, KDD 1999, pp. 43–52 (1999)
Google Scholar
Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H.: Finding interesting rules from large sets of discovered association rules. In: Proceedings of the 3rd International Conference on Information and Knowledge Management (CIKM 1994), pp. 401–407. ACM Press, New York (1994)
Google Scholar
Pawlak, Z.: Rough set theory and its applications to data analysis. Cybernetics and systems 29, 661–688 (1998)
Google Scholar
Pawlak, Z.: Rough sets – Theoritical aspects of reasoning about data, vol. 229. Kluwer Academic Publishers, Dordrecht (1991)
Google Scholar
Agrawal, R., Srikant, R.: Fast Algorithm for Mining Association Rules. In: Proc. of the 20th VLDB Conference, pp. 487–499 (1994)
Google Scholar
Ziarko, W.: Rough sets as a methodology for data mining. In: Rough Sets in Knowledge Discovery 1: Methodology and Applications, pp. 554–576. Physica-Verlag, Heidelberg (1998)
Google Scholar
Swiniarski, W., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recog. Letters 24(6), 833–849 (2003)
Google Scholar
The Apriori Algorithm (a Tutorial) Markus Hegland CMA, Australian National University John Dedman Building, Canberra ACT 0200, Australia
Google Scholar
Pawlak, Z.: Drawing Conclusions from Data-The Rough Set Way. IJIS 16, 3–11 (2001)
Google Scholar
Witten, I.H., Frank, E.: Data Mining:Practical Machine Learning Tools and Techniques with Java Implementations. MK (2000)
Google Scholar
Han, J., Kamber, M.: Data Miningg:Concepts and Techniques. MK (2001)
Google Scholar
Pawlak, Z.: Rough set. Int. J. of Computer and Information Science 11, 341–356 (1982)
Google Scholar
Murphy, P., Aha, W.: UCI repository of machine learning databases (1996), http://www.ics.uci.edu/mlearn/MLRepository.html
Das, A.K., Sil, J.: An Efficient Classifier Design Integrating Rough Set and Graph Theory based Decision Forest. In: the 4th Indian International Conference on Artificial Intelligence (IICAI 2009), Siddaganga Institute of Technology, December 16-18, pp. 533–544, Tumkur, India (2009)
Google Scholar
WEKA: Machine Learning Software, http://www.cs.waikato.ac.nz/~ml/
Borgelt, C.: Apriori: Finding Association Rules/ Hyperedges with the Apriori Algorithm School of Computer Science, University of Magdeburg (2004)
Google Scholar
Quinlan, J.R.: The minimum description length and categorical theories. In: Proceedings 11th International Conference on Machine learning, New Brunswick, pp. 233–241. Morgan Kaufmann, San Francisco
Google Scholar
Hansen, M., Yu, B.: Model selection and the principle of minimum description length. J. Am. Stat. Assoc. 96, 746–774 (2001)
Article MathSciNet MATH Google Scholar
Roman, W.S., Hargis, L.: Rough sets as a frontend as neural-networks texture classifiers. Neurocomputing 36, 85–102 (2001)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Bengal Engineering and Science University, Shibpur, Howrah, India
A. K. Das & J. Sil

Authors

A. K. Das
View author publications
You can also search for this author in PubMed Google Scholar
J. Sil
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical Engineering, Indian Institute of Technology, New Delhi, India
Bijaya Ketan Panigrahi
Department of Electronics and Communication Engineering, Jadavpur University, 700032, Kolkata, West Bengal, India
Swagatam Das
School of Electrical and Electronic Engineering, Nanyang Technological University, 639798, Singapore
Ponnuthurai Nagaratnam Suganthan
Department of Electrical and Electronics Engineering, SRM University, Chennai, Tamil Nadu, India
Subhransu Sekhar Dash

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Das, A.K., Sil, J. (2010). Dimensionality Reduction and Optimum Feature Selection in Designing Efficient Classifiers. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Dash, S.S. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2010. Lecture Notes in Computer Science, vol 6466. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17563-3_67

Download citation

DOI: https://doi.org/10.1007/978-3-642-17563-3_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17562-6
Online ISBN: 978-3-642-17563-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics