Dimensionality Reduction and Optimum Feature Selection in Designing Efficient Classifiers | SpringerLink
Skip to main content

Dimensionality Reduction and Optimum Feature Selection in Designing Efficient Classifiers

  • Conference paper
Swarm, Evolutionary, and Memetic Computing (SEMCCO 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6466))

Included in the following conference series:

  • 2553 Accesses

Abstract

In the course of day-to-day work, huge volumes of data sets constantly grow accumulating a large number of features, but lack completeness and have relatively low information density. Dimensionality reduction and feature selection are the core issues in handling such data sets and more specifically, discovering relationships in data. Dimensionality reduction by reduct generation is an important aspect of classification where reduced attribute set has the same classification power as the entire set of attributes of an information system. In the paper, multiple reducts are generated integrating the concept of rough set theory (RST) and relational algebra operations. As a next step, the attributes of the reducts, which are relatively better associated and have stronger classification power, are selected to generate the single reduct using classical Apriori algorithm. Different classifiers are built using the single reduct and accuracies are compared to measure the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 13155
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Carter, C., Hamilton, H.: Efficient attribute-oriented generalization for knowledge discovery from large databases. IEEE Trans. Knowledge and Data Engineering 10, 193–208 (1998)

    Google Scholar 

  2. Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: Proc. 1999 Int. Conf. Knowledge Discovery and Data Mining, KDD 1999, pp. 43–52 (1999)

    Google Scholar 

  3. Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H.: Finding interesting rules from large sets of discovered association rules. In: Proceedings of the 3rd International Conference on Information and Knowledge Management (CIKM 1994), pp. 401–407. ACM Press, New York (1994)

    Google Scholar 

  4. Pawlak, Z.: Rough set theory and its applications to data analysis. Cybernetics and systems 29, 661–688 (1998)

    Google Scholar 

  5. Pawlak, Z.: Rough sets – Theoritical aspects of reasoning about data, vol. 229. Kluwer Academic Publishers, Dordrecht (1991)

    Google Scholar 

  6. Agrawal, R., Srikant, R.: Fast Algorithm for Mining Association Rules. In: Proc. of the 20th VLDB Conference, pp. 487–499 (1994)

    Google Scholar 

  7. Ziarko, W.: Rough sets as a methodology for data mining. In: Rough Sets in Knowledge Discovery 1: Methodology and Applications, pp. 554–576. Physica-Verlag, Heidelberg (1998)

    Google Scholar 

  8. Swiniarski, W., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recog. Letters 24(6), 833–849 (2003)

    Google Scholar 

  9. The Apriori Algorithm (a Tutorial) Markus Hegland CMA, Australian National University John Dedman Building, Canberra ACT 0200, Australia

    Google Scholar 

  10. Pawlak, Z.: Drawing Conclusions from Data-The Rough Set Way. IJIS 16, 3–11 (2001)

    Google Scholar 

  11. Witten, I.H., Frank, E.: Data Mining:Practical Machine Learning Tools and Techniques with Java Implementations. MK (2000)

    Google Scholar 

  12. Han, J., Kamber, M.: Data Miningg:Concepts and Techniques. MK (2001)

    Google Scholar 

  13. Pawlak, Z.: Rough set. Int. J. of Computer and Information Science 11, 341–356 (1982)

    Google Scholar 

  14. Murphy, P., Aha, W.: UCI repository of machine learning databases (1996), http://www.ics.uci.edu/mlearn/MLRepository.html

  15. Das, A.K., Sil, J.: An Efficient Classifier Design Integrating Rough Set and Graph Theory based Decision Forest. In: the 4th Indian International Conference on Artificial Intelligence (IICAI 2009), Siddaganga Institute of Technology, December 16-18, pp. 533–544, Tumkur, India (2009)

    Google Scholar 

  16. WEKA: Machine Learning Software, http://www.cs.waikato.ac.nz/~ml/

  17. Borgelt, C.: Apriori: Finding Association Rules/ Hyperedges with the Apriori Algorithm School of Computer Science, University of Magdeburg (2004)

    Google Scholar 

  18. Quinlan, J.R.: The minimum description length and categorical theories. In: Proceedings 11th International Conference on Machine learning, New Brunswick, pp. 233–241. Morgan Kaufmann, San Francisco

    Google Scholar 

  19. Hansen, M., Yu, B.: Model selection and the principle of minimum description length. J. Am. Stat. Assoc. 96, 746–774 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  20. Roman, W.S., Hargis, L.: Rough sets as a frontend as neural-networks texture classifiers. Neurocomputing 36, 85–102 (2001)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Das, A.K., Sil, J. (2010). Dimensionality Reduction and Optimum Feature Selection in Designing Efficient Classifiers. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Dash, S.S. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2010. Lecture Notes in Computer Science, vol 6466. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17563-3_67

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17563-3_67

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17562-6

  • Online ISBN: 978-3-642-17563-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics