Abstract
Real-world problems usually present a huge volume of imprecise data. These types of problems may challenge case-based reasoning systems because the knowledge extracted from data is used to identify analogies and solve new problems. Many authors have focused on organizing case memory in patterns to minimize the computational burden and deal with uncertainty. The organization is usually determined by a single criterion, but in some problems, a single criterion can be insufficient to find accurate clusters. This work describes an approach to organize the case memory in patterns based on multiple criteria. This new approach uses the searching capabilities of multiobjective evolutionary algorithms to build a Pareto set of solutions, where each one is a possible organization based on the relevance of objectives. The system shows promising capabilities when it is compared with a successful system based on self-organizing maps. Due to the data set geometry influences, the clustering building process results are analyzed taking into account it. For this reason, some complexity measures are used to categorize data sets according to their topology.
Similar content being viewed by others
References
Aamodt A, Plaza E (1994) Case-based reasoning: foundations issues, methodological variations, and system approaches. AI Commun 7: 39–59
Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, School of Information and Computer Sciences, Irvine
Basu M, Ho TK (2006) Data complexity in pattern recognition, advanced information and knowledge processing. Springer, New York
Bernadó-Mansilla E, Llorà X, Garrell JM (2002) XCS and GALE: a comparative study of two learning classifier systems on data mining. In: Advances in learning classifier systems, vol 2321 of LNAI, pp 115–132. Springer
Bichindaritz I (2006) Memory organization as the missing link between case-based reasoning and information retrieval in biomedicine. Comput Intell 22(3-4): 148–160
Brown M (1994) A memory model for case retrieval by activation passing. Ph.D. thesis, University of Manchester
Cantu-Paz E (2000) Efficient and accurate parallel genetic algorithms. Kluwer, Norwell
Chang P, Lai C (2005) A hybrid system combining self-organizing maps with case-based reasoning in wholesaler’s new-release book forecasting. Expert Syst Appl 29(1): 183–192
Coello CAC (1999) A comprehensive survey of evolutionary-based multiobjective optimization techniques. Knowl Inf Syst 1: 269–308
Corchado E, Corchado JM, Aiken J (2004) Ibr retrieval method based on topology preserving mappings. J Exp Theoret Artif Intell 16(3): 145–160
Corne DW, Jerram NR, Knowles JD, Oates MJ (2001) PESA-II: region-based selection in evolutionary multiobjective optimization. In: Proceedings of the genetic and evolutionary computation conference, pp 283–290. Morgan Kaufmann Publishers
Corral G, Garcia-Piquer A, Orriols-Puig A, Fornells A, Golobardes E (2011) Analysis of vulnerability assessment results based on CAOS. Appl Soft Comput J 11: 4321–4331
Czarnowski I (2011) Cluster-based instance selection for machine classification. Knowl Inf Syst 1–21. doi:10.1007/s10115-010-0375-z
Davies D, Bouldin D (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(4): 224–227
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1–30
Duda RO, Hart PE, Stork DG (2000) Pattern classification. Wiley, New York
Dunn JC (1974) Well separated clusters and optimal fuzzy partitions. J Cybern 4: 95–104
Fonseca CM., Fleming PJ (1995) An overview of evolutionary algorithms in multiobjective optimization. Evolut Comput 3: 1–16
Fornells A, Golobardes E, Martorell JM, Garrell JM (2008) Patterns out of cases using kohonen maps in breast cancer diagnosis. Int J Neural Syst 18: 33–43
Fornells A, Golobardes E, Martorell JM, Garrell JM, Bernadó E, Macià N (2007) Measuring the applicability of self-organizing maps in a case-based reasoning system. In: Proceedings of 3rd Iberian conference on pattern recognition and image analysis, vol 4478 of LNCS, pp 532–539. Springer
Fornells A, Golobardes E, Martorell JM, Garrell JM, Bernadó E, Macià N (2007) A methodology for analyzing the case retrieval from a clustered case memory. In: Proceedings of 7th international conference on case-based reasoning, vol 4626 of LNAI, pp 122–136. Springer (best paper nomination)
Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, Secaucus
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11: 86–92
Gan G, Chaoqun M, Wu J (2000) Data clustering theory, algorithms, and applications. ASA-SIAM, Philadelphia
Handl J, Knowles J (2002) An evolutionary approach to multiobjective clustering. IEEE Trans Evolut Comput 1(1): 56–76
Herrera F, Carmona C, González P, del Jesus M (2010) An overview on subgroup discovery: foundations and applications. Knowl Inf Syst 1–31. doi:10.1007/s10115-010-0356-2
Ho TK, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24(3): 289–300
Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor
Iredi S, Merkle D, Middendorf M (2000) Bi-criterion optimization with multi colony ant algorithms. In: Proceedings of the first international conference on evolutionary multi-criterion optimization (EMO 2001), no. 1993 in LNCS, pp 359–372. Springer
Kohonen T (2000) Self-organizing maps, 3rd edn. Springer, Berlin
Law MH, Topchy AP, Jain AK (2004) Multiobjective data clustering. In: IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 424–430
Lenz M, Burkhard HD, Brückner S (1996) Applying case retrieval nets to diagnostic tasks in technical domains. In: Proceedings of the third European workshop on advances in case-based reasoning, pp 219–233. Springer
Malek M, Amy B (2007) A pre-processing model for integrating cbr and prototype-based neural networks. In: Sun R, Alexandre F (eds) Connectionism-symbolic Integration. Erlbaum, Hillsdale
Nemenyi PB (1963) Distribution-free multiple comparisons. Ph.D. thesis, Princeton University, New Jersey, USA
Nicholson R, Bridge D, Wilson N (2006) Decision diagrams: fast and flexible support for case retrieval and recommendation. In: Proceedings of the 8th European conference on case-based reasoning, vol 4106 of LNAI, pp 136–150. Springer
Park Y, Song M (1998) A genetic algorithm for clustering problems. In: Proceedings of the 3rd annual conference on genetic programming, pp 568–575. Morgan Kaufmann
Plaza E, Arcos J-L (1990) A reflective architecture for integrated memory-based learning and reasoning. In: Richter MM, Wess S, Althoff KD, Maurer F (eds) Proceedings first European workshop on case-based reasoning, vol 2, pp 329–334
Porter B (1986) PROTOS: an experiment in knowledge acquisition for heuristic classification tasks. In: Proceedings of first international meeting on advances in learning, Les Arcs, France, pp 159–174
Rissland EL, Skalak DB, Friedman M (1993) Case retrieval through multiple indexing and heuristic search. In: Proceedings of the international joint conferences on artificial intelligence, pp 902–908
Rousseeuw P (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Saha S, Bandyopadhyay S (2010) A new multiobjective clustering technique based on the concepts of stability and symmetry. Knowl Inf Syst 23: 1–27
Schaaf JW (1995) Fish and Sink—an anytime-algorithm to retrieve adequate cases. In: Proceedings of the first international conference on case-based reasoning research and development, vol 1010, pp 538–547. Springer
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3: 583–617
Strobbe M, Van Laere O, Dhoedt B, De Turck F, Demeester P (2011) Hybrid reasoning technique for improving context-aware applications. Knowl Inf Syst. doi:10.1007/s10115-011-0411-7
Vernet D, Golobardes E (2003) An unsupervised learning approach for case-based classifier systems. Expert Update Special Group Artif Intell 6(2): 37–42
Wess S, Althoff KD, Derwand G (1994) Using k-d trees to improve the retrieval step in case-based reasoning. In: Selected papers from the first European workshop on topics in case-based reasoning, vol 837, pp 167–181. Springer
Wittten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques with Java implementations, 3rd edn. Morgan Kaufmann Publishers, Burlington
Yang Q, Wu J (2001) Enhancing the effectiveness of interactive case-based reasoning with clustering and decision forests. Appl Intell 14(1): 49–64
Zenko B, Dzeroski S, Struyf J (2005) Learning predictive clustering rules. In: Knowledge discovery in inductive databases, vol 3933 of lecture notes in computer science, pp 234–250. Springer
Zitzler E, Deb K, Thiele L (2000) Comparison of multiobjective evolutionary algorithms: empirical results. Evolut Comput 8(2): 173–195
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Garcia-Piquer, A., Fornells, A., Orriols-Puig, A. et al. Data classification through an evolutionary approach based on multiple criteria. Knowl Inf Syst 33, 35–56 (2012). https://doi.org/10.1007/s10115-011-0462-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-011-0462-9