Using differential evolution for an attribute-weighted inverted specific-class distance measure for nominal attributes | Data Mining and Knowledge Discovery Skip to main content
Log in

Using differential evolution for an attribute-weighted inverted specific-class distance measure for nominal attributes

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Distance metrics are central to many machine learning algorithms. Improving their measurement performance can greatly affect the classification result of these algorithms. The inverted specific-class distance measure (ISCDM) is effective in handling nominal attributes rather than numeric ones, especially if a training set contains missing values and non-class attribute noise. However, similar to many other distance metrics, this method is still based on the attribute independence assumption, which is obviously infeasible for many real-world datasets. In this study, we focus on establishing an improved ISCDM by using an attribute weighting scheme to address its attribute independence assumption. We use a differential evolution (DE) algorithm to determine better attribute weights for our improved ISCDM, which is thus denoted as DE-AWISCDM. We experimentally tested our DE-AWISCDM on 29 UCI datasets, and find that it significantly outperforms the original ISCDM and other state-of-the-art methods with respect to negative conditional log likelihood and root relative squared error.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://www.gepsoft.com/gxpt4kb/Chapter10/Section1/SS07.html

  2. http://sci2s.ugr.es/keel/

  3. http://www.r-tutor.com/elementary-statistics/

  4. http://cn.mathworks.com/help/stats/signrank.html

References

  • Aha D, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66

    Article  Google Scholar 

  • Alcala-Fdez J, Fernandez A, Luengo J, Derrac J, Garcia S, Sanchez L, Herrera F (2011) Keel data mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple Valued Logic Soft Comput 17(2):255–287

    Google Scholar 

  • Ali M, Torn A (2004) Population set-based global optimization algorithms: some modifications and numerical studies. Comput Oper Res 31(10):1703–1725

    Article  MathSciNet  MATH  Google Scholar 

  • Asuncion A, Newman D (2007) UCI machine learning repository. University of California, Irvine

    Google Scholar 

  • Blanzieri E, Ricci F (1999) Probability based metrics for nearest neighbor classification and case-based reasoning. In: Proceedings of the 3rd international conference on case-based reasoning, Japan, pp 14–28

  • Buhmann MD (2003) Radial basis functions: theory and implementations. Cambridge University Press. https://doi.org/10.1017/CBO9780511543241

    Book  MATH  Google Scholar 

  • Cost S, Salzberg S (1993) A weighted nearest neighbor algorithm for learning with symbolic feature. Mach Learn 10:57–78

    Article  Google Scholar 

  • Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  • Diab D, Hindi K (2018) Using differential evolution for improving distance measures of nominal values. Appl Soft Comput 64:14–34

    Article  Google Scholar 

  • Diday E (1974) Recent progress in distance and similarity measures in pattern recognition. In: Proceedings of the 2nd international joint conference of pattern recognition, Japan, pp 534–539

  • Domeniconi C, Gunopulos D (2001) Adaptive nearest neighbor classification using support vector machines. In: Proceedings of advances in neural information processing systems, Cambridge, UK, pp 665–672

  • Domeniconi C, Peng J, Gunopulos D (2000) Adaptive metric nearest-neighbor classification. In: Proceedings of IEEE conference on computer vision and pattern recognition, Hilton Head, USA, pp 1517–1522

  • Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Comput Intell Mag 4(1):28–39

    Article  Google Scholar 

  • Dudani S (1976) The distance-weighted k-nearest neighbor rule. IEEE Trans Syst Man Cybern 6(4):325–327

    Article  Google Scholar 

  • Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th international joint conference on articial intelligence, Chambery, France, pp 1022–1027

  • Fu X, Wang L (2002) A ga-based rbf classifier with class dependent features. In: Proceedings of the 2002 congress on evolutionary computation, Honolulu, USA, pp 1890–1894

  • Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694

    MATH  Google Scholar 

  • Gong W, Wang Y, Cai Z, Wang L (2018) Finding multiple roots of nonlinear equation systems via a repulsion-based adaptive differential evolution. IEEE Trans Syst, Man Cybern: Syst 50(4):1499–1513

    Article  Google Scholar 

  • Gong F, Jiang L, Wang D, Guo X (2020a) Averaged one-dependence inverted specific-class distance measure for nominal attributes. J Exp Theor Artif Intell 32(4):651–663

    Article  Google Scholar 

  • Gong F, Jiang L, Zhang H, Wang D, Guo X (2020b) Gain ratio weighted inverted specific-class distance measure for nominal attributes. Int J Mach Learn Cybern 11:2237–2246

    Article  Google Scholar 

  • Gong F, Wang X, Jiang L, Rahimi S, Wang D (2021) Fine-grained attribute weighted inverted specific-class distance measure for nominal attributes. Inf Sci 578:848–869

    Article  MathSciNet  Google Scholar 

  • Grossman D, Domingos P (2004) Learning bayesian network classifiers by maximizing conditional likelihood. In: Proceedings of the 21st international conference on machine learning, Banff, Canada, pp 361–368

  • Guo Y, Greiner R (2005) Discriminative model selection for belief net structures. In: Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, USA, pp 770–776

  • Hall M (2006) A decision tree-based attribute weighting filter for naive bayes. In: Proceedings of AI-2006, the 26th SGAI international conference on innovative techniques and applications of artificial intelligence, Cambridge, UK, pp 59–70

  • Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor classification. IEEE Trans Pattern Anal Mach Intell 18(6):607–616

    Article  Google Scholar 

  • Hindi K (2013) Specific-class distance measures for nominal attributes. AI Commun 26(3):261–279

    Article  MathSciNet  Google Scholar 

  • Jiang L, Li C (2019) Two improved attribute weighting schemes for value difference metric. Knowl Inf Syst 60(2–3):1–22

    Google Scholar 

  • Jiang L, Zhang H, Cai Z (2009) A novel bayes model: hidden naive bayes. IEEE Trans Knowl Data Eng 21(10):1361–1371

    Article  Google Scholar 

  • Jiang L, Cai Z, Wang D, Zhang H (2012) Improving tree augmented naive bayes for class probability estimation. Knowl-Based Syst 26:239–245

    Article  Google Scholar 

  • Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive bayes and its application to text classification. Eng Appl Artif Intell 52:26–39

    Article  Google Scholar 

  • Jiang L, Zhang L, Li C, Wu J (2019) A correlation-based feature weighting filter for naive bayes. IEEE Trans Knowl Data Eng 31(2):201–213

    Article  Google Scholar 

  • Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95 - international conference on neural networks, Perth, Australia, pp 1942–1948

  • Kirkpatrick S, Gelatt C, Vecchi M (1983) Optimization by simulated annealing. Science 220(4598):671–680

    Article  MathSciNet  MATH  Google Scholar 

  • Kohonen T (2001) Self-organizing maps. Series in Information Sciences, New York, USA

    Book  MATH  Google Scholar 

  • Li C, Li H (2011) One dependence value difference metric. Knowl-Based Syst 24(5):589–594

    Article  Google Scholar 

  • Li C, Li H (2013) Selective value difference metric. J Comput 8(9):2232–2238

    Article  Google Scholar 

  • Li C, Jiang L, Li H (2014a) Local value difference metric. Pattern Recogn Lett 49(1):62–68

    Article  Google Scholar 

  • Li C, Jiang L, Li H (2014b) Naive bayes for value difference metric. Front Comp Sci 8(2):255–264

    Article  MathSciNet  Google Scholar 

  • Li C, Jiang L, Wu J, Zhang P (2018) Toward value difference metric with attribute weighting. Knowl Inf Syst 50(3):795–825

    Article  Google Scholar 

  • Li C, Jiang L, Li H, Wang S (2013) Attribute weighted value difference metric. In: Proceedings of the 2013 IEEE 25th international conference on tools with artificial intelligence, Herndon, USA, pp 575–580

  • Lloyd S (1982) Least square quantization in pcm. IEEE Trans Inf Theory 28(2):129–137

    Article  MathSciNet  MATH  Google Scholar 

  • Mitchell T (1997) Machine learning. New York, USA

  • Myles J, Hand D (1990) The multi-class metric problem in nearest neighbor discrimination rules. Pattern Recogn 23(11):1291–1297

    Article  Google Scholar 

  • Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281

    Article  MATH  Google Scholar 

  • Oh I, Lee J, Suen C (1999) Analysis of class separation and combination of class-dependent features for handwriting recognition. IEEE Trans Pattern Anal Mach Intell 21(10):1089–1094

    Article  Google Scholar 

  • Pineda-Bautista B, Carrasco-Ochoa J, Martinez-Trinidad J (2011) General framework for class-specific feature selection. Expert Syst Appl 38(8):10018–10024

    Article  Google Scholar 

  • Qiu C, Jiang L, Li C (2015) Not always simple classification: learning super-parent for class probability estimation. Expert Syst Appl 42(13):5433–5440

    Article  Google Scholar 

  • Saar-Tsechansky M, Provost F (2004) Active sampling for class probability estimation and ranking. Mach Learn 54:153–178

    Article  MATH  Google Scholar 

  • Short R, Fukunaga K (1981) The optimal distance measure for nearest neighbor classification. IEEE Trans Inf Theory 27(5):622–627

    Article  MathSciNet  MATH  Google Scholar 

  • Soares C, Williams P, Gilbert J, Dozier G (2010) A class-specific ensemble feature selection approach for classification problems. In: Proceedings of the 48th annual southeast regional conference, New York, USA, pp 1–6

  • Stanfill C, Wilson D (1986) Toward memory-based reasoning. Commun ACM 29(12):1213–1228

    Article  Google Scholar 

  • Storn R, Price K (1997) Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 4(11):341–359

    Article  MathSciNet  MATH  Google Scholar 

  • Wilson D, Martinez T (1997) Improved heterogeneous distance functions. Journal of Artificial Intelligence Research 6:1–34

    Article  MathSciNet  MATH  Google Scholar 

  • Witten I, Frank E, Hall M (2005) Data mining: practical machine learning tools and techniques. California, USA

    MATH  Google Scholar 

  • Wu X, Cai Z (2011) Attribute weighting via differential evolution algorithm for attribute weighted naive bayes (wnb). J Comput Inf Syst 7(5):1672–1679

    Google Scholar 

  • Wu J, Kumar V, Quinlan J, Ghosh J, Yang Q, Motoda H (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37

    Article  Google Scholar 

  • Yang L, Jin R (2006) Distance metric learning: a comprehensive survey. Department of Computer Science and Engineering, Michigan State University

  • Zaidi N, Cerquides J, Carman M, Webb G (2013) Alleviating naive bayes attribute independence assumption by attribute weighting. J Mach Learn Res 14:1947–1988

    MathSciNet  MATH  Google Scholar 

  • Zhang D, Wei B (2014) Comparison between differential evolution and particle swarm optimization algorithms. In: Proceedings of 2014 IEEE international conference on mechatronics and automation, Tianjin, China, pp 239–244

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xingfeng Guo.

Additional information

Communicated by Dr. Srinivasan Parthasarathy.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gong, F., Guo, X. & Wang, D. Using differential evolution for an attribute-weighted inverted specific-class distance measure for nominal attributes. Data Min Knowl Disc 37, 409–433 (2023). https://doi.org/10.1007/s10618-022-00881-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-022-00881-w

Keywords

Navigation