Feature weighting to tackle label dependencies in multi-label stacking nearest neighbor | Applied Intelligence Skip to main content
Log in

Feature weighting to tackle label dependencies in multi-label stacking nearest neighbor

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In multi-label learning, each instance is associated with a subset of predefined labels. One common approach for multi-label classification has been proposed in Godbole and Sarawagi (2004) based on stacking which is called as Meta Binary Relevance (MBR). It uses two layers of binary models and feeds the outputs of the first layer to all binary models of the second layer. Hence, initial predicted class labels (in the first layer) are attached to the original features to have a new prediction of the classes in the second layer. To predict a specific label in the second layer, irrelevant labels are also used as the noisy features. This is why; Nearest Neighbor (NN) as a sensitive classifier to noisy features had been not, up to now, a proper base classifier in stacking method and all of its merits including simplicity, interpretability, global stability to noisy labels and good performance, are lost. As the first contribution, a popular feature weighting in NN classification is used here to solve uncorrelated labels problem. It tunes a parametric distance function by gradient descent to minimize the classification error on training data. However, it is known that some other objectives including F-measure are more suitable than classification error on learning imbalanced data. The second contribution of this paper is extending this method in order to improve F-measure. In our experimental study, the proposed method has been compared with and outperforms state-of-the-art multi-label classifiers in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. http://mulan.sourceforge.net/datasets-mlc.html

  2. http://computer.njnu.edu.cn/Lab/LABIC/LABIC_software.html

  3. http://www.uco.es/kdis/mllresources/

  4. The implementation is available in Mulan lbrary [44]

  5. The implementation is available in MLC-toolbox [48]

  6. The source code can be downloaded from the author home page: http://palm.seu.edu.cn/zhangml/

  7. The source code can be downloaded from https://paperswithcode.com/paper/joint-ranking-svm-and-binary-relevance-with#code

  8. The source code can be downloaded from https://github.com/keauneuh/Incorporating-Multiple-Cluster-Centers-for-Multi-Label-Learning

References

  1. Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 22–30

  2. Zhang M-L, Zhou Z-H (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837

    Article  Google Scholar 

  3. Alazaidah R, Ahmad FK (2016) Trending challenges in multi label classification. Int J Adv Comput Sci Appl 7(10):127–131

    Google Scholar 

  4. Rathore V S, Worring M, Mishra D K, Joshi A, Maheshwari S (2018) Emerging trends in expert applications and security: Proceedings of iceteas 2018, vol 841. Springer, Berlin

    Google Scholar 

  5. Tsoumakas G, Katakis I, Vlahavas I (2009) Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer, pp 667–685

  6. Zhang M-L, Zhang K (2010) Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 999–1008

  7. De Comité F, Gilleron R, Tommasi M (2003) Learning multi-label alternating decision trees from texts and data. In: International workshop on machine learning and data mining in pattern recognition. Springer, pp 35–49

  8. Zhang M-L, Zhou Z-H (2007) ML-KNN: A lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048

    Article  Google Scholar 

  9. Schapire R E, Singer Y (2000) BoosTexter: A boosting-based system for text categorization. Mach Learn 39(2-3):135–168

    Article  Google Scholar 

  10. Fürnkranz J, Hüllermeier E, Menc∖’∖ia E L, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73 (2):133–153

    Article  Google Scholar 

  11. Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333

    Article  MathSciNet  Google Scholar 

  12. Cheng Z, Zeng Z (2020) Joint label-specific features and label correlation for multi-label learning with missing label. Appl Intell 1–21

  13. Wu G, Tian Y, Liu D (2018) Cost-sensitive multi-label learning with positive and negative label pairwise correlations. Neural Netw 108:411–423

    Article  Google Scholar 

  14. Dembczyński K, Waegeman W, Cheng W, Hüllermeier E (2012) On label dependence and loss minimization in multi-label classification. Mach Learn 88(1-2):5–45

    Article  MathSciNet  Google Scholar 

  15. Tsoumakas G, Dimou A, Spyromitros E, Mezaris V, Kompatsiaris I, Vlahavas I (2009) Correlation-based pruning of stacked binary relevance models for multi-label learning. In: Proceedings of the 1st International Workshop on Learning from Multi-label Data, pp 101–116

  16. Chekina L, Gutfreund D, Kontorovich A, Rokach L, Shapira B (2013) Exploiting label dependencies for improved sample complexity. Mach Learn 91(1):1–42

    Article  MathSciNet  Google Scholar 

  17. Alali A, Kubat M (2015) Prudent: A pruned and confident stacking approach for multi-label classification. IEEE Trans Knowl Data Eng 27(9):2480–2493

    Article  Google Scholar 

  18. Huang S-J, Zhou Z-H (2012) Multi-label learning by exploiting label correlations locally. In: Twenty-sixth AAAI conference on artificial intelligence

  19. Zhang J, Li C, Cao D, Lin Y, Su S, Dai L, Li S (2018) Multi-label learning with label-specific features by resolving label correlations. Knowl-Based Syst 159:148–157

    Article  Google Scholar 

  20. Charte F, Rivera AJ, del Jesus MJ, Herrera F (2015) Addressing imbalance in multilabel classification: Measures and random resampling algorithms. Neurocomputing 163:3–16

    Article  Google Scholar 

  21. Charte F, Rivera AJ, del Jesus MJ, Herrera F (2015) Mlsmote: Approaching imbalanced multilabel learning through synthetic instance generation. Knowl-Based Syst 89:385–397

    Article  Google Scholar 

  22. Ding M, Yang Y, Lan Z (2018) Multi-label imbalanced classification based on assessments of cost and value. Appl Intell 48(10):3577–3590

    Article  Google Scholar 

  23. Spyromitros-Xioufis E, Spiliopoulou M, Tsoumakas G, Vlahavas I (2011) Dealing with concept drift and class imbalance in multi-label stream classification. Department of Computer Science, Aristotle University of Thessaloniki

  24. Quevedo J R, Luaces O, Bahamonde A (2012) Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recogn 45(2):876–883

    Article  Google Scholar 

  25. Pillai I, Fumera G, Roli F (2013) Threshold optimisation for multi-label classifiers. Pattern Recogn 46(7):2055–2065

    Article  Google Scholar 

  26. Petterson J, Caetano T S (2010) Reverse multi-label learning. In: Advances in neural information processing systems, pp 1912–1920

  27. Dembczynski K, Jachnik A, Kotlowski W, Waegeman W, Hüllermeier E (2013) Optimizing the f-measure in multi-label classification: Plug-in rule approach versus structured loss minimization. In: International conference on machine learning, pp 1130–1138

  28. Wu B, Lyu S, Ghanem B (2016) Constrained submodular minimization for missing labels and class imbalance in multi-label learning.. In: AAAI, pp 2229–2236

  29. Paredes R, Vidal E (2006) Learning weighted metrics to minimize nearest-neighbor classification error. IEEE Trans Pattern Anal Mach Intell 28(7):1100–1110. https://doi.org/10.1109/TPAMI.2006.145

    Article  Google Scholar 

  30. Paredes R, Vidal E (2006) Learning prototypes and distances: A prototype reduction technique based on nearest neighbor error minimization. Pattern Recogn 39(2):180–188

    Article  Google Scholar 

  31. Jahromi MZ, Parvinnia E, John R (2009) A method of learning weighted similarity function to improve the performance of nearest neighbor. Inf Sci 179(17):2964–2973

    Article  Google Scholar 

  32. Rastin N, Jahromi MZ, Taheri M (2020) A generalized weighted distance k-nearest neighbor for multi-label problems. Pattern Recogn 107526

  33. Zhang Q-W, Zhong Y, Zhang M-L (2018) Feature-induced labeling information enrichment for multi-label learning.. In: AAAI, pp 4446–4453

  34. Dembczy K (2010) Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th international conference on machine learning, pp 279–286

  35. Qi G-J, Hua X-S, Rui Y, Tang J, Mei T, Zhang H-J (2007) Correlative multi-label video annotation categories and subject descriptors. Context

  36. Pachet F, Roy P (2009) Improving multilabel analysis of music titles: A large-scale validation of the correction approach. IEEE Trans Audio Speech Lang Process 17(2):335–343. https://doi.org/10.1109/TASL.2008.2008734

    Article  Google Scholar 

  37. Montañes E, Senge R, Barranquero J, Ramón Quevedo J, José del Coz J, Hüllermeier E (2013) Dependent binary relevance models for multi-label classification. Pattern Recogn 47(3):1494–1508. https://doi.org/10.1016/j.patcog.2013.09.029

    Article  Google Scholar 

  38. Zhang M-L, Li Y-K, Liu X-Y, Geng X (2018) Binary relevance for multi-label learning: an overview. Front Comput Sci 12(2):191–202

    Article  Google Scholar 

  39. Chen Y-N, Weng W, Wu S-X, Chen B-H, Fan Y-L, Liu J-H (2020) An efficient stacking model with label selection for multi-label classification. Appl Intell 1–18

  40. Cheng W, Hüllermeier E (2009) Combining instance-based learning and logistic regression for multilabel classification. Mach Learn 76(2-3):211–225

    Article  Google Scholar 

  41. Rastin N, Jahromi MZ, Taheri M (2017) Multi-label classification systems by the use of supervised clustering. In: Artificial intelligence and signal processing conference (AISP), 2017. IEEE, pp 246–249

  42. Zhang M-L, Zhou Z-H (2013) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837

    Article  Google Scholar 

  43. Neave HR, Worthington PL (1988) Distribution-free tests. Unwin Hyman, London

    Google Scholar 

  44. Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I (2011) Mulan: A java library for multi-label learning. J Mach Learn Res 12(Jul):2411–2414

    MathSciNet  MATH  Google Scholar 

  45. Younes Z, Abdallah F, Denœux T (2008) Multi-label classification algorithm derived from k-nearest neighbor rule with label dependencies. In: Signal Processing Conference, 2008 16th European. IEEE, pp 1–5

  46. Xu J (2011) Multi-label weighted k-nearest neighbor classifier with adaptive weight estimation. In: International conference on neural information processing. Springer, pp 79–88

  47. Spyromitros E, Tsoumakas G, Vlahavas I (2008) An empirical study of lazy multilabel classification algorithms. In: Hellenic conference on artificial intelligence. Springer, pp 401–406

  48. Kimura K, Sun L, Kudo M (2017) Mlc toolbox: A matlab/octave library for multi-label classification. arXiv:1704.02592

  49. Sun L, Kudo M, Kimura K (2016) Multi-label classification with meta-label-specific features. In: Pattern Recognition (ICPR), 2016 23rd International conference on. IEEE, pp 1612–1617

  50. Wu G, Zheng R, Tian Y, Liu D (2020) Joint ranking svm and binary relevance with robust low-rank learning for multi-label classification. Neural Netw 122:24–39

    Article  Google Scholar 

  51. Shu S, Lv F, Feng L, Huang J, He S, He J, Li L (2020) Incorporating multiple cluster centers for multi-label learning. arXiv:2004.08113

  52. Asuncion A, Newman D (2007) Uci machine learning repository

  53. Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult-Valued Logic Soft Comput 17

  54. Bi J, Zhang C (2018) An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl-Based Syst 158:81–93

    Article  Google Scholar 

  55. Liu X-Y, Li Q-Q, Zhou Z-H (2013) Learning imbalanced multi-class data with optimal dichotomy weights. In: 2013 IEEE 13th international conference on data mining. IEEE, pp 478–487

  56. Ghanem AS, Venkatesh S, West G (2010) Multi-class pattern classification in imbalanced data. In: 2010 20th international conference on pattern recognition. IEEE, pp 2881–2884

  57. Wang S, Chen H, Yao X (2010) Negative correlation learning for classification ensembles. In: The 2010 International joint conference on neural networks (IJCNN). IEEE, pp 1–8

  58. Hoens TR, Qian Q, Chawla NV, Zhou Z-H (2012) Building decision trees for the multi-class imbalance problem. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, pp 122–134

  59. Ramentol E, Vluymans S, Verbiest N, Caballero Y, Bello R, Cornelis C, Herrera F (2014) Ifrowann: imbalanced fuzzy-rough ordered weighted average nearest neighbor classification. IEEE Trans Fuzzy Syst 23(5):1622–1637

    Article  Google Scholar 

  60. Dietterich TG, Bakiri G (1991) Error-correcting output codes: A general method for improving multiclass inductive learning programs. In: AAAI. Citeseer, pp 572–577

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Niloofar Rastin.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rastin, N., Jahromi, M.Z. & Taheri, M. Feature weighting to tackle label dependencies in multi-label stacking nearest neighbor. Appl Intell 51, 5200–5218 (2021). https://doi.org/10.1007/s10489-020-02073-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-02073-9

Keywords

Navigation