Multi-kernel learning for multivariate performance measures optimization | Neural Computing and Applications Skip to main content
Log in

Multi-kernel learning for multivariate performance measures optimization

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In this paper, we investigate the problem of optimizing complex multivariate performance measures to learn classifiers for pattern classification problems. For the first time, the multi-kernel learning is considered to construct a classifier to optimize a given nonlinear and non-smooth multivariate classifier performance measure. We estimate and optimize the upper bound of the given multivariate performance measure, instead of optimizing it directly. Moreover, to solve the problem of kernel function selection and kernel parameter tuning, we proposed to construct an optimal kernel by weighted linear combination of some candidate kernels. The learning of the classifier parameter and the kernel weight are unified in a single objective function considering minimizing the upper bound of the given multivariate performance measure. The objective function is optimized with regard to classifier parameter and kernel weight alternately in an iterative algorithm. The developed algorithm is evaluated on two different pattern classification methods with regard to various multivariate performance measure optimization problems. The experiment results show the proposed algorithm outperforms the competing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Althloothi S, Mahoor M, Zhang X, Voyles R (2014) Human activity recognition using multi-features and multiple kernel learning. Pattern Recognit. 47(5):1800–1812

    Article  Google Scholar 

  2. Alvira M, Rifkin R (2001) An empirical comparison of SNoW and SVMs for face detection. Tech. Rep. 2001–004, CBCL, MIT, Cambridge, MA

  3. Chen N, Hoiy S, Li S, Xiao X (2015) Simapp:a framework for detecting similar mobile applications by online kernel learning. In: WSDM 2015—proceedings of the 8th ACM international conference on web search and data mining, pp 305–314

  4. Congalton RG (1991) A review of assessing the accuracy of classifications of remotely sensed data. Remote Sens Environ 37(1):35–46

    Article  Google Scholar 

  5. Cornforth D, Campbell P, Nesbitt K, Robinson D, Jelinek H (2015) Prediction of game performance in australian football using heart rate variability measures. Int J Signal Imaging Syst Eng 8(1–2):80–88

    Article  Google Scholar 

  6. Damoulas T, Girolami M (2008) Probabilistic multi-class multi-kernel learning: On protein fold recognition and remote homology detection. Bioinformatics 24(10):1264–1270

    Article  Google Scholar 

  7. Dang HX, Lawrence CB (2014) Allerdictor: fast allergen prediction using text classification techniques. Bioinformatics 30(8):1120–1128

    Article  Google Scholar 

  8. Dimitrov I, Flower DR, Doytchinova I (2013) Allertop-a server for in silico prediction of allergens. BMC Bioinform. 14(6):1–9

    Google Scholar 

  9. Dumais S, Platt J, Heckerman D, Sahami M (1998) Inductive learning algorithms and representations for text categorization. In: Proceedings of the seventh international conference on information and knowledge management, ACM, pp 148–155

  10. El Sharkawi A, Ramig L, Logemann J, Pauloski B, Rademaker A, Smith C, Pawlas A, Baum S, Werner C (2002) Swallowing and voice effects of Lee Silverman voice treatment (lsvt®): a pilot study. J Neurol Neurosurg Psychiatry 72(1):31–36

    Article  Google Scholar 

  11. Fan H, Song Q, Xu Z (2014) An information theoretic sparse kernel algorithm for online learning. Expert Syst Appl 41(9):4349–4359

    Article  Google Scholar 

  12. Ferri C, Hernández-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recognit Lett 30(1):27–38

    Article  Google Scholar 

  13. Forti M, Tesi A (1995) New conditions for global stability of neural networks with application to linear and quadratic programming problems. IEEE Trans Circuits Syst I Fundam Theory Appl 42(7):354–366

    Article  MathSciNet  MATH  Google Scholar 

  14. García V, Sanchez J, Mollineda R (2012) On the suitability of numerical performance measures for class imbalance problems. In: ICPRAM 2012—proceedings of the 1st international conference on pattern recognition applications and methods, vol 1, pp 310–313

  15. Gasteiger E, Jung E, Bairoch A et al (2001) Swiss-prot: connecting biomolecular knowledge via a protein database. Curr issues Mol Biol 3:47–56

    Google Scholar 

  16. Gönen M, Alpaydin E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12:2211–2268

    MathSciNet  MATH  Google Scholar 

  17. Joachims T (2005) A support vector method for multivariate performance measures. In: Proceedings of the 22nd international conference on Machine learning, ACM, pp 377–384

  18. Joachims T, Yu CN (2009) Sparse kernel SVMs via cutting-plane training. Mach Learn 76(2–3):179–193

    Article  Google Scholar 

  19. Kleber F, Diem M, Sablatnig R (2013) Form classification and retrieval using bag of words with shape features of line structures. In: IS&T/SPIE electronic imaging international society for optics and photonics, pp 902,107–902,107

  20. Koehler SR, Dhaher YY, Hansen AH (2014) Cross-validation of a portable, six-degree-of-freedom load cell for use in lower-limb prosthetics research. J Biomech 47(6):1542–1547

    Article  Google Scholar 

  21. Kohavi R et al (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI vol 14, pp 1137–1145

  22. Lanckriet G, Cristianini N, Bartlett P, El Ghaoui L, Jordan M (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:27

    MathSciNet  MATH  Google Scholar 

  23. Lausser L, Schmid F, Schmid M, Kestler HA (2014) Unlabeling data can improve classification accuracy. Pattern Recogniti Lett 37:15–23

    Article  Google Scholar 

  24. Li N, Tsang I, Zhou ZH (2013) Efficient optimization of performance measures by classifier adaptation. IEEE Trans Pattern Anal Mach Intell 35(6):1370–1382

    Article  Google Scholar 

  25. Liang Z, Xia S, Zhou Y, Zhang L (2013) Training lp norm multiple kernel learning in the primal. Neural Netw 46:172–182

    Article  MATH  Google Scholar 

  26. Mao Q, Tsang IH (2013) A feature selection method for multivariate performance measures. Pattern Anal Mach Intell IEEE Trans 35(9):2051–2063

    Article  Google Scholar 

  27. Maratea A, Petrosino A, Manzo M (2014) Adjusted f-measure and kernel scaling for imbalanced data learning. Inform Sci 257:331–341

    Article  Google Scholar 

  28. Müller KR, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12(2):181–201

    Article  Google Scholar 

  29. Molina-Giraldo S, Carvajal-González J, Álvarez-Meza A, Castellanos-Domínguez G (2013) Video segmentation based on multi-kernel learning and feature relevance analysis for object classification. In: ICPRAM 2013—proceedings of the 2nd international conference on pattern recognition applications and methods, pp 396–401

  30. Ranjbar M, Lan T, Wang Y, Robinovitch SN, Li ZN, Mori G (2013) Optimizing nondecomposable loss functions in structured prediction. Pattern Anal Mach Intell IEEE Trans 35(4):911–924

    Article  Google Scholar 

  31. Shi Z, Jin Q (2014) Second order optimality conditions and reformulations for nonconvex quadratically constrained quadratic programming problems. J Ind Manag Optim 10(3):871–882

    Article  MathSciNet  MATH  Google Scholar 

  32. Sun T, Jiao L, Liu F, Wang S, Feng J (2013) Selective multiple kernel learning for classification with ensemble strategy. Pattern Recognit 46(11):3081–3090

    Article  Google Scholar 

  33. Sun Y, Todorovic S, Goodison S (2010) Local-learning-based feature selection for high-dimensional data analysis. IEEE Trans Pattern Anal Mach Intell 32(9):1610–1626

    Article  Google Scholar 

  34. Takeda A, Kanamori T (2014) Using financial risk measures for analyzing generalization performance of machine learning models. Neural Netw 57:29–38

    Article  MATH  Google Scholar 

  35. Tsanas A (2012) Accurate telemonitoring of Parkinson’s disease symptom severity using nonlinear speech signal processing and statistical machine learning. Ph.D. thesis, University of Oxford

  36. Tsanas A, Little MA, Fox C, Ramig LO (2014) Objective automatic assessment of rehabilitative speech treatment in Parkinson’s disease. IEEE Trans Neural Syst Rehabil Eng 22(1):181–190

    Article  Google Scholar 

  37. Wallach HM (2006) Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd international conference on Machine learning. ACM, pp 977–984

  38. Wang J, Yu Y, Zhao Y, Zhang D, Li J (2013) Evaluation and integration of existing methods for computational prediction of allergens. BMC Bioinform 14(4):1–9

    Article  Google Scholar 

  39. Xu R, Gui L, Xu J, Lu Q, Wong KF (2013) Cross lingual opinion holder extraction based on multi-kernel SVMs and transfer learning. World Wide Web 14(Suppl 4):1–18

    Google Scholar 

  40. Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: ICML vol 97, pp 412–420

  41. Zayid E, Akay M (2013) Predicting the performance measures of a message-passing multiprocessor architecture using artificial neural networks. Neural Comput Appl 23(7–8):2481–2491

    Article  Google Scholar 

  42. Zayid E, Akay M (2013) Reliable attributes selection technique for predicting the performance measures of a dsm multiprocessor architecture. In: Proceedings-2013 international conference on computer, electrical and electronics engineering: ’Research Makes a Difference’, ICCEEE 2013, pp 209–215. doi:10.1109/ICCEEE.2013.6633934

  43. Zhang JF, Hu SS (2008) Chaotic time series prediction based on multi-kernel learning support vector regression. Wuli Xuebao/Acta Phys Sin 57(5):2708–2713

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianbing Xiahou.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, F., Wang, J., Zhang, N. et al. Multi-kernel learning for multivariate performance measures optimization. Neural Comput & Applic 28, 2075–2087 (2017). https://doi.org/10.1007/s00521-015-2164-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-015-2164-9

Keywords

Navigation