Abstract
Comparison of generative and discriminative classifiers is an ever-lasting topic. As an important contribution to this topic, based on their theoretical and empirical comparisons between the naïve Bayes classifier and linear logistic regression, Ng and Jordan (NIPS 841–848, 2001) claimed that there exist two distinct regimes of performance between the generative and discriminative classifiers with regard to the training-set size. In this paper, our empirical and simulation studies, as a complement of their work, however, suggest that the existence of the two distinct regimes may not be so reliable. In addition, for real world datasets, so far there is no theoretically correct, general criterion for choosing between the discriminative and the generative approaches to classification of an observation x into a class y; the choice depends on the relative confidence we have in the correctness of the specification of either p(y|x) or p(x, y) for the data. This can be to some extent a demonstration of why Efron (J Am Stat Assoc 70(352):892–898, 1975) and O’Neill (J Am Stat Assoc 75(369):154–160, 1980) prefer normal-based linear discriminant analysis (LDA) when no model mis-specification occurs but other empirical studies may prefer linear logistic regression instead. Furthermore, we suggest that pairing of either LDA assuming a common diagonal covariance matrix (LDA-Λ) or the naïve Bayes classifier and linear logistic regression may not be perfect, and hence it may not be reliable for any claim that was derived from the comparison between LDA-Λ or the naïve Bayes classifier and linear logistic regression to be generalised to all generative and discriminative classifiers.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Abbreviations
- LDA/QDA:
-
Normal-based linear/quadratic discriminant analysis
- AIC:
-
Akaike information criterion
- GAM:
-
Generalised additive model
References
Dawid AP (1976) Properties of diagnostic data distributions. Biometrics 32(3): 647–658
Efron B (1975) The efficiency of logistic regression compared to normal discriminant analysis. J Am Stat Assoc 70(352): 892–898
Hand DJ (2006) Classifier technology and illusion of progress (with discussion). Stat Sci 21: 1–34
Lim T-S, Loh W-Y (1996) A comparison of tests of equality of variances. Comput Stat Data Anal 22(3): 287–301
Newman DJ, Hettich S, Blake CL, Merz CJ (1998) UCI Repository of machine learning databases. University of California, Irvine, Department of Information and Computer Sciences, http://www.ics.uci.edu/~mlearn/MLRepository.html
Ng AY, Jordan MI (2001) On discriminative vs. generative classifiers: a comparison of logistic regression and naive Bayes. In: Dietterich TG, Becker S, Ghahramani Z (eds) NIPS. MIT Press, MA, pp 841–848
O’Neill TJ (1980) The general distribution of the error rate of a classification procedure with application to logistic regression discrimination. J Am Stat Assoc 75(369): 154–160
Perlich C, Provost F, Simonoff JS (2003) Tree induction vs. logistic regression: a learning-curve analysis. J Mach Learn Res 4: 211–255
Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, New York
Rubinstein YD, Hastie T (1997) Discriminative vs. informative learning. In: Heckerman D, Mannila H, Pregibon D, Uthurusamy R (eds) KDD. AAAI Press, CA, pp 49–53
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3-4): 591–611
Titterington DM, Murray GD, Murray LS, Spiegelhalter DJ, Skene AM, Habbema JDF, Gelpke GJ (1981) Comparison of discrimination techniques applied to a complex data set of head injured patients (with discussion). J R Stat Soc [Ser A] 144(2): 145–175
Verboven S, Hubert M (2005) LIBRA: a MATLAB library for robust analysis. Chemometrics Intell Lab Syst 75(2): 127–136
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xue, JH., Titterington, D.M. Comment on “On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes”. Neural Process Lett 28, 169–187 (2008). https://doi.org/10.1007/s11063-008-9088-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-008-9088-7