{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,6,25]],"date-time":"2024-06-25T17:40:09Z","timestamp":1719337209107},"reference-count":62,"publisher":"Walter de Gruyter GmbH","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,4,24]]},"abstract":"Abstract<\/jats:title>The naive Bayes classifier is a popular classifier, as it is easy to train, requires no cross-validation for parameter tuning, and can be easily extended due to its generative model. Moreover, recently it was shown that the word probabilities (background distribution) estimated from large unlabeled corpora could be used to improve the parameter estimation of naive Bayes. However, previous methods do not explicitly allow to control how much the background distribution can influence the estimation of naive Bayes parameters. In contrast, we investigate an extension of the graphical model of naive Bayes such that a word is either generated from a background distribution or from a class-specific word distribution. We theoretically analyze this model and show the connection to Jelinek-Mercer smoothing. Experiments using four standard text classification data sets show that the proposed method can statistically significantly outperform previous methods that use the same background distribution.<\/jats:p>","DOI":"10.1515\/jisys-2017-0016","type":"journal-article","created":{"date-parts":[[2017,7,20]],"date-time":"2017-07-20T10:01:12Z","timestamp":1500544872000},"page":"259-273","source":"Crossref","is-referenced-by-count":2,"title":["Analysis of the Use of Background Distribution for Naive Bayes Classifiers"],"prefix":"10.1515","volume":"28","author":[{"given":"Daniel","family":"Andrade","sequence":"first","affiliation":[]},{"given":"Akihiro","family":"Tamura","sequence":"additional","affiliation":[]},{"given":"Masaaki","family":"Tsuchida","sequence":"additional","affiliation":[]}],"member":"374","reference":[{"key":"ref81","article-title":"A weakly supervised Bayesian model for violence detection in social media","year":"2013","journal-title":"International Joint Conference on Natural Language Processing (IJCNLP)"},{"key":"ref491","first-page":"640","article-title":"Generating templates of entity summaries with an entity-aspect model and pattern mining","year":"2010","journal-title":"Association for Computational Linguistics"},{"key":"ref191","first-page":"343","article-title":"Scaling semi-supervised naive Bayes with feature marginals","year":"2013","journal-title":"Association for Computational Linguistics"},{"key":"ref331","first-page":"101","year":"1999","journal-title":"Comparing Bayesian network classifiers"},{"key":"ref151","first-page":"160","year":"2016","journal-title":"Deep feature weighting in naive Bayes for Chinese text classification"},{"key":"ref511","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1023\/A:1007692713085","article-title":"Text classification from labeled and unlabeled documents using EM","volume":"39","year":"2000","journal-title":"Mach. Learn."},{"key":"ref601","doi-asserted-by":"crossref","first-page":"1650003","DOI":"10.1142\/S0218001416500038","article-title":"A new feature selection approach to naive Bayes text classifiers","volume":"30","year":"2016","journal-title":"Int. J. Pattern Recogn. Artif. Intell."},{"key":"ref01","first-page":"993","article-title":"Latent Dirichlet allocation","volume":"3","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref21","first-page":"101","year":"1999","journal-title":"Comparing Bayesian network classifiers"},{"key":"ref301","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1016\/j.knosys.2016.02.017","article-title":"Two feature weighting approaches for naive Bayes text classifiers","volume":"100","year":"2016","journal-title":"Knowl. Based Syst."},{"key":"ref211","article-title":"A two-dimensional topic-aspect model for discovering multi-faceted topics","volume":"51","year":"2010","journal-title":"AAAI"},{"key":"ref31","first-page":"2493","article-title":"Natural language processing (almost) from scratch","volume":"12","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref561","first-page":"555","volume-title":"A CFS-Based Feature Weighting Approach to Naive Bayes Text Classifiers","year":"2014"},{"key":"ref231","first-page":"97","article-title":"Large scale text classification using semi-supervised multinomial naive Bayes","year":"2011","journal-title":"Proceedings of the 28th International Conference on Machine Learning (ICML-11)"},{"key":"ref411","doi-asserted-by":"crossref","first-page":"571","DOI":"10.1080\/03610928008827904","article-title":"Approximations of the critical region of the Fbietkan statistic","volume":"9","year":"1980","journal-title":"Commun. Stat. Theory Methods"},{"key":"ref271","first-page":"42","article-title":"A re-examination of text categorization methods","volume-title":"ACM SIGIR Conference on Research and Development in Information Retrieval","year":"1999"},{"key":"ref141","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.ins.2015.09.037","article-title":"Structure extended multinomial naive Bayes","volume":"329","year":"2016","journal-title":"Inf. Sci."},{"key":"ref521","article-title":"A two-dimensional topic-aspect model for discovering multi-faceted topics","volume":"51","year":"2010","journal-title":"AAAI"},{"key":"ref121","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1080\/0952813X.2012.721010","article-title":"naive Bayes text classifiers: a locally weighted learning approach","volume":"25","year":"2013","journal-title":"J. Exp. Theor. Artif. Intell."},{"key":"ref261","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1007\/s10115-014-0746-y","article-title":"Adapting naive Bayes tree for text classification","volume":"44","year":"2015","journal-title":"Knowl. Inf. Syst."},{"key":"ref451","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.ins.2015.09.037","article-title":"Structure extended multinomial naive Bayes","volume":"329","year":"2016","journal-title":"Inf. Sci."},{"key":"ref221","first-page":"616","article-title":"Tackling the poor assumptions of naive Bayes text classifiers","volume":"3","year":"2003","journal-title":"Proceedings of the International Conference on Machine Learning"},{"key":"ref481","first-page":"361","article-title":"Rcv1: a new benchmark collection for text categorization research","volume":"5","year":"2004","journal-title":"J. Mach. Learn. Res."},{"key":"ref541","first-page":"97","article-title":"Large scale text classification using semi-supervised multinomial naive Bayes","year":"2011","journal-title":"Proceedings of the 28th International Conference on Machine Learning (ICML-11)"},{"key":"ref341","first-page":"2493","article-title":"Natural language processing (almost) from scratch","volume":"12","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref181","first-page":"640","article-title":"Generating templates of entity summaries with an entity-aspect model and pattern mining","year":"2010","journal-title":"Association for Computational Linguistics"},{"key":"ref61","first-page":"1","article-title":"Statistical comparisons of classifiers over multiple data sets","volume":"7","year":"2006","journal-title":"J. Mach. Learn. Res."},{"key":"ref161","year":"1998","journal-title":"Text categorization with support vector machines: learning with many relevant features"},{"key":"ref431","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1080\/0952813X.2012.721010","article-title":"naive Bayes text classifiers: a locally weighted learning approach","volume":"25","year":"2013","journal-title":"J. Exp. Theor. Artif. Intell."},{"key":"ref381","first-page":"1041","article-title":"Sparse additive generative models of text","year":"2011","journal-title":"Proceedings of the 28th International Conference on Machine Learning (ICML-11)"},{"key":"ref241","first-page":"1973","article-title":"Rethinking LDA: Why priors matter","volume":"22","year":"2009","journal-title":"NIPS"},{"key":"ref371","first-page":"1","article-title":"Statistical comparisons of classifiers over multiple data sets","volume":"7","year":"2006","journal-title":"J. Mach. Learn. Res."},{"key":"ref441","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.engappai.2016.02.002","article-title":"Deep feature weighting for naive Bayes and its application to text classification","volume":"52","year":"2016","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref591","first-page":"334","article-title":"A study of smoothing methods for language models applied to ad hoc information retrieval","year":"2001","journal-title":"Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval"},{"key":"ref41","first-page":"305","article-title":"Bayesian query-focused summarization","year":"2006","journal-title":"Association for Computational Linguistics"},{"key":"ref91","first-page":"192","volume-title":"SIGIR","year":"1994"},{"key":"ref131","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.engappai.2016.02.002","article-title":"Deep feature weighting for naive Bayes and its application to text classification","volume":"52","year":"2016","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref501","first-page":"343","article-title":"Scaling semi-supervised naive Bayes with feature marginals","year":"2013","journal-title":"Association for Computational Linguistics"},{"key":"ref551","first-page":"1973","article-title":"Rethinking LDA: Why priors matter","volume":"22","year":"2009","journal-title":"NIPS"},{"key":"ref311","first-page":"993","article-title":"Latent Dirichlet allocation","volume":"3","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref571","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1007\/s10115-014-0746-y","article-title":"Adapting naive Bayes tree for text classification","volume":"44","year":"2015","journal-title":"Knowl. Inf. Syst."},{"key":"ref251","first-page":"555","volume-title":"A CFS-Based Feature Weighting Approach to Naive Bayes Text Classifiers","year":"2014"},{"key":"ref611","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1016\/j.knosys.2016.02.017","article-title":"Two feature weighting approaches for naive Bayes text classifiers","volume":"100","year":"2016","journal-title":"Knowl. Based Syst."},{"key":"ref321","first-page":"241","article-title":"Modeling general and specific aspects of documents with a probabilistic topic model","volume":"19","year":"2006","journal-title":"NIPS"},{"key":"ref471","year":"1998","journal-title":"Text categorization with support vector machines: learning with many relevant features"},{"key":"ref351","first-page":"305","article-title":"Bayesian query-focused summarization","year":"2006","journal-title":"Association for Computational Linguistics"},{"key":"ref461","first-page":"160","year":"2016","journal-title":"Deep feature weighting in naive Bayes for Chinese text classification"},{"key":"ref291","doi-asserted-by":"crossref","first-page":"1650003","DOI":"10.1142\/S0218001416500038","article-title":"A new feature selection approach to naive Bayes text classifiers","volume":"30","year":"2016","journal-title":"Int. J. Pattern Recogn. Artif. Intell."},{"key":"ref581","first-page":"42","article-title":"A re-examination of text categorization methods","volume-title":"ACM SIGIR Conference on Research and Development in Information Retrieval","year":"1999"},{"key":"ref171","first-page":"361","article-title":"Rcv1: a new benchmark collection for text categorization research","volume":"5","year":"2004","journal-title":"J. Mach. Learn. Res."},{"key":"ref361","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm","volume":"39","year":"1977","journal-title":"J. R. Stat. Soc."},{"key":"ref201","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1023\/A:1007692713085","article-title":"Text classification from labeled and unlabeled documents using EM","volume":"39","year":"2000","journal-title":"Mach. Learn."},{"key":"ref71","first-page":"1041","article-title":"Sparse additive generative models of text","year":"2011","journal-title":"Proceedings of the 28th International Conference on Machine Learning (ICML-11)"},{"key":"ref51","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm","volume":"39","year":"1977","journal-title":"J. R. Stat. Soc."},{"key":"ref11","first-page":"241","article-title":"Modeling general and specific aspects of documents with a probabilistic topic model","volume":"19","year":"2006","journal-title":"NIPS"},{"key":"ref111","doi-asserted-by":"crossref","first-page":"1250007","DOI":"10.1142\/S0218213011004770","article-title":"Discriminatively weighted naive Bayes and its application in text classification","volume":"21","year":"2012","journal-title":"Int. J. Artif. Intell. Tools"},{"key":"ref281","first-page":"334","article-title":"A study of smoothing methods for language models applied to ad hoc information retrieval","year":"2001","journal-title":"Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval"},{"key":"ref101","doi-asserted-by":"crossref","first-page":"571","DOI":"10.1080\/03610928008827904","article-title":"Approximations of the critical region of the Fbietkan statistic","volume":"9","year":"1980","journal-title":"Commun. Stat. Theory Methods"},{"key":"ref421","doi-asserted-by":"crossref","first-page":"1250007","DOI":"10.1142\/S0218213011004770","article-title":"Discriminatively weighted naive Bayes and its application in text classification","volume":"21","year":"2012","journal-title":"Int. J. Artif. Intell. Tools"},{"key":"ref531","first-page":"616","article-title":"Tackling the poor assumptions of naive Bayes text classifiers","volume":"3","year":"2003","journal-title":"Proceedings of the International Conference on Machine Learning"},{"key":"ref391","article-title":"A weakly supervised Bayesian model for violence detection in social media","year":"2013","journal-title":"International Joint Conference on Natural Language Processing (IJCNLP)"},{"key":"ref401","first-page":"192","volume-title":"SIGIR","year":"1994"}],"container-title":["Journal of Intelligent Systems"],"original-title":[],"link":[{"URL":"http:\/\/www.degruyter.com\/view\/j\/jisys.2019.28.issue-2\/jisys-2017-0016\/jisys-2017-0016.xml","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/jisys-2017-0016\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,25]],"date-time":"2024-06-25T17:22:27Z","timestamp":1719336147000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/jisys-2017-0016\/html"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,4,24]]},"references-count":62,"journal-issue":{"issue":"2"},"URL":"https:\/\/doi.org\/10.1515\/jisys-2017-0016","relation":{},"ISSN":["2191-026X","0334-1860"],"issn-type":[{"value":"2191-026X","type":"electronic"},{"value":"0334-1860","type":"print"}],"subject":[],"published":{"date-parts":[[2019,4,24]]}}}