Optimal Thresholding of Classifiers to Maximize F1 Measure

Lipton, Zachary C.; Elkan, Charles; Naryanaswamy, Balakrishnan

doi:10.1007/978-3-662-44851-9_15

Zachary C. Lipton²³,
Charles Elkan²³ &
Balakrishnan Naryanaswamy²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8725))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

39k Accesses
181 Citations
4 Altmetric

Abstract

This paper provides new insight into maximizing F1 measures in the context of binary classification and also in the context of multilabel classification. The harmonic mean of precision and recall, the F1 measure is widely used to evaluate the success of a binary classifier when one class is rare. Micro average, macro average, and per instance average F1 measures are used in multilabel classification. For any classifier that produces a real-valued output, we derive the relationship between the best achievable F1 value and the decision-making threshold that achieves this optimum. As a special case, if the classifier outputs are well-calibrated conditional probabilities, then the optimal threshold is half the optimal F1 value. As another special case, if the classifier is completely uninformative, then the optimal behavior is to classify all examples as positive. When the actual prevalence of positive examples is low, this behavior can be undesirable. As a case study, we discuss the results, which can be surprising, of maximizing F1 when predicting 26,853 labels for Medline documents.

Download to read the full chapter text

Chapter PDF

Classifier calibration: a survey on how to assess and improve predicted class probabilities

Article Open access 16 May 2023

LiBRe: Label-Wise Selection of Base Learners in Binary Relevance for Multi-label Classification

Efficient set-valued prediction in multi-class classification

Article 06 May 2021

Keywords

References

Akay, M.F.: Support vector machines combined with feature selection for breast cancer diagnosis. Expert Systems with Applications 36(2), 3240–3247 (2009)
Article Google Scholar
Capen, E.C., Clapp, R.V., Campbell, W.M.: Competitive bidding in high-risk situations. Journal of Petroleum Technology 23(6), 641–653 (1971)
Article Google Scholar
Cover, T.M., Thomas, J.A.: Elements of information theory. John Wiley & Sons (2012)
Google Scholar
del Coz, J.J., Diez, J., Bahamonde, A.: Learning nondeterministic classifiers. Journal of Machine Learning Research 10, 2273–2293 (2009)
MATH Google Scholar
Dembczynski, K., Kotłowski, W., Jachnik, A., Waegeman, W., Hüllermeier, E.: Optimizing the F-measure in multi-label classification: Plug-in rule approach versus structured loss minimization. In: ICML (2013)
Google Scholar
Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: An exact algorithm for F-measure maximization. In: Neural Information Processing Systems (2011)
Google Scholar
Elkan, C.: The foundations of cost-sensitive learning. In: International Joint Conference on Artificial Intelligence, pp. 973–978 (2001)
Google Scholar
Jansche, M.: A maximum expected utility framework for binary sequence labeling. In: Annual Meeting of the Association for Computational Linguistics, p. 736 (2007)
Google Scholar
Lewis, D.D.: Evaluating and optimizing autonomous text classification systems. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 246–254. ACM (1995)
Google Scholar
Madsen, R., Kauchak, D., Elkan, C.: Modeling word burstiness using the Dirichlet distribution. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 545–552 (August 2005)
Google Scholar
Manning, C., Raghavan, P., Schütze, H.: Introduction to information retrieval, vol. 1. Cambridge University Press (2008)
Google Scholar
Menon, A., Jiang, X., Vembu, S., Elkan, C., Ohno-Machado, L.: Predicting accurate probabilities with a ranking loss. In: Proceedings of the International Conference on Machine Learning (ICML) (June 2012)
Google Scholar
Mozer, M.C., Dodier, R.H., Colagrosso, M.D., Guerra-Salcedo, C., Wolniewicz, R.H.: Prodding the ROC curve: Constrained optimization of classifier performance. In: NIPS, pp. 1409–1415 (2001)
Google Scholar
Musicant, D.R., Kumar, V., Ozgur, A., et al.: Optimizing F-measure with support vector machines. In: FLAIRS Conference, pp. 356–360 (2003)
Google Scholar
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Information Processing and Management 45, 427–437 (2009)
Article Google Scholar
Suzuki, J., McDermott, E., Isozaki, H.: Training conditional random fields with multivariate evaluation measures. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 217–224. Association for Computational Linguistics (2006)
Google Scholar
Tan, S.: Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert Systems with Applications 28, 667–671 (2005)
Article Google Scholar
Tsoumakas, G., Katakis, I.: Multi-label classification: An overview. International Journal of Data Warehousing and Mining 3(3), 1–13 (2007)
Article Google Scholar
Ye, N., Chai, K.M., Lee, W.S., Chieu, H.L.: Optimizing F-measures: A tale of two approaches. In: Proceedings of the International Conference on Machine Learning (2012)
Google Scholar
Zhao, M.J., Edakunni, N., Pocock, A., Brown, G.: Beyond Fano’s inequality: Bounds on the optimal F-score, BER, and cost-sensitive risk and their implications. Journal of Machine Learning Research 14(1), 1033–1090 (2013)
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

University of California, San Diego, La Jolla, California, 92093-0404, USA
Zachary C. Lipton, Charles Elkan & Balakrishnan Naryanaswamy

Authors

Zachary C. Lipton
View author publications
You can also search for this author in PubMed Google Scholar
Charles Elkan
View author publications
You can also search for this author in PubMed Google Scholar
Balakrishnan Naryanaswamy
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Applied Sciences,Department of Computer and Decision Engineering, Université Libre de Bruxelles, Av. F. Roosevelt, CP 165/15, 1050, Brussels, Belgium
Toon Calders
Dipartimento di Informatica, Università degli Studi “Aldo Moro”, via Orabona 4, 70125, Bari, Italy
Floriana Esposito
Department of Computer Science, Universität Paderborn, Warburger Str. 100, 33098, Paderborn, Germany
Eyke Hüllermeier
Dipartimento di Informatica, Università degli Studi di Torino, Corso Svizzera 185, 10149, Torino, Italy
Rosa Meo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lipton, Z.C., Elkan, C., Naryanaswamy, B. (2014). Optimal Thresholding of Classifiers to Maximize F1 Measure. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2014. Lecture Notes in Computer Science(), vol 8725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44851-9_15

Download citation

DOI: https://doi.org/10.1007/978-3-662-44851-9_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44850-2
Online ISBN: 978-3-662-44851-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Optimal Thresholding of Classifiers to Maximize F1 Measure

Abstract

Chapter PDF

Similar content being viewed by others

Classifier calibration: a survey on how to assess and improve predicted class probabilities

LiBRe: Label-Wise Selection of Base Learners in Binary Relevance for Multi-label Classification

Efficient set-valued prediction in multi-class classification

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Optimal Thresholding of Classifiers to Maximize F1 Measure

Abstract

Chapter PDF

Similar content being viewed by others

Classifier calibration: a survey on how to assess and improve predicted class probabilities

LiBRe: Label-Wise Selection of Base Learners in Binary Relevance for Multi-label Classification

Efficient set-valued prediction in multi-class classification

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation