{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,24]],"date-time":"2024-08-24T07:07:13Z","timestamp":1724483233823},"reference-count":53,"publisher":"Springer Science and Business Media LLC","issue":"12","license":[{"start":{"date-parts":[[2020,11,3]],"date-time":"2020-11-03T00:00:00Z","timestamp":1604361600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,11,3]],"date-time":"2020-11-03T00:00:00Z","timestamp":1604361600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100002157","name":"Istituto Nazionale di Alta Matematica \u201dFrancesco Severi\u201d","doi-asserted-by":"publisher","award":["-"],"id":[{"id":"10.13039\/100002157","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Med Biol Eng Comput"],"published-print":{"date-parts":[[2020,12]]},"abstract":"Abstract<\/jats:title>Agreement measures are useful tools to both compare different evaluations of the same diagnostic outcomes and validate new rating systems or devices. Cohen\u2019s kappa (\u03ba<\/jats:italic>) certainly is the most popular agreement method between two raters, and proved its effectiveness in the last sixty years. In spite of that, this method suffers from some alleged issues, which have been highlighted since the 1970s; moreover, its value is strongly dependent on the prevalence of the disease in the considered sample. This work introduces a new agreement index, the informational agreement<\/jats:italic> (IA<\/jats:italic>), which seems to avoid some of Cohen\u2019s kappa\u2019s flaws, and separates the contribution of the prevalence from the nucleus of agreement. These goals are achieved by modelling the agreement\u2014in both dichotomous and multivalue ordered-categorical cases\u2014as the information shared between two raters through the virtual diagnostic channel<\/jats:italic> connecting them: the more information exchanged between the raters, the higher their agreement. In order to test its fair behaviour and the effectiveness of the method, IA<\/jats:italic> has been tested on some cases known to be problematic for \u03ba<\/jats:italic>, in the machine learning context and in a clinical scenario to compare ultrasound<\/jats:italic> (US) and automated breast volume scanner<\/jats:italic> (ABVS) in the setting of breast cancer imaging.<\/jats:p>","DOI":"10.1007\/s11517-020-02261-2","type":"journal-article","created":{"date-parts":[[2020,11,3]],"date-time":"2020-11-03T22:02:41Z","timestamp":1604440961000},"page":"3089-3099","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Beyond kappa: an informational index for diagnostic agreement in dichotomous and multivalue ordered-categorical ratings"],"prefix":"10.1007","volume":"58","author":[{"given":"Alberto","family":"Casagrande","sequence":"first","affiliation":[]},{"ORCID":"http:\/\/orcid.org\/0000-0002-1950-8838","authenticated-orcid":false,"given":"Francesco","family":"Fabris","sequence":"additional","affiliation":[]},{"given":"Rossano","family":"Girometti","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,11,3]]},"reference":[{"key":"2261_CR1","volume-title":"On measures of information and their characterizations, mathematics in science and engineering, vol 115","author":"J Acz\u00e9l","year":"1975","unstructured":"Acz\u00e9l J, Dar\u00f3czy Z (1975) On measures of information and their characterizations, mathematics in science and engineering, vol 115. Academic Press, New York"},{"key":"2261_CR2","doi-asserted-by":"crossref","DOI":"10.1002\/0470114754","volume-title":"An introduction to categorical data analysis","author":"A Agresti","year":"2007","unstructured":"Agresti A (2007) An introduction to categorical data analysis. Wiley-Blackwell, Hoboken"},{"key":"2261_CR3","unstructured":"Aha DW (1991) Tic-Tac-Toe endgame data set. https:\/\/archive.ics.uci.edu\/ml\/datasets\/Tic-Tac-Toe+Endgame"},{"issue":"1","key":"2261_CR4","first-page":"117","volume":"41","author":"U Arslan","year":"2014","unstructured":"Arslan U, Bozkurt B, Karaa\u011fao\u011flu AE, \u0130rke\u00e7 MT (2014) Evaluation of GDx parameters by using information theory. Turk J Med Sci 41(1):117\u2013124","journal-title":"Turk J Med Sci"},{"key":"2261_CR5","doi-asserted-by":"crossref","first-page":"214","DOI":"10.5152\/balkanmedj.2014.13218","volume":"31","author":"U Arslan","year":"2014","unstructured":"Arslan U, Karaa\u011fao\u011flu AE, \u00d6zkan G, Kanl\u0131 A (2014) Evaluation of diagnostic tests using information theory for multi-class diagnostic problems and its application for the detection of occlusal caries lesions. Balk Med J 31:214\u2013218","journal-title":"Balk Med J"},{"issue":"1","key":"2261_CR6","doi-asserted-by":"crossref","first-page":"3","DOI":"10.2307\/3315487","volume":"27","author":"M Banerjee","year":"1999","unstructured":"Banerjee M, Capozzoli M, McSweeney L, Sinha D (1999) Beyond kappa: a review of interrater agreement measures. Can J Stat 27(1):3\u201323","journal-title":"Can J Stat"},{"key":"2261_CR7","doi-asserted-by":"crossref","unstructured":"Barlow W (2005) Agreement, modeling of categorical. American Cancer Society","DOI":"10.1002\/0470011815.b2a04004"},{"issue":"2","key":"2261_CR8","doi-asserted-by":"crossref","first-page":"202","DOI":"10.1177\/0272989X9901900211","volume":"19","author":"WA Benish","year":"1999","unstructured":"Benish WA (1999) Relative entropy as a measure of diagnostic information. Med Dec Making 19(2):202\u2013206","journal-title":"Med Dec Making"},{"issue":"6","key":"2261_CR9","doi-asserted-by":"crossref","first-page":"552","DOI":"10.3414\/ME0627","volume":"48","author":"WA Benish","year":"2009","unstructured":"Benish WA (2009) Intuitive and axiomatic arguments for quantifying diagnostic test performance in units of information. Methods of Inf Med 48(6):552\u2013557","journal-title":"Methods of Inf Med"},{"issue":"6","key":"2261_CR10","doi-asserted-by":"crossref","first-page":"1044","DOI":"10.1177\/0962280212439742","volume":"24","author":"WA Benish","year":"2015","unstructured":"Benish WA (2015) The channel capacity of a diagnostic test as a function of test sensitivity and test specificity. Stat Methods Med Res 24(6):1044\u20131052. PMID: 22368178","journal-title":"Stat Methods Med Res"},{"issue":"14","key":"2261_CR11","doi-asserted-by":"publisher","first-page":"2109","DOI":"10.1002\/sim.1180","volume":"21","author":"H Chmura Kraemer","year":"2002","unstructured":"Chmura Kraemer H, Periyakoil VS, Noda A (2002) Kappa coefficients in medical research. Stat Med 21(14):2109\u20132129. https:\/\/doi.org\/10.1002\/sim.1180","journal-title":"Stat Med"},{"issue":"1","key":"2261_CR12","doi-asserted-by":"publisher","first-page":"58","DOI":"10.1097\/nmd.0000000000000598","volume":"205","author":"DV Cicchetti","year":"2017","unstructured":"Cicchetti DV, Klin A, Volkmar FR (2017) Assessing binary diagnoses of bio-behavioral disorders. J Nerv Ment Dis 205(1):58\u201365. https:\/\/doi.org\/10.1097\/nmd.0000000000000598","journal-title":"J Nerv Ment Dis"},{"issue":"1","key":"2261_CR13","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1177\/001316446002000104","volume":"20","author":"J Cohen","year":"1960","unstructured":"Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37\u201346","journal-title":"Educ Psychol Meas"},{"issue":"4","key":"2261_CR14","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1037\/h0026256","volume":"70","author":"J Cohen","year":"1968","unstructured":"Cohen J (1968) Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 70(4):213\u2013220","journal-title":"Psychol Bull"},{"key":"2261_CR15","doi-asserted-by":"crossref","unstructured":"Cook RJ (2005) Kappa. American Cancer Society","DOI":"10.1002\/0470011815.b2a04023"},{"issue":"9","key":"2261_CR16","doi-asserted-by":"publisher","first-page":"e0222,916","DOI":"10.1371\/journal.pone.0222916","volume":"14","author":"R Delgado","year":"2019","unstructured":"Delgado R, Tibau XA (2019) Why cohen\u2019s kappa should be avoided as performance measure in classification. PLOS ONE 14(9):e0222,916. https:\/\/doi.org\/10.1371\/journal.pone.0222916","journal-title":"PLOS ONE"},{"key":"2261_CR17","first-page":"2349","volume":"14","author":"J Dem\u0161ar","year":"2013","unstructured":"Dem\u0161ar J, Curk T, Erjavec A, \u010crt Gorup, Ho\u010devar T, Milutinovi\u010d M, Mo\u017eina M, Polajnar M, Toplak M, Stari\u010d A, \u0160tajdohar M, Umek L, \u017eagar L, \u017ebontar J, \u017eitnik M, Zupan B (2013) Orange: Data Mining Toolbox in Python. J Mach Learn Res 14:2349\u20132353. http:\/\/jmlr.org\/papers\/v14\/demsar13a.html","journal-title":"J Mach Learn Res"},{"key":"2261_CR18","unstructured":"D\u2019Orsi C, et alt (2014) 2013 ACR BI-RADS atlas: Breast imaging reporting and data system. American College of Radiology"},{"key":"2261_CR19","unstructured":"Dua D, Graff C (2017) UCI Machine learning repository. http:\/\/archive.ics.uci.edu\/ml"},{"issue":"6","key":"2261_CR20","doi-asserted-by":"crossref","first-page":"543","DOI":"10.1016\/0895-4356(90)90158-L","volume":"43","author":"AR Feinstein","year":"1990","unstructured":"Feinstein AR, Cicchetti DV (1990) High agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol 43(6):543\u2013549","journal-title":"J Clin Epidemiol"},{"key":"2261_CR21","unstructured":"Fisher R (1988) IRIS data set. https:\/\/archive.ics.uci.edu\/ml\/datasets\/iris"},{"key":"2261_CR22","unstructured":"Fleiss JL (1981) Statistical Methods for Rates and Proportions. A Whiley publ.in applied statistics. Wiley"},{"issue":"3","key":"2261_CR23","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1177\/001316447303300309","volume":"33","author":"JL Fleiss","year":"1973","unstructured":"Fleiss JL, Cohen J (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas 33(3):613\u2013619","journal-title":"Educ Psychol Meas"},{"issue":"9","key":"2261_CR24","doi-asserted-by":"crossref","first-page":"899","DOI":"10.1007\/s11517-015-1294-7","volume":"53","author":"R Girometti","year":"2015","unstructured":"Girometti R, Fabris F (2015) Informational analysis: a Shannon theoretic approach to measure the performance of a diagnostic test. Med Biol Eng Comput 53(9):899\u2013910","journal-title":"Med Biol Eng Comput"},{"issue":"9","key":"2261_CR25","doi-asserted-by":"crossref","first-page":"3767","DOI":"10.1007\/s00330-017-4749-4","volume":"27","author":"R Girometti","year":"2017","unstructured":"Girometti R, Zanotel M, Londero V, Bazzocchi M, Zuiani C (2017) Comparison between automated breast volume scanner (ABVS) versus hand-held ultrasound as a second look procedure after magnetic resonance imaging. Eur Radiol 27(9):3767\u20133775","journal-title":"Eur Radiol"},{"issue":"5","key":"2261_CR26","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1016\/S0895-4356(99)00174-2","volume":"53","author":"F Hoehler","year":"2000","unstructured":"Hoehler F (2000) Bias and prevalence effects on kappa viewed in terms of sensitivity and specificity. J Clin Epidemiol 53(5):499\u2013503","journal-title":"J Clin Epidemiol"},{"key":"2261_CR27","unstructured":"Hopkins M, Reeber E, Forman G, Suermondt J (1999) Spambase data set. https:\/\/archive.ics.uci.edu\/ml\/datasets\/spambase"},{"key":"2261_CR28","unstructured":"Janosi A, Steinbrunn W, Pfisterer M, Detrano R (1988) Heart disease data set. http:\/\/archive.ics.uci.edu\/ml\/datasets\/Heart+Disease"},{"issue":"1","key":"2261_CR29","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1186\/s12911-016-0335-y","volume":"16","author":"Y Kang","year":"2016","unstructured":"Kang Y, Steis MR, Kolanowski AM, Fick D, Prabhu VV (2016) Measuring agreement between healthcare survey instruments using mutual information. BMC Med Inform Decis Mak 16(1):99","journal-title":"BMC Med Inform Decis Mak"},{"key":"2261_CR30","volume-title":"Mathematical foundations of information theory","author":"AI Khinchin","year":"1957","unstructured":"Khinchin AI (1957) Mathematical foundations of information theory. Dover Publications, New York"},{"issue":"3","key":"2261_CR31","first-page":"395","volume":"28","author":"B Klemens","year":"2012","unstructured":"Klemens B (2012) Mutual information as a measure of intercoder agreement. J Off Stat 28 (3):395\u2013412","journal-title":"J Off Stat"},{"issue":"1","key":"2261_CR32","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1214\/aoms\/1177729694","volume":"22","author":"S Kullback","year":"1951","unstructured":"Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79\u201386","journal-title":"Ann Math Stat"},{"issue":"3","key":"2261_CR33","doi-asserted-by":"crossref","first-page":"276","DOI":"10.11613\/BM.2012.031","volume":"22","author":"ML McHugh","year":"2012","unstructured":"McHugh ML (2012) Interrater reliability: the kappa statistic. Biochem Med (Zagreb) 22(3):276\u2013282","journal-title":"Biochem Med (Zagreb)"},{"issue":"2","key":"2261_CR34","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1007\/BF02295996","volume":"12","author":"Q McNemar","year":"1947","unstructured":"McNemar Q (1947) Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2):153\u2013157","journal-title":"Psychometrika"},{"key":"2261_CR35","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1148\/109.2.297","volume":"109","author":"C Metz","year":"1973","unstructured":"Metz C, Goodenough D, Rossmann K (1973) Evaluation of receiver operating characteristic curve data in terms of information theory, with applications in radiography. Radiology 109:297\u2013303","journal-title":"Radiology"},{"issue":"1","key":"2261_CR36","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1176\/jnp.4.1.95","volume":"4","author":"D Mossman","year":"1992","unstructured":"Mossman D, Somoza E (1992) Diagnostic tests and information theory. J Neuropsych Clin Neurosci 4(1):95\u201398","journal-title":"J Neuropsych Clin Neurosci"},{"issue":"1","key":"2261_CR37","first-page":"10","volume":"1","author":"EO \u00d6zlem","year":"2011","unstructured":"\u00d6zlem EO, Arma\u011fan K (2011) Evaluation and comparison of diagnostic test performance based on information theory. Int J Stat Appl 1(1):10\u201313","journal-title":"Int J Stat Appl"},{"key":"2261_CR38","doi-asserted-by":"crossref","first-page":"240","DOI":"10.1098\/rspl.1895.0041","volume":"58","author":"K Pearson","year":"1895","unstructured":"Pearson K (1895) Notes on regression and inheritance in the case of two parents. Proc R Soc Lond 58:240\u2013242","journal-title":"Proc R Soc Lond"},{"key":"2261_CR39","unstructured":"Schlimmer J (1987) Congressional voting records data set. https:\/\/archive.ics.uci.edu\/ml\/datasets\/Congressional+Voting+Records"},{"issue":"12","key":"2261_CR40","doi-asserted-by":"crossref","first-page":"2326","DOI":"10.1109\/TKDE.2018.2822307","volume":"30","author":"F Serafino","year":"2018","unstructured":"Serafino F, Pio G, Ceci M (2018) Ensemble learning for multi-type classification in heterogeneous networks. IEEE Trans Knowl Data Eng 30(12):2326\u20132339","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"3","key":"2261_CR41","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","volume":"27","author":"CE Shannon","year":"1948","unstructured":"Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379\u2013423","journal-title":"Bell Syst Tech J"},{"issue":"3","key":"2261_CR42","doi-asserted-by":"crossref","first-page":"165","DOI":"10.14366\/usg.15002","volume":"34","author":"HJ Shin","year":"2015","unstructured":"Shin HJ, Kim HH, Cha JH (2015) Current status of automated breast ultrasonography. Ultrasonography 34(3):165\u2013172","journal-title":"Ultrasonography"},{"key":"2261_CR43","unstructured":"Shoukri MM (2003) Measures of interobserver agreement. CRC Biostatistics Series Chapman & Hall"},{"issue":"2","key":"2261_CR44","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1037\/0033-2909.86.2.420","volume":"86","author":"P Shrout","year":"1979","unstructured":"Shrout P, Fleiss J (1979) Intraclass correlations: Uses in assessing rater reliability. Psychol Bull 86(2):420\u2013428","journal-title":"Psychol Bull"},{"issue":"C","key":"2261_CR45","doi-asserted-by":"publisher","first-page":"120","DOI":"10.1016\/j.neucom.2014.10.086","volume":"160","author":"B Sluban","year":"2015","unstructured":"Sluban B, Lavra\u010d N (2015) Relating ensemble diversity and performance. Neurocomput 160 (C):120\u2013131. https:\/\/doi.org\/10.1016\/j.neucom.2014.10.086","journal-title":"Neurocomput"},{"issue":"3","key":"2261_CR46","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1177\/0272989X9201200303","volume":"12","author":"E Somoza","year":"1992","unstructured":"Somoza E, Mossman D (1992) Comparing and Optimizing Diagnostic Tests: An Information-theoretical Approach. Med Decis Making 12(3):179\u2013188. PMID: 1513208","journal-title":"Med Decis Making"},{"issue":"2","key":"2261_CR47","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1176\/jnp.4.2.214","volume":"4","author":"E Somoza","year":"1992","unstructured":"Somoza E, Mossman D (1992) Comparing diagnostic tests using information theory: the INFO-ROC technique. J Neuropsych Clin Neurosci 4(2):214\u2013219","journal-title":"J Neuropsych Clin Neurosci"},{"issue":"1","key":"2261_CR48","doi-asserted-by":"crossref","first-page":"72","DOI":"10.2307\/1412159","volume":"15","author":"C Spearman","year":"1904","unstructured":"Spearman C (1904) The proof and measurement of association between two things. Am J Psychol 15(1):72\u2013101. http:\/\/www.jstor.org\/stable\/1412159","journal-title":"Am J Psychol"},{"issue":"10","key":"2261_CR49","doi-asserted-by":"crossref","first-page":"949","DOI":"10.1016\/0895-4356(88)90031-5","volume":"41","author":"WD Thompson","year":"1988","unstructured":"Thompson WD, Walter SD (1988) A reappraisal of the kappa coefficient. J Clin Epidemiol 41(10):949\u2013958","journal-title":"J Clin Epidemiol"},{"issue":"7","key":"2261_CR50","doi-asserted-by":"crossref","first-page":"655","DOI":"10.1016\/j.jclinepi.2004.02.021","volume":"58","author":"W Vach","year":"2005","unstructured":"Vach W (2005) The dependence of Cohen\u2019s kappa on the prevalence does not matter. J Clin Epidemiol 58(7):655\u2013661","journal-title":"J Clin Epidemiol"},{"issue":"apr12 1","key":"2261_CR51","doi-asserted-by":"publisher","first-page":"f2125","DOI":"10.1136\/bmj.f2125","volume":"346","author":"HCW de Vet","year":"2013","unstructured":"de Vet HCW, Mokkink LB, Terwee CB, Hoekstra OS, Knol DL (2013) Clinicians are right not to like cohen\u2019s kappa. BMJ 346(apr12 1):f2125\u2013f2125. https:\/\/doi.org\/10.1136\/bmj.f2125","journal-title":"BMJ"},{"key":"2261_CR52","unstructured":"Wolberg William H, Street WN, Mangasarian OL (1995) Breast cancer wisconsin (diagnostic) data set. https:\/\/archive.ics.uci.edu\/ml\/datasets\/Breast+Cancer+Wisconsin+(Diagnostic)"},{"issue":"1","key":"2261_CR53","doi-asserted-by":"publisher","first-page":"211","DOI":"10.2174\/1874434601711010211","volume":"11","author":"S Zec","year":"2017","unstructured":"Zec S, Soriani N, Comoretto R, Baldi I (2017) High agreement and high prevalence: the paradox of cohen\u2019s kappa. Open Nurs J 11(1):211\u2013218. https:\/\/doi.org\/10.2174\/1874434601711010211","journal-title":"Open Nurs J"}],"container-title":["Medical & Biological Engineering & Computing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11517-020-02261-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s11517-020-02261-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11517-020-02261-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,11,20]],"date-time":"2020-11-20T22:13:14Z","timestamp":1605910394000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s11517-020-02261-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,3]]},"references-count":53,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["2261"],"URL":"https:\/\/doi.org\/10.1007\/s11517-020-02261-2","relation":{},"ISSN":["0140-0118","1741-0444"],"issn-type":[{"value":"0140-0118","type":"print"},{"value":"1741-0444","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,11,3]]},"assertion":[{"value":"2 March 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 August 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 November 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}