ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus
- PMID: 25432799
- PMCID: PMC4264258
- DOI: 10.1186/s12859-014-0373-3
ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus
Abstract
Background: In order to extract meaningful information from electronic medical records, such as signs and symptoms, diagnoses, and treatments, it is important to take into account the contextual properties of the identified information: negation, temporality, and experiencer. Most work on automatic identification of these contextual properties has been done on English clinical text. This study presents ContextD, an adaptation of the English ConText algorithm to the Dutch language, and a Dutch clinical corpus. We created a Dutch clinical corpus containing four types of anonymized clinical documents: entries from general practitioners, specialists' letters, radiology reports, and discharge letters. Using a Dutch list of medical terms extracted from the Unified Medical Language System, we identified medical terms in the corpus with exact matching. The identified terms were annotated for negation, temporality, and experiencer properties. To adapt the ConText algorithm, we translated English trigger terms to Dutch and added several general and document specific enhancements, such as negation rules for general practitioners' entries and a regular expression based temporality module.
Results: The ContextD algorithm utilized 41 unique triggers to identify the contextual properties in the clinical corpus. For the negation property, the algorithm obtained an F-score from 87% to 93% for the different document types. For the experiencer property, the F-score was 99% to 100%. For the historical and hypothetical values of the temporality property, F-scores ranged from 26% to 54% and from 13% to 44%, respectively.
Conclusions: The ContextD showed good performance in identifying negation and experiencer property values across all Dutch clinical document types. Accurate identification of the temporality property proved to be difficult and requires further work. The anonymized and annotated Dutch clinical corpus can serve as a useful resource for further algorithm development.
Similar articles
-
Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods.BMC Bioinformatics. 2023 Jan 9;24(1):10. doi: 10.1186/s12859-022-05130-x. BMC Bioinformatics. 2023. PMID: 36624385 Free PMC article.
-
French FastContext: A publicly accessible system for detecting negation, temporality and experiencer in French clinical notes.J Biomed Inform. 2021 May;117:103733. doi: 10.1016/j.jbi.2021.103733. Epub 2021 Mar 15. J Biomed Inform. 2021. PMID: 33737205
-
ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports.J Biomed Inform. 2009 Oct;42(5):839-51. doi: 10.1016/j.jbi.2009.05.002. Epub 2009 May 10. J Biomed Inform. 2009. PMID: 19435614 Free PMC article.
-
Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality.BMC Med Inform Decis Mak. 2021 Apr 7;21(1):120. doi: 10.1186/s12911-021-01477-y. BMC Med Inform Decis Mak. 2021. PMID: 33827555 Free PMC article.
-
Implementation and evaluation of a negation tagger in a pipeline-based system for information extract from pathology reports.Stud Health Technol Inform. 2004;107(Pt 1):663-7. Stud Health Technol Inform. 2004. PMID: 15360896
Cited by
-
Combining unsupervised, supervised and rule-based learning: the case of detecting patient allergies in electronic health records.BMC Med Inform Decis Mak. 2023 Sep 18;23(1):188. doi: 10.1186/s12911-023-02271-8. BMC Med Inform Decis Mak. 2023. PMID: 37723446 Free PMC article.
-
The added value of text from Dutch general practitioner notes in predictive modeling.J Am Med Inform Assoc. 2023 Nov 17;30(12):1973-1984. doi: 10.1093/jamia/ocad160. J Am Med Inform Assoc. 2023. PMID: 37587084 Free PMC article.
-
Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods.BMC Bioinformatics. 2023 Jan 9;24(1):10. doi: 10.1186/s12859-022-05130-x. BMC Bioinformatics. 2023. PMID: 36624385 Free PMC article.
-
Contextual Word Embeddings and Topic Modeling in Healthy Dieting and Obesity.J Healthc Inform Res. 2019 Jun 10;3(2):159-183. doi: 10.1007/s41666-019-00052-5. eCollection 2019 Jun. J Healthc Inform Res. 2019. PMID: 35415426 Free PMC article.
-
The Unified Medical Language System at 30 Years and How It Is Used and Published: Systematic Review and Content Analysis.JMIR Med Inform. 2021 Aug 27;9(8):e20675. doi: 10.2196/20675. JMIR Med Inform. 2021. PMID: 34236337 Free PMC article. Review.
References
-
- Christensen LM, Haug PJ, Fiszman M. Proc ACL-02 Work Nat Lang Process Biomed domain - Morristown, NJ, USA: Association for Computational Linguistics; 2002. MPLUS: a probabilistic medical language understanding system; pp. 29–36.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources