ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Nov 29;15(1):373.
doi: 10.1186/s12859-014-0373-3.

ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus

Affiliations

ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus

Zubair Afzal et al. BMC Bioinformatics. .

Abstract

Background: In order to extract meaningful information from electronic medical records, such as signs and symptoms, diagnoses, and treatments, it is important to take into account the contextual properties of the identified information: negation, temporality, and experiencer. Most work on automatic identification of these contextual properties has been done on English clinical text. This study presents ContextD, an adaptation of the English ConText algorithm to the Dutch language, and a Dutch clinical corpus. We created a Dutch clinical corpus containing four types of anonymized clinical documents: entries from general practitioners, specialists' letters, radiology reports, and discharge letters. Using a Dutch list of medical terms extracted from the Unified Medical Language System, we identified medical terms in the corpus with exact matching. The identified terms were annotated for negation, temporality, and experiencer properties. To adapt the ConText algorithm, we translated English trigger terms to Dutch and added several general and document specific enhancements, such as negation rules for general practitioners' entries and a regular expression based temporality module.

Results: The ContextD algorithm utilized 41 unique triggers to identify the contextual properties in the clinical corpus. For the negation property, the algorithm obtained an F-score from 87% to 93% for the different document types. For the experiencer property, the F-score was 99% to 100%. For the historical and hypothetical values of the temporality property, F-scores ranged from 26% to 54% and from 13% to 44%, respectively.

Conclusions: The ContextD showed good performance in identifying negation and experiencer property values across all Dutch clinical document types. Accurate identification of the temporality property proved to be difficult and requires further work. The anonymized and annotated Dutch clinical corpus can serve as a useful resource for further algorithm development.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet. 2012;13(June):395–405. doi: 10.1038/nrg3208. - DOI - PubMed
    1. Friedman C, Hripcsak G. Natural language processing and its future in medicine. Acad Med. 1999;74:890–895. doi: 10.1097/00001888-199908000-00012. - DOI - PubMed
    1. Friedman C, Alderson PO, Austin JHM, Cimino JJ, Johnson SB. A general natural-language text processor for clinical radiology. J Am Med Informat Assoc. 1994;1:161–174. doi: 10.1136/jamia.1994.95236146. - DOI - PMC - PubMed
    1. Christensen LM, Haug PJ, Fiszman M. Proc ACL-02 Work Nat Lang Process Biomed domain - Morristown, NJ, USA: Association for Computational Linguistics; 2002. MPLUS: a probabilistic medical language understanding system; pp. 29–36.
    1. Hahn U, Romacker M, Schulz S. MEDSYNDIKATE–a natural language system for the extraction of medical information from findings reports. Int J Med Inform. 2002;67:63–74. doi: 10.1016/S1386-5056(02)00053-9. - DOI - PubMed

Publication types