ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus

doi:10.1186/s12859-014-0373-3

. 2014 Nov 29;15(1):373.

doi: 10.1186/s12859-014-0373-3.

ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus

Zubair Afzal¹, Ewoud Pons², Ning Kang³, Miriam C J M Sturkenboom⁴, Martijn J Schuemie⁵, Jan A Kors⁶

Affiliations

¹ Department of Medical Informatics, Erasmus Medical Center, P.O. Box 2040, Rotterdam, CA, 3000, Netherlands. m.afzal@erasmusmc.nl.
² Department of Medical Informatics, Erasmus Medical Center, P.O. Box 2040, Rotterdam, CA, 3000, Netherlands. e.pons@erasmusmc.nl.
³ Department of Medical Informatics, Erasmus Medical Center, P.O. Box 2040, Rotterdam, CA, 3000, Netherlands. n.kang@erasmusmc.nl.
⁴ Department of Medical Informatics, Erasmus Medical Center, P.O. Box 2040, Rotterdam, CA, 3000, Netherlands. m.sturkenboom@erasmusmc.nl.
⁵ Janssen Research and Development LLC, Titusville, NJ, USA. mschuemi@its.jnj.com.
⁶ Department of Medical Informatics, Erasmus Medical Center, P.O. Box 2040, Rotterdam, CA, 3000, Netherlands. j.kors@erasmusmc.nl.

PMID: 25432799
PMCID: PMC4264258
DOI: 10.1186/s12859-014-0373-3

ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus

Zubair Afzal et al. BMC Bioinformatics. 2014.

. 2014 Nov 29;15(1):373.

doi: 10.1186/s12859-014-0373-3.

Authors

Zubair Afzal¹, Ewoud Pons², Ning Kang³, Miriam C J M Sturkenboom⁴, Martijn J Schuemie⁵, Jan A Kors⁶

Affiliations

¹ Department of Medical Informatics, Erasmus Medical Center, P.O. Box 2040, Rotterdam, CA, 3000, Netherlands. m.afzal@erasmusmc.nl.
² Department of Medical Informatics, Erasmus Medical Center, P.O. Box 2040, Rotterdam, CA, 3000, Netherlands. e.pons@erasmusmc.nl.
³ Department of Medical Informatics, Erasmus Medical Center, P.O. Box 2040, Rotterdam, CA, 3000, Netherlands. n.kang@erasmusmc.nl.
⁴ Department of Medical Informatics, Erasmus Medical Center, P.O. Box 2040, Rotterdam, CA, 3000, Netherlands. m.sturkenboom@erasmusmc.nl.
⁵ Janssen Research and Development LLC, Titusville, NJ, USA. mschuemi@its.jnj.com.
⁶ Department of Medical Informatics, Erasmus Medical Center, P.O. Box 2040, Rotterdam, CA, 3000, Netherlands. j.kors@erasmusmc.nl.

PMID: 25432799
PMCID: PMC4264258
DOI: 10.1186/s12859-014-0373-3

Abstract

Background: In order to extract meaningful information from electronic medical records, such as signs and symptoms, diagnoses, and treatments, it is important to take into account the contextual properties of the identified information: negation, temporality, and experiencer. Most work on automatic identification of these contextual properties has been done on English clinical text. This study presents ContextD, an adaptation of the English ConText algorithm to the Dutch language, and a Dutch clinical corpus. We created a Dutch clinical corpus containing four types of anonymized clinical documents: entries from general practitioners, specialists' letters, radiology reports, and discharge letters. Using a Dutch list of medical terms extracted from the Unified Medical Language System, we identified medical terms in the corpus with exact matching. The identified terms were annotated for negation, temporality, and experiencer properties. To adapt the ConText algorithm, we translated English trigger terms to Dutch and added several general and document specific enhancements, such as negation rules for general practitioners' entries and a regular expression based temporality module.

Results: The ContextD algorithm utilized 41 unique triggers to identify the contextual properties in the clinical corpus. For the negation property, the algorithm obtained an F-score from 87% to 93% for the different document types. For the experiencer property, the F-score was 99% to 100%. For the historical and hypothetical values of the temporality property, F-scores ranged from 26% to 54% and from 13% to 44%, respectively.

Conclusions: The ContextD showed good performance in identifying negation and experiencer property values across all Dutch clinical document types. Accurate identification of the temporality property proved to be difficult and requires further work. The anonymized and annotated Dutch clinical corpus can serve as a useful resource for further algorithm development.

PubMed Disclaimer

Cited by

Combining unsupervised, supervised and rule-based learning: the case of detecting patient allergies in electronic health records.
Berge GT, Granmo OC, Tveit TO, Ruthjersen AL, Sharma J. Berge GT, et al. BMC Med Inform Decis Mak. 2023 Sep 18;23(1):188. doi: 10.1186/s12911-023-02271-8. BMC Med Inform Decis Mak. 2023. PMID: 37723446 Free PMC article.
The added value of text from Dutch general practitioner notes in predictive modeling.
Seinen TM, Kors JA, van Mulligen EM, Fridgeirsson E, Rijnbeek PR. Seinen TM, et al. J Am Med Inform Assoc. 2023 Nov 17;30(12):1973-1984. doi: 10.1093/jamia/ocad160. J Am Med Inform Assoc. 2023. PMID: 37587084 Free PMC article.
Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods.
van Es B, Reteig LC, Tan SC, Schraagen M, Hemker MM, Arends SRS, Rios MAR, Haitjema S. van Es B, et al. BMC Bioinformatics. 2023 Jan 9;24(1):10. doi: 10.1186/s12859-022-05130-x. BMC Bioinformatics. 2023. PMID: 36624385 Free PMC article.
Contextual Word Embeddings and Topic Modeling in Healthy Dieting and Obesity.
Yeruva VK, Junaid S, Lee Y. Yeruva VK, et al. J Healthc Inform Res. 2019 Jun 10;3(2):159-183. doi: 10.1007/s41666-019-00052-5. eCollection 2019 Jun. J Healthc Inform Res. 2019. PMID: 35415426 Free PMC article.
The Unified Medical Language System at 30 Years and How It Is Used and Published: Systematic Review and Content Analysis.
Jing X. Jing X. JMIR Med Inform. 2021 Aug 27;9(8):e20675. doi: 10.2196/20675. JMIR Med Inform. 2021. PMID: 34236337 Free PMC article. Review.

See all "Cited by" articles

References

1. Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet. 2012;13(June):395–405. doi: 10.1038/nrg3208. - DOI - PubMed
1. Friedman C, Hripcsak G. Natural language processing and its future in medicine. Acad Med. 1999;74:890–895. doi: 10.1097/00001888-199908000-00012. - DOI - PubMed
1. Friedman C, Alderson PO, Austin JHM, Cimino JJ, Johnson SB. A general natural-language text processor for clinical radiology. J Am Med Informat Assoc. 1994;1:161–174. doi: 10.1136/jamia.1994.95236146. - DOI - PMC - PubMed
1. Christensen LM, Haug PJ, Fiszman M. Proc ACL-02 Work Nat Lang Process Biomed domain - Morristown, NJ, USA: Association for Computational Linguistics; 2002. MPLUS: a probabilistic medical language understanding system; pp. 29–36.
1. Hahn U, Romacker M, Schulz S. MEDSYNDIKATE–a natural language system for the extraction of medical information from findings reports. Int J Med Inform. 2002;67:63–74. doi: 10.1016/S1386-5056(02)00053-9. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

[1] Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet. 2012;13(June):395–405. doi: 10.1038/nrg3208. - DOI - PubMed

[2] Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet. 2012;13(June):395–405. doi: 10.1038/nrg3208. - DOI - PubMed

[3] Friedman C, Hripcsak G. Natural language processing and its future in medicine. Acad Med. 1999;74:890–895. doi: 10.1097/00001888-199908000-00012. - DOI - PubMed

[4] Friedman C, Hripcsak G. Natural language processing and its future in medicine. Acad Med. 1999;74:890–895. doi: 10.1097/00001888-199908000-00012. - DOI - PubMed

[5] Friedman C, Alderson PO, Austin JHM, Cimino JJ, Johnson SB. A general natural-language text processor for clinical radiology. J Am Med Informat Assoc. 1994;1:161–174. doi: 10.1136/jamia.1994.95236146. - DOI - PMC - PubMed

[6] Friedman C, Alderson PO, Austin JHM, Cimino JJ, Johnson SB. A general natural-language text processor for clinical radiology. J Am Med Informat Assoc. 1994;1:161–174. doi: 10.1136/jamia.1994.95236146. - DOI - PMC - PubMed

[7] Christensen LM, Haug PJ, Fiszman M. Proc ACL-02 Work Nat Lang Process Biomed domain - Morristown, NJ, USA: Association for Computational Linguistics; 2002. MPLUS: a probabilistic medical language understanding system; pp. 29–36.

[8] Christensen LM, Haug PJ, Fiszman M. Proc ACL-02 Work Nat Lang Process Biomed domain - Morristown, NJ, USA: Association for Computational Linguistics; 2002. MPLUS: a probabilistic medical language understanding system; pp. 29–36.

[9] Hahn U, Romacker M, Schulz S. MEDSYNDIKATE–a natural language system for the extraction of medical information from findings reports. Int J Med Inform. 2002;67:63–74. doi: 10.1016/S1386-5056(02)00053-9. - DOI - PubMed

[10] Hahn U, Romacker M, Schulz S. MEDSYNDIKATE–a natural language system for the extraction of medical information from findings reports. Int J Med Inform. 2002;67:63–74. doi: 10.1016/S1386-5056(02)00053-9. - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus

Affiliations

ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus

Authors

Affiliations

Abstract

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources