Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications
- PMID: 20819853
- PMCID: PMC2995668
- DOI: 10.1136/jamia.2009.001560
Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications
Abstract
We aim to build and evaluate an open-source natural language processing system for information extraction from electronic medical record clinical free-text. We describe and evaluate our system, the clinical Text Analysis and Knowledge Extraction System (cTAKES), released open-source at http://www.ohnlp.org. The cTAKES builds on existing open-source technologies-the Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Performance of individual components: sentence boundary detector accuracy=0.949; tokenizer accuracy=0.949; part-of-speech tagger accuracy=0.936; shallow parser F-score=0.924; named entity recognizer and system-level evaluation F-score=0.715 for exact and 0.824 for overlapping spans, and accuracy for concept mapping, negation, and status attributes for exact and overlapping spans of 0.957, 0.943, 0.859, and 0.580, 0.939, and 0.839, respectively. Overall performance is discussed against five applications. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text.
Figures
Similar articles
-
Sophia: A Expedient UMLS Concept Extraction Annotator.AMIA Annu Symp Proc. 2014 Nov 14;2014:467-76. eCollection 2014. AMIA Annu Symp Proc. 2014. PMID: 25954351 Free PMC article.
-
Towards comprehensive syntactic and semantic annotations of the clinical narrative.J Am Med Inform Assoc. 2013 Sep-Oct;20(5):922-30. doi: 10.1136/amiajnl-2012-001317. Epub 2013 Jan 25. J Am Med Inform Assoc. 2013. PMID: 23355458 Free PMC article.
-
Part-of-speech tagging for clinical text: wall or bridge between institutions?AMIA Annu Symp Proc. 2011;2011:382-91. Epub 2011 Oct 22. AMIA Annu Symp Proc. 2011. PMID: 22195091 Free PMC article.
-
Clinical named entity recognition and relation extraction using natural language processing of medical free text: A systematic review.Int J Med Inform. 2023 Sep;177:105122. doi: 10.1016/j.ijmedinf.2023.105122. Epub 2023 Jun 5. Int J Med Inform. 2023. PMID: 37295138 Review.
-
Temporal reasoning over clinical text: the state of the art.J Am Med Inform Assoc. 2013 Sep-Oct;20(5):814-9. doi: 10.1136/amiajnl-2013-001760. Epub 2013 May 15. J Am Med Inform Assoc. 2013. PMID: 23676245 Free PMC article. Review.
Cited by
-
BioKGrapher: Initial evaluation of automated knowledge graph construction from biomedical literature.Comput Struct Biotechnol J. 2024 Oct 17;24:639-660. doi: 10.1016/j.csbj.2024.10.017. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 39502384 Free PMC article.
-
Unsupervised SapBERT-based bi-encoders for medical concept annotation of clinical narratives with SNOMED CT.Digit Health. 2024 Oct 21;10:20552076241288681. doi: 10.1177/20552076241288681. eCollection 2024 Jan-Dec. Digit Health. 2024. PMID: 39493636 Free PMC article.
-
Evaluating the Performance and Bias of Natural Language Processing Tools in Labeling Chest Radiograph Reports.Radiology. 2024 Oct;313(1):e232746. doi: 10.1148/radiol.232746. Radiology. 2024. PMID: 39436298
-
A hybrid framework with large language models for rare disease phenotyping.BMC Med Inform Decis Mak. 2024 Oct 8;24(1):289. doi: 10.1186/s12911-024-02698-7. BMC Med Inform Decis Mak. 2024. PMID: 39375687 Free PMC article.
-
Disambiguating Clinical Abbreviations by One-to-All Classification: Algorithm Development and Validation Study.JMIR Med Inform. 2024 Oct 1;12:e56955. doi: 10.2196/56955. JMIR Med Inform. 2024. PMID: 39352715 Free PMC article.
References
-
- Hornberger J. Electronic health records: a guide for clinicians and administrators. Book and media review. JAMA 2009;(301):110
-
- Meystre SM, Savova GK, Kipper-Schuler KC, et al. Extracting information from textual documents in the electronic health record: a review of recent research. IMIA Year book of Medical Informatics 2008;47(Suppl 1):128–44 - PubMed
-
- Hripcsak G, Kuperman G, Friedman C. Extracting findings from narrative reports: software transferability and sources of physician disagreement. Methods Inf Med 1998;37:1–7 - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources