Temporal relation discovery for clinical text

NIH RePORTER · NIH · R01 · $520,942 · view on reporter.nih.gov ↗

Abstract

Project Summary / Abstract The current proposal continues the investigation on the topic of temporal relation extraction from the Electronic Medical Records (EMR) clinical narrative funded by the NLM since 2010 (Temporal Histories of Your Medical Events, or THYME; thyme.healthnlp.org). Through our efforts so far, we have defined the topic as an active area of research attracting attention across the world. Since its inception, the project has pushed the boundaries of this highly challenging task by investigating new computational methods within the context of the latest developments in the fields of natural language processing (NLP), machine learning (ML), artificial intelligence (AI) and biomedical informatics (BMI) resulting in 60+ publications/presentations. We have made our best performing methods available to the community open source as part of the Apache Clinical Text Analysis and Knowledge Extraction System (cTAKES; ctakes.apache.org). In 2015, 2016, 2017 and 2018, we organized an international shared task (Clinical TempEval) on the topic under the umbrella of the highly prestigious SemEval, thus inviting the international community to work with our THYME data and improve on our results. Clinical TempEval has been highly successful with many participants each year, resulting in new discoveries and many publications. We have made all our data along with our gold standard annotations available to the community through the hNLP Center (center.healthnlp.org). The underlying theme of this renewal is novel methods for combining explicit domain knowledge (linguistic, semantic, biomedical ontological, clinical), readily available unlabeled data (health-related social media, EMRs), and modern machine learning techniques (e.g. neural networks) for temporal relation extraction from the EMR clinical narrative. Therefore, our renewal proposes a novel and much needed exploration of this line of research: Specific Aim 1: Develop computational models for novel rich semantic representations such as the Abstract Meaning Representations to encapsulate a single, coherent, full-document graphical representation of meaning for temporal relation extraction Specific Aim 2: Develop computational methods to infuse domain knowledge (linguistic, semantic, biomedical ontological, clinical) into modern machine learning techniques such as NNs for temporal relation extraction – through input representations, pre-trained vectors, or architectures Specific Aim 3: Develop novel methods for combining labeled and unlabeled data from various sources (EMR, health-related social media, newswire) for temporal relation extraction from the clinical narrative Specific Aim 4: Apply the best performing methods for temporal relation extraction developed in SA1-3 to temporally sensitive phenotypes for direct translational sciences studies. Dissemination efforts through publications and open source releases into Apache cTAKES.

Key facts

NIH application ID
10405477
Project number
5R01LM010090-11
Recipient
BOSTON CHILDREN'S HOSPITAL
Principal Investigator
James H. Martin
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$520,942
Award type
5
Project period
2010-09-30 → 2023-05-31