GHUCCTS N3C COVID data mapping

NIH RePORTER · NIH · UL1 · $99,864 · view on reporter.nih.gov ↗

Abstract

Abstract A major challenge to full utilization the available data and resources has been the complex nature of health data, and heterogeneity of data sources (including unstructured clinical notes) combined with a lack of standards. The lack of standards precludes semantic interoperability across platforms and between institutions. Instead, current approaches utilize resource intensive natural language processes to extract, transform, and correlate data from different sources for analysis. To improve translational science and accelerate research to improve patient outcomes, many new and innovative studies are leveraging large volumes of available data through standardized and shared data initiatives. With current advances in computing and health data analysis tools, methods and access, and to make data more meaningful, open, and accessible, research studies have moved beyond traditional retroactive reporting to pragmatic interventions and predictive capabilities. Ongoing efforts focus on exploiting common data standards and models such as the Observational Medical Outcomes Partnership (OMOP) standard—defined by the Observational Health Data Sciences and Informatics (OHDSI) consortium, and accepted as canon by both the NIH and PCORI— will lead the way to discover insights in textual narrative, enforce data standardization, and promote scalability and sharing. The OHDSI Common Data Models (CDM) makes data more meaningful, open, and accessible, which drives translational science and allows for consistent development of predictive models across different data sources. The National COVID Cohort Collaborative (N3C), ACT, BD2K-NIH Data Commons, the National Center for Data to Health (CD2H), and others are among the efforts that will lead to new discoveries and informed decision making, driven by data science and undergirded by mature Big Data technologies. We propose to design and establish novel, scalable, and standardized big data processes to massively abstract the raw electronic medical record datasets for observational studies. This project will develop a secure cloud-based environment to host these data, as well as the application programming and graphical user interfaces to support observational research studies leveraging these resources. By these means we will reduce the barriers to data standardization, annotation and sharing for reproducible analytics and begin to enforce complete semantic and syntactic interoperability between the resources in the data ecosystem. This effort will enable our investigators to study the effects of medical interventions and predict patients' health outcomes and generate the empirical evidence base necessary to establish best practices in observational analysis.

Key facts

NIH application ID
10299876
Project number
3UL1TR001409-06S3
Recipient
GEORGETOWN UNIVERSITY
Principal Investigator
Nawar Shara
Activity code
UL1
Funding institute
NIH
Fiscal year
2021
Award amount
$99,864
Award type
3
Project period
2015-08-28 → 2022-01-25