# Bionformatics Resource Core

> **NIH NIH P30** · BRIGHAM AND WOMEN'S HOSPITAL · 2024 · $255,752

## Abstract

PROJECT SUMMARY
 In this renewal, we build on our experience serving as the VERITY Bioinformatics Core, expanding Core
services to include advancements in the field since the last application. Research using electronic medical
record (EMR) data continues to grow with increasing availability of EMR data. Simultaneously, methods to
utilize data for research have also advanced including natural language processing (NLP) and machine
learning (ML) to extract crucial clinical data embedded in narrative notes, and to include these data in models
of disease risk and outcomes. However, there remains a large gap between access to raw EMR data
optimized for billing and patient care, and the ability to fully and appropriately utilize these data in clinical
research. Through consultations, courses offered as part of the Bioinformatics Core, and current Core projects,
we have identified 4 areas of high demand and/or unmet need for clinical investigators: (1) phenotyping using
EMR data; (2) extraction of clinical data from narrative notes using NLP, including early applications of this
technology to study social determinants of health; (3) use of EMR for studies of treatment effects and
applications of causal inference methods; and (4) approaches for multi-institutional EMR studies without
requiring direct sharing of data (termed federated learning).
 The mission of the Bioinformatics Core remains supporting investigators from the pediatric and
adult rheumatic and musculoskeletal (MSK) research community to apply and integrate bioinformatics
approaches to clinical research studies using EMR data. While our target audience remains trainees and
junior faculty, in this renewal, our expanded services are also designed for established investigators interested
in incorporating bioinformatics to their research programs. Aim 1. To provide methods for investigators to
obtain robust and accurate phenotypes using information from EMRs and integrating these data for clinical
studies. This requires applying supervised and unsupervised machine learning approaches for phenotyping
with EMR data. As well, we will utilize causal inference methods applied to EMR data for studies of treatment
effects. Aim 2. To provide NLP support enabling clinical research studies with EMR data. We will support and
develop the use of NLP to incorporate social determinants of health (SDoH) in studies of health equity using
EMR data. As well, we will support and educate investigators on the use of NLP based data and tools. Aim 3.
To strengthen existing ties and build new partnerships between the rheumatic and MSK clinical research and
bioinformatics communities through Core platforms and consulting services.
 The Bioinformatics Core team and a network of expert advisors will perform consultations, provide
educational services, and deliver bioinformatics research services to the Research Community. Additionally,
we will build on our foundation to enable cross-institutional studies with federated lea...

## Key facts

- **NIH application ID:** 10906802
- **Project number:** 5P30AR072577-08
- **Recipient organization:** BRIGHAM AND WOMEN'S HOSPITAL
- **Principal Investigator:** Katherine Phoenix Liao
- **Activity code:** P30 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $255,752
- **Award type:** 5
- **Project period:** 2017-09-15 → 2027-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10906802

## Citation

> US National Institutes of Health, RePORTER application 10906802, Bionformatics Resource Core (5P30AR072577-08). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10906802. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
