Data Coordination Core

NIH RePORTER · NIH · U2C · $2,367,144 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY The Kids First Data Coordinating Core (DCC) supports the data intake, management, and release for the largest pediatric genomic resource available to the public. This currently includes 40 projects, 48,000 genomes, and 23 released studies as well as other relevant interoperable datasets. The DCC is responsible for ingestion, harmonization, curation and provisioning data to the Kids First Data Resource Core (DRC) in lossless and high utility formats. This includes retaining source data and requisite metadata for scientific discovery across a wide range of cancer and birth defect research domains. Through this experience, we have developed the Kids First “Data Tracker” application as a key tool for collaboratively working with data generators, analysts, investigators and NIH program staff for effective data coordination. The DCC will expand the application based on the use cases around clinical data ingestion, AWS S3 storage management, status reporting, workflow automation and dbGaP submissions. The data exchange endpoint between the DCC and DRC will be a FHIR service and provide the ability for other interoperable data resources to integrate with Kids First data. We will continue to improve on our genomic data best practices and expand the portfolio of pipelines available for long-read sequencing, epigenomics and other data modalities as required for the Kids First community. All bioinformatics pipeline development will leverage our experience building community-based best practice workflows utilizing Common Workflow Language (CWL) and Docker so that they can be reproducibly used by the community. Similarly, the DCC will continue to expand and improve on the clinical/phenotypic data collection by building a standards-based curation toolkit that integrates with the Data Tracker application. As the collection of Kids First variants scales to the level of tens of billions of variants, the DCC will help support the community by building out easy to use workflows for variant filtering and incorporate more expansive annotation that will allow researchers to more easily identify variants and clinical data of further interest and investigation. The DCC will work hand in hand with the Kid First Administrative and Outreach Core (AOC) to provide technical user support, training and other documentation to support the use of the data, pipelines and other tools developed for Kids First to help empower and accelerate research and discovery in the scientific community.

Key facts

NIH application ID
10917375
Project number
5U2CHD109731-08
Recipient
CHILDREN'S HOSP OF PHILADELPHIA
Principal Investigator
Allison Heath
Activity code
U2C
Funding institute
NIH
Fiscal year
2024
Award amount
$2,367,144
Award type
5
Project period
2017-09-26 → 2027-08-31