# Identification of Transposable Element Insertions in the Kids First Data

> **NIH NIH R03** · HARVARD MEDICAL SCHOOL · 2021 · $169,000

## Abstract

Project Summary
Insertion of transposable elements (TEs, sometimes referred to as “jumping genes”) into the
human genome can be pathogenic. Our aim in this project is to use sophisticated computational
approaches to characterize TE insertions in the whole-genome sequencing data generated in
the Gabriella Miller Kids First Pediatric Research Program and identify any insertional mutations
that may disrupt gene function. The large scale of the Kids First program provides an
unprecedented opportunity to investigate the role of TE insertions in childhood cancers and
structural birth defects, as well as to create a resource of reference TE maps that will be
important for all other TE studies. We will first modify our existing algorithm called xTEA for the
trio design of the Kids First studies and increase the accuracy and efficiency of the algorithm.
Then, we will apply it to the thousands of trios that have been profiled in the Kids First program,
using a pipeline optimized for the cloud environment. The resulting set of TE insertions
(especially L1, Alu, SVA, and HERV insertions) will be curated with all relevant features and be
made into a database for the community. We will also apply machine learning methods to
improve the calls once a sufficient amount of training data have been obtained. To investigate
the potential pathogenicity of the mutation, we will first focus on insertions within genes, but we
will also explore those in regulatory elements inferred from epigenetic profiling data.

## Key facts

- **NIH application ID:** 10172875
- **Project number:** 5R03CA249364-02
- **Recipient organization:** HARVARD MEDICAL SCHOOL
- **Principal Investigator:** Peter J Park
- **Activity code:** R03 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $169,000
- **Award type:** 5
- **Project period:** 2020-06-01 → 2023-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10172875

## Citation

> US National Institutes of Health, RePORTER application 10172875, Identification of Transposable Element Insertions in the Kids First Data (5R03CA249364-02). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10172875. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
