# Scalable methods for identity by descent

> **NIH NIH R01** · UNIVERSITY OF TEXAS HLTH SCI CTR HOUSTON · 2020 · $570,000

## Abstract

ABSTRACT
In the next a few years, large genotyped cohorts are becoming available (e.g., TOPMed, UK biobank, All of Us,
Million Veteran Program). With the sample size approaches 0.1%-1% of the total population size, extensive
distant relatives and Identity-by-descent, or IBD information are represented in such samples. Such information
will enable more sophisticated and powerful genetics analysis beyond single variant-based analyses. However,
current informatics methods are not equipped with the efficiency to handle genotype data of that scale. We will
develop new genome informatics methods for biobank-scale cohorts with genotypes. We have developed an
efficient tool, RaPID, the first computationally feasible method for inferring IBD segments among individuals in
a biobank-scale cohort. We demonstrated that RaPID achieves running time linear to the sample size and is
over 100 times faster than existing methods. At the same time, RaPID detects a greater number of IBDs, with
higher accuracy, and sharper segment boundaries than existing methods. In this application, we propose to
develop (1) the RaPID+ method for pairwise IBD detection that can tolerate and correct phasing errors, with a
principled way of parameter tuning, and can work with genotype data across sequencing and array platforms;
(2) the RaPID-diploid method for detection of IBD2 segments; (3) the RaPID-multiway method that identifies
IBD Cluster; and (4) the RaPID-ancestry method for local ancestry inference across subcontinental populations.
Methods will be rigorously tested in simulations using realistic population demographic models as well as real
data from large cohorts. All methods will be implemented as free software for academic use. This project will
advance genetic research by developing efficient informatics tools that reveal detailed genetic relationships in
very large genotyped cohorts.

## Key facts

- **NIH application ID:** 9899283
- **Project number:** 5R01HG010086-03
- **Recipient organization:** UNIVERSITY OF TEXAS HLTH SCI CTR HOUSTON
- **Principal Investigator:** Shaojie Zhang
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $570,000
- **Award type:** 5
- **Project period:** 2018-06-01 → 2022-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9899283

## Citation

> US National Institutes of Health, RePORTER application 9899283, Scalable methods for identity by descent (5R01HG010086-03). Retrieved via AI Analytics 2026-05-28 from https://api.ai-analytics.org/grant/nih/9899283. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
