# Analyzing the behavior and interpreting the results of gene based tests of rare variant association

> **NIH NIH R15** · DORDT COLLEGE · 2020 · $125,000

## Abstract

The technological and computational breakthroughs in the years since the sequencing of the human genome
have provided an unprecedented opportunity to understand the etiology of complex human diseases. Notably,
the diminishing cost of next-generation sequencing means that it is now possible for researchers to obtain
complete genome sequence information on many thousands of individuals, with widespread access to that
data via large repositories of electronic health records (EHRs; e.g,. biobanks). However, major statistical
questions remain about biobank-era analysis strategies in order to study the contribution of genetic variation to
common diseases. In particular, foundational statistical questions exist in the areas of: (a) a recognition of the
need to minimize computational complexity and respect data privacy concerns, (b) random and non-random
missing data, (c) data uncertainty and errors (both phenotypic and genotypic), and (d) the role of multi-marker
(variant-set) tests, which aggregate evidence from many individual variants into a single test statistic. Research
by our group as part of our existing award (R15-HG0006915 (2011-present)) began by developing a framework
for evaluating the performance of existing variant-set tests. We then utilized this framework to provide a clear
understanding of test performance in a variety of circumstances, developed novel robust and powerful tests,
evaluated method performance in light of genotype uncertainty, developed methods to characterize underlying
genetic architecture and demonstrated the utility of these methods to understand the genetics of fatty acids
and high blood pressure. Recently, we proposed a novel method for utilizing summary statistics from single-
variant- single phenotype association in tests of complex phenotypes (in this case, the linear combination of
phenotypes) as a first step to provide computationally efficient, biobank era-ready statistical methods for
assessing genotype-phenotype association. Moving forward, our research will generalize this initial method to
be applicable to any complex phenotype. Additionally, we will continue to build on a strong history of
exploration of uncertainty, by considering the impact of random and non-random errors and uncertainty on
genotype-phenotype association tests in the biobank era, and extension of these methods to multi-marker test
settings. Methods we develop will be tested on both simulated and real data via the CHARGE consortium.
Additionally, the work we will perform addresses the three main goals of NIH’s R15 program: (a) to conduct
meritorious research that will (b) strengthen the research environment of the liberal arts college where the
research will be conducted, while (c) exposing undergraduate students to statistical genetics research. With
this last goal in mind, the fourth aim of our proposal is to provide research experiences to undergraduate
students when conducting aims 1, 2 and 3.

## Key facts

- **NIH application ID:** 10155863
- **Project number:** 3R15HG006915-03S1
- **Recipient organization:** DORDT COLLEGE
- **Principal Investigator:** Nathan L Tintle
- **Activity code:** R15 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $125,000
- **Award type:** 3
- **Project period:** 2020-08-11 → 2021-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10155863

## Citation

> US National Institutes of Health, RePORTER application 10155863, Analyzing the behavior and interpreting the results of gene based tests of rare variant association (3R15HG006915-03S1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10155863. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*