# Outcome Dependent Sampling of Longitudinal Data: Design and Analysis

> **NIH NIH R01** · VANDERBILT UNIVERSITY MEDICAL CENTER · 2024 · $676,786

## Abstract

Summary
Contemporary cohort studies and randomized clinical trials are regularly linked to
biorepositories and electronic health records systems. These secondary resources often
contain exposure and/or outcome data that are crucial for addressing novel study
questions. However, exposure/outcome ascertainment costs are often prohibitive. For
example, assaying biospecimen from biobanks to measure blood markers or manually
reviewing health records to accurately ascertain medical history information are both
costly and restrict sample size. A solution to high ascertainment costs is the two-phase
study that uses available participant information to identify those who are most
informative for addressing study questions. Restricted study resources are then
concentrated on the sub-cohort of informative participants. Outcome dependent and
outcome related sampling designs are examples of two-phase studies. They are highly
efficient compared to standard random sampling because they use outcome and/or
auxiliary variable data to identify the informative participants and then enrich the
observed sample with them. However, analyses must correct for the non-representative
sample. In this competing renewal, we propose highly efficient outcome dependent and
outcome related sampling designs as well as ascertainment correcting analysis
procedures for ordinal and longitudinal data. This is a natural extension of the research
conducted during the prior funding cycles which focused on longitudinal binary and
normally distributed response data. In the current proposal our focus is on generalized
ordinal (from a few ordered categories to non-normal, continuous) and longitudinal
ordinal responses, on novel semiparametric models, and on robust variations of
likelihood-based estimation strategies. Aim 1 regards outcome dependent sampling and
outcome related sampling designs and analysis procedures for scalar generalized
ordinal response data; Aim 2 regards outcome dependent sampling and outcome related
sampling designs and analysis procedures for ordinal, longitudinal data; and Aim 3
extends a new class of semi-parametric generalized linear models (SPGLM) to
correlated multi-outcome dependent sampling designs, to longitudinal data settings, and
then proposes outcome dependent sampling designs for longitudinal data.

## Key facts

- **NIH application ID:** 10830475
- **Project number:** 5R01HL094786-10
- **Recipient organization:** VANDERBILT UNIVERSITY MEDICAL CENTER
- **Principal Investigator:** Jonathan Scott Schildcrout
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $676,786
- **Award type:** 5
- **Project period:** 2009-09-01 → 2027-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10830475

## Citation

> US National Institutes of Health, RePORTER application 10830475, Outcome Dependent Sampling of Longitudinal Data: Design and Analysis (5R01HL094786-10). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10830475. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
