# Using Administrative Data to Improve Health Survey Estimates

> **NIH NIH R21** · RESEARCH TRIANGLE INSTITUTE · 2020 · $220,903

## Abstract

PROJECT SUMMARY/ABSTRACT
 Health research relies on probability-based survey data for a wide range of outcomes, such as
evaluating policy impact, quantifying disease prevalence rates and health conditions in the general
population, identifying risk factors, assessing health disparities, and measuring changes over time. The need
for accurate data on the U.S. population has never been greater. However, survey participation has been
rapidly declining and methodological studies have shown substantial bias in some survey estimates because
of nonresponse. Results based on multiple nonresponse bias studies indicate that adjustments for
nonresponse using demographic characteristics, as in common practice, may not be sufficient.
 This study proposes the use of government administrative data that are related to health measures to
improve weighting adjustments. In an era of “big data” that has seen the use of combined data from multiple
sources to expand the types of analyses, there is a missed opportunity to use administrative data for
improving survey estimates rather than only augmenting data with analytic variables. Administrative data can
be variable-poor, such as enrollment in a government health plan without any other data, but complete for
the entire population; survey data are generally variable-rich, with a diverse set of demographic, factual, and
behavioral survey measures, but can suffer from nonresponse and other survey errors. This study aims to
leverage the population estimates from administrative data to improve inference from the survey data.
 There are three main impediments to using administrative data to correct for nonresponse. A key
challenge results from measurement differences between the responses to the survey questions and the
administrative data elements. A second hindrance arises from problems with data linkage across sources. A
third obstacle is posed by the usual difference in target populations between surveys and administrative
databases. The main objectives of this study are to implement a set of methods that overcome these
challenges and to evaluate the approach's effectiveness to reduce nonresponse bias along with the effects
on variance estimates and mean squared error.
If successful, the proposed approach could lead to better utilization of available resources to improve health
survey data. This study offers a test based on one survey and three administrative databases (from two
sources), but if deemed effective, it could be applied to many other surveys using a variety of administrative
data sources. The integrity of official statistics relies on accurate estimates, which in turn are reliant on the
methods used to produce those estimates. Declining survey participation is a serious threat to results based
on surveys and aims to contribute to the development of better methods to correct for nonresponse.

## Key facts

- **NIH application ID:** 9845782
- **Project number:** 5R21AG061312-02
- **Recipient organization:** RESEARCH TRIANGLE INSTITUTE
- **Principal Investigator:** Andrey Alexandrov Peytchev
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $220,903
- **Award type:** 5
- **Project period:** 2019-01-15 → 2021-11-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9845782

## Citation

> US National Institutes of Health, RePORTER application 9845782, Using Administrative Data to Improve Health Survey Estimates (5R21AG061312-02). Retrieved via AI Analytics 2026-06-11 from https://api.ai-analytics.org/grant/nih/9845782. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
