# Software development for Stan to improve survey statistics for non-probability samples

> **NIH NIH R01** · COLUMBIA UNIV NEW YORK MORNINGSIDE · 2021 · $233,137

## Abstract

1 Project Summary
This proposal is a supplement to our NIH grant R01 AG067149-01: Improving Representativeness
in Non-probability Surveys and Causal Inference with Regularized Regression and Poststrati cation.
That project involves developing certain Bayesian methods for sampling adjustment in a general,
 exible, and reliable way that can be used for a wide range of problems in public health research.
The project requires extensive use of the Stan probabilistic programming platform, both as part of
the research e ort and as part of resulting methods.
 This NOSI is synergistic with that grant. It will support new software engineering initiatives to
improve the core Stan platform in three ways: (1) Providing the option for JSON format outputs
will improve interoperability and facilitate incorporating Bayesian methods into machine learning
pipelines; (2) Extending and refactoring the core Stan inference algorithms for greater memory
eciency and increased parallel processing will improve the overall speed and scalability of infer-
ence, allowing for Bayesian methods to be used with increasingly complex models. This will allow
researchers to compare a greater number and wider range of models in order to nd those with
optimal behaviors. (3) The addition of a standard logging framework will bene t both the Stan
user community and the developer community.
 The parent grant's research agenda is threefold. Firstly, it is directed to addressing the unique
challenges posed by public health datasets and questions by investigating adaptations to state-of-
the art modelling techniques. Secondly, it strives to improve causal inferences for demographic
subgroups. Thirdly, and more broadly, it seeks to improve current methodology by developing
work ows to test and validate models with non-representative data in order to obtain better and
more trustworthy population based estimates.
 The work in the NOSI is relevant in two ways. First, it will directly support the research in
the main project. During our research, computational challenges arise. The progress in research
reveals areas where the computational infrastructure needs to be improved; thus, the NOSI will
enable us to do our NIH-funded research more e ectively. Second, it's important for the results
of our research to be used by others. The computing work in the NOSI will make it easier for
applied practitioners to make use of the research we have been developing. Furthermore, the
addition of a common data format for inputs and outputs, greater processing speed and eciency,
and standardized logging will make it easier to use Stan in complex processing pipelines, therefore
improving overall cloud-readiness.

## Key facts

- **NIH application ID:** 10405924
- **Project number:** 3R01AG067149-02S1
- **Recipient organization:** COLUMBIA UNIV NEW YORK MORNINGSIDE
- **Principal Investigator:** ANDREW GELMAN
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $233,137
- **Award type:** 3
- **Project period:** 2020-08-01 → 2023-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10405924

## Citation

> US National Institutes of Health, RePORTER application 10405924, Software development for Stan to improve survey statistics for non-probability samples (3R01AG067149-02S1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10405924. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
