# Protein Signatures of APOE2 and Cognitive Aging

> **NIH NIH R01** · TUFTS MEDICAL CENTER · 2021 · $322,757

## Abstract

Improving AI/ML readiness of data generated under the R01: Protein signatures of APOE2 and
AG061844 “Protein signatures of APOE2 and cognitive aging”, we
are generating proteomic and metabolomics data in a cohort of centenarians, their offspring, and unrelated
controls from the New England Centenarian Study (NECS). Study participants have been characterized with
detailed medical history, genetic profiles, and longitudinal assessment of physical and cognitive functions. The
goal of the parent R01 is to validate a proteomic signature of APOE genotypes, and to evaluate its value
together with metabolic profiles to predict patterns of cognitive function change in aging individuals. We plan to
share data through the Alzheimer’s disease (AD) portal, and the new extreme longevity (EL) portal that is
currently under development. Sharing the data in an unrestricted manner is not possible because they include
HIPAA identifiers, particularly age >89. Unrestricted sharing of data would be an attractive option for AI/ML
investigators, and the goal of this request for administrative supplement is to
cognitive aging. Funded by the NIA: R01
use advanced machine learning
techniques to generate high-fidelity, privacy-preserving, synthetic versions of the data obtained in the parent
achine learning methods have emerged that can be used to
generate synthetic data using a model that is trained in the real data. This model can be used to generate a
synthetic data set in which no single data point corresponds to a real person in the original data set, but the
synthetic data can be analyzed to produce results that are like those derived from the original data. This
approach has received substantial attention in the past few years, and it has been adopted to compromise
between data sharing and privacy, including generation of synthetic data for the National COVID Cohort
Collaborative (N3C). We have put together a team of data scientists and partners from the company Syntegra
R01 so they can be shared without restriction. M
,
to generate and validate a synthetic data set that matches the data generated with the parent R01. Our
proposal is structured in three aims. In Aim 1, we will share with Syntegra real data from the NECS that include
proteomics and metabolomics, genetic variables and patients’ characteristics including assessment of
cognitive function. This real data will be used to train the data generation model and create synthetic data sets.
In Aim 2 we will d
evelop a protocol for validation of the synthetic data sets that includes
fidelity to a variety of
results of machine learning analyses
and metrics to assess the deidentification of data. In Aim 3 we will
conduct the analysis in the real and synthetic data sets and compare the results.
Impact. This is a high risk,
but potentially high return proposal. If the approach works, we will be able to generate data that can be widely
shared with the community. The approach will also be applicable to several other stu...

## Key facts

- **NIH application ID:** 10408304
- **Project number:** 3R01AG061844-04S1
- **Recipient organization:** TUFTS MEDICAL CENTER
- **Principal Investigator:** THOMAS T PERLS
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $322,757
- **Award type:** 3
- **Project period:** 2018-09-30 → 2023-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10408304

## Citation

> US National Institutes of Health, RePORTER application 10408304, Protein Signatures of APOE2 and Cognitive Aging (3R01AG061844-04S1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10408304. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
