# Sample-specific Models for Molecular Portraits of Diseases in Precision Medicine

> **NIH NIH R01** · CARNEGIE-MELLON UNIVERSITY · 2022 · $303,551

## Abstract

A fundamental challenge in precision medicine is to understand the patterns of differentiation between individuals. To
address this challenge, we propose to go beyond the traditional `one disease--one model' view of bioinformatics and
pursue a new view built upon personalized patient models that facilitates precision medicine by leveraging both
commonalities within a patient cohort as well as signatures unique to every individual patient. With the emergence of
large-scale databases such as The Cancer Genome Atlas (TCGA), the International Cancer Genome Consortium
(ICGC), and the Gene Expression Omnibus (GEO), which collect multi-omic data on many different diseases, a new
“pan-omics” and “pan-disease” paradigm has emerged to jointly analyze all patients in a disease cohort while
accounting for patient-specific effects. An example of this is the recently released Pan-Cancer Atlas. At the same time,
next generation statistical tools to accurately and rigorously draw the necessary inferences are lacking.
In this project we propose a series of mathematically rigorous, statistically sound, and computationally feasible
approaches to infer sample-specific models, providing a more complete view of heterogeneous datasets. By bringing
together ideas from the machine learning, statistics, and mathematical optimization communities, we provide a
rigorous framework for precision medicine via sample-specific statistical models. Crucially, we propose to analyze this
framework and prove strong theoretical guarantees under weak assumptions--this dramatically distinguishes our
framework from much of the existing literature. Towards these goals, we propose the following aims:
Aim 1: Discovery of new molecular profiles with sample-specific statistical models. We propose a general framework
for inferring sample-specific models with low-rank structure based on the novel concept of distance-matching. This
allows us to infer statistical models at the level of a single patient without overfitting, and is general enough to be
applied for prediction, classification, and network inference as well as a variety of diseases and phenotypes.
Aim 2: Multimodal approaches to personalized diagnosis--contextually interpretable models for actionable clinical
decision support. In order to translate these models into practice, we propose a novel interpretable predictive model
that supports complex, multimodal data types such as images and text combined with high-level interpretable features
such as SNP data, gender, age, etc. This framework simultaneously boosts the accuracy of clinical predictions by
exploiting sample heterogeneity while providing human-digestable explanations for the predictions being made.
Aim 3: Next-generation precision medicine--algorithms and software for personalized estimation. To put our models
into practical use, we will develop new algorithms for interpretable prediction of personalized clinical outcomes and
visualization of personalized statistical models. ...

## Key facts

- **NIH application ID:** 10492446
- **Project number:** 5R01GM140467-03
- **Recipient organization:** CARNEGIE-MELLON UNIVERSITY
- **Principal Investigator:** Eric P Xing
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $303,551
- **Award type:** 5
- **Project period:** 2020-09-01 → 2024-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10492446

## Citation

> US National Institutes of Health, RePORTER application 10492446, Sample-specific Models for Molecular Portraits of Diseases in Precision Medicine (5R01GM140467-03). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10492446. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
