# Bridging the gap between genetic variants and radiomic phenotypes via genomic large language models

> **NIH NIH K99** · STANFORD UNIVERSITY · 2024 · $131,085

## Abstract

PROJECT SUMMARY/ABSTRACT
One of the fundamental challenges in modern biology is to decode the functionalities of human genome
sequence. Over the past decade, genome-wide association studies (GWAS) have generated a wealth of new
information, including the genotype–phenotype associations in various diseases and traits. Despite clear
successes in identifying novel disease susceptibility genes and in translating these findings into clinical care,
GWAS has been criticized for the fact that most association signals reflect variants and genes with no direct
biological relevance to phenotype. The development of large language model (LLM) has been the main driving
force behind many recent breakthroughs in artificial intelligence. Research into the “genomic LLM” therefore has
the potential to significantly advance our understanding of how the genetics variants lead to the changes in
phenotypes by disrupting the underlying regulatory syntax of DNA. The Research Training Plan will first develop
and improve the core technologies of genomic LLMs to deepen our understanding on understanding the complex
regulatory mechanisms in gene regulation (Aim 1). The developed genomic LLMs will then be applied in imaging
genetics studies where imaging traits are used as phenotypes (Aim 2) and the development of new machine
learning (ML) approaches for Alzheimer’s disease diagnosis (Aim 3). In Aim 1, the applicant Dr. Qiao Liu will
develop new genomic LLM techniques and provide biological model interpretation with special focus on how
transcription factor (TF) binds DNA recognition sites in genomic regulatory regions to control genomic
transcription and affect epigenomic signals in a context-specific manner. The proposed genomic LLMs will serve
as solid foundation for both Aim 2 and Aim 3. In Aim 2, Dr. Liu will focus on the imaging genetics studies, which
can be considered as GWAS of imaging phenotypes, for linking genetic variants/genes to structural or functional
imaging features through the mediation of genomic LLMs. Genomic LLMs thus will bridge the gap between
personal genetics and radiomics. In Aim 3 during the R00 phase, Dr. Liu will develop new ML approaches on
AD diagnosis by considering the causal genetic-imaging-clinical pathways and leveraging the power from the
genomic LLM. To succeed in these aims, a Career Development Plan is tailored to enable Dr. Liu to gain new
knowledge and skills in radiomics, neuroimaging, and Alzheimer’s disease, as well as career skills through
practice and coursework with the support of the outstanding mentoring team and scientific advisory committee.
Stanford University is an ideal environment, providing all of the facilities needed for the proposed research and
a rich interdisciplinary environment for collaborative studies. In summary, the strong mentoring team and
scientific advisory committee, as well as the training plan are anticipated to fully prepare Dr. Liu to launch his
independent career. The proposed studies promise to o...

## Key facts

- **NIH application ID:** 10948002
- **Project number:** 1K99HG013661-01
- **Recipient organization:** STANFORD UNIVERSITY
- **Principal Investigator:** Qiao Liu
- **Activity code:** K99 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $131,085
- **Award type:** 1
- **Project period:** 2024-09-01 → 2025-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10948002

## Citation

> US National Institutes of Health, RePORTER application 10948002, Bridging the gap between genetic variants and radiomic phenotypes via genomic large language models (1K99HG013661-01). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10948002. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
