Bridging the gap between genetic variants and radiomic phenotypes via genomic large language models

NIH RePORTER · NIH · K99 · $131,085 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT One of the fundamental challenges in modern biology is to decode the functionalities of human genome sequence. Over the past decade, genome-wide association studies (GWAS) have generated a wealth of new information, including the genotype–phenotype associations in various diseases and traits. Despite clear successes in identifying novel disease susceptibility genes and in translating these findings into clinical care, GWAS has been criticized for the fact that most association signals reflect variants and genes with no direct biological relevance to phenotype. The development of large language model (LLM) has been the main driving force behind many recent breakthroughs in artificial intelligence. Research into the “genomic LLM” therefore has the potential to significantly advance our understanding of how the genetics variants lead to the changes in phenotypes by disrupting the underlying regulatory syntax of DNA. The Research Training Plan will first develop and improve the core technologies of genomic LLMs to deepen our understanding on understanding the complex regulatory mechanisms in gene regulation (Aim 1). The developed genomic LLMs will then be applied in imaging genetics studies where imaging traits are used as phenotypes (Aim 2) and the development of new machine learning (ML) approaches for Alzheimer’s disease diagnosis (Aim 3). In Aim 1, the applicant Dr. Qiao Liu will develop new genomic LLM techniques and provide biological model interpretation with special focus on how transcription factor (TF) binds DNA recognition sites in genomic regulatory regions to control genomic transcription and affect epigenomic signals in a context-specific manner. The proposed genomic LLMs will serve as solid foundation for both Aim 2 and Aim 3. In Aim 2, Dr. Liu will focus on the imaging genetics studies, which can be considered as GWAS of imaging phenotypes, for linking genetic variants/genes to structural or functional imaging features through the mediation of genomic LLMs. Genomic LLMs thus will bridge the gap between personal genetics and radiomics. In Aim 3 during the R00 phase, Dr. Liu will develop new ML approaches on AD diagnosis by considering the causal genetic-imaging-clinical pathways and leveraging the power from the genomic LLM. To succeed in these aims, a Career Development Plan is tailored to enable Dr. Liu to gain new knowledge and skills in radiomics, neuroimaging, and Alzheimer’s disease, as well as career skills through practice and coursework with the support of the outstanding mentoring team and scientific advisory committee. Stanford University is an ideal environment, providing all of the facilities needed for the proposed research and a rich interdisciplinary environment for collaborative studies. In summary, the strong mentoring team and scientific advisory committee, as well as the training plan are anticipated to fully prepare Dr. Liu to launch his independent career. The proposed studies promise to o...

Key facts

NIH application ID
10948002
Project number
1K99HG013661-01
Recipient
STANFORD UNIVERSITY
Principal Investigator
Qiao Liu
Activity code
K99
Funding institute
NIH
Fiscal year
2024
Award amount
$131,085
Award type
1
Project period
2024-09-01 → 2025-08-31