# Big data analytics for the evaluation of whole genome sequence and transcriptome data in alcohol research

> **NIH NIH K25** · SCRIPPS RESEARCH INSTITUTE, THE · 2020 · $161,460

## Abstract

PROJECT SUMMARY/ABSTRACT
 Large-scale U.S. epidemiological studies demonstrate that alcohol use disorders are highly prevalent,
highly co-morbid with other psychiatric disorders, disabling, and often go untreated. Compared with other U.S.
ethnic groups, Native Americans have the highest rates of alcohol and other drug dependence, and it is
associated with particularly significant disability and mortality. Thus studies that identify specific genetic risk
factors for alcohol use disorders in the general U.S. population, and especially in Native Americans, are of high
public health importance. Alcohol use disorders are complex genetic diseases sensitive to environmental
conditions that require complex data strategies to uncover the underlying risk factors. Although recent years
have seen significant advancement in our understanding in the biology and genetics of the disorders, exactly
how these factors interact in an individual to confer risk or protection from alcohol use disorders is still unclear.
Further, the genetic factors identified in the human genome thus far by conventional methods appear to only
explain a very small fraction of the overall heritability for the disorders.
 The overall objective of this research program is to identify the complex genetic and genomic factors that
affect susceptibility to alcohol use disorders and related comorbidities through novel and innovative
quantitative methods and big data analytics. The proposed study will utilize whole-genome sequence (WGS)
data from a unique high-risk Native American population and a European American population along with gene
expression data of alcoholic human brains. The project will develop methodology to analyze WGS data with
unique relevance to alcohol research. Selected multivariate, graphical, and dimension-reduction modeling tools
will be used in combination with mixed models suitable for genomic data with both population and family
structures to dissect polygenic basis for alcohol use disorders and shared genetic risk factors for alcohol use
disorders and comorbid disorders. Mixture models and clustering methods will be employed to uncover
heterogeneous genetic influences. Differential genetic effects at various levels of heterogeneity will be tested
with rigorous statistical methods. The project will identify population and ancestry-specific genetic risk factors
and shared risk factors across populations and determine their differential influences on susceptibility to
alcohol use disorders. Endophenotypes will also be investigated to help identify unique risk factors for alcohol
use disorder traits. The project will further identify alcohol- and addiction-relevant pathways and networks that
are differentially expressed in alcoholic brains, and establish directions of causations by combining gene
expression data with WGS data, and applying instrumental variable approaches. Finally, an integrated system
approach will be taken by further leveraging epigenomic maps and annot...

## Key facts

- **NIH application ID:** 9981554
- **Project number:** 5K25AA025095-05
- **Recipient organization:** SCRIPPS RESEARCH INSTITUTE, THE
- **Principal Investigator:** Qian Peng
- **Activity code:** K25 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $161,460
- **Award type:** 5
- **Project period:** 2016-08-01 → 2022-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9981554

## Citation

> US National Institutes of Health, RePORTER application 9981554, Big data analytics for the evaluation of whole genome sequence and transcriptome data in alcohol research (5K25AA025095-05). Retrieved via AI Analytics 2026-05-28 from https://api.ai-analytics.org/grant/nih/9981554. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*