# Core C- Bioinformatics Core

> **NIH NIH P01** · UNIVERSITY OF CALIFORNIA, SAN DIEGO · 2024 · $173,917

## Abstract

PROJECT SUMMARY – Core C: Bioinformatics
Bioinformatics is the application of statistics and computer science to the field of molecular biology. It has
emerged as a field unto itself, as the datasets that are generated by modern biomedical researchers easily
exceeds what can be directly visualized. The vast amount of data increases the chance of false-negative and
false-positive results, and argue for robust statistical models and reproducible workflows. Core C will work with
the data generated from massive parallel sequencing from human, frog and mouse in Project I, II and III and
Core B to extract variants that have potential to cause meningomyelocele or influence neural tube
phenotypes. The PIs of the Projects and Cores have worked together extensively in the past, and have an
established track record of productivity in the area of next generation sequencing (NGS) data analysis. Dr. Bafna
has worked broadly in bioinformatics and genomics in the development computational methodologies employing
novel algorithms and statistical techniques for NGS datasets. We envision that the DNA sequencing derived
from Project I in the form of whole genome or whole exome sequencing from patients and their parents will be
delivered to Core C for determination of potentially pathogenic risk-associated variant prioritization. RNA
sequencing, single cell sequencing and epigenetic sequencing data generated from Core B, as well as imported
from Project I, II and III, will be delivered to Core C for extraction of expression changes, which will be delivered
to each of the Projects for segregation analysis and further validation. The Bioinformatics Core will provide these
analysis pipelines to identify and annotate variants, and to develop innovative network analyses, RNAseq,
Methylseq and single cell analysis to discover novel genetic mechanisms of MM based on Protein-Protein
Interaction (PPI) and gene co-expression networks, to interpret large datasets from current genetic and
genomic technologies, and to apply these in the different components of this Program Project. Although our
primary goal is to provide service using existing computational methods, we expect that the Core B will also
develop novel computational methods as required by the Projects and Cores, as we have done to develop
our current WGS analysis pipeline. Methods development will be geared towards fundamental unsolved
problems underlying the above four key functions, such as algorithms for correlating variants to phenotypes,
further improvements in methods for computing epistatic interactions, detection of short tandem repeats and
mobile elements from WGS, advanced methods for integration of genotypes with pathways, use of next-
generation sequencing (NGS) in analysis of gene association, and discovery of genetic variants that influence
protein expression or function.

## Key facts

- **NIH application ID:** 10747897
- **Project number:** 5P01HD104436-04
- **Recipient organization:** UNIVERSITY OF CALIFORNIA, SAN DIEGO
- **Principal Investigator:** Vineet Bafna
- **Activity code:** P01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $173,917
- **Award type:** 5
- **Project period:** 2020-12-01 → 2025-11-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10747897

## Citation

> US National Institutes of Health, RePORTER application 10747897, Core C- Bioinformatics Core (5P01HD104436-04). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10747897. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
