# Finishing multiple genomes in EupathDB using Oxford Nanopore Single Molecule sequencing

> **NIH NIH R21** · UNIVERSITY OF PITTSBURGH AT PITTSBURGH · 2020 · $259,005

## Abstract

PROJECT SUMMARY/ABSTRACT:
Toxoplasma gondii is an important opportunistic pathogen of humans where it can cause severe disease in the
developing fetus and those with HIV/AIDS. Despite extensive efforts by the research community to sequence,
assemble and annotate multiple genomes for this organism, these genome sequences remain incomplete due
to repetitive and uncloneable sequence. A major reason for this knowledge gap is that the sequencing
technologies used (1st and 2nd generation) cannot fully resolve these loci. This prevents fully effective use of
the data (which is hosted on the EuPathDB Bioinformatics Resource Center; BRC) by the research community
since there are thousands of base pairs of missing and/or unassembled data. Here we propose to resequence
and generate de novo assemblies for multiple T. gondii isolates (as well as two other species that serve as
comparators) using 3rd generation sequencing and Chromosome conformation-based sequencing approaches,
and then annotate them and integrate them into EuPathDB BRC. Our preliminary data show the feasibility of
this approach where we have used it to revise the karyotype for T. gondii (discovering that it harbors 13, rather
than 14, chromosomes), increase the total genome assembly by ~2 Mb, and perform genome-wide analyses of
structural and/or copy number variation at loci with a known role in T. gondii pathogenesis. The proposed
studies are responsive to RFA PA-19-068, “Secondary Analysis of Existing Datasets for Advancing Infectious
Disease Research” by specifically using data outside of the EuPathDB BRC (our de novo assemblies and
annotations) to improve the utility of data within the EuPathDB BRC (gene expression, annotation and
proteomics data, for example). Moreover the analysis pipeline will rely on using the existing genome sequence
data within the EuPathDB BRC to identify sequence differences between our new assemblies and those
hosted by the BRC. In addition to the expertise of the PI in genome sequencing and function of multicopy loci
encoding pathogenesis determinants, the success of the proposed studies is also facilitated by the assembled
team, including an expert in Chromosome Conformation Capture-based sequencing approaches (Le Roch)
and sequence assembly and annotation (Lorenzi).

## Key facts

- **NIH application ID:** 10048453
- **Project number:** 1R21AI154386-01
- **Recipient organization:** UNIVERSITY OF PITTSBURGH AT PITTSBURGH
- **Principal Investigator:** Jon P Boyle
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $259,005
- **Award type:** 1
- **Project period:** 2020-06-10 → 2022-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10048453

## Citation

> US National Institutes of Health, RePORTER application 10048453, Finishing multiple genomes in EupathDB using Oxford Nanopore Single Molecule sequencing (1R21AI154386-01). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10048453. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
