# Machine learning approaches for improved accuracy and speed in sequence annotation: supplement for software enhancement

> **NIH NIH R01** · UNIVERSITY OF MONTANA · 2021 · $221,904

## Abstract

Summary
The goal of this parent grant for this supplement request is to develop Machine Learning approaches to
improve both accuracy and speed of highly-sensitive sequence database search and alignment. We have
developed three software tools associated with this effort of correctly annotating genomes: (i) ULTRA, which
labels repetitive sequence, (ii) PolyA which integrates such labels with other sequence annotations in a
probabilistic framework, computing uncertainty and improving accuracy, and (iii) SODA, which aids in
visualization of annotations and supporting evidence. Here, we describe a plan to refactor these software
tools and their documentation to improve robustness and reliability, and to improve their availability through
package management systems and incorporation into cloud-based analysis frameworks.

## Key facts

- **NIH application ID:** 10406630
- **Project number:** 3R01GM132600-03S1
- **Recipient organization:** UNIVERSITY OF MONTANA
- **Principal Investigator:** Travis John Wheeler
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $221,904
- **Award type:** 3
- **Project period:** 2019-09-20 → 2023-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10406630

## Citation

> US National Institutes of Health, RePORTER application 10406630, Machine learning approaches for improved accuracy and speed in sequence annotation: supplement for software enhancement (3R01GM132600-03S1). Retrieved via AI Analytics 2026-05-28 from https://api.ai-analytics.org/grant/nih/10406630. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
