# Uncovering the Human Secretome

> **NIH NIH DP1** · HARVARD MEDICAL SCHOOL · 2020 · $1,186,500

## Abstract

PROJECT SUMMARY / ABSTRACT
Peptide hormones regulate embryonic development and most physiological processes by acting as endocrine
or paracrine signals. They are also a rich source of relatively safe medicines to treat both common and rare
diseases. Yet finding peptide-coding genes below ~300 base pairs is inherently difficult because they lie within
the noise of the genome. Recent multidisciplinary, proteophylogenomic studies in lower species, such as yeast
and flies, have uncovered hundreds of new small protein-coding genes called “smORFs”. In humans, recent
work on the mitochondrial genome has also uncovered dozens of small peptide hormone genes called MDPs.
Based on these and other studies, it is estimated that about 5% of proteins in the human nuclear genome have
not yet been discovered, particularly those that encode small peptides below 100 amino acids. It is a well
documented but rarely challenged practice to discard large quantities of sequencing and proteomic data
because they do not match the annotated human genome. My overarching goal is to discover the human
“secretome” and make practical use of it to improve the human condition. Over the past few years, we have
developed a unique pipeline of technologies that combines breakthroughs in math, computer hardware and
software, proteomics, mass spectrometry, and HTS screening, each of which has been optimized and
integrated. Our GeneFinder software modules, based on machine-learning, can process data 100 times faster
than traditional methods and rapidly validate small human genes using public and in-house generated
databases of genetic and proteomic data. Using the prototype version of the platform that finds conservation
between humans, chimp, and macaque, we have discovered thousands of putative peptide-coding genes and
validated hundreds of them. We aim to (1) further improve the algorithm to increase its speed and accuracy,
(2) improve the genome annotation for thousands of small novel genes, (3) determine their expression profiles
in normal and diseased tissues, (4) explore their genetic association with disease loci, and (5) screen the first
secretomic library to find hormones with novel biological and therapeutically relevant activities. The data, the
software package, and libraries will be made available to the research community. In doing so, we will shed
light on the dark matter of the human genome, the parts with the greatest therapeutic potential, thereby helping
to steer and accelerate the pace of research and drug development for generations to come.

## Key facts

- **NIH application ID:** 9928344
- **Project number:** 5DP1AG058605-04
- **Recipient organization:** HARVARD MEDICAL SCHOOL
- **Principal Investigator:** DAVID A. SINCLAIR
- **Activity code:** DP1 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $1,186,500
- **Award type:** 5
- **Project period:** 2017-09-15 → 2022-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9928344

## Citation

> US National Institutes of Health, RePORTER application 9928344, Uncovering the Human Secretome (5DP1AG058605-04). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/9928344. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
