Abstract Renewed interest in the microbiome has yielded numerous intriguing findings about the role of bacterial symbionts in human biology and disease. However, most of these findings are correlative and associative;; little is known about microbiota-host interactions at the level of molecular mechanism. Typical approaches to address this problem are one-off efforts that attempt to find an individual molecule responsible for a phenotype of interest. Here, we propose to upend this paradigm by systematically studying one of the most concrete contributions of the microbiota to human biology: the ‘top 100’ molecules, by abundance, from the gut community. These molecules vary widely in concentration among individuals, can accumulate in host circulation, and are present at levels that match or exceed the concentration of a typical small molecule drug. The work we propose here – to determine the bacterial species that produce each of the top 100, and identify the genes responsible – will be the first step toward creating a capability to completely specify the molecular output of the gut community (which molecules are produced, and which others are not), a process we will pilot in the current project. This ensemble of dozens of high-concentration molecules – to which we are exposed daily – is likely to be a major driver of human biology and disease. It is not unreasonable to imagine a future in which every human will harbor a ‘reprogrammed’ (synthetic) gut community whose molecular output has been optimized for disease treatment and prevention. The fields of natural product discovery and microbiome research are dominated by genomics-driven approaches. The solution we propose here runs entirely counter to this trend: we will start with 1) an empirical approach that is, in essence, old-fashioned Bergey’s-style microbiology outfitted with state-of-the-art analytical chemistry. Using the rich information we derive from empirical metabolic profiling, we will then 2) use genetics and biochemistry to identify the genes responsible for synthesizing the top 100, 3) use this information to devise a computational algorithm that can predict metabolic output directly from metagenomic sequence data, and 4) test our predictions using simple synthetic communities. Our data will create a rich metabolic map of the top 100, and will enable the construction of transplant-ready synthetic communities that produce custom cocktails of desired molecules (and do not produce undesired molecules).