The life sciences are in the midst of a data revolution. Cheap and accurate genome sequencing is a reality, high-resolution imaging is becoming routine, and clinical data is increasingly stored in machine-readable formats. These breakthroughs have brought us to the threshold of a new era in biomedicine, one where the data sciences hold the potential to propel our understanding and treatment of human disease. Achieving this potential, however, will require creating software platforms that can support storing, sharing, and analyzing data at unlimited scale. In this application, we propose to address this unmet need by bringing together three groups — the University of Chicago, the Broad Institute, and the University of California at Santa Cruz — each with a strong track record of developing production-grade software platforms to support flagship scientific efforts, including the All of Us Cohort Program, the Genome Data Commons (GDC) and its affiliated NCI Cloud Pilots program, and the Human Cell Atlas Data Coordination Platform (HCA DCP). Our goal is to align and integrate our individual efforts at building data platforms, in order to build a cohesive environment that can serve the needs of the NIH Data Commons and beyond. Because these platforms were each developed to fulfill differing use cases, there is currently far more complementarity than overlap between them. For example, Dr. Grossman has extensive expertise in running a hybrid cloud at scale to support the needs of the GDC; this provides cost benefits around data transport and egress that would be invaluable to the NIH Data Commons. Similarly, Dr. Philippakis has developed a cloud-based model of collaborative workspaces (FireCloud) and software for management of secondary data use restrictions (DUOS), and Dr. Paten has long been a leader in developing and implementing standardized APIs as part of the GA4GH. It is this complementarity that motivates us to integrate our efforts. In the sections below, we present our plans for creating the Commons Alliance Platform. In addition to having a unified technical vision for what is needed, we are also aligned around a core set of guiding principles: (1) Open-source - All the software we develop, from user interfaces down to cloud metal, is open-source. This includes not only the software that would be funded via this awarding mechanism, but all software developed and deployed by our team. (2) Modular and interoperable - A design principle of all complex software undertakings is “separation of concerns,” i.e. the notion that there should be a clean division between architectural components, each encapsulated by well-defined interfaces. We are committed to building modular and interoperable software and, in doing so, encouraging the creation of an ecosystem around them. (3) Standards-driven - Our team is committed to creating and utilizing standardized APIs and data formats. We have been leaders in GA4GH since its founding, chairing various working ...