PROJECT SUMMARY Biomedical investigators at Baylor College of Medicine (BCM) are increasingly dependent on high performance computer cluster (HPC) based basic and integrative analysis of sequence and other high-dimensional data to conduct their research. The Biostatistics and Informatics Shared Resource (BISR), a Shared Resource in the College’s Advanced Technology Cores, currently manages a Beowulf style cluster as a service for computational investigators. This cluster is highly used but is aging and does not have the type of high-memory nodes needed for efficient timely processing of single-cell and single nucleus sequencing experiments, which typically require 100-200 GB of memory per processor. In some cases, analyses simply cannot be run. Although there are other HPC capabilities at BCM, for example in the Human Genome Sequencing Center or within individual labs as well as other HPC resources in the region, none of these offer satisfactory solutions to our users. Internal BCM-based systems are not designed for high-memory requiring jobs. None are open to general users, and none are operated as a shared resource that ensures consistent up-times, high-speed network connections, mountable storage and regulatorily compliant data protections. External resources are simply not available to general users outside of the owner institution, or they are expressly designed for certain types of jobs and place limits on usage that preclude their use for the types of runs needed by our users. The new BISR HPC will fill a unique niche in providing high-memory HPC capabilities, as a formally managed shared resource, to BCM biomedical investigators. In addition, we are not simply providing raw CPU hours to computationally expert users who do not need any help. We provide assistance to investigators that straddle wet and dry lab research by offering central software management and troubleshooting. The full potential of a recently acquired S10-supported ultra-high throughput NovaSeq6000 sequencer and a recently CPRIT-funded single-cell sequencing Core may fail to be realized without this computational support. We propose to build a new high-memory GPU-enabled system specifically designed to support the burgeoning need of investigators who are conducting large single-cell and/or single nucleus sequencing experiments. Typical experiments involve sequences from 100’s to 10,000’s of cells/per biologic unit and 10’s to 1000’s of biologic units. These experiments represent hundreds of thousands of genomic, transcriptomic and/or epigenomic sequences that must be processed, aligned and integrated. The proposed system will include a front-end node, 22 compute nodes each with 36 processors and 1 TB of memory, 1 GPU server with 8 GPU’s and 1PB direct attached storage. Major Users and their projects will account for about 82% of usage. Demand for single-cell sequencing is growing and we anticipate that there will be numerous additional users. Availability of this HPC will...