Boss: A cloud-based data archive for electron microscopy and x-ray microtomography

NIH RePORTER · NIH · R24 · $700,763 · view on reporter.nih.gov ↗

Abstract

Project Abstract Due to recent technological advances, it is possible to image the high-resolution structure of brain volumes at spatial extents that are much larger than was previously possible. Emerging X-ray microtomography (XRM) methods allow for the collection of whole mouse brains in a high-throughput paradigm, permitting the generation of sub-micron three-dimensional image volumes in less than a day without the alignment challenges or tissue clearing approaches of other methods. Similarly, electron microscopy (EM) efforts now routinely exceed 100 terabytes in scale and projects are underway to map cubic millimeters of brain regions, resulting in petabytes of image and annotation data. Both of these methods are widely used throughout the BRAIN Initiative and the broader neuroscience community, and the instruments required to collect these datasets are becoming more common and higher throughput resulting in an increased need for data storage and archival solutions that can accommodate these larger datasets that are being generated at an increasing pace. Finally, the sample preparation of XRM and EM are compatible with and amenable to co-registration, and work is underway to pursue multimodal experiments; new instruments are now available with the ability to perform both XRM and EM data collection from a single sample. Existing paradigms for data storage and access are often insufficient to accommodate the required storage, processing, and dissemination needed to fully exploit the generated data. At this scale, traditional analysis approaches are often ineffective; for example, it is difficult for a human to view all of data collected or manually annotate more than a small fraction of the volume. Contemporary analysis approaches leveraging automated methods require robust and efficient access to data, which can be challenging when managing massive datasets spread across many files. Without a standard data storage mechanism, data access is cumbersome, storage is expensive and can lack sufficient durability, metadata is unreliable or unavailable and may not be attributable in useful ways, and file formats and organization are often different across laboratories, resulting in a high-barrier for collaboration and sharing. Thus, we propose the Block and Object Storage Service Database (bossDB) to deliver a high-performance, cost efficient data archive by utilizing a cloud-based tiered storage architecture, where data is seamlessly migrated between low cost, durable object storage (i.e., S3) and a fast in-memory spatial data store. This system will be developed through an agile process that will actively fold in community stakeholders for regular reviews and continuous opportunities for design input, and will provide and support integration of a robust suite of user-facing tools that are vital to foster community adoption, such as a web-based management console and visualization tool, a Python SDK for programmatic access, and a client to facilitate...

Key facts

NIH application ID
9928113
Project number
5R24MH114785-03
Recipient
JOHNS HOPKINS UNIVERSITY
Principal Investigator
BROCK A. WESTER
Activity code
R24
Funding institute
NIH
Fiscal year
2020
Award amount
$700,763
Award type
5
Project period
2018-08-24 → 2023-05-31