# SciDAP: Scientific Data Analysis Platform

> **NIH NIH R42** · DATIRIUM, LLC · 2022 · $802,500

## Abstract

The recent proliferation of next-generation sequencing (NGS) - based methods for the analysis of expression,
chromatin and protein-DNA interactions has created tremendous opportunities for gaining insights into biology,
health, and disease. However, analysis of the data requires computational expertise that many biologists do not
possess. Hence, when dealing with genomics data, majority of biologists require the help of bioinformaticians
even for simple tasks. This places these exciting methods beyond the reach of the majority of life scientists.
 This phase II proposal from DATIRIUM, LLC, a start-up from Cincinnati, OH follows phase I project that
resulted in the development of a prototype (MVP) of SciDAP (Scientific Data Analysis Platform), a novel multi-
omics user-friendly data analysis platform that allows biologists to analyze the data and enables collaboration
with bioinformaticians. The current phase II proposal describes a plan to continue SciDAP development.
 The key problem for creating user-friendly data analysis packages is the difficulty in adding new or modifying
existing pipelines: due to the tight coupling between pipeline and user interface this required changes at all levels
of software. Unfortunately, the same limitation exists for all user-friendly bioinformatics tools. Given that there
are more than 150 NGS-based methods and many ways to process the data, this explains why a universal and
user-friendly data analysis platform does not yet exist.
 We hypothesized that we can create a data analysis platform that is both universal and user-friendly by
including interface instructions into computational pipelines. Platform will use these instructions to create a
graphical interface. Specifically, we are using containerized pipelines developed using Common Workflow
Language (CWL) making our pipelines both portable and reproducible. On top of CWL, Datirium developed a
system of CWL extensions that allows to describe the inputs and outputs visualizations within the CWL
workflows. Importantly, our platform will increase the rigor of computational analysis by (i) making the analysis
reproducible and auditable by bioinformaticians due to CWL pipeline portability and recording each step of the
analysis as Research Objects; (ii) enabling collaboration between experimentalists and computational biologists
by providing bioinformaticians with a way to direct analysis flow and biologists with the convenience of GUI; (iii)
Including out of the box pipelines with optimized parameters and actionable QC metrics that flag possible issues.
 In the first aim of this proposal we will develop a version of SciDAP for use on academic clusters and
commercial clouds. In the second aim, in collaboration with Dr. Salomonis at CCHMC, we will adopt pipelines
miRNA, WGS/WXS and scMultiome data analysis. In the third, we will develop improvements to SciDAP
interface that will increase SciDAP flexibility and usability for bioinformaticians and experimentalists....

## Key facts

- **NIH application ID:** 10484046
- **Project number:** 2R42HG011219-02
- **Recipient organization:** DATIRIUM, LLC
- **Principal Investigator:** Artem Barski
- **Activity code:** R42 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $802,500
- **Award type:** 2
- **Project period:** 2020-09-14 → 2024-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10484046

## Citation

> US National Institutes of Health, RePORTER application 10484046, SciDAP: Scientific Data Analysis Platform (2R42HG011219-02). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10484046. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
