Project Abstract The first postnatal years are an exceptionally dynamic and critical period of structural and functional development of the human brain. Many neurodevelopmental disorders are the consequence of abnormal brain development during this stage. Several NIH-funded studies have recently acquired and released large-scale infant brain MRI datasets in the National Institute of Mental Health Data Archive (NDA), leading to over 3,000 publically-available infant MRI scans from multiple imaging sites. Joint analysis of these big data of infant brains will undoubtedly improve our limited understanding of normative early brain development and neurodevelopmental disorders with boosted statistical power and reproducibility. However, the processed and harmonized data of these multi-site infant MR images still remain publically absent, due to the challenges in processing and analyzing infant MR images, which typically exhibit extremely low tissue contrast, large within-tissue intensity variations, and regionally-heterogeneous dynamic changes. To address this critical issue, the goal of this project is to comprehensively process, harmonize, discover and archive large-scale, multi-site public infant MRI datasets to significantly advance early brain development studies, by taking advantage of our infant-tailored computational tools and further developing advanced machine learning techniques. In Aim 1, we will extensively process large-scale infant MRI datasets by adopting our established and recently-improved infant- dedicated cortical surface-based computational tools and further develop a deep spherical neural network for quality control of produced cortical property maps. This will lead to quality-ensured vertex-wise maps of multiple biologically-distinct cortical properties, e.g., cortical thickness, surface area, myelin content, sulcal depth, local gyrification, curvature and diffusivity. In Aim 2, to remove site effects associated with different scanners and imaging protocols and meanwhile preserve biological associations, we will harmonize the computed cortical property maps from multi-site data in Aim 1 by leveraging our surface-to-surface cycle-consistent generative adversarial networks (S2SGAN) based on the spherical U-Net, without requiring traveling subjects (paired data) across sites. To further increase the efficiency and learn more robust feature representation in the whole multi- site data, we propose to extend S2SGAN to jointly harmonize all multi-site cortical property maps using a single generator. In Aim 3, leveraging the informative growth patterns and gradient information of the harmonized maps of multiple cortical properties in Aim 2, we will discover distinct cortical regions, by capitalizing on multi-view nonnegative matrix factorization in a data-driven manner, without making any assumption on the parametric forms of growth patterns. All our processed data, results, computational tools, and source codes will be deposited into N...