Project Summary/Abstract This project will create a massive microdata resource comprising the entire population of the United States in 1950. The 1950 Census is ideal for research on aging: people who were young in 1950 can be linked to myriad sources describing their health and well-being from mid to later adulthood, allowing a prospective view of aging. By linking the 1950 Census to recent health surveys, administrative records, and the national death index, investigators can pursue prospective analyses of the impact of early life conditions—including socioeconomic status, parental education, local environment, and family structure—on later health and mortality. The database will cover the entire population with full geographic detail, providing contextual information on childhood neighborhood characteristics, labor-market conditions, and environmental conditions. The 1950 data will enable transformative research to uncover the effects of early life conditions on health and well-being in later life, including cognitive impairment. The database will make a permanent and substantial addition to the nation’s statistical infrastructure and will have far-reaching implications for research across the social and behavioral sciences. The project involves (1) transcribing 8.3 billion keystrokes of data describing the demographic and economic characteristics of all individuals, families, households, and group quarters present in the U.S. in 1950; (2) evaluating data quality through random blind verification and comparison with published census tabulations; (3) converting approximately ten million different open-ended census responses into numeric classifications compatible with previous and subsequent census data; (4) data cleaning, including editing and imputation of inconsistent and missing data values; (5) developing metadata and documentation, including full descriptions of data processing methods, detailed analysis of comparability issues, and comprehensive machine-processable metadata; and (6) incorporating the database into the Integrated Public Use Microdata Series (IPUMS) data access systems for free dissemination to the scientific community. The proposed work will be carried out by a team of highly-skilled researchers with unparalleled expertise and experience in large-scale data creation, integration, and dissemination. The project is a collaboration with the nation’s largest producer of genealogical data. This public-private partnership allows a highly cost-effective use of scarce resources for shared infrastructure for population and health research.