SUMMARY This project will complete a longitudinal data infrastructure including most of the U.S. population from 1940 through 2020. The linked infrastructure currently includes data from the censuses of 1940 and 2000-2020 and will soon include data from the censuses of 1960-1990. The current project will incorporate the 1950 Census and enhance the 1940 Census. These new data will provide the baseline observations for most of today’s oldest Americans. When complete, the entire 1940-2020 infrastructure will serve as a massive multi-purpose resource enabling a wide range of new discoveries and applications. We will undertake three key tasks to accomplish this goal: (1) link 1950 Census respondents into the broader infrastructure, (2) use new techniques to improve the coverage and accuracy of linkages currently available for the 1940 Census respondents, and (3) use a restricted Social Security Administration file to add new information on respondents’ exact date of birth, county of birth for those born in the U.S., and date of death for the deceased. With our addition of linked cases from 1940 and 1950, population health researchers will be able to use these data to analyze the life-course trajectories of hundreds of millions of Americans over the past century. Researchers will be able to incorporate information on early-life and ancestral experiences—such as parental economic status, childhood environmental exposures, policy conditions, social institutions, and neighborhood characteristics—into investigations of the health, well-being, and mortality of Americans over their lives and through generations.