Current lung cancer screening eligibility guidelines were developed in a civilian population and miss the majority of Veterans who develop lung cancer. The guidelines include 50-80 year old heavy smokers, with a 20 or more pack years history, who either currently smoke or quit within the last 15 years. These criteria only capture 20-35% of lung cancers in the civilian population and Veterans. Furthermore, Veterans suffer from lung cancer at higher rates than the rest of the United States population, smoke more, and have unique exposures to known causes of lung cancer including Agent Orange, asbestos, diesel fumes, ionizing radiation and Open Burn Pit hydrocarbons. Veterans also have additional risk factors for lung cancer such as race, low socio-economic status, previous history of cancer, HIV, rheumatoid arthritis and chronic obstructive pulmonary disease (COPD) each of which have been shown to increase lung cancer risk. Other, population specific models effectively identify at risk subgroups who may benefit from screening, but none of these models have been validated in Veterans and none consider Veterans’ unique risks. A personalized and Veteran-specific model that adds service-related lung cancer risks and leads to the identification of high-risk groups that may benefit from lung cancer screening is needed. The objective of this proposal is to combine general population and Veteran-specific lung cancer risk factors into a Veteran's lung cancer screening eligibility model. Our overall hypothesis is that service histories and novel risk factors can be used in a Veteran-specific lung cancer risk model to broaden the population who may benefit from lung cancer screening. This effort to improve Veterans’ health through the early detection of lung cancer with screening has two aims. In Aim 1 we will define and discover novel phenotypes associated with increased lung cancer risk in Veterans that include longitudinal clinical and military service-specific exposures. We will generate a comprehensive, longitudinal set of lung cancer risk factors from all Veterans who have received care at a VA facility in the last decade. We will use linked Department of Defense service and VA Electronic Health Record (EHR) data to identify service-related exposures and lung cancer risk factors. Using artificial intelligence, we will mine unstructured text data from clinical notes radiological reports to discover novel data pattern (phenotypes) that help predict future lung cancer diagnosis. We hypothesize that we will accurately determine risk variables used in current eligibility models and discover a set of novel Veteran-specific phenotypes associated with lung cancer risk. In Aim 2 we will build a Veteran-specific lung cancer screening model and compare it to existing screening eligibility criteria and models. We will use a combination of standard lung cancer risk variables, military service-specific risk factors and novel discovered EHR lung cancer risk phenotypes...