CRII: III: Towards Traceable, Affordable and Explainable Automated Feature Generation for Tabular Data

NSF Award Search · 01002425DB NSF RESEARCH & RELATED ACTIVIT · $174,811 · view on nsf.gov ↗

Abstract

Features are used to describe the characteristics of objects. For example, "age", "smoking or not", and "years of smoking" are features of a patient, which can be used to describe the patient's physical condition, and furthermore, to predict if she or he is likely to get lung cancer. A combination of features could be more helpful to the prediction, e.g., "age" minus "years of smoking" can be a new feature to indicate how early the patient starts smoking. This kind of feature combination is called feature generation. In the big data era, there exist enormous numbers of features, and it is not realistic to generate features manually by human experts. This project will build new technologies to automatically generate new features based on existing features, to better describe the objects, and to gain better prediction performance. Additionally, this project aims to substantially improve the traceability, affordability, and explainability during the generation process. The developed algorithms and tools are expected to be generalized and applicable to a broad range of scientific and engineering problems, not just in feature generation, but also in other domains such as data pre-processing, social analysis, intelligent transportation systems, healthcare, and the internet of things. This project identifies three research tasks: (i) A Reinforcement Learning (RL) based approach to realize traceability. Two RL agents are used to select appropriate features, and one RL agent is u

Key facts

NSF award ID: 2550105
Awardee: Clemson University (SC)
SAM.gov UEI: H2BMNX7DSKU8
PI: Kunpeng Liu
Primary program: 01002425DB NSF RESEARCH & RELATED ACTIVIT
All programs: INFO INTEGRATION & INFORMATICS, CISE Resrch Initiatn Initiatve
Estimated total: $174,811
Funds obligated: $110,820
Transaction type: Standard Grant
Period: 09/01/2025 → 05/31/2027