SaTC: CORE: Small: Improving the Security of Large Language Model-Assisted Coding

NSF Award Search · 01002627DB NSF RESEARCH & RELATED ACTIVIT · $450,000 · view on nsf.gov ↗

Abstract

Large Language Model (LLM)-assisted coding, where an LLM automatically generates code based on a developer-specified prompt, is already popular and projected to grow, promising to bolster coding productivity while reducing software development time and effort. LLM-generated code can be insecure for a variety of reasons, for example omitting critical security checks, or containing mistakes that adversaries can exploit. When this insecure code eludes scrutiny and makes its way into production systems, our software infrastructure is at risk. This project advances the state of research and practice in LLM-assisted coding by using program analysis and verification to generate secure code. The project's novelties are to introduce and use guardrails named contexts that define security properties and guide LLMs into producing code that meets such security guarantees. The project's broader significance and importance are empowering programmers to understand the security implications of LLM-assisted coding and in turn write more secure code efficiently, helping fuel economic prosperity and increasing national security. The project consists of three thrusts. The first thrust uses program analysis to bridge the gap between security properties and LLM code generation via contexts and LLM prompts. The second thrust constructs an iterative LLM-centered code generation approach with criteria designed to improve security in each iteration. The third thrust develops a minimization approach, designed to produce minimal code examples in situations where LLM generation fails, or the LLM cannot converge toward producing secure code. These approaches are usable in a variety of other settings: scenarios where LLM-generated code needs to be rigorously or formally verified, settings where the LLM-produced code must meet certain specifications from the start, and the widely-applicable technique of generating a minimal example when LLM generation fails, so the programmers can easily underst

Key facts

NSF award ID: 2453331
Awardee: New Jersey Institute of Technology (NJ)
SAM.gov UEI: SGBMHQ7VXNH5
PI: Zhihao Yao
Primary program: 01002627DB NSF RESEARCH & RELATED ACTIVIT
All programs: SaTC: Secure and Trustworthy Cyberspace, Artificial Intelligence (AI), Nat Security, Secure Border & Pub Safety, SMALL PROJECT
Estimated total: $450,000
Funds obligated: $450,000
Transaction type: Standard Grant
Period: 05/15/2026 → 04/30/2029