1. Whose fault is it anyway? SILC: Safe Integration of LLM-Generated Code
- Author
-
Lin, Peisen, Zhang, Yuntong, Costea, Andreea, and Roychoudhury, Abhik
- Subjects
Computer Science - Software Engineering - Abstract
In modern software development, multiple software components, often sourced from different contributors, including AI assistants, are combined to create a cohesive system. Although these components might each be individually safe, their composition might not be so. At the core of this issue is often a misalignment between the requirements and assumptions made by each component. Once discovered it is important to determine which component is accountable for addressing the misalignment issue and to prevent its occurrence in the future. In this work we propose SILC, a framework for localising fault, i.e. blame, and for assigning sanitization obligations to prevent memory issues resulting from the composition of multiple software components. In particular, we show the role Incorrectness Logic could have in automatically extracting implicit non-functional assumptions in auto-generated code and render them explicit in order to detect misalignment with the requirements in existing code. In other words, we are looking at the problem of code comprehension from a perspective focused on safety properties rather than the traditional approach centered on functionality. To do that, we enhance Incorrectness Separation Logic with capabilities for fault tracking and sanitization insertion. We show the benefits of this framework by running experiments on millions of lines of code from open source projects where parts of existing functionality are regenerated by AI assistants. We empirically show that AI assistants produce unsafe code and demonstrate the utility of our framework in proposing appropriate blame and sanitization obligations.
- Published
- 2024