LLMs for Software Vulnerability Detection: Holy Grail, Pandora's Box, or just a Fad?
Talk by Gianluca Stringhini
Abstract: In this talk, I will discuss the results of our investigation on the ability of Large Language Models (LLMs) to detect software vulnerabilities. I will show that while LLMs show promise, they present several pitfalls like non-determinism in their outputs and unfaithful reasoning. I will then talk about how we can manage these pitfalls to create an agentic framework that is able to take CVEs of known vulnerabilities, set up a working environment with the vulnerable software, and produce a verifiable exploit against the vulnerability. This framework allows the research community to produce benchmarks of known vulnerabilities and their exploits, which are needed to test new defenses. Unfortunately, our framework also shows that automated generation of exploits using LLMs is a real threat that could be weaponized by attackers. I will finally conclude discussing the future challenges and opportunities of LLMs for vulnerability detection.
Bio: Gianluca Stringhini is an Associate Professor in the Electrical and Computer Engineering Department at Boston University, holding affiliate appointments in the Computer Science Department and in the Faculty of Computing and Data Sciences. In his research Gianluca applies a data-driven approach to better understand malicious activity on the Internet. Through the collection and analysis of large-scale datasets, he develops novel and robust mitigation techniques to make the Internet a safer place. Over the years, Gianluca has worked on understanding and mitigating malicious activities like malware, software vulnerabilities, online fraud, influence operations, and coordinated online harassment. He received multiple prizes including an NSF CAREER Award in 2020, and his research won multiple Best Paper Awards. Gianluca has published over 150 peer reviewed papers including several in top computer security conferences like IEEE Security and Privacy, CCS, NDSS, and USENIX Security, as well as top measurement, HCI, and Web conferences such as IMC, ICWSM, CHI, CSCW, and WWW.