Artificial Intelligence is becoming more powerful than ever, e.g., GitHub Copilot suggests code to developers, and Large Language Model (LLM) Plugins will soon assist many tasks in our daily lives. We can utilize the power of AI to solve security problems, which needs to be robust against new attacks and new vulnerabilities.
In this talk, I will first discuss how to develop robust AI techniques for malware detection. Our research finds that, after training an Android malware classifier on one year’s worth of data, the F1 score quickly dropped from 0.99 to 0.76 after 6 months of deployment on new test samples. I will present new methods to make machine learning for Android malware detection more effective against data distribution shift. My vision is, continuous learning with a human-in-the-loop setup can achieve robust malware detection. Our results show that to maintain a steady F1 score over time, we can achieve 8X reduction in labels indeed from security analysts. Next, I will discuss the potential of using large language models to solve security problems, using vulnerable source code detection as a case study. We propose and release a new vulnerable source code dataset, DiverseVul. Using the new dataset, we study 11 model architectures belonging to 4 families for vulnerability detection. Our results indicate that developing code-specific pre-training tasks is a promising research direction of using LLMs for security. We demonstrate an important generalization challenge for the deployment of deep learning-based models. In closing, I will discuss security issues of LLMs and future research directions.
Yizheng Chen is an Assistant Professor of Computer Science at University of Maryland. She works at the intersection of AI and security. Her research focuses on AI for Security and robustness of AI models. Previously, she received her Ph.D. in Computer Science from the Georgia Institute of Technology, and was a postdoc at University of California, Berkeley and Columbia University. Her work has received an ACM CCS Best Paper Award Runner-up and a Google ASPIRE Award. She is a recipient of the Anita Borg Memorial Scholarship.