A Study on Mixup-Inspired Augmentation Methods for Software Vulnerability Detection

Seyed Shayan Daneshvar; Da Tan; Shaowei Wang; Carson Leung

doi:10.1145/3756681.3757017

What is it about?

This paper examines the effect of representation-level augmentation methods such as mixup and it's variants for software vulnerability detection, and also proposes a masked variant to increase effectiveness and reduce important information loss. It shows that using such methods are not as effective as simply using random oversampling of the vulnerable samples, but it does provide sota performance as complex generative methods.

Photo by Nick Brunner on Unsplash

Why is it important?

It shows that generative methods do not necessarily beat more basic ways of creating new data points, and also random oversampling may be more useful when it comes to dealing with the shortage of vulnerable samples.

Perspectives

Interesting finding, which complements the results of VulScriber paper. It shows that generative models (Non-LLMs) are not as effective for dealing with the data shortage problem that exists in vulnerability datasets.
Seyed Shayan Daneshvar
University of Manitoba

This page is a summary of: A Study on Mixup-Inspired Augmentation Methods for Software Vulnerability Detection, June 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3756681.3757017.
You can read the full text:

Read

Contributors

The following have contributed to this page

Seyed Shayan Daneshvar
University of Manitoba

Data augmentation for Vulnerability Detection via Mixup on code representation

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Data augmentation for Vulnerability Detection via Mixup on code representation

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management