STraceBERT: Source Code Retrieval using Semantic Application Traces

Claudio Spiess

doi:10.1145/3611643.3617852

What is it about?

Reverse engineering is a critical task in software engineering, yet it is challenging when facing adversarial artifacts i.e., intentionally obfuscated software. We propose a novel approach that works by looking at the behavior of a program, in terms of core library calls, to retrieve similar code. This was done through a custom BERT-style model that learns the semantics of application traces.

Photo by Immo Wegmann on Unsplash

Why is it important?

Using a language model to embed dynamic analysis artifacts for source code retrieval is a little explored area. This work presents promising results that encourage future work in this area.

Perspectives

I hope that this work encourages further investigation into how the behavior of applications, in terms of library and system calls, can be used for difficult tasks such as reverse engineering.
Claudio Spiess
University of California Davis

This page is a summary of: STraceBERT: Source Code Retrieval using Semantic Application Traces, November 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3611643.3617852.
You can read the full text:

Read

Contributors

The following have contributed to this page

Claudio Spiess
University of California Davis

Source Code Retrieval using Semantic Application Traces

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Source Code Retrieval using Semantic Application Traces

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management