DGA Detection Using Similarity-Preserving Bloom Encodings

Lasse Nitz; Avikarsha Mandal

doi:10.1145/3590777.3590795

What is it about?

We used an approach from the area of privacy-preserving record linkage to encode training data samples for the machine learning-based detection of algorithmically generated domain names, which are used to enable communication in botnets. The evaluated approach provides the required property of preserving similarity of data samples, while at the same time allowing to tune encodings in regard to the privacy-utility trade-off. We discuss requirements of different machine learning scenarios as well as privacy implications of this encoding approach for those scenarios. We further evaluated the encoding approach by training deep learning models on encodings generated with different parameter values, and compare their performance to the model trained on cleartext samples.

Photo by JJ Ying on Unsplash

Why is it important?

For many applications related to classification, machine learning has become the go-to solution. Its use in scenarios involving sensitive training data and the rise of privacy regulations such as the GDPR, however, have led to concerns about potential leakage of sensitive information. We contributed to the goal of improving the understanding of privacy approaches for machine learning by evaluating an approach from the area of privacy-preserving record linkage in the cybersecurity use case of detecting algorithmically generated domains via deep learning. We hope that building bridges between these research areas helps to find innovative solutions for technical privacy protection.

This page is a summary of: DGA Detection Using Similarity-Preserving Bloom Encodings, June 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3590777.3590795.
You can read the full text:

Read

Contributors

The following have contributed to this page

Lasse Nitz
Fraunhofer-Gesellschaft zur Forderung der angewandten Forschung eV

Using an approach for data linkage to enhance privacy in a deep learning cybersecurity use case

What is it about?

Why is it important?

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Using an approach for data linkage to enhance privacy in a deep learning cybersecurity use case

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management