What is it about?

Deep neural networks (DNNs) have achieved significant sucess in wide applications, such as image classification, face recognition, object detection, speech recognition and machine translation. Despite their success, these models are vulnerable to adversarial attacks. Crafted by adding perturbations to benign inputs, adversarial examples (AEs) can fool DNNs into making wrong predictions, which is a critical threat especially for some security-sensitive scenarios such as autonomous driving. This research aims to attack DNNs in an imperceptible way, where the perturbations applied to the image are difficult for humans to perceive. Our proposed Saliency Attack identifies salient regions in an image and generates an interpretable and imperceptible perturbation within the regions that can deceive DNNs with a high success rate.

Featured Image

Why is it important?

With the growing use of deep neural networks (DNNs), ensuring their robustness and safety has become increasingly important. For example, as small modifications to traffic signs can cause disastrous results for autonomous vehicles, it is essential to examine the robustness of DNNs when facing such attacks. This research focuses on black-box attack settings where attackers have only the model output. This is a practical and challenging setting compared to traditional white-box attack settings. The Saliency Attack proposed in this research achieves a state-of-the-art attack success rate while maintaining perturbations in an imperceptible and interpretable manner. As a result, the proposed attack can serve as a baseline to assess the robustness of deep neural networks.

Read the Original

This page is a summary of: Saliency Attack: Towards Imperceptible Black-box Adversarial Attack, ACM Transactions on Intelligent Systems and Technology, February 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3582563.
You can read the full text:

Read

Contributors

The following have contributed to this page