Fast Feature Fool - A data independent approach to universal adversarial perturbations

Konda Reddy Mopuri*, Utsav Garg*, R. Venkatesh Babu (*=equal contribution)

[Paper] [Code] [Project Page]

Abstract

State-of-the-art object recognition Convolutional Neural Networks (CNNs) are shown to be fooled by image agnostic perturbations, called universal adversarial perturbations. It is also observed that these perturbations generalize across multiple networks trained on the same target data. However, these algorithms require training data on which the CNNs were trained and compute adversarial perturbations via complex optimization. The fooling performance of these approaches is directly proportional to the amount of available training data. This makes them unsuitable for practical attacks since its unreasonable for an attacker to have access to the training data. In this paper, for the first time, we propose a novel data independent approach to generate image agnostic perturbations for a range of CNNs trained for object recognition. We further show that these perturbations are transferable across multiple network architectures trained either on same or different data. In the absence of data, our method generates universal perturbations efficiently via fooling the features learned at multiple layers thereby causing CNNs to misclassify. Experiments demonstrate impressive fooling rates and surprising transferability for the proposed universal perturbations generated without any training data.

Sample perturbations

Reference

@article{mopuri2017fast,
  title={Fast Feature Fool: A data independent approach to universal adversarial perturbations},
  author={Mopuri, Konda Reddy and Garg, Utsav and Babu, R Venkatesh},
  journal={arXiv preprint arXiv:1707.05572},
  year={2017}
}