Literature Survey on Adversarial Attacks and their defense

9 minute read


With the coming up of so many applications on which deep learning is proving to be impactful with high accuracies and precision. it is important to ensure it’s safety and security against adversarial attacks. It has been observed that deep neural networks are susceptible to adversarial attacks even in the form of small perturbations which are not conceivable by humans. My literature survey on this topic consists of the following papers and their details :

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

Before starting with the survey, the paper mentions some important definitions like :

  • Adversarial example / image
  • Adversarial perturbation
  • Adversarial training
  • Adversary
  • Black-box attacks
  • Detector
  • Fooling ratio / rate
  • One-shot/ one-step methods
  • Iterative methods
  • Quasi-imperceptible perturbations
  • Rectifier
  • Targeted attacks
  • Non-targeted attacks
  • Threat model
  • Transfer-ability
  • Universal perturbations
  • White-box attacks

The paper also describes adversarial attacks. The attacks have been classified into two : Attacks for classification and Attacks beyond classification/ recognition. The various attacks that are discussed as follows :

  • Attacks for classification :
    • Box-constrained L-BFGS

    • Fast Gradient Sign Method (FGSM) : The approach exploits the linearity of deep network models. They propose a linearity hypothesis also. The perturbations are restricted with l-infinity and l-2 norm.
    • Basic & Least-Likely-Class Iterative Methods : There is also a modification to this technique relating to one-step target class using Iterative Least-likely Class Method (ILCM) method.
    • Jacobian-based Saliency Map Attack (JSMA)
    • One Pixel Attack
    • Carlini and Wagner Attacks ( C&W)
    • Deep Fool
    • Universal Adversarial Perturbations
    • UPSET and ANGRI
    • Houdini
    • Adversarial Transformation Networks (ATNs)
    • Other miscellaneous attacks
  • Attacks beyond classification/ recognition :
    • Attacks on Auto-encoders and Generative Models

    • Attacks on Recurrent Neural Networks
    • Attacks on Deep Reinforcement Learning
    • Attacks on Semantic Segmentation and Object Detection

The paper further defines some attacks which have been successfully applied to real world problems. Some attacks that are mentioned are as follows :

  • Attacks on Face Attributes -> Fast Flipping Attribute technique.
  • Cell-phone camera attack
  • Road sign attack
  • Generic adversarial 3D objects
  • Cyberspace attacks
  • Robotic Vision & Visual QA attacks

The paper furthers talks about existence of adversarial examples. Then, paper explains about the defense strategies. The strategies are talked about broadly in three directions.

  • Modified training/ input
  • Modified networks : complete and detection only
  • Network add-on : complete and detection only

Various methods under modified training / input strategies are :

  • Brute-force adversarial training
  • Data compression as defense
  • Foveation based defense
  • Data randomization and other methods

Various methods under modifying the network strategies are :

  • Complete approaches :
    • Deep Contractive Networks
    • Gradient regularization/ masking
    • Defensive distillation
    • Biologically inspired protection
    • Parseval Networks
    • Deep Cloak
    • Other miscellaneous approaches
  • Detection only approaches :
    • SafetyNet
    • Detector subnetwork
    • Exploiting convolution filter statistics
    • Additional class augmentation

Various methods under network add-ons are :

  • Defense against universal perturbations.
  • GAN-based defense
  • Detection only approaches :
    • Feature squeezing
    • MagNet
    • Other miscellaneous methods

Hence, this survey shows that adversarial attacks are for real and the deep neural networks are vulnerable to the attacks. However, coming up with robust defense mechanism can help in evading these vulnerabilities.

Adversarial Examples: Attacks and Defenses for Deep Learning

This paper is also a survey based paper which highlights various aspects relating to adversarial attacks and its defense. The paper starts with introducing the Threat Model. This model limits the attack only at testing stage. Also, adversaries only aim at compromising integrity.

The paper also drives through adversarial examples and countermeasures in machine learning. The machine learning techniques states a box-constrained optimization problem. The attacks have been categorized into 3 axes : influence, security violation and specificity.

Like the previous paper, this paper also describes some taxonomy. The threat model is characterized based on following terms :

  • Adversarial Falsification
    • False positive
    • False negative
  • Adversary’s Knowledge
    • White-box
    • Black-box
  • Adversarial Specificity
    • Targeted attacks
    • Non-targeted attacks
  • Attack Frequency
    • One-time attacks
    • Iterative attacks

Similarly, perturbations are also defined in terms of :

  • Perturbation Scope
    • Individual
    • Universal
  • Perturbation Limitation
    • Optimized perturbation
    • Constraint perturbation
  • Perturbation Measurement
    • l-p norms
    • Psychometric perceptual adversarial similarity score (PASS)
  • Datasets considered :
    • MNIST
    • CIFAR-10
    • ImageNet
  • Victim Models :
    • LeNet
    • VGG
    • AlexNet
    • GoogleNet
    • CaffeNet
    • ResNet

Next, paper highlights the various methods for generating adversarial examples. Some of the examples that are covered in this paper are also covered in the previous paper too. The examples are stated as follows :

  • L-BFGS attack
  • Fast Gradient Sign Method
  • Basic Iterative Method and Iterative Least-Likely class
  • Jacobian-Based Saliency Map Attack
  • DeepFool
  • CPPN EA Fool
  • C & W’s attack
  • Zeroth-Order Optimization
  • Universal Perturbation
  • One-Pixel Attack
  • Feature Adversary
  • Hot / Cold
  • Natural GAN
  • Model-Based Ensembling Attack
  • Ground-Truth Attack

The various applications for adversarial examples are also discussed, which are stated as follows :

  • Reinforcement Learning
  • Generative Modeling
  • Face Recognition
  • Object Detection
  • Semantic Segmentation
  • NLP
  • Malware Detection

Some countermeasures for adversarial examples are also discussed. These are categorized into two : reactive and proactive.

Some methods for proactive countermeasures are :

  • Network distillation
  • Adversarial (re)training
  • Classifier robustifying

Some methods for reactive countermeasures are :

  • Adversarial detecting
  • Input Reconstruction
  • Network Verification
  • Ensembling Defenses

The challenges are also discussed. It states about the issues related to transferability, data in-completion, model capability, no robust model, etc: . Issues related to robustness evaluation are also discussed. Thus, all the aspects of adversarial attacks, there defense and issues are discussed as survey in this paper.

Is Deep Learning Safe for Robot Vision?Adversarial Examples against the iCub Humanoid

This paper aims to show the vulnerability of robot-vision system to adversarial examples. Also, it proposes a computationally-efficient countermeasure, inspired from work on classification with the reject option and open-set recognition.

The paper presents iCub Humanoid robot which exploits pre-trained ImageNet deep network for classification and recognition purposes. Moreover, adversarial example generation have been also discussed by dividing them into two classes : error-generic and error-specific, based on the following optimization :

enter image description here

enter image description here

enter image description here

The paper furthers moves to the classifier security to adversarial examples. The paper puts forth the problem of intrinsic vulnerability of the feature representations. Adversarial examples may quite reasonably end up in regions of training distribution while also successfully fooling detection. These samples are often referred to as blind-spot evasion samples. Thus, to avoid this issue, by introducing SVM with RBF kernels as Compact Abating Probability (CAP) models. The decision rule is modified as :

enter image description here

The conceptual representation of the idea is demonstrated as :

enter image description here

This performance of SVM-adv increases for low values dmax, as even if all testing images are only slightly modified in input space, they immediately become blind-spot adversarial examples, ending up in a region which is far from the rest of the data. As the input perturbation increases, such samples are gradually drifted inside a different class, becoming indistinguishable from the samples from the samples of such class.

Thus, the paper nicely provides the vulnerability towards the classification problem and suggests a defensive algorithm too.

Conditional Image Generation with PixelCNN Decoders

This paper proposes the idea of using PixelCNN and PixelRNN for conditional image generation. The basic idea of the architecture is to use auto-regressive connections to model images pixel by pixel. PixelRNN generally gives better performance, but PixelCNN are much faster to train. However, the model has blind spots which makes it vulnerable.

The paper also introduces Gated PixelCNN. PixelCNNs model the joint distribution of pixels over an image x as the following product of conditional distributions, where x_i is as single pixel.

enter image description here

The new architecture that they propose, combines the advantages of both PixelCNN and PixelRNN by modelling the gated activation units as :

enter image description here

The combined architecture is given as :

enter image description here

The Conditional PixelCNN has the following distribution and activation :

enter image description here

enter image description here

Because conditional PixelCNNs have the capacity to model diverse, multi modal image distributions, it is possible to apply them as image decoders in existing neural architectures such as auto encoders.

  • Datasets : ImageNet, CIFAR, Flickr

  • Results :

enter image description here

enter image description here


This paper proposes the idea of purifying the adversarial image to make it fall into the uniform training distribution. They make a hypothesis that the adversarial examples largely lie in the low probability regions of the distribution that generated the data used to train the model.

The paper tells about some attacking methods like Random perturbation (RAND), Fast gradient sign method (FGSM), Basic iterative method (BIM), DeepFool, Carlini-Wagner (CW), etc:

The various defense methods are also mentioned like : Adversarial training, Label smoothing and Feature squeezing.

  • Datasets : Fashion MNIST, CIFAR-10
  • Models : ResNet, VGG
  • PixelCNN is also mentioned.

Adversarial examples are detected using p-values :

enter image description here

enter image description here

where X’ is a test image, while others are training images.

The paper uses the following optimization to return images to the training distribution :

enter image description here

The paper also produces another variant named as adaptive defend which will modify only the images which do not lie in the training distribution.

Application of deep learning to cybersecurity : A survey

The paper starts with addressing the various deep network architectures. For eg : CNN, DBN, RBM, SAE, AE, RNN, LSTM, GAN, etc:

Conceptual DL framework for cybersecurity applications has been presented.

enter image description here

The paper tells about the main branches of applying DL techniques to cybersecurity :

enter image description here

The paper also summarized various PC-based malware detection and analysis :

enter image description here enter image description hereenter image description here

The paper also summarizes Android-based malware detection :

enter image description hereenter image description here

Intrusion detection also has been summarized :

enter image description hereenter image description here

There are other attacks and defense too which are mentioned as phishing detection, spam detection, website defacement detection, etc:.

There are various analysis based on focus area, methodology, generative architectures, discriminative architectures, model applicability, and feature granularity.