COMSW4182 Homework 3 Solved

30.00 $

Category:

Description

Rate this product

Problem : Adversarial Examples

Adversarial examples are deliberately crafted by attackers to trick machine learning models. These malicious examples are generated by adding small perturbation which is not perceptible by human beings yet extremely effective to influence machine learning models. In general, adversarial examples are computed by optimizing a given input towards a specified objective. If the objective is to decrease the likelihood of the correct label prediction, then the adversarial example is untargeted(i.e., the predicted label can be any wrong labels); if the objective is to increase the likelihood of a specified wrong label prediction, then the adversarial examples is targeted(i.e., the predicted label is the specified wrong label).

In this assignment, you will learn two methods to generate adversarial examples: fast gradient sign method(FGSM) and projected gradient descent(PGD).

      • FGSM is the first approach to compute adversarial examples. It simply assumes the linear approximation of ML models at point x and defines adversarial examples as follows:

        x adv = x + εsign(∇xJ(x, θ, y)) (0.1)

        Here, J represents the objective function, θ represents the parameters of neural network, x represents the input and y represents the corresponding label. sign() takes the sign of gradient at each dimension. ε defines the magnitude of perturbation. The adversarial example is computed by adding sign of gradient scaled by epsilon.

      • PGD is essentially an iterative FGSM methods with smaller step size. Further, it projects the perturbation back into norm bound. Its equation is defined as:

        Xadv = X, Xadv = P (Xadv + εsign ∇ J(xadv, θ, y)) (0.2)

0 N+1 X,ε N x N

1

– Homework Assignment #3 2 Xadv denotes the N-th iteration value for input x, PX,ε denotes the projection of (N+1)-th iteration into

norm bound. The adversarial input is initially defined as original inputs and goes through N iterations of FGSM and norm bound projection.

CleverHans(https://github.com/cleverhans-lab/cleverhans). CleverHans is a python library developed by academic researchers to benchmark both adversarial attack and defence. It provides implementation of vari- ous adversarial attacks on ML models, and implementation of adversarial training to improve robustness of ML models. In this assignment, you will use CleverHans to construct adversarial examples. Check their nice tutorials at cleverhans/tutorials/torch/mnisttutorial.py and cleverhans/tutorials/torch/cifar10tutorial.py before you get started.

Specially, there are two tasks you need to complete:
(a) Construct adversarial examples using FGSM and PGD on MNIST dataset.

• Install conda and set up a python environment with python 3.7, pytorch and CleverHans
• Load a pre-trained MNIST model by running hw3/mnist exp.py
• Generate two untargeted adversarial examples using FGSM with the following two settings

– l inf norm, ε = 0.2 – l2norm,ε=2

• Generate two adversarial examples using PGD with the following two settings – l inf norm, ε = 0.2

– l2norm,ε=2
• Visualize original input, your FGSM and PGD adversarial inputs using python matplotlib.

Extra Points

• Generate two targeted adversarial examples of 3 to 8 using FGSM with the following two settings – l inf norm, ε = 0.2

– l2norm,ε=2
• Generate two targeted adversarial examples of 3 to 8 using PGD with the following two settings

– l inf norm, ε = 0.2 – l2norm,ε=2

• Visualize original input, your FGSM and PGD adversarial inputs using python matplotlib.

(b) Construct adversarial examples using FGSM and PGD on CIFAR-10 dataset.

• Load a pre-trained CIFAR-10 model by running hw3/cifar10 exp.py
• Generate two untargeted adversarial examples using FGSM with the following two settings

– l inf norm, ε = 0.2 – l2norm,ε=2

• Generate two adversarial examples using PGD with the following two settings – l inf norm, ε = 0.2

N

– Homework Assignment #3 3 – l2norm,ε=2

• Visualize original input, your FGSM and PGD adversarial inputs using python matplotlib. Deliveries: A tar.gz file with name format hw3 {your uni}.tar.gz.

• A PDF file containing visualization of original inputs and your adversarial inputs • part a, python source code
• part b, python source code

  • Hw3-1qypkj.zip