1
ImageNet ResNet-50 Adversarial Training for Adversarial Training for Free! Adversarial Training ϵ ϵ s τ Algorithm: K-PGD Input: Training samples X, perturbation bound , step-size , PGD iterations K, learning-rate for epoch = 1 … N do for minibatch do Build forall with PGD: for k = 1 … K do end for update with SGD: end for end for θ x adv x + r; r U(ϵ, ϵ) x adv clip(x adv + ϵ s sign(g adv ), x ϵ, x + ϵ) θ θ τg θ B X x B g θ (x,y)B [ θ l(x adv , y, θ )] g adv ←∇ x l(x adv , y, θ ) x adv Backprop 1 time Backprop K times { for epoch = 1 … N/m do for minibatch do for i = 1 … m do Update with SGD: Use grads calculated at min step for updating end for end for end for θ δ 0 δ clip(δ + ϵ sign(g adv ), ϵ,+ ϵ) θ θ τg θ B X g θ (x,y)B [ θ l(x + δ, y, θ )] g adv ←∇ x l(x + δ, y, θ ) δ ϵ τ Input: Training samples X, perturbation bound , learning-rate , replay m Algorithm: Free-m { Backprop just 1 time! CIFAR-100 (ϵ = 8) CIFAR-10 (ϵ = 8) ImageNet (ϵ = 4) Results: Free vs K-PGD CIFAR ImageNet Tensorflow Pytorch Free-m has smooth loss surface Free-m has interpretable grads nat adv for Free-8 Adversarial examples Modify target images at test time + = Frog Plane Target = Overview Adversarial training (using PGD attacks) is one of the best defenses and wins nearly all defense competitions. But it’s slow, taking 5-50X longer than regular training. This makes it nearly intractable for large problems like ImageNet. Ali Shafahi 1 , Mahyar Najibi 1 , Amin Ghiasi 1 , Zheng Xu 1 , John Dickerson 1 , 1 University of Maryland, 2 Cornell University, and 3 US Naval Academy Christoph Studer 2 , Larry Davis 1 , Gavin Taylor 3 , Tom Goldstein 1 We present a method that adversarially trains with no added cost beyond regular training. Our “free” method gets comparable results to adversarial training on CIFAR, and can adversarially train ImageNet on a desktop computer in just a day!

Adversarial Training for - University Of Maryland · 2019. 10. 27. · 1University of Maryland, 2Cornell University, and 3US Naval Academy Christoph Studer2, Larry Davis1, Gavin Taylor3

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Adversarial Training for - University Of Maryland · 2019. 10. 27. · 1University of Maryland, 2Cornell University, and 3US Naval Academy Christoph Studer2, Larry Davis1, Gavin Taylor3

ImageNet

ResNet-50

Adversarial Training for

Adversarial Training for Free!Adversarial Training

ϵϵs τ

Algorithm: K-PGDInput: Training samples X, perturbation bound ,

step-size , PGD iterations K, learning-rate for epoch = 1 … N do for minibatch do Build forall with PGD: for k = 1 … K do

end for update with SGD:

end forend for

θ

xadv ← x + r; r ← U(−ϵ, ϵ)

xadv ← clip(xadv + ϵs ⋅ sign(gadv), x − ϵ, x + ϵ)

θ ← θ − τgθ

B ⊂ Xx ∈ B

gθ ← 𝔼(x,y)∈B[∇θ l(xadv, y, θ)]

gadv ← ∇xl(xadv, y, θ)

xadv

Backprop 1 time

Backprop K times{

for epoch = 1 … N/m do for minibatch do for i = 1 … m do Update with SGD:

Use grads calculated at min step for updating end for end forend for

θ

δ ← 0

δ ← clip(δ + ϵ ⋅ sign(gadv), − ϵ, + ϵ)

θ ← θ − τgθ

B ⊂ X

gθ ← 𝔼(x,y)∈B[∇θ l(x + δ, y, θ)]gadv ← ∇xl(x + δ, y, θ)

δ

ϵτInput: Training samples X, perturbation bound , learning-rate , replay m

Algorithm: Free-m

{Backprop just1 time!

CIF

AR

-100

(ϵ=

8)

CIFAR-10 (ϵ = 8)

Imag

eNet

(ϵ=

4)

Results: Free vs K-PGD

CIFARImageNet

TensorflowPytorch

Free-m has smooth loss

surface

Free-m has interpretable grads

nat

adv for Free-8

Adversarial examples

Modify target images at test time

+=

Frog

Pla

ne

Target

=

OverviewAdversarial training (using PGD attacks) is one of the best defenses and wins nearly all defense competitions. But it’s slow, taking 5-50X longer than regular training. This makes it nearly intractable for large problems like ImageNet.

Ali Shafahi1, Mahyar Najibi1, Amin Ghiasi1, Zheng Xu1, John Dickerson1,

1University of Maryland, 2Cornell University, and 3US Naval Academy

Christoph Studer2, Larry Davis1, Gavin Taylor3 , Tom Goldstein1

We present a method that adversarially trains with no added cost beyond regular training. Our “free” method gets comparable results to adversarial training on CIFAR, and can adversarially train ImageNet on a desktop computer in just a day!