82
1 Introduction Generative Adversarial Networks Yunjey Choi Korea University DAVIAN LAB

1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Embed Size (px)

Citation preview

Page 1: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

1

Introduction

Generative

Adversarial Networks

Yunjey Choi

Korea University

DAVIAN LAB

Page 2: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

2

Speaker Introduction Introduction

B.S. in Computer Science & Engineering at Korea University

M.S. Student in Computer Science & Engineering at Korea University (Current)

Interest: Deep Learning, TensorFlow, PyTorch

GitHub Link: https://github.com/yunjey

Page 3: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

3

Referenced Slides Introduction

• Namju Kim. Generative Adversarial Networks (GAN)

https://www.slideshare.net/ssuser77ee21/generative-adversarial-networks-70896091

• Taehoon Kim. 지적 대화를 위한 깊고 넓은 딥러닝

https://www.slideshare.net/carpedm20/ss-63116251

Page 4: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Introduction01

Page 5: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

5

Branches of ML Introduction

Semi-supervised

Learning Unsupervised

Learning

Supervised

Learning

Reinforcement

Learning

Machine Learning

No labeled data

No feedback

“find hidden structure”

Labeled data

Direct feedback

No labeled data

Delayed feedback

Reward signal

Page 6: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

6

Supervised Learning Introduction

Discriminative

Model woman

The discriminative model learns how to classify input to its class.

man

(1)

(1)

(2)

(2)

Input image

(64x64x3)

Page 7: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

7

Unsupervised Learning Introduction

Generative

Model

Latent

code

The generative model learns the distribution of training data.

(100)

Image(64x64x3)

Page 8: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

8

Generative Model

𝑋 1 2 3 4 5 6

𝑃 𝑋1

6

1

6

1

6

0

6

1

6

2

6

1 2 3 4 5 6

1

6

2

6

𝑝 𝑥

𝑥

Probability Distribution Introduction

Random variable

Probability Basics (Review)

Probability mass function

Page 9: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

9

Generative Model

What if 𝑥 is actual images in the training data?

At this point, 𝑥 can be represented as a (for example) 64x64x3 dimensional vector.

Probability Distribution Introduction

Page 10: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

10

Generative Model

𝑥

𝑝𝑑𝑎𝑡𝑎(𝑥)

Probability Distribution Introduction

There is a 𝑝𝑑𝑎𝑡𝑎(𝑥) that represents the distribution of actual images.

Probability density function

Page 11: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

11

Generative Model

𝑥

𝑝𝑑𝑎𝑡𝑎(𝑥)

Probability Distribution Introduction

Let’s take an example with human face image dataset.

Our dataset may contain few images of men with glasses.

𝑥1 𝑥2 𝑥3 𝑥4

𝑥1 is a 64x64x3 high dimensional vector

representing a man with glasses.

The probability

density value is low

Page 12: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

The probability

density value is high

12

Generative Model

𝑥

𝑝𝑑𝑎𝑡𝑎(𝑥)

Probability Distribution Introduction

Our dataset may contain many images of women with black hair.

𝑥1 𝑥2 𝑥3 𝑥4

𝑥2 is a 64x64x3 high dimensional vector

representing a woman with black hair.

Page 13: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

13

Generative Model

𝑥

𝑝𝑑𝑎𝑡𝑎(𝑥)

Probability Distribution Introduction

Our dataset may contain very many images of women with blonde hair.

𝑥1 𝑥2 𝑥3 𝑥4

𝑥3 is a 64x64x3 high dimensional vector

representing a woman with blonde hair.

The probability

density value is

very high

Page 14: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

14

Generative Model

𝑥

𝑝𝑑𝑎𝑡𝑎(𝑥)

Probability Distribution Introduction

Our dataset may not contain these strange images.

𝑥1 𝑥2 𝑥3 𝑥4

𝑥4 is an 64x64x3 high dimensional vector

representing very strange images.

The probability

density value is

almost 0

Page 15: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

15

Generative Model

𝑥

Probability Distribution Introduction

The goal of the generative model is to find a 𝑝𝑚𝑜𝑑𝑒𝑙(𝑥) that

approximates 𝑝𝑑𝑎𝑡𝑎(𝑥) well.

𝑥1 𝑥2 𝑥3 𝑥4

𝑝𝑚𝑜𝑑𝑒𝑙(𝑥)𝑝𝑑𝑎𝑡𝑎(𝑥)

𝐷𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑎𝑐𝑡𝑢𝑎𝑙 𝑖𝑚𝑎𝑔𝑒𝑠

𝐷𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑖𝑚𝑎𝑔𝑒𝑠 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑒𝑑 𝑏𝑦 𝑡ℎ𝑒 𝑚𝑜𝑑𝑒𝑙

Page 16: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Generative Adversarial Networks

02

Page 17: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

17

Intuition in GAN GANs

G(z)

DGz D(G(z))

D D(x)

x

Fake image

Real image

The probability of thatx came from the real data

(0~1)Discriminator

Generator

Latent Code

May be high

May be low

Training with real images

Training with fake images

Page 18: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

18

Intuition in GAN GANs

G(z)

DGz D(G(z))

D D(x)

x

Real image(64x64x3)

This value should be close to 1.Discriminator

(Neural Network)

The discriminator should classify a real image as real.

Page 19: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

19

Intuition in GAN GANs

G(z)

Dz D(G(z))

D D(x)

x

Fake image generated by the generator(64x64x3)

Generator

This value should be close to 0.

The discriminator should classify a fake image as fake.

Page 20: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

20

Intuition in GAN GANs

G(z)

DGz D(G(z))

D D(x)

x

Generated image(64x64x3)

Generator(Neural Network)

Latent Code(100)

This value should be close to 1.

The generator should create an image that is indistinguishable from real to

deceive the discriminator

Page 21: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺(𝑧)))

21

Objective Function of GAN GANs

G(z)

DGz D(G(z))

D D(x)

x

𝐺 𝐷

Training with real images

Training with fake images

Maximum when 𝐷(𝑥) = 1 Maximum when 𝐷(𝐺(𝑧)) = 0

Sample 𝑥 from real data distribution Sample latent code 𝑧 from Gaussian distribution

Train D to classify fake images as fake

Train D to classify real images as real

𝐷 should maximize 𝑉(𝐷, 𝐺)

Objective function

Page 22: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺(𝑧)))

𝐺 is independent of this part

22

Objective Function of GAN GANs

𝐺 should minimize 𝑉(𝐷, 𝐺) Minimum when 𝐷(𝐺(𝑧)) = 1

G(z)

DGz D(G(z))

D D(x)

x

Train G to deceive D

𝐺 𝐷

Training with real images

Training with fake images

Objective function

Page 23: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

23

PyTorch Implementation GANs

G(z) DGz D(G(z))

D D(x)x

Training with real images

Training with fake images

Page 24: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Define the discriminator

input size: 784

hidden size: 128

output size: 1

24

PyTorch Implementation GANs

G(z) DGz D(G(z))

D D(x)x

Discriminator

Assume x is MNIST (784 dimension)

Output probability(1 dimension)

Page 25: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

25

PyTorch Implementation GANs

G(z) DGz D(G(z))

D D(x)x

Generator

Define the generator

input size: 100

hidden size: 128

output size: 784

Latent code (100 dimension) Generated image (784 dimension)

Page 26: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

26

PyTorch Implementation GANs

G(z) DGz D(G(z))

D D(x)x

Binary Cross Entropy Loss (ℎ(𝑥), 𝑦)

− 𝑦 log ℎ 𝑥 − (1 − 𝑦) log(1 − ℎ(𝑥))

Page 27: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

27

PyTorch Implementation GANs

G(z) DGz D(G(z))

D D(x)x

Optimizer for D and G

Page 28: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

28

PyTorch Implementation GANs

G(z) DGz D(G(z))

D D(x)x

x is a tensor of shape (batch_size, 784).

z is a tensor of shape (batch_size, 100).

Page 29: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

29

PyTorch Implementation GANs

G(z) DGz D(G(z))

D D(x)x

D(x) gets closer to 1.

D(G(z)) gets closer to 0

Forward, Backward and Gradient Descent

Train the discriminator

with real images

Train the discriminator

with fake images

Page 30: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

30

PyTorch Implementation GANs

G(z) DGz D(G(z))

D D(x)x

Forward, Backward and Gradient Descent

D(G(z)) gets closer to 1

Train the generator

to deceive the discriminator

Page 31: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

31

PyTorch Implementation GANs

G(z) DGz D(G(z))

D D(x)x

The complete code can be found here

https://github.com/yunjey/pytorch-tutorial

Page 32: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

32

Objective Function of Generator

𝐺

Objective function of G

Non-Saturating Game GANs

Images created by the generator

at the beginning of training

𝑦 = log(1 − 𝑥)

The gradient is relatively small at D(G(z))=0

At the beginning of training, the discriminator can clearly classify the

generated image as fake because the quality of the image is very low.

This means that D(G(z)) is almost zero at early stages of training.

𝑚𝑖𝑛 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )

Page 33: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

33

Solution for Poor Gradient

𝐺

𝑚𝑎𝑥 𝐸𝑧~𝑝𝑧(𝑧) log𝐷(𝐺 𝑧 )𝐺

𝑚𝑖𝑛 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )

Non-Saturating Game GANs

𝑦 = log(𝑥)

The gradient is very large at D(G(z))=0

Use binary cross entropy loss function with fake label (1)

𝑚𝑖𝑛 𝐸𝑧~𝑝𝑧(𝑧)[−𝑦 log𝐷(𝐺 𝑧 ) − 1 − 𝑦 log(1 − 𝐷(𝐺 𝑧 )]

𝑚𝑖𝑛 𝐸𝑧~𝑝𝑧(𝑧)[− log𝐷 𝐺 𝑧 ]

𝐺

𝐺

𝑦 = 1

Modification (heuristically motivated)

• Practical Usage

Page 34: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

34

Theory in GAN GANs

𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )

𝑚𝑖𝑛 𝐽𝑆𝐷(𝑝𝑑𝑎𝑡𝑎||𝑝𝑔)𝐺,𝐷

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺𝐺 𝐷

𝑥

𝑝𝑔(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥)

𝐽𝑆𝐷(𝑃||𝑄) =1

2𝐾𝐿(𝑃| 𝑀 +

1

2𝐾𝐿(𝑄| 𝑀

𝑤ℎ𝑒𝑟𝑒 𝑀 =1

2(𝑃 + 𝑄) KL Divergence

• Why does GANs work?

same

Please see Appendix for details

Because it actually minimizes the distance between the real data distribution and the model distribution.

Objective function of GANs Jenson-Shannon divergence

Page 35: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Variants of GAN03

Page 36: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

36

DCGANVariants of

GAN

Radford et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015

• Deep Convolutional GAN(DCGAN), 2015

The authors present a model that is still highly preferred.

Page 37: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

37

DCGANVariants of

GAN

DGz D(G(z))

D D(x)

x

Use convolution, Leaky ReLU

Use deconvolution, ReLU

• No pooling layer (Instead strided convolution)

• Use batch normalization

• Adam optimizer(lr=0.0002, beta1=0.5, beta2=0.999)

Page 38: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

38

DCGANVariants of

GAN

Radford et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015

• Latent vector arithmetic

Page 39: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

39

LSGANVariants of

GAN

Xudong Mao et al. Least Squares Generative Adversarial Networks, 2016

• Least Squares GAN (LSGAN)

Proposed a GAN model that adopts the least squares loss function for the discriminator.

Page 40: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

40

LSGANVariants of

GAN

Vanilla GAN LSGAN

Remove sigmoid

non-linearity

in last layer

Page 41: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

41

LSGANVariants of

GAN

Vanilla GAN LSGAN

Generator is the

same as original

Page 42: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

42

LSGANVariants of

GAN

Vanilla GAN LSGAN

Replace cross entropy loss

to least squares loss (L2)

D(x) gets closer to 1

D(G(z)) gets closer to 0

(same as original)

Page 43: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

43

LSGANVariants of

GAN

Vanilla GAN LSGAN

Replace cross entropy loss

to least squares loss (L2)

D(G(z)) gets closer to 1

(same as original)

Page 44: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

44

LSGANVariants of

GAN

• Results (LSUN dataset)

Xudong Mao et al. Least Squares Generative Adversarial Networks, 2016

Page 45: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

45

LSGANVariants of

GAN

• Results (CelebA)

Page 46: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

46

SGANVariants of

GAN

D

G D

one-hot vector representing 2

Real image

latent vector z

fake image

(1) FC layer with softmax

• Semi-Supervised GAN

Training with real images

Training with fake images

11 dimension

(10 classes + fake)

(1)

(1)

one-hot vector representing a fake label

one-hot vector representing 5

Augustus Odena et al. Semi-Supervised Learning with Generative Adversarial Netwoks, 2016

Page 47: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

47

SGANVariants of

GAN

• Results (Game Character)

1 2 3 4 5

The generator can create an character image that takes a certain pose.

one-hot vectors

representing class labels

The code will be available soon: https://github.com/yunjey

Page 48: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

48

ACGANVariants of

GAN

Augustus Odena et al. Conditional Image Synthesis with Auxilary Classifier, 2016

• Auxiliary Classifier GAN(ACGAN), 2016

Proposed a new method for improved training of GANs using class labels.

Page 49: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

49

ACGANVariants of

GAN

real or fake?

D

G

real or fake?

D

one-hot vector representing 2

Real image

one-hot vector representing 5

latent vector z

fake image

one-hot vector representing 2

(1) FC layer with sigmoid

(2) FC layer with softmax(1)

(2)

Discriminator

(multi-task learning)

(1)

(2)

• How does it work?

Training with real images

Training with fake images

Page 50: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Extensions of GAN04

Page 51: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

51

CycleGAN Extensions

• CycleGAN: Unpaired Image-to-Image Translation

presents a GAN model that transfer an image from a source domain A to a target

domain B in the absence of paired examples.

Jun-Yan Zhu et al. Unpaired Image-to-Image Translation using Cycle Consistent Adversarial Networks, 2017

Page 52: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

52

CycleGAN

• How does it work?

real or fake ? DB

GAB

Real Image in domain A Fake Image in domain B

Real Image in domain B

Discriminator for domain B

The generator GAB should generates a horse

from the zebra to deceive the discriminator DB.

Extensions

Page 53: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

53

CycleGAN

• How does it work?

DB

GAB

Discriminator for domain B

GBA

Reconstructed Image

L2 Loss

GBA generates a reconstructed image of domain A.

This makes the shape to be maintained

when GAB generates a horse image from the zebra.real or fake ?

Real Image in domain A

Real Image in domain B

Fake Image in domain B

Extensions

Page 54: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

54

CycleGAN Extensions

• Results

Jun-Yan Zhu et al. Unpaired Image-to-Image Translation using Cycle Consistent Adversarial Networks, 2017

Page 55: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

55

CycleGAN Extensions

• Results

MNIST-to-SVHNSVHN-to-MNIST

Odd columns contain real images and even columns contain generated images.

https://github.com/yunjey/mnist-svhn-transfer

Page 56: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

56

StackGAN

Han Zhang et al. StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, 2016

• StackGAN: Text to Photo-realistic Image Synthesis

Extensions

Page 57: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

57

StackGAN

Fake image(128x128)

Real image(128x128)

D

G

Z

(100)

‘real’

or

‘fake’

Generates a128x128 image from scratch (not guarantee good result)

Discriminator for 128x128 image

• Generating 128x128 from scratch

Extensions

Page 58: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

58

StackGAN Extensions

D1 D2

G1 G2

Fake image(64x64)

Real image(64x64)

Fake image(128x128)

Real image(128x128)

‘real’

or

‘fake’

Generates a 64x64 image

Upscales a 64x64 image to 128x128 (Easier than generating from scratch).

Discriminator for 64x64 image Discriminator for 128x128 image

• Generating 128x128 from 64x64

‘real’

or

‘fake’

Z

(100)

Page 59: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

59

Latest Work

Jing Liao et al. Visual Attribute Transfer through Deep Image Analogy, 2017

Extensions

• Visual Attribute Transfer

Page 60: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

60

Latest Work

Richard Zhang et al. Real-Time User-Guided Image Colorization with Learned Deep Prioirs, 2017

Extensions

• User-Interactive Image Colorization

Page 61: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Future of GAN05

Page 62: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

62

Convergence Measure

• Boundary Equilibrium GAN (BEGAN)

Extensions

David Berthelot et al. BEGAN: Boundary Equilibrium Generative Adversarial Networks, 2017

Page 63: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

63

Convergence Measure

• Reconstruction Loss

Extensions

Sitao Xiang. On the effect of Batch Normalization and Weight Normalization in Generative Adversarial Network, 2017

Gz

G(z) x

Test images

Page 64: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

64

Better Upsampling

• Deconvolution Checkboard Artifacts

Extensions

http://distill.pub/2016/deconv-checkerboard/

Page 65: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

65

Better Upsampling

• Deconvolution vs Resize-Convolution

Extensions

http://distill.pub/2016/deconv-checkerboard/

Page 66: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

66

GAN in Supervised Learning

• Machine Translation (Seq2Seq)

Extensions

A B C D

Y

X Y Z<start>

Z <end>X

Should ‘ABCD’ be translated to ‘XYZ’?

Tackling the supervised learning

Page 67: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

67

GAN in Supervised Learning

• Machine Translation (GANs)

Extensions

DG

D

Training with real sentences

Training with fake sentences

A

(English)

B

(Korean)

Does A and B have the same meaning?

The discriminator should say ‘yes’.

Does A and B have the same meaning?

The discriminator should say ‘no’.

A

(English)

B

(Fake Korean)

Lijun Wu et al. Adversarial Neural Machine Translation, 2017

The generator should generate B

that has the same meaning as

A to deceive the discriminator.

Page 68: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Thank you

Page 69: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Appendix06

Page 70: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

70

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )𝐺 𝐷

Theory in GAN GANs

𝑂𝑏𝑗𝑒𝑐𝑡𝑖𝑣𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐺𝐴𝑁𝑠 𝑟𝑒𝑎𝑙 𝑑𝑎𝑡𝑎 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛

ℎ𝑖𝑔ℎ 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙 𝑣𝑒𝑐𝑡𝑜𝑟 (𝑒. 𝑔. 64 × 64) l𝑜𝑤 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙 𝑣𝑒𝑐𝑡𝑜𝑟 (𝑒. 𝑔. 100)

Page 71: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

71

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )𝐺 𝐷

Theory in GAN GANs

𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )

𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷

𝑂𝑝𝑡𝑖𝑚𝑎𝑙 𝐷

𝐷

𝐺𝑒𝑡 𝐷 𝑤ℎ𝑒𝑛 𝑉 𝐷 𝑖𝑠 𝑚𝑎𝑥𝑖𝑚𝑢𝑚

Page 72: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

72

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )𝐺 𝐷

Theory in GAN GANs

𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )

= 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑥~𝑝𝑔(𝑥) log(1 − 𝐷(𝑥))

𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷

𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝𝑔𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧

𝑣𝑒𝑐𝑡𝑜𝑟 𝑜𝑓 64 × 64 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛

𝑣𝑒𝑐𝑡𝑜𝑟 𝑜𝑓 100 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛

𝑚𝑜𝑑𝑒𝑙 𝐺 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑓𝑜𝑟 ℎ𝑖𝑔ℎ 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑎𝑙 𝑣𝑒𝑐𝑡𝑜𝑟 (𝑒. 𝑔. 64 × 64)

𝐷

Page 73: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

73

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )𝐺 𝐷

Theory in GAN GANs

𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )

= 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑥~𝑝𝑔(𝑥) log(1 − 𝐷(𝑥))

= න𝑝𝑑𝑎𝑡𝑎 𝑥 log𝐷(𝑥) 𝑑𝑥 + න𝑝𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥𝑥 𝑥

𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷

𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝𝑔𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧

= න𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸𝑥~𝑝(𝑥) 𝑓(𝑥)

𝐷𝑒𝑓𝑖𝑛𝑖𝑡𝑖𝑜𝑛 𝑜𝑓 𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛

𝑥

𝐼𝑛𝑡𝑒𝑔𝑟𝑎𝑡𝑒 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑥

𝐷

Page 74: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

74

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )𝐺 𝐷

Theory in GAN GANs

𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )

= 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑥~𝑝𝑔(𝑥) log(1 − 𝐷(𝑥))

= න𝑝𝑑𝑎𝑡𝑎 𝑥 log𝐷(𝑥) 𝑑𝑥 + න𝑝𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥

= න𝑝𝑑𝑎𝑡𝑎 𝑥 log𝐷 𝑥 + 𝑝𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥𝑥

𝑥 𝑥

𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷

𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝𝑔𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧

= න𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸𝑥~𝑝(𝑥) 𝑓(𝑥)

𝐷𝑒𝑓𝑖𝑛𝑖𝑡𝑖𝑜𝑛 𝑜𝑓 𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛

𝑥

𝐵𝑎𝑠𝑖𝑐 𝑝𝑟𝑜𝑝𝑒𝑟𝑡𝑦 𝑜𝑓 𝐼𝑛𝑡𝑒𝑔𝑟𝑎𝑙

𝐷

Page 75: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

75

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )𝐺 𝐷

Theory in GAN GANs

𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷 = 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑧~𝑝𝑧(𝑧) log(1 − 𝐷(𝐺 𝑧 )

= 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎(𝑥) log𝐷 𝑥 + 𝐸𝑥~𝑝𝑔(𝑥) log(1 − 𝐷(𝑥))

= න𝑝𝑑𝑎𝑡𝑎 𝑥 log𝐷(𝑥) 𝑑𝑥 + න𝑝𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥

= න𝑝𝑑𝑎𝑡𝑎 𝑥 log𝐷 𝑥 + 𝑝𝑔 𝑥 log 1 − 𝐷 𝑥 𝑑𝑥𝑥

𝑥 𝑥

𝐹𝑖𝑥 𝐺 𝑡𝑜 𝑚𝑎𝑘𝑒 𝑖𝑡 𝑡𝑜 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝐷

𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑥 𝑓𝑟𝑜𝑚 𝑝𝑔𝑖𝑛𝑠𝑡𝑒𝑎𝑑 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑧 𝑓𝑟𝑜𝑚 𝑝𝑧

= න𝑝 𝑥 𝑓 𝑥 𝑑𝑥𝐸𝑥~𝑝(𝑥) 𝑓(𝑥)

𝐷𝑒𝑓𝑖𝑛𝑖𝑡𝑖𝑜𝑛 𝑜𝑓 𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛

𝑥

𝐵𝑎𝑠𝑖𝑐 𝑝𝑟𝑜𝑝𝑒𝑟𝑡𝑦 𝑜𝑓 𝐼𝑛𝑡𝑒𝑔𝑟𝑎𝑙

𝑁𝑜𝑤 𝑤𝑒 𝑛𝑒𝑒𝑑 𝑡𝑜 𝑓𝑖𝑛𝑑 𝐷 𝑥 𝑤ℎ𝑖𝑐ℎ 𝑚𝑎𝑘𝑒𝑠 𝑡ℎ𝑒 𝑓𝑢𝑛𝑡𝑖𝑜𝑛 𝑖𝑛𝑠𝑖𝑑𝑒 𝑖𝑛𝑡𝑒𝑔𝑟𝑎𝑙 𝑚𝑎𝑥𝑖𝑚𝑢𝑚.

𝐷

Page 76: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

76

Theoretical Results

= 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝𝑑𝑎𝑡𝑎 𝑥 log𝐷 𝑥 + 𝑝𝑔 𝑥 log 1 − 𝐷 𝑥

Theory in GAN GANs

𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷

𝐷

𝐷

𝑇ℎ𝑒 𝑓𝑢𝑛𝑡𝑖𝑜𝑛 𝑖𝑛𝑠𝑖𝑑𝑒 𝑖𝑛𝑡𝑒𝑔𝑟𝑎𝑙

Page 77: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

77

Theoretical Results

= 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝𝑑𝑎𝑡𝑎 𝑥 log𝐷 𝑥 + 𝑝𝑔 𝑥 log 1 − 𝐷 𝑥

Theory in GAN GANs

𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷

𝐷

𝐷

𝑎 log 𝑦 + 𝑏 log 1 − 𝑦

𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝𝑑𝑎𝑡𝑎 𝑥 , 𝑦 =𝐷 𝑥 , 𝑏 = 𝑝𝑔 𝑥

Page 78: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

78

Theoretical Results

= 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝𝑑𝑎𝑡𝑎 𝑥 log𝐷 𝑥 + 𝑝𝑔 𝑥 log 1 − 𝐷 𝑥

Theory in GAN GANs

𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷

𝐷

𝐷

𝑎 log 𝑦 + 𝑏 log 1 − 𝑦

𝑎

𝑦+

−𝑏

1 − 𝑦=𝑎 − (𝑎 + 𝑏)𝑦

𝑦(1 − 𝑦)

𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝𝑑𝑎𝑡𝑎 𝑥 , 𝑦 =𝐷 𝑥 , 𝑏 = 𝑝𝑔 𝑥

𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑡𝑒 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝐷 𝑥 𝑢𝑠𝑖𝑛𝑔

𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝐷 𝑥 𝑐𝑎𝑛 𝑛𝑜𝑡 𝑎𝑓𝑓𝑒𝑐𝑡 𝑡𝑜 𝑝𝑑𝑎𝑡𝑎 𝑥 𝑎𝑛𝑑 𝑝𝑔 𝑥 .

𝑑

𝑑𝑥log𝑓(𝑥) =

𝑓′(𝑥)

𝑓(𝑥)

Page 79: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

79

Theoretical Results

= 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝𝑑𝑎𝑡𝑎 𝑥 log𝐷 𝑥 + 𝑝𝑔 𝑥 log 1 − 𝐷 𝑥

Theory in GAN GANs

𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷

𝐷

𝐷

𝑎 log 𝑦 + 𝑏 log 1 − 𝑦

𝑎

𝑦+

−𝑏

1 − 𝑦

𝑎 − (𝑎 + 𝑏)𝑦

𝑦(1 − 𝑦)= 0

𝑦 =𝑎

𝑎 + 𝑏

=𝑎 − (𝑎 + 𝑏)𝑦

𝑦(1 − 𝑦)

𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝𝑑𝑎𝑡𝑎 𝑥 , 𝑦 =𝐷 𝑥 , 𝑏 = 𝑝𝑔 𝑥

𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝐷 𝑥 𝑐𝑎𝑛 𝑛𝑜𝑡 𝑎𝑓𝑓𝑒𝑐𝑡 𝑡𝑜 𝑝𝑑𝑎𝑡𝑎 𝑥 𝑎𝑛𝑑 𝑝𝑔 𝑥 .

𝐼𝑡 ℎ𝑎𝑠 𝑎 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑣𝑎𝑙𝑢𝑒 𝑤ℎ𝑒𝑛

𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝑡ℎ𝑒 𝑙𝑜𝑐𝑎𝑙 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑖𝑠 𝑡ℎ𝑒 𝑔𝑙𝑜𝑏𝑎𝑙 𝑚𝑎𝑥𝑖𝑚𝑢𝑚𝑤ℎ𝑒𝑛 𝑡ℎ𝑒𝑟𝑒 𝑎𝑟𝑒 𝑛𝑜 𝑜𝑡ℎ𝑒𝑟 𝑙𝑜𝑐𝑎𝑙 𝑒𝑥𝑡𝑟𝑒𝑚𝑒𝑠.

𝑑

𝑑𝑥log𝑓(𝑥) =

𝑓′(𝑥)

𝑓(𝑥)𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑡𝑒 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝐷 𝑥 𝑢𝑠𝑖𝑛𝑔

𝐹𝑖𝑛𝑑 𝑡ℎ𝑒 𝑝𝑜𝑖𝑛𝑡 𝑤ℎ𝑒𝑟𝑒 𝑡ℎ𝑒 𝑑𝑒𝑟𝑖𝑣𝑎𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒 𝑖𝑠 0 𝑙𝑜𝑐𝑎𝑙 𝑒𝑥𝑡𝑟𝑒𝑚𝑒 .

Page 80: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

80

Theoretical Results

= 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑝𝑑𝑎𝑡𝑎 𝑥 log𝐷 𝑥 + 𝑝𝑔 𝑥 log 1 − 𝐷 𝑥

Theory in GAN GANs

𝐷∗(𝑥) = 𝑎𝑟𝑔 𝑚𝑎𝑥 𝑉 𝐷

𝐷

𝐷

𝑎 log 𝑦 + 𝑏 log 1 − 𝑦

𝑎

𝑦+

−𝑏

1 − 𝑦

𝑎 − (𝑎 + 𝑏)𝑦

𝑦(1 − 𝑦)= 0

𝑦 =𝑎

𝑎 + 𝑏

=𝑎 − (𝑎 + 𝑏)𝑦

𝑦(1 − 𝑦)

𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝𝑑𝑎𝑡𝑎 𝑥 , 𝑦 =𝐷 𝑥 , 𝑏 = 𝑝𝑔 𝑥

𝑁𝑜𝑡𝑒 𝑡ℎ𝑎𝑡 𝐷 𝑥 𝑐𝑎𝑛 𝑛𝑜𝑡 𝑎𝑓𝑓𝑒𝑐𝑡 𝑡𝑜 𝑝𝑑𝑎𝑡𝑎 𝑥 𝑎𝑛𝑑 𝑝𝑔 𝑥 .

𝐼𝑡 ℎ𝑎𝑠 𝑎 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑣𝑎𝑙𝑢𝑒 𝑤ℎ𝑒𝑛

𝑑

𝑑𝑥log𝑓(𝑥) =

𝑓′(𝑥)

𝑓(𝑥)𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡𝑖𝑎𝑡𝑒 𝑤𝑖𝑡ℎ 𝑟𝑒𝑠𝑝𝑒𝑐𝑡 𝑡𝑜 𝐷 𝑥 𝑢𝑠𝑖𝑛𝑔

𝐹𝑖𝑛𝑑 𝑡ℎ𝑒 𝑝𝑜𝑖𝑛𝑡 𝑤ℎ𝑒𝑟𝑒 𝑡ℎ𝑒 𝑑𝑒𝑟𝑖𝑣𝑎𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒 𝑖𝑠 0 𝑙𝑜𝑐𝑎𝑙 𝑒𝑥𝑡𝑟𝑒𝑚𝑒 .

𝐷∗ 𝑥 =𝑝𝑑𝑎𝑡𝑎(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒 𝑎 = 𝑝𝑑𝑎𝑡𝑎 𝑥 , 𝑦 =𝐷 𝑥 , 𝑏 = 𝑝𝑔 𝑥

Page 81: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

81

Theory in GAN GANs

= 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎 log𝐷∗ 𝑥 + 𝐸𝑥~𝑝𝑔 log(1 − 𝐷∗(𝑥))

= න𝑝𝑑𝑎𝑡𝑎 𝑥 log𝑝𝑑𝑎𝑡𝑎(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥 + න𝑝𝑔 𝑥 log

𝑝𝑔(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥

= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න𝑝𝑑𝑎𝑡𝑎 𝑥 log𝑝𝑑𝑎𝑡𝑎(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥 + න𝑝𝑔 𝑥 log

𝑝𝑔(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥

= −𝑙𝑜𝑔4 + න𝑝𝑑𝑎𝑡𝑎 𝑥 log2 ∙ 𝑝𝑑𝑎𝑡𝑎(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥 + න𝑝𝑔 𝑥 log

2 ∙ 𝑝𝑔(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥

= −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝𝑑𝑎𝑡𝑎||𝑝𝑑𝑎𝑡𝑎 + 𝑝𝑔

2) + 𝐾𝐿(𝑝𝑔||

𝑝𝑑𝑎𝑡𝑎 + 𝑝𝑔

2)

= −𝑙𝑜𝑔4 + 2 ∙ 𝐽𝑆𝐷(𝑝𝑑𝑎𝑡𝑎||𝑝𝑔)

𝑥 𝑥

𝑥 𝑥

𝑥𝑥

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺 =𝐺 𝐷

𝑚𝑖𝑛 𝑉 𝐷∗, 𝐺𝐺

𝑉 𝐷∗, 𝐺

Page 82: 1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

82

Theory in GAN GANs

= 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎 log𝐷∗ 𝑥 + 𝐸𝑥~𝑝𝑔 log(1 − 𝐷∗(𝑥))

= න𝑝𝑑𝑎𝑡𝑎 𝑥 log𝑝𝑑𝑎𝑡𝑎(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥 + න𝑝𝑔 𝑥 log

𝑝𝑔(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥

= −𝑙𝑜𝑔4 + 𝑙𝑜𝑔4 + න𝑝𝑑𝑎𝑡𝑎 𝑥 log𝑝𝑑𝑎𝑡𝑎(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥 + න𝑝𝑔 𝑥 log

𝑝𝑔(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥

= −𝑙𝑜𝑔4 + න𝑝𝑑𝑎𝑡𝑎 𝑥 log2 ∙ 𝑝𝑑𝑎𝑡𝑎(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥 + න𝑝𝑔 𝑥 log

2 ∙ 𝑝𝑔(𝑥)

𝑝𝑑𝑎𝑡𝑎(𝑥) + 𝑝𝑔(𝑥)𝑑𝑥

= −𝑙𝑜𝑔4 + 𝐾𝐿(𝑝𝑑𝑎𝑡𝑎||𝑝𝑑𝑎𝑡𝑎 + 𝑝𝑔

2) + 𝐾𝐿(𝑝𝑔||

𝑝𝑑𝑎𝑡𝑎 + 𝑝𝑔

2)

= −𝑙𝑜𝑔4 + 2 ∙ 𝐽𝑆𝐷(𝑝𝑑𝑎𝑡𝑎||𝑝𝑔)

𝑥 𝑥

𝑥 𝑥

𝑥𝑥

𝑚𝑖𝑛𝑚𝑎𝑥 𝑉 𝐷, 𝐺 =𝐺 𝐷

𝑚𝑖𝑛 𝑉 𝐷∗, 𝐺𝐺

𝑉 𝐷∗, 𝐺

𝐺 𝑠ℎ𝑜𝑢𝑙𝑑 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒

𝐺 𝑠ℎ𝑜𝑢𝑙𝑑 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒

𝑂𝑝𝑡𝑖𝑚𝑖𝑧𝑖𝑛𝑔 𝑉 𝐷, 𝐺 𝑖𝑠 𝑠𝑎𝑚𝑒 𝑎𝑠 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑖𝑛𝑔 𝐽𝑆𝐷(𝑝𝑑𝑎𝑡𝑎||𝑝𝑔)

𝑂𝑝𝑡𝑖𝑚𝑎𝑙 𝐷