Transcript
Page 1: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Deep Learning for Computer VisionFall 2020

http://vllab.ee.ntu.edu.tw/dlcv.html (Public website)

https://cool.ntu.edu.tw/courses/3368 (NTU COOL; for grade, etc.)

Yu-Chiang Frank Wang ็Ž‹้ˆบๅผท, Professor

Dept. Electrical Engineering, National Taiwan University

2020/11/10

Page 2: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Week Date Topic Remarks

1 9/15 Course Logistics

2 9/22 Machine Learning 101

3 9/29 Intro to Neural Networks; Convolutional Neural Network (I) HW #1 out

4 10/6 Convolutional Neural Network (II): Visualization & Extensions of CNN

5 10/13 Tutorials on Python, Github, etc. (by TAs) HW #1 due

6 10/20 Visualization of CNN (II)Object Detection & Segmentation

HW #2 out

7 10/27 Image Segmentation; Generative Models

8 11/3 Generative Adversarial Network (GAN)

9 11/10 Transfer Learning for Visual Classification & Synthesis; Representation Disentanglement

HW #2 due;HW #3 out

10 11/17 TBD (CVPR Week)

11 11/24 Recurrent Neural Networks & Transformer

12 12/1 Meta-Learning; Few-Shot and Zero-Shot Classification (I) HW #3 due

13 12/8 Meta-Learning; Few-Shot and Zero-Shot Classification (II) HW #4 out

14 12/15 From Domain Adaptation to Domain Generalization Team-up for Final Projects

15 12/22 Beyond 2D vision (3D and Depth)

16 12/29 Image Inpainting and Outpainting; Guest Lecture HW #4 due

17 1/5 Guest Lectures

1/18-22 Presentation for Final Projects TBD2

Page 3: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

What to Cover Todayโ€ฆโ€ข Transfer Learning for Visual Classification & Synthesis

โ€ข Visual Classificationโ€ข Domain Adaptation & Adversarial Learning

โ€ข Visual Synthesisโ€ข Style Transfer

โ€ข Representation Disentanglementโ€ข Supervised vs. unsupervised feature disentanglement

Many slides from Richard Turner, Fei-Fei Li, Yaser Sheikh, Simon Lucey, Kaiming He, and J.-B. Huang 3

Page 4: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Revisit of CNN for Visual Classification

4LeCun & Ranzato, Deep Learning Tutorial, ICML 2013

Page 5: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

(Traditional) Machine Learning vs. Transfer Learning

โ€ข Machine Learningโ€ข Collecting/annotating data is typically expensive.

5Image Credit: A. Karpathy

Page 6: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

(Traditional) Machine Learning vs. Transfer Learning (contโ€™d)

โ€ข Transfer Learningโ€ข Improved learning & understanding in the (target) domain of interest

by leveraging knowledge from a different source domain.

6

Page 7: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

7https://techcrunch.com/2017/02/08/udacity-open-sources-its-self-driving-car-simulator-for-anyone-to-use/https://googleblog.blogspot.tw/2014/04/the-latest-chapter-for-self-driving-car.html

โ€ข A More Practical Example

Transfer Learning: What, When, and Why?

Page 8: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Transfer Learning

8

Transfer Learning

Multi-task Learning

TransductiveTransfer Learning

Unsupervised Transfer Learning

Inductive Transfer Learning

Domain Adaptation

Sample Selection Bias /Covariance Shift

Self-taught Learning

Labeled data are available in a target domain

Labeled data are available only in a source domain

No labeled data in both source and target domain

No labeled data in a source domain

Labeled data are available in a source domain

Case 1

Case 2Source and target tasks are learnt simultaneously

Different domains but single task

Assumption: single domain and single task

S. J. Pan and Q. Yang, โ€œA survey on transfer learning,โ€ IEEE TKDE, 2010.

Page 9: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Domain Adaptationin Transfer Learning

โ€ข Whatโ€™s DA?โ€ข Leveraging info from source to target domains,

so that the same learning task across domains can be addressed.โ€ข Typically all the source-domain data are labeled,

while the target domain data are partially labeled or fully unlabeled.

โ€ข Settingsโ€ข Semi-supervised/unsupervised DA:

few/no target-domain data are with labels.โ€ข Imbalanced DA:

fewer classes of interest in the target domainโ€ข Open/closed-set/universal DA:

overlapping label space or notโ€ข Homogeneous vs. heterogeneous DA:

same/distinct feature types across domains

9

Page 10: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Deep Feature is Sufficiently Promising.

โ€ข DeCAFโ€ข Leveraging an auxiliary large dataset to train CNN.โ€ข The resulting features exhibit sufficient representation ability.โ€ข Supporting results on Office+Caltech datasets, etc.

10Donahue et al., DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition, ICML 2014

Page 11: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Recent Deep Learning Methods for DA

โ€ข Deep Domain Confusion (DDC)

โ€ข Domain-Adversarial Training of Neural Networks (DANN)

โ€ข Adversarial Discriminative Domain Adaptation (ADDA)

โ€ข Domain Separation Network (DSN)

โ€ข Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks (PixelDA)

โ€ข No More Discrimination: Cross City Adaptation of Road Scene Segmenters

11

Shared weights Adaptation loss Generative model

DDC โœ“ MMD โœ˜

DANN โœ“ Adversarial โœ˜

ADDA โœ˜ Adversarial โœ˜

DSN Partially shared MMD/Adversarial โœ˜

PixelDA โœ˜ Adversarial โœ“

Page 12: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Deep Domain Confusion (DDC)

โ€ข Deep Domain Confusion: Maximizing for Domain Invarianceโ€ข Tzeng et al., arXiv: 1412.3474, 2014

12

Page 13: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Deep Domain Confusion (DDC)

13

sharedweights

โœ“Minimize classification loss:

Page 14: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Domain Confusion by Domain-Adversarial Training

โ€ข Domain-Adversarial Training of Neural Networks (DANN)โ€ข Y. Ganin et al., ICML 2015โ€ข Maximize domain confusion = maximize domain classification lossโ€ข Minimize source-domain data classification lossโ€ข The derived feature f can be viewed as a disentangled & domain-invariant feature.

14

Page 15: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Beyond Domain Confusion

โ€ข Domain Separation Network (DSN)โ€ข Bousmalis et al., NIPS 2016โ€ข Separate encoders for domain-invariant and domain-specific featuresโ€ข Private/common features are disentangled from each other.

15

Page 16: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Beyond Domain Confusion

โ€ข Domain Separation Network, NIPS 2016โ€ข Example results

32

Source-domain image Xs

Reconstruct private + shared featuresD(Ec(xs)+Ep(xs))

Reconstruct shared feature only D(Ec(xs))

Reconstruct private feature D(Ep(xs))

Target-domain image XT

Page 17: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Beyond Domain Confusion

โ€ข Domain Separation Network, NIPS 2016โ€ข Example results

32

Source-domain image Xs

Target-domain image XT

Page 18: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

What to Cover Todayโ€ฆโ€ข Transfer Learning for Visual Classification & Synthesis

โ€ข Visual Classificationโ€ข Domain Adaptation & Adversarial Learning

โ€ข Visual Synthesisโ€ข Style Transfer

โ€ข Representation Disentanglementโ€ข Supervised vs. unsupervised feature disentanglement

Many slides from Richard Turner, Fei-Fei Li, Yaser Sheikh, Simon Lucey, Kaiming He, and J.-B. Huang 18

Page 19: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Transfer Learning for Manipulating Data?

โ€ข TL not only addresses cross-domain classification tasks.โ€ข Letโ€™s see how we can synthesize and manipulate data across domains.

โ€ข As a computer vision guy, letโ€™s focus on visual data in this lectureโ€ฆ

19

SourceDomain

TargetDomain

Page 20: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Transfer Learning for Image Synthesis

โ€ข Cross-Domain Image Translationโ€ข Pix2pix (CVPRโ€™17): Pairwise cross-domain training dataโ€ข CycleGAN/DualGAN/DiscoGAN: Unpaired cross-domain training dataโ€ข UNIT (NIPSโ€™17): Learning cross-domain image representation (with unpaired training data)โ€ข DTN (ICLRโ€™17) : Learning cross-domain image representation (with unpaired training data)โ€ข Beyond image translation

20

Page 21: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

A Super Brief Review forGenerative Adversarial Networks (GAN)โ€ข Architecture of GAN

โ€ข Loss:

21Goodfellow et al., Generative Adversarial Nets, NIPS, 2014

โ„’๐บ๐บ๐บ๐บ๐บ๐บ ๐บ๐บ,๐ท๐ท = ๐”ผ๐”ผ log 1 โˆ’ ๐ท๐ท (๐บ๐บ(๐‘ฅ๐‘ฅ)) + ๐”ผ๐”ผ log๐ท๐ท ๐‘ฆ๐‘ฆ

x

y

Page 22: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Pix2pix

โ€ข Image-to-image translation with conditional adversarial networks (CVPRโ€™17)โ€ข Can be viewed as image style transfer

22

Sketch Photo

Isola et al. " Image-to-image translation with conditional adversarial networks." CVPR 2017.

Page 23: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Pix2pixโ€ข Goal / Problem Setting

โ€ข Image translation across two distinct domains (e.g., sketch v.s. photo)

โ€ข Pairwise training data

โ€ข Method: Conditional GANโ€ข Example: Sketch to Photo

โ€ข GeneratorInput: SketchOutput: Photo

โ€ข DiscriminatorInput: Concatenation of Input(Sketch) & Synthesized/Real(Photo) imagesOutput: Real or Fake

23

Testing Phase

GeneratedInput

Input

Concat

Concat

Training Phase

Input

real

Isola et al. " Image-to-image translation with conditional adversarial networks." CVPR 2017.

Page 24: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Pix2pix

24

โ€ข Learning the model

GeneratedInput

Input

Concat

Training Phase

Concat

Input

Real

โ„’๐‘๐‘๐บ๐บ๐บ๐บ๐บ๐บ(G, D) = ๐”ผ๐”ผ ๐‘ฅ๐‘ฅ log 1 โˆ’ D(๐‘ฅ๐‘ฅ, G(๐‘ฅ๐‘ฅ)) + ๐”ผ๐”ผ ๐‘ฅ๐‘ฅ,๐‘ฆ๐‘ฆ log D ๐‘ฅ๐‘ฅ,๐‘ฆ๐‘ฆFake (Generated) Real

Concatenate Concatenate

โ„’๐ฟ๐ฟ๐ฟ(G) = ๐”ผ๐”ผ ๐‘ฅ๐‘ฅ,๐‘ฆ๐‘ฆ ๐‘ฆ๐‘ฆ โˆ’ G(๐‘ฅ๐‘ฅ) ๐ฟ

Reconstruction Loss

Conditional GAN loss

Overall objective functionGโˆ— = arg min

GmaxD

โ„’๐‘๐‘๐บ๐บ๐บ๐บ๐บ๐บ G, D + โ„’๐ฟ๐ฟ๐ฟ(G)

Isola et al. " Image-to-image translation with conditional adversarial networks." CVPR 2017.

Page 25: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Pix2pix

25

โ€ข Experiment results

Demo page: https://affinelayer.com/pixsrv/

Isola et al. " Image-to-image translation with conditional adversarial networks." CVPR 2017.

Page 26: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Transfer Learning for Image Synthesis

โ€ข Cross-Domain Image Translationโ€ข Pix2pix (CVPRโ€™17): Pairwise cross-domain training dataโ€ข CycleGAN/DualGAN/DiscoGAN: Unpaired cross-domain training dataโ€ข UNIT (NIPSโ€™17): Learning cross-domain image representation (with unpaired training data)โ€ข DTN (ICLRโ€™17) : Learning cross-domain image representation (with unpaired training data)โ€ข Beyond image translation

26

Page 27: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

CycleGAN/DiscoGAN/DualGAN

โ€ข CycleGAN (CVPRโ€™17)โ€ข Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial

Networks -to-image translation with conditional adversarial networks

27Zhu et al. "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks." CVPR 2017.

โ€ข Easier to collect training data

โ€ข More practical

Paired Unpaired

1-to-1 Correspondence

No Correspondence

Page 28: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

CycleGAN

28Zhu et al. "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks." CVPR 2017.

โ€ข Goal / Problem Settingโ€ข Image translation across two distinct domains โ€ข Unpaired training data

โ€ข Ideaโ€ข Autoencoding-like image translationโ€ข Cycle consistency between two domains

Photo PaintingUnpaired

Cycle Consistency

Photo Painting Photo

Training data

Painting Photo Painting

Cycle Consistency

Page 29: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

CycleGAN

29Zhu et al. "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks." CVPR 2017.

โ€ข Method (Example: Photo & Painting)

โ€ข Based on 2 GANsโ€ข First GAN (G1, D1): Photo to Paintingโ€ข Second GAN (G2, D2): Painting to Photo

โ€ข Cycle Consistencyโ€ข Photo consistencyโ€ข Painting consistency

Photo(Input)

Painting(Generated)

G1

D1

Painting(Real)

or Real / Fake

Photo(Generated)

Painting(Input)

G2

D2or Real / Fake

Photo(Real)

Page 30: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

CycleGAN

30Zhu et al. "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks." CVPR 2017.

โ€ข Method (Example: Photo vs. Painting)

โ€ข Based on 2 GANsโ€ข First GAN (G1, D1): Photo to Paintingโ€ข Second GAN (G2, D2): Photo to Painting

โ€ข Cycle Consistencyโ€ข Photo consistencyโ€ข Painting consistency

Photo Consistency

Photo Painting PhotoG1 G2

Painting Photo Painting

Painting Consistency

G1G2

Page 31: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

CycleGAN

31Zhu et al. "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks." CVPR 2017.

โ€ข Learning

โ€ข Adversarial Lossโ€ข First GAN (G1, D1):

โ€ข Second GAN (G2, D2):

Overall objective functionG๐ฟโˆ— , G2โˆ— = arg min

G1,G2maxD1,D2

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G๐ฟ, D๐ฟ + โ„’๐บ๐บ๐บ๐บ๐บ๐บ G2, D2 + โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G๐ฟ, G2First GAN Second GAN

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G๐ฟ, D๐ฟ = ๐”ผ๐”ผ log 1 โˆ’ D๐ฟ(G๐ฟ(๐‘ฅ๐‘ฅ)) + ๐”ผ๐”ผ log D๐ฟ ๐‘ฆ๐‘ฆ

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G2, D2 = ๐”ผ๐”ผ log 1 โˆ’ D2(G2(๐‘ฆ๐‘ฆ)) + ๐”ผ๐”ผ log D2 ๐‘ฅ๐‘ฅ

Photo(Input)

Painting(Generated)

G1

D1

Painting(Real)

or

Real/ Fake

Photo(Generated)

Painting(Input)

G2

D2or Real/ Fake

Photo(Real)

๐‘ฅ๐‘ฅ G๐ฟ(๐‘ฅ๐‘ฅ)

๐‘ฆ๐‘ฆ

๐‘ฆ๐‘ฆ G2(๐‘ฆ๐‘ฆ)

๐‘ฅ๐‘ฅ

Page 32: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

CycleGAN

32Zhu et al. "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks." CVPR 2017.

โ€ข Learning

โ€ข Consistency Lossโ€ข Photo and Painting consistency

Overall objective functionG๐ฟโˆ— , G2โˆ— = arg min

G1,G2maxD1,D2

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G๐ฟ, D๐ฟ + โ„’๐บ๐บ๐บ๐บ๐บ๐บ G2, D2 + โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G๐ฟ, G2Cycle Consistency

โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G๐ฟ, G2 = ๐”ผ๐”ผ G2 G๐ฟ ๐‘ฅ๐‘ฅ โˆ’ ๐‘ฅ๐‘ฅ ๐ฟ + G๐ฟ G2 ๐‘ฆ๐‘ฆ โˆ’ ๐‘ฆ๐‘ฆ ๐ฟ

Photo Consistency

Photo Painting Photo

G1 G2๐‘ฅ๐‘ฅ G๐ฟ ๐‘ฅ๐‘ฅ G2 G๐ฟ ๐‘ฅ๐‘ฅ

Painting Photo Painting

Painting Consistency

G1G2๐‘ฆ๐‘ฆ G2 ๐‘ฆ๐‘ฆ G๐ฟ G2 ๐‘ฆ๐‘ฆ

Page 33: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

CycleGAN

33Zhu et al. "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks." CVPR 2017.

โ€ข Example results

Project Page: https://junyanz.github.io/CycleGAN/

Page 34: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Image Translation Using Unpaired Training Data

34

Zhu et al. "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks." CVPR 2017.Kim et al. "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks.โ€œ, ICML 2017

Yi, Zili, et al. "Dualgan: Unsupervised dual learning for image-to-image translation." ICCV 2017

โ€ข CycleGAN, DiscoGAN, and DualGAN

CycleGANICCVโ€™17

DiscoGANICMLโ€™17

DualGANICCVโ€™17

Page 35: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Transfer Learning for Image Synthesis

โ€ข Cross-Domain Image Translationโ€ข Pix2pix (CVPRโ€™17): Pairwise cross-domain training dataโ€ข CycleGAN/DualGAN/DiscoGAN: Unpaired cross-domain training dataโ€ข UNIT (NIPSโ€™17): Learning cross-domain image representation (with unpaired training data)โ€ข DTN (ICLRโ€™17) : Learning cross-domain image representation (with unpaired training data)โ€ข Beyond image translation

35

Page 36: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

UNIT

โ€ข Unsupervised Image-to-Image Translation Networks (NIPSโ€™17)โ€ข Image translation via learning cross-domain joint representation

36Liu et al., "Unsupervised image-to-image translation networks.โ€œ, NIPS 2017

๐‘ฅ๐‘ฅ๐ฟ๐‘ฅ๐‘ฅ2

๐‘ง๐‘ง๐’ต๐’ต: Joint latent space

๐’ณ๐’ณ๐ฟ ๐’ณ๐’ณ2

๐‘ฅ๐‘ฅ๐ฟ๐‘ฅ๐‘ฅ2

๐‘ง๐‘ง๐’ต๐’ต: Joint latent space

๐’ณ๐’ณ๐ฟ ๐’ณ๐’ณ2

Stage1: Encode to the joint space Stage2: Generate cross-domain images

Day Night Day Night

Page 37: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

UNIT

โ€ข Goal/Problem Settingโ€ข Image translation

across two distinct domainsโ€ข Unpaired training image data

โ€ข Ideaโ€ข Based on two parallel VAE-GAN models

37Liu et al., "Unsupervised image-to-image translation networks.โ€œ, NIPS 2017

VAE GAN

Page 38: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

UNIT

โ€ข Goal/Problem Settingโ€ข Image translation

across two distinct domainsโ€ข Unpaired training image data

โ€ข Ideaโ€ข Based on two parallel VAE-GAN modelsโ€ข Learning of joint representation

across image domains

38Liu et al., "Unsupervised image-to-image translation networks.โ€œ, NIPS 2017

Page 39: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

UNIT

โ€ข Goal/Problem Settingโ€ข Image translation

across two distinct domainsโ€ข Unpaired training image data

โ€ข Ideaโ€ข Based on two parallel VAE-GAN modelsโ€ข Learning of joint representation

across image domainsโ€ข Generate cross-domain images

from joint representation

39Liu et al., "Unsupervised image-to-image translation networks.โ€œ, NIPS 2017

Page 40: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Variation Autoencoder Loss

โ€ข Learning

40

Overall objective functionGโˆ— = arg min

GmaxD

โ„’๐‘‰๐‘‰๐บ๐บ๐‘‰๐‘‰ E๐ฟ, G๐ฟ, E2, G2 + โ„’๐บ๐บ๐บ๐บ๐บ๐บ G๐ฟ, D๐ฟ, G2, D2

โ„’๐‘‰๐‘‰๐บ๐บ๐‘‰๐‘‰ E๐ฟ, G๐ฟ, E2, G2 = ๐”ผ๐”ผ G๐ฟ E๐ฟ ๐‘ฅ๐‘ฅ๐ฟ โˆ’ ๐‘ฅ๐‘ฅ๐ฟ 2 + ๐”ผ๐”ผ ๐’ฆ๐’ฆโ„’(๐‘ž๐‘ž๐ฟ(๐‘ง๐‘ง)||๐‘๐‘(๐‘ง๐‘ง))๐”ผ๐”ผ G2 E2 ๐‘ฅ๐‘ฅ2 โˆ’ ๐‘ฅ๐‘ฅ2 2 + ๐”ผ๐”ผ ๐’ฆ๐’ฆโ„’(๐‘ž๐‘ž2(๐‘ง๐‘ง)||๐‘๐‘(๐‘ง๐‘ง))

VAEG๐ฟE๐ฟ D๐ฟ

E2 G2 D2

Adversarial Lossโ„’๐บ๐บ๐บ๐บ๐บ๐บ G๐ฟ, D๐ฟ, G2, D2 = ๐”ผ๐”ผ log 1 โˆ’ D๐ฟ(G๐ฟ(๐‘ง๐‘ง) + ๐”ผ๐”ผ log D๐ฟ ๐‘ฆ๐‘ฆ๐ฟ

๐”ผ๐”ผ log 1 โˆ’ D2(G2(๐‘ง๐‘ง) + ๐”ผ๐”ผ log D2 ๐‘ฆ๐‘ฆ2

G๐ฟ(๐‘ง๐‘ง)

G2(๐‘ง๐‘ง)

GAN

Variation Autoencoder Adversarial

UNIT

Page 41: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Generated

Variation Autoencoder Loss

โ€ข Learning

41

โ„’๐‘‰๐‘‰๐บ๐บ๐‘‰๐‘‰ ๐ธ๐ธ๐ฟ,๐บ๐บ๐ฟ,๐ธ๐ธ2,๐บ๐บ2 = ๐”ผ๐”ผ ๐บ๐บ๐ฟ ๐ธ๐ธ๐ฟ ๐‘ฅ๐‘ฅ๐ฟ โˆ’ ๐‘ฅ๐‘ฅ๐ฟ 2 + ๐”ผ๐”ผ ๐’ฆ๐’ฆโ„’(๐‘ž๐‘ž๐ฟ(๐‘ง๐‘ง)||๐‘๐‘(๐‘ง๐‘ง))๐”ผ๐”ผ ๐บ๐บ2 ๐ธ๐ธ2 ๐‘ฅ๐‘ฅ2 โˆ’ ๐‘ฅ๐‘ฅ2 2 + ๐”ผ๐”ผ ๐’ฆ๐’ฆโ„’(๐‘ž๐‘ž2(๐‘ง๐‘ง)||๐‘๐‘(๐‘ง๐‘ง))

Adversarial Lossโ„’๐บ๐บ๐บ๐บ๐บ๐บ ๐บ๐บ๐ฟ,๐ท๐ท๐ฟ,๐บ๐บ2,๐ท๐ท2 = ๐”ผ๐”ผ ๐‘™๐‘™๐‘™๐‘™๐‘™๐‘™ 1 โˆ’ ๐ท๐ท๐ฟ(๐บ๐บ๐ฟ(๐‘ง๐‘ง) + ๐”ผ๐”ผ ๐‘™๐‘™๐‘™๐‘™๐‘™๐‘™๐ท๐ท๐ฟ ๐‘ฆ๐‘ฆ๐ฟ

๐”ผ๐”ผ ๐‘™๐‘™๐‘™๐‘™๐‘™๐‘™ 1 โˆ’ ๐ท๐ท2(๐บ๐บ2(๐‘ง๐‘ง) + ๐”ผ๐”ผ ๐‘™๐‘™๐‘™๐‘™๐‘™๐‘™๐ท๐ท2 ๐‘ฆ๐‘ฆ2Real

UNIT

Overall objective functionGโˆ— = arg min

GmaxD

โ„’๐‘‰๐‘‰๐บ๐บ๐‘‰๐‘‰ E๐ฟ, G๐ฟ, E2, G2 + โ„’๐บ๐บ๐บ๐บ๐บ๐บ G๐ฟ, D๐ฟ, G2, D2

Variation Autoencoder Adversarial

VAEG๐ฟE๐ฟ D๐ฟ

E2 G2 D2

G๐ฟ(๐‘ง๐‘ง)

G2(๐‘ง๐‘ง)

GAN

Page 42: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

UNIT

42

โ€ข Example results

Liu et al., "Unsupervised image-to-image translation networks.โ€œ, NIPS 2017

Sunny โ†’ Rainy

Rainy โ†’ Sunny

Real Street-view โ†’ Synthetic Street-view

Synthetic Street-view โ†’ Real Street-view

Github Page: https://github.com/mingyuliutw/UNIT

Page 43: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Transfer Learning for Image Synthesis

โ€ข Cross-Domain Image Translationโ€ข Pix2pix (CVPRโ€™17): Pairwise cross-domain training dataโ€ข CycleGAN/DualGAN/DiscoGAN: Unpaired cross-domain training dataโ€ข UNIT (NIPSโ€™17): Learning cross-domain image representation (with unpaired training data)โ€ข DTN (ICLRโ€™17) : Learning cross-domain image representation (with unpaired training data)โ€ข Beyond image translation

43

Page 44: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Domain Transfer Networks

โ€ข Unsupervised Cross-Domain Image Generation (ICLRโ€™17)โ€ข Goal/Problem Setting

โ€ข Image translation across two domainsโ€ข One-way only translationโ€ข Unpaired training data

โ€ข Idea โ€ข Apply unified model to learn

joint representation across domains.

44Taigman et al., "Unsupervised cross-domain image generation.โ€œ, ICLR 2017

Page 45: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Domain Transfer Networks

โ€ข Unsupervised Cross-Domain Image Generation (ICLRโ€™17)โ€ข Goal/Problem Setting

โ€ข Image translation across two domainsโ€ข One-way only translationโ€ข Unpaired training data

โ€ข Idea โ€ข Apply unified model to learn

joint representation across domains.โ€ข Consistency observed in image and feature spaces

45Taigman et al., "Unsupervised cross-domain image generation.โ€œ, ICLR 2016

Image consistency

feature consistency

Page 46: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

46

โ€ข Learningโ€ข Unified model to translate across domains

โ€ข Consistency of feature and image space

โ€ข Adversarial loss

Gโˆ— = arg minG

maxD

โ„’๐‘–๐‘–๐‘–๐‘–๐‘–๐‘– G + โ„’๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“ G + โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D

โ„’๐‘–๐‘–๐‘–๐‘–๐‘–๐‘– G = ๐”ผ๐”ผ ๐‘™๐‘™ ๐‘“๐‘“ ๐‘ฆ๐‘ฆ โˆ’ ๐‘ฆ๐‘ฆ 2

โ„’๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“ G = ๐”ผ๐”ผ ๐‘“๐‘“(๐‘™๐‘™ ๐‘“๐‘“ ๐‘ฅ๐‘ฅ ) โˆ’ ๐‘“๐‘“(๐‘ฅ๐‘ฅ) 2

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฅ๐‘ฅ) + ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฆ๐‘ฆ) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

G D

Domain Transfer Networks

Page 47: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

47

โ€ข Learningโ€ข Unified model to translate across domains

โ€ข Consistency of image and feature space

โ€ข Adversarial loss

Gโˆ— = arg minG

maxD

โ„’๐‘–๐‘–๐‘–๐‘–๐‘–๐‘– G + โ„’๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“ G + โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D

โ„’๐‘–๐‘–๐‘–๐‘–๐‘–๐‘– G = ๐”ผ๐”ผ ๐‘™๐‘™ ๐‘“๐‘“ ๐‘ฆ๐‘ฆ โˆ’ ๐‘ฆ๐‘ฆ 2

โ„’๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“ G = ๐”ผ๐”ผ ๐‘“๐‘“(๐‘™๐‘™ ๐‘“๐‘“ ๐‘ฅ๐‘ฅ ) โˆ’ ๐‘“๐‘“(๐‘ฅ๐‘ฅ) 2

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฅ๐‘ฅ) + ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฆ๐‘ฆ) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

๐‘ฆ๐‘ฆ

๐‘ฅ๐‘ฅ

G D

Image consistency

feature consistency

G = {๐‘“๐‘“,๐‘™๐‘™}

Domain Transfer Networks

Page 48: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

48

โ€ข Learningโ€ข Unified model to translate across domains

โ€ข Consistency of feature and image space

โ€ข Adversarial loss

Gโˆ— = arg minG

maxD

โ„’๐‘–๐‘–๐‘–๐‘–๐‘–๐‘– G + โ„’๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“ G + โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D

โ„’๐‘–๐‘–๐‘–๐‘–๐‘–๐‘– G = ๐”ผ๐”ผ ๐‘™๐‘™ ๐‘“๐‘“ ๐‘ฆ๐‘ฆ โˆ’ ๐‘ฆ๐‘ฆ 2

โ„’๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“ G = ๐”ผ๐”ผ ๐‘“๐‘“(๐‘™๐‘™ ๐‘“๐‘“ ๐‘ฅ๐‘ฅ ) โˆ’ ๐‘“๐‘“(๐‘ฅ๐‘ฅ) 2

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฅ๐‘ฅ) + ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฆ๐‘ฆ) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

๐‘ฆ๐‘ฆ

๐‘ฅ๐‘ฅ

G DG(๐‘ฆ๐‘ฆ)

G(๐‘ฅ๐‘ฅ)

Domain Transfer Networks

Page 49: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

49

โ€ข Learningโ€ข Unified model to translate across domains

โ€ข Consistency of feature and image space

โ€ข Adversarial loss

Gโˆ— = arg minG

maxD

โ„’๐‘–๐‘–๐‘–๐‘–๐‘–๐‘– G + โ„’๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“ G + โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D

โ„’๐‘–๐‘–๐‘–๐‘–๐‘–๐‘– G = ๐”ผ๐”ผ ๐‘™๐‘™ ๐‘“๐‘“ ๐‘ฆ๐‘ฆ โˆ’ ๐‘ฆ๐‘ฆ 2

โ„’๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“๐‘“ G = ๐”ผ๐”ผ ๐‘“๐‘“(๐‘™๐‘™ ๐‘“๐‘“ ๐‘ฅ๐‘ฅ ) โˆ’ ๐‘“๐‘“(๐‘ฅ๐‘ฅ) 2

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฅ๐‘ฅ) + ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฆ๐‘ฆ) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

๐‘ฆ๐‘ฆ

๐‘ฅ๐‘ฅ

G DG(๐‘ฆ๐‘ฆ)

G(๐‘ฅ๐‘ฅ)

Domain Transfer Networks

Page 50: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

DTN

50

โ€ข Example results

Taigman et al., "Unsupervised cross-domain image generation.โ€œ, ICLR 2016

SVHN 2 MNIST Photo 2 Emoji

Page 51: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

What to Cover Todayโ€ฆโ€ข Transfer Learning for Visual Classification & Synthesis

โ€ข Visual Classificationโ€ข Domain Adaptation & Adversarial Learning

โ€ข Visual Synthesisโ€ข Style Transfer

โ€ข Representation Disentanglementโ€ข Supervised vs. unsupervised feature disentanglement

Many slides from Richard Turner, Fei-Fei Li, Yaser Sheikh, Simon Lucey, Kaiming He, and J.-B. Huang 51

Page 52: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Beyond Image Style Transfer:Learning Interpretable Deep Representationsโ€ข Faceapp โ€“ Putting a smile on your face!

โ€ข Deep learning for representation disentanglement โ€ข Interpretable deep feature representation

InputMr. Takeshi Kaneshiro

52

Page 53: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Recall: Generative Adversarial Networks (GAN)

โ€ข Architecture of GANโ€ข Loss

Goodfellow et al., Generative Adversarial Nets, NIPS, 2014

โ„’๐บ๐บ๐บ๐บ๐บ๐บ ๐บ๐บ,๐ท๐ท = ๐”ผ๐”ผ log 1 โˆ’ ๐ท๐ท (๐บ๐บ(๐‘ฅ๐‘ฅ)) + ๐”ผ๐”ผ log๐ท๐ท ๐‘ฆ๐‘ฆ

x

y

53

Page 54: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Representation Disentanglement

โ€ข Goalโ€ข Interpretable deep feature representationโ€ข Disentangle attribute of interest c from the derived latent representation zโ€ข Possible solutions: VAE, GAN, or mix of themโ€ฆ

GLatent feature z

(uninterpretable)

InterpretableFactor c

(e.g., season)

54

Page 55: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Representation Disentanglement

โ€ข Goalโ€ข Interpretable deep feature representationโ€ข Disentangle attribute of interest c from the derived latent representation z

โ€ข Supervised setting: from VAE to conditional VAE

55

Page 56: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Representation Disentanglement

โ€ข Conditional VAEโ€ข Given training data x and attribute of interest c,

we model the conditional distribution ๐‘๐‘๐œƒ๐œƒ ๐‘ฅ๐‘ฅ|๐‘๐‘ .

https://zhuanlan.zhihu.com/p/25518643 56

Page 57: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Representation Disentanglement

โ€ข Conditional VAEโ€ข Example Results

57

Page 58: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Representation Disentanglement

โ€ข Conditional GANโ€ข Interpretable latent factor cโ€ข Latent representation z

https://arxiv.org/abs/1411.1784 58

Page 59: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Representation Disentanglement

โ€ข Goalโ€ข Interpretable deep feature representationโ€ข Disentangle attribute of interest c from the derived latent representation z

โ€ข Unsupervised: InfoGANโ€ข Supervised: AC-GAN

InfoGANChen et al.

NIPS โ€™16

ACGANOdena et al.

ICML โ€™17

Chen et al., InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets., NIPS 2016.Odena et al., Conditional image synthesis with auxiliary classifier GANs. ICMLโ€™17 59

Page 60: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

AC-GAN

Odena et al., Conditional image synthesis with auxiliary classifier GANs. ICMLโ€™17

Real dataw.r.t. its domain label

โ€ข Supervised Disentanglement

Gโˆ— = arg minG

maxD

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D + โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ง๐‘ง, ๐‘๐‘)) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D = ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘โ€ฒ|๐‘ฆ๐‘ฆ) + ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘|G(๐‘ฅ๐‘ฅ, ๐‘๐‘))

โ€ข Learningโ€ข Overall objective function

โ€ข Adversarial Loss

โ€ข Disentanglement lossG

D

๐‘ง๐‘ง๐‘๐‘

G(๐‘ง๐‘ง, ๐‘๐‘)๐‘ฆ๐‘ฆ (real)

Supervised

Generated dataw.r.t. assigned label

60

Page 61: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

AC-GAN

Odena et al., Conditional image synthesis with auxiliary classifier GANs. ICMLโ€™17

โ€ข Supervised Disentanglement

G

D

๐‘ง๐‘ง๐‘๐‘

G(๐‘ง๐‘ง, ๐‘๐‘)๐‘ฆ๐‘ฆ (real)

Supervised

Different ๐‘๐‘ values

61

Page 62: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

InfoGAN

โ€ข Unsupervised Disentanglement

Generated dataw.r.t. assigned label

Gโˆ— = arg minG

maxD

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D + โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ง๐‘ง, ๐‘๐‘)) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D = ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘|G(๐‘ฅ๐‘ฅ, ๐‘๐‘))

โ€ข Learningโ€ข Overall objective function

โ€ข Adversarial Loss

โ€ข Disentanglement loss

Chen et al., InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets., NIPS 2016.62

Page 63: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

InfoGAN

โ€ข Unsupervised Disentanglement

Chen et al., InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets., NIPS 2016.

โ€ข No guarantee in disentangling particular semantics

Different ๐‘๐‘

Rotation Angle Width

Training process

Different ๐‘๐‘

Time

Loss

63

Page 64: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

What to Cover Todayโ€ฆโ€ข Transfer Learning for Visual Classification & Synthesis

โ€ข Visual Classificationโ€ข Domain Adaptation & Adversarial Learning

โ€ข Visual Synthesisโ€ข Style Transfer

โ€ข Representation Disentanglementโ€ข Supervised vs. unsupervised feature disentanglementโ€ข Joint style transfer & feature disentanglement

Many slides from Richard Turner, Fei-Fei Li, Yaser Sheikh, Simon Lucey, Kaiming He, and J.-B. Huang 64

Page 65: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Goalโ€ข Unified GAN for multi-domain image-to-image translation

Choi et al. "StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation." CVPR 2018

Traditional Cross-Domain Models Unified Multi-Domain Model(StarGAN)

65

Page 66: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Goalโ€ข Unified GAN for multi-domain image-to-image translation

Choi et al. "StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation." CVPR 2018

Traditional Cross-Domain ModelsUnified Multi-Domain Model

(StarGAN)

UnifiedG

๐บ๐บ๐ฟ2

๐บ๐บ๐ฟ3 ๐บ๐บ24

๐บ๐บ34

๐บ๐บ๐ฟ4

๐บ๐บ23

66

Page 67: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Goal / Problem Settingโ€ข Single image translation model across

multiple domains โ€ข Unpaired training data

67

Page 68: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Goal / Problem Settingโ€ข Single Image translation model across multiple

domains โ€ข Unpaired training data

โ€ข Ideaโ€ข Concatenate image and target domain label as input of generatorโ€ข Auxiliary domain classifier on Discriminator

Target domain Image

68

Page 69: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Goal / Problem Settingโ€ข Single Image translation model across multiple

domains โ€ข Unpaired training data

โ€ข Ideaโ€ข Concatenate image and target domain label as input of

Generatorโ€ข Auxiliary domain classifier as discriminator too

69

Page 70: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Goal / Problem Settingโ€ข Single Image translation model across

multiple domains โ€ข Unpaired training data

โ€ข Ideaโ€ข Concatenate image and target domain label as input of

Generatorโ€ข Auxiliary domain classifier on Discriminatorโ€ข Cycle consistency across domains

70

Page 71: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGANโ€ข Goal / Problem Setting

โ€ข Single Image translation model across multiple domains

โ€ข Unpaired training data

โ€ข Ideaโ€ข Auxiliary domain classifier as discriminatorโ€ข Concatenate image and target domain label as inputโ€ข Cycle consistency across domains

71

Page 72: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข LearningOverall objective function

Gโˆ— = arg minG

maxD

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D + โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D + โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G

72

Page 73: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Learning

โ€ข Adversarial Loss

Overall objective functionGโˆ— = arg min

GmaxD

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D + โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D + โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฅ๐‘ฅ, ๐‘๐‘)) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

Adversarial Loss ๐‘ฆ๐‘ฆG(๐‘ฅ๐‘ฅ, ๐‘๐‘)

๐‘ฅ๐‘ฅ๐‘๐‘

73

Page 74: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Learning

โ€ข Adversarial Loss

โ€ข Domain Classification Loss (Disentanglement)

Overall objective functionGโˆ— = arg min

GmaxD

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D + โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D + โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฅ๐‘ฅ, ๐‘๐‘)) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

Domain Classification Loss ๐‘ฆ๐‘ฆG(๐‘ฅ๐‘ฅ, ๐‘๐‘)

๐‘ฅ๐‘ฅ๐‘๐‘

โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D = ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘โ€ฒ|๐‘ฆ๐‘ฆ) + ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘|G(๐‘ฅ๐‘ฅ, ๐‘๐‘))

74

Page 75: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Learning

โ€ข Adversarial Loss

โ€ข Domain Classification Loss (Disentanglement)

Overall objective functionGโˆ— = arg min

GmaxD

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D + โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D + โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฅ๐‘ฅ, ๐‘๐‘)) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

Domain Classification Loss ๐‘ฆ๐‘ฆG(๐‘ฅ๐‘ฅ, ๐‘๐‘)

โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D = ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘โ€ฒ|๐‘ฆ๐‘ฆ) + ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘|G(๐‘ฅ๐‘ฅ, ๐‘๐‘))

Real dataw.r.t. its domain label

๐‘๐‘

๐‘ฅ๐‘ฅ

Dcls(๐‘๐‘โ€ฒ|๐‘ฆ๐‘ฆ)

75

Page 76: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Learning

โ€ข Adversarial Loss

โ€ข Domain Classification Loss (Disentanglement)

Overall objective functionGโˆ— = arg min

GmaxD

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D + โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D + โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฅ๐‘ฅ, ๐‘๐‘)) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

Domain Classification Loss ๐‘ฆ๐‘ฆG(๐‘ฅ๐‘ฅ, ๐‘๐‘)

โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D = ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘โ€ฒ|๐‘ฆ๐‘ฆ) + ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘|G(๐‘ฅ๐‘ฅ, ๐‘๐‘))

Generated dataw.r.t. assigned label

๐‘๐‘ ๐‘ฅ๐‘ฅ

Dcls(๐‘๐‘|G(๐‘ฅ๐‘ฅ, ๐‘๐‘))

76

Page 77: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Learning

โ€ข Adversarial Loss

โ€ข Domain Classification Loss (Disentanglement)

โ€ข Cycle Consistency Loss

Overall objective functionGโˆ— = arg min

GmaxD

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D + โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D + โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฅ๐‘ฅ, ๐‘๐‘)) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

Consistency Loss

๐‘๐‘๐‘ฅ๐‘ฅ

G(๐‘ฅ๐‘ฅ, ๐‘๐‘)

๐‘ฅ๐‘ฅ๐‘๐‘โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D= ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘โ€ฒ|๐‘ฆ๐‘ฆ) + ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘|G(๐‘ฅ๐‘ฅ, ๐‘๐‘))

โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G = ๐”ผ๐”ผ G G ๐‘ฅ๐‘ฅ, ๐‘๐‘ , ๐‘๐‘๐‘ฅ๐‘ฅ โˆ’ ๐‘ฅ๐‘ฅ ๐ฟ

G G ๐‘ฅ๐‘ฅ, ๐‘๐‘ , ๐‘๐‘๐‘ฅ๐‘ฅ

77

Page 78: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGAN

โ€ข Learning

โ€ข Adversarial Loss

โ€ข Domain Classification Loss

โ€ข Cycle Consistency Loss

Overall objective functionGโˆ— = arg min

GmaxD

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D + โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D + โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G

โ„’๐บ๐บ๐บ๐บ๐บ๐บ G, D = ๐”ผ๐”ผ log 1 โˆ’ D(G(๐‘ฅ๐‘ฅ, ๐‘๐‘)) + ๐”ผ๐”ผ log D ๐‘ฆ๐‘ฆ

โ„’๐‘๐‘๐‘๐‘๐‘๐‘ G, D = ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘โ€ฒ|๐‘ฆ๐‘ฆ) + ๐”ผ๐”ผ โˆ’ log Dcls(๐‘๐‘|G(๐‘ฅ๐‘ฅ, ๐‘๐‘))

โ„’๐‘๐‘๐‘ฆ๐‘ฆ๐‘๐‘ G = ๐”ผ๐”ผ G G ๐‘ฅ๐‘ฅ, ๐‘๐‘ , ๐‘๐‘๐‘ฅ๐‘ฅ โˆ’ ๐‘ฅ๐‘ฅ ๐ฟ

78

Page 79: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

StarGANโ€ข Example results

โ€ข StarGAN can somehow be viewed as a representation disentanglement model, instead of an image translation one.

Choi et al. "StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation." CVPR 2018

Multiple Domains Multiple Domains

Github Page: https://github.com/yunjey/StarGAN

79

Page 80: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

What to Cover Todayโ€ฆโ€ข Transfer Learning for Visual Classification & Synthesis

โ€ข Visual Classificationโ€ข Domain Adaptation & Adversarial Learning

โ€ข Visual Synthesisโ€ข Style Transfer

โ€ข Representation Disentanglementโ€ข Supervised vs. unsupervised feature disentanglementโ€ข Joint style transfer & feature disentanglement

Many slides from Richard Turner, Fei-Fei Li, Yaser Sheikh, Simon Lucey, Kaiming He, and J.-B. Huang 80

Page 81: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation

81

โ€ข Learning interpretable representations

Page 82: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation

82

โ€ข Learning interpretable representations

Page 83: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation

83

โ€ข Learning interpretable representations

Page 84: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Example Results

โ€ข Face image translation

84

Page 85: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Example Results

โ€ข Multi-attribute image translation

85

Page 86: Deep Learning for Computer Visionvllab.ee.ntu.edu.tw/uploads/1/1/1/6/111696467/dlcv_w8.pdfLeCun & Ranzato, Deep Learning Tutorial, ICML 2013 4 (Traditional) Machine Learning vs. Transfer

Next Week

86

Guest Lectures:1. โ€œThe Paradigm Shift in AIโ€

- 2:20pm-3:10pm- Dr. Trista ChenChief Scientist of Machine LearningInventec Corp.

2. โ€œAI้†ซ็™‚ๅ‰ตๆฅญๅˆ†ไบซโ€- 3:30pm-4:20pm- David ChouFounder & CEO of Deep01


Recommended