57
VAE variants Hao Dong Peking University 1

Lecture 9 VAE variants 50mins - Deep Generative Models

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 9 VAE variants 50mins - Deep Generative Models

VAE variantsHao Dong

Peking University

1

Page 2: Lecture 9 VAE variants 50mins - Deep Generative Models

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

2

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 3: Lecture 9 VAE variants 50mins - Deep Generative Models

• Convolutional VAE• Conditional VAE• IWAE• β-VAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

3

VAE variants

Hierarchical representation learning

Temporal representa3on learning

Representation learning

Page 4: Lecture 9 VAE variants 50mins - Deep Generative Models

4

Convolutional Variational Autoencoder

• Limitations of vanilla VAE• The size of weight of fully connected layer == input size x output size• If VAE uses fully connected layers only, will lead to curse of dimensionality

when the input dimension is large (e.g., image).

• Solution

Image is modified from: Deep Clustering with Convolutional Autoencoder. NIPS 2017.

Fully connected layers here(independent variables)

Page 5: Lecture 9 VAE variants 50mins - Deep Generative Models

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

5

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 6: Lecture 9 VAE variants 50mins - Deep Generative Models

6

Conditional Variational Autoencoder

• Train and inference with labeled data.

Learning structured output representa8on using deep condi8onal genera8ve models. NeurIPS 2015.

Page 7: Lecture 9 VAE variants 50mins - Deep Generative Models

7

Recap: Variational Autoencoder

Auto-Encoding Variational Bayes. Diederik P. Kingma, Max Welling. ICLR 2013

• Recap: Setting up the objective• Maximize P(X)• Set Q(z) to be an arbitrary distribution

Page 8: Lecture 9 VAE variants 50mins - Deep Generative Models

8

Recap: Variational Autoencoder

Auto-Encoding Variational Bayes. Diederik P. Kingma, Max Welling. ICLR 2013

• Recap: Setting up the objective

encoder ideal reconstruction KLD

Page 9: Lecture 9 VAE variants 50mins - Deep Generative Models

9

Conditional Variational Autoencoder

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

• Setting up the objective with labels• Maximize P(Y|X)• Set Q(z) to be an arbitrary distribution

Page 10: Lecture 9 VAE variants 50mins - Deep Generative Models

10

Conditional Variational Autoencoder

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

• Setting up the objective

encoder ideal

reconstruction KLD

Page 11: Lecture 9 VAE variants 50mins - Deep Generative Models

11

Conditional Variational Autoencoder

• Train and inference without labeled data i.e., vanilla VAE

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

Page 12: Lecture 9 VAE variants 50mins - Deep Generative Models

12

Conditional Variational Autoencoder

• Train and inference with labeled data.

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

Page 13: Lecture 9 VAE variants 50mins - Deep Generative Models

13

Conditional Variational Autoencoder

• Train and inference with labeled data.

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

Page 14: Lecture 9 VAE variants 50mins - Deep Generative Models

14

Conditional Variational Autoencoder

• Train and inference with labeled data.

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

Page 15: Lecture 9 VAE variants 50mins - Deep Generative Models

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

15

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 16: Lecture 9 VAE variants 50mins - Deep Generative Models

16

Before we start

• Disentangled / Factorized representation• Each variable in the inferred latent representation is only sensitive to one

single generative factor and relatively invariant to other factors• Good interpretability and easy generalization to a variety of tasks

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 17: Lecture 9 VAE variants 50mins - Deep Generative Models

17

Before we start

• Unsupervised hierarchical representation learning

Rethinking Style and Content Disentanglement in Variational Autoencoders. ICLR 2018 Workshops.

Page 18: Lecture 9 VAE variants 50mins - Deep Generative Models

18

β-VAE

• Unsupervised representation learning• Augment the original VAE framework with a single hyper-parameter β

that modulates the learning constraints• Impose a limit on the capacity of the latent information channel

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 19: Lecture 9 VAE variants 50mins - Deep Generative Models

19

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 20: Lecture 9 VAE variants 50mins - Deep Generative Models

20

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 21: Lecture 9 VAE variants 50mins - Deep Generative Models

21

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 22: Lecture 9 VAE variants 50mins - Deep Generative Models

22

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 23: Lecture 9 VAE variants 50mins - Deep Generative Models

23

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 24: Lecture 9 VAE variants 50mins - Deep Generative Models

24

β-VAE

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 25: Lecture 9 VAE variants 50mins - Deep Generative Models

25

β-VAE

• Discussion: It is really unsupervised?

• It is unsupervised/self-supervised learning, because it does not need any label data

• It is not fully unsupervised learning, it works because of the inductive bias of the neural network model, the hierarchical design introduces prior knowledge about the data

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

Page 26: Lecture 9 VAE variants 50mins - Deep Generative Models

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

26

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 27: Lecture 9 VAE variants 50mins - Deep Generative Models

27

IWAE

Importance Weighted Autoencoders. ICLR 2016.

• Optimize a tighter lower bound than VAE• VAE just optimizes a lower bound of log P(X)

encoder ideal reconstruction KLD

-ELBO

Page 28: Lecture 9 VAE variants 50mins - Deep Generative Models

28

IWAE

Importance Weighted Autoencoders. ICLR 2016.

• Optimize a tighter lower bound than VAE

Page 29: Lecture 9 VAE variants 50mins - Deep Generative Models

29

IWAE

Importance Weighted Autoencoders. ICLR 2016.

Page 30: Lecture 9 VAE variants 50mins - Deep Generative Models

30

IWAE

Importance Weighted Autoencoders. ICLR 2016.

• Why “Importance weighted”

where

Page 31: Lecture 9 VAE variants 50mins - Deep Generative Models

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

31

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 32: Lecture 9 VAE variants 50mins - Deep Generative Models

32

Ladder VAE

• To learn hierarchical latent representation• Deep models with several layers of dependent stochastic variables are difficult to train

• Limiting the improvements obtained using these highly expressive models

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 33: Lecture 9 VAE variants 50mins - Deep Generative Models

33

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

encode decode encode decode

Hierarchical VAE Ladder VAE

Page 34: Lecture 9 VAE variants 50mins - Deep Generative Models

34

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 35: Lecture 9 VAE variants 50mins - Deep Generative Models

35

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 36: Lecture 9 VAE variants 50mins - Deep Generative Models

36

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 37: Lecture 9 VAE variants 50mins - Deep Generative Models

37

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 38: Lecture 9 VAE variants 50mins - Deep Generative Models

38

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 39: Lecture 9 VAE variants 50mins - Deep Generative Models

39

Ladder VAE

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

Page 40: Lecture 9 VAE variants 50mins - Deep Generative Models

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

40

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 41: Lecture 9 VAE variants 50mins - Deep Generative Models

41

Progressive + Fade-in VAE

• Discussion

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

encoders decoders

high-level

middle-level

low-level

Can we directly train a hierarchical VAE with ladder structure like that?

Page 42: Lecture 9 VAE variants 50mins - Deep Generative Models

42

Progressive + Fade-in VAE

• Discussion

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

encoders decoders

high-level

middle-level

low-level

Information SHORTCUT problemall information go through the low-level path,other paths are ignored.model is lazy…

Page 43: Lecture 9 VAE variants 50mins - Deep Generative Models

43

Progressive + Fade-in VAE

• Progressive + Fade-in

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

Stage 1: enable high-level path Stage 2: enable high-level and middle paths

Stage 3: enable all paths …

𝛼 increase from 0 to 1 gradually

Page 44: Lecture 9 VAE variants 50mins - Deep Generative Models

44

Progressive + Fade-in VAE

• Results

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

High-level: background and foreground colors

Middle-level: shape

Low-level: …

Page 45: Lecture 9 VAE variants 50mins - Deep Generative Models

45

Progressive + Fade-in VAE

• Results

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

High-level Middle-level Low-level

Page 46: Lecture 9 VAE variants 50mins - Deep Generative Models

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

46

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 47: Lecture 9 VAE variants 50mins - Deep Generative Models

47

VAE in speech

• Learning latent representations for style control and transfer in end-to-end speech synthesis

• RNN as encoder

Learning latent representations for style control and transfer in end-to-end speech synthesis. ICASSP 2019.

Page 48: Lecture 9 VAE variants 50mins - Deep Generative Models

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

48

VAE variants

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 49: Lecture 9 VAE variants 50mins - Deep Generative Models

49

TD-VAE

• To model temporal information

Temporal Difference VAE. ICLR 2019.

Page 50: Lecture 9 VAE variants 50mins - Deep Generative Models

50

TD-VAE

• State-space model as a Markov Chain model

Temporal Difference VAE. ICLR 2019.

Page 51: Lecture 9 VAE variants 50mins - Deep Generative Models

51

TD-VAE

Temporal Difference VAE. ICLR 2019.

Page 52: Lecture 9 VAE variants 50mins - Deep Generative Models

52

TD-VAE

Temporal Difference VAE. ICLR 2019.

Page 53: Lecture 9 VAE variants 50mins - Deep Generative Models

53

TD-VAE

Temporal Difference VAE. ICLR 2019.

Page 54: Lecture 9 VAE variants 50mins - Deep Generative Models

54

TD-VAE

Temporal Difference VAE. ICLR 2019.

Page 55: Lecture 9 VAE variants 50mins - Deep Generative Models

55

TD-VAE

Temporal Difference VAE. ICLR 2019.

Page 56: Lecture 9 VAE variants 50mins - Deep Generative Models

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

56

Summary

Hierarchical representation learning

Temporal representation learning

Representation learning

Page 57: Lecture 9 VAE variants 50mins - Deep Generative Models

Thanks

57