Lecture 9 VAE variants 40mins - GitHub Pages

VAE variantsHao Dong

Peking University

1

• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

2

VAE variants

Hierarchical representation learning

Temporal representa2on learning

Representation learning

• Convolutional VAE• Conditional VAE• IWAE• β-VAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)

3

VAE variants


Temporal representation learning


4

Convolutional Variational Autoencoder

• Limitations of vanilla VAE• The size of weight of fully connected layer == input size x output size• If VAE uses fully connected layers only, will lead to curse of dimensionality

when the input dimension is large (e.g., image).

• Solution

Image is modified from: Deep Clustering with Convolutional Autoencoder. NIPS 2017.

Fully connected layers here(independent variables)


5

VAE variants

Hierarchical representa2on learning



6

Conditional Variational Autoencoder

• Train and inference with labelled data.

Learning structured output representation using deep conditional generative models. NeurIPS 2015.

7

Recap: Variational Autoencoder

Auto-Encoding Variational Bayes. Diederik P. Kingma, Max Welling. ICLR 2013

• Recap: Se>ng up the objecAve• Maximise P(X)• Set Q(z) to be an arbitrary distribuGon

8

Recap: Variational Autoencoder

Auto-Encoding Variational Bayes. Diederik P. Kingma, Max Welling. ICLR 2013

• Recap: Setting up the objective

encoder ideal reconstruction KLD

9

Condi6onal Varia6onal Autoencoder


• Setting up the objective with labels• Maximise P(Y|X)• Set Q(z) to be an arbitrary distribution

10



• Setting up the objective

encoder ideal

reconstruction KLD

11


• Train and inference without labelled data i.e., vanilla VAE


12




13




14


• Train and inference with labeled data.



15

VAE variants




16

Before we start

• Disentangled / Factorised representation• Each variable in the inferred latent representation is only sensitive to one

single generative factor and relatively invariant to other factors• Good interpretability and easy generalization to a variety of tasks

β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.

17

Before we start

• Unsupervised hierarchical representation learning

Rethinking Style and Content Disentanglement in Variational Autoencoders. ICLR 2018 Workshops.

18

β-VAE

• Unsupervised representation learning• Augment the original VAE framework with a single hyper-parameter β

that modulates the learning constraints• Impose a limit on the capacity of the latent information channel


19

β-VAE


20

β-VAE


21

β-VAE


22

β-VAE


23

β-VAE


24

β-VAE


25

β-VAE

• Discussion: It is really unsupervised?

• It is unsupervised/self-supervised learning, because it does not need any label data

• It is not fully unsupervised learning, it works because of the inductive bias of the neural network model, the hierarchical design introduces prior knowledge about the data



26

VAE variants




27

IWAE（Importance Weighted Autoencoder）

Importance Weighted Autoencoders. ICLR 2016.

• Optimise a tighter lower bound than VAE• VAE just optimises a lower bound of log P(X)

encoder ideal reconstruction KLD

-ELBO

ELBO

28

IWAE


• Optimise a tighter lower bound than VAE

29

IWAE


30

IWAE


• Why “Importance weighted”

where


31

VAE variants




32

Ladder VAE

• To learn hierarchical latent representation• Deep models with several layers of dependent stochastic variables are difficult to train

• Limiting the improvements obtained using these highly expressive models

LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.

33

Ladder VAE


encode decode encode decode

Hierarchical VAE Ladder VAE

34

Ladder VAE


35

Ladder VAE


36

Ladder VAE


37

Ladder VAE


38

Ladder VAE


39

Ladder VAE



40

VAE variants




41

Progressive + Fade-in VAE

• Discussion

Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.

encoders decoders

high-level

middle-level

low-level

Can we directly train a hierarchical VAE with ladder structure like that?

42


• Discussion


encoders decoders

high-level

middle-level

low-level

Information SHORTCUT problemall information go through the low-level path,other paths are ignored.model is lazy…

43


• Progressive + Fade-in


Stage 1: enable high-level path Stage 2: enable high-level and middle paths

Stage 3: enable all paths …

𝛼 increase from 0 to 1 gradually

44


• Results


High-level: background and foreground colors

Middle-level: shape

Low-level: …

45


• Results


High-level Middle-level Low-level


46

VAE variants




47

VAE in speech

• Learning latent representations for style control and transfer in end-to-end speech synthesis

• RNN as encoder

Learning latent representations for style control and transfer in end-to-end speech synthesis. ICASSP 2019.


48

VAE variants




50

TD-VAE

• State-space model as a Markov Chain model

Temporal Difference VAE. ICLR 2019.

51

TD-VAE


52

TD-VAE


53

TD-VAE


54

TD-VAE


55

TD-VAE



56

Summary




Thanks

57

Documents

Lecture 9 VAE variants 40mins - GitHub Pages