ESL 17.3.2-17.4: Graphical Lasso and Boltzmann Machines

Preview:

Citation preview

ESL 17.3.2-17.4

Graphical Lasso& Boltzmann Machines

June 8, 2015Talk by Shinichi TAMURA

Mathematical Informatics Lab @ NAIST

Today's topics

Today's topics

[Review] Properties of graphical models

What is the Graphical Model

[Review] Properties of graphical models

Fitting Gaussian Graphical Models

1

2

34

5

・ ・ 0 0 ・

・ ・ ・ 0 0

0 ・ ・ ・ 0

0 0 ・ ・ ・

・ 0 0 ・ ・

[Review] Properties of graphical models

Fitting Gaussian Graphical Models

[Review] Properties of graphical models

Fitting Gaussian Graphical Models

[Review] Properties of graphical models

Fitting Gaussian Graphical Models

[Review] Properties of graphical models

Fitting Gaussian Graphical Models

θjk=0

[Review] Properties of graphical models

Fitting Gaussian Graphical Models

[Review] Properties of graphical models

Fitting Gaussian Graphical Models

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

WΘ Γ

? ? ?

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

θjk=0

WΘ Γ0 0

0 0

0 0

0 0

0 0

?

1

2

34

5

?

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

θjk=0 γjk=0

WΘ Γ0 0

0 0

0 0

0 0

0 0

?

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

θjk=0 γjk=0wjk=sjk

WΘ Γ0 0

0 0

0 0

0 0

0 0

s11 s12 s15

s21 s22 s23

s32 s33 s34

s43 s44 s45

s51 s54 s55

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

θjk=0 γjk=0wjk=sjk

WΘ Γ? ? 0 0 ?

? ? ? 0 0

0 ? ? ? 0

0 0 ? ? ?

? 0 0 ? ?

s11 s12 ? ? s15

s21 s22 s23 ? ?

? s32 s33 s34 ?

? ? s43 s44 s45

s51 ? ? s54 s55

0 0 ? ? 0

0 0 0 ? ?

? 0 0 0 ?

? ? 0 0 0

0 ? ? 0 0

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

θjk=0 γjk=0wjk=sjk

θjk=0 wjk=0

WΘ Γ? ? 0 0 ?

? ? ? 0 0

0 ? ? ? 0

0 0 ? ? ?

? 0 0 ? ?

s11 s12 ? ? s15

s21 s22 s23 ? ?

? s32 s33 s34 ?

? ? s43 s44 s45

s51 ? ? s54 s55

0 0 ? ? 0

0 0 0 ? ?

? 0 0 0 ?

? ? 0 0 0

0 ? ? 0 0

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

θjk=0 γjk=0wjk=sjk

θjk=0 wjk=0

WΘ Γ? ? 0 0 ?

? ? ? 0 0

0 ? ? ? 0

0 0 ? ? ?

? 0 0 ? ?

s11 s12 ? ? s15

s21 s22 s23 ? ?

? s32 s33 s34 ?

? ? s43 s44 s45

s51 ? ? s54 s55

0 0 ? ? 0

0 0 0 ? ?

? 0 0 0 ?

? ? 0 0 0

0 ? ? 0 0

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

θjk=0 γjk=0wjk=sjk

θjk=0 wjk=0

WΘ Γ? ? 0 0 ?

? ? ? 0 0

0 ? ? ? 0

0 0 ? ? ?

? 0 0 ? ?

s11 s12 ? ? s15

s21 s22 s23 ? ?

? s32 s33 s34 ?

? ? s43 s44 s45

s51 ? ? s54 s55

0 0 ? ? 0

0 0 0 ? ?

? 0 0 0 ?

? ? 0 0 0

0 ? ? 0 0

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

θjk=0 γjk=0wjk=sjk

θjk=0 wjk=0

WΘ Γ? ? 0 0 ?

? ? ? 0 0

0 ? ? ? 0

0 0 ? ? ?

? 0 0 ? ?

s11 s12 ? ? s15

s21 s22 s23 ? ?

? s32 s33 s34 ?

? ? s43 s44 s45

s51 ? ? s54 s55

0 0 ? ? 0

0 0 0 ? ?

? 0 0 0 ?

? ? 0 0 0

0 ? ? 0 0

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W−S−Γ=0.

WΘ Γ

Θ W Γ

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

w12−s12−γ12=0

WΘ Γ

Θ11 θ12

θT12 θ22

W11 w12

wT12 w22

Γ11 γ12

γT12 γ22

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

w12−s12−γ12=0

w12−s12−γ12=0

W11β−s12−γ12=0 (β=θ12/θ22)

W*11β*−s*12=0

WΘ Γ

Θ11 θ12

θT12 θ22

W11 w12

wT12 w22

Γ11 γ12

γT12 γ22

1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

WΘ Γ1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

WΘ Γ1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

WΘ Γ1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

WΘ Γ1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

WΘ Γ1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

WΘ Γ1

2

34

5

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W=S+λI

W*11β*−s*12=0

θ12=−βθ22

[Review] Properties of graphical models

Block-wise Algorithm for GGM

W=S+λI

W*11β*−s*12=0

θ12=−βθ22

Θ=W-1

Today's topics

Today's topics

Graphical Lasso

How to Estimate Graph Structure

θjk

Graphical Lasso

How to Estimate Graph Structure

θjk

Graphical Lasso

How to Estimate Graph Structure

θjk

Graphical Lasso

How to Estimate Graph Structure

θjk

Graphical Lasso

How to Estimate Graph Structure

θjk

Graphical Lasso

How to Estimate Graph Structure

θjk

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Problem of lasso regularization

Graphical Lasso

Problem of lasso regularization

Graphical Lasso

Sub-derivative

Graphical Lasso

Sub-derivative

x0

yƒ( )=| |

Graphical Lasso

Sub-derivative

f(x)=|x|{-1} x<0[-1,1] x=0{1} x>0

Graphical Lasso

Sub-derivative

f(x)=|x|{-1} x<0[-1,1] x=0{1} x>0

“Sign” “sign”= -1 (θ<0)

Sign(θ) ∈ [-1,1] (θ=0)= 1 (θ>0){

Graphical Lasso

Sub-derivative

f(x)=|x|{-1} x<0[-1,1] x=0{1} x>0

“Sign” “sign”= -1 (θ<0)

Sign(θ) ∈ [-1,1] (θ=0)= 1 (θ>0)

Sign(0) sign(0)=0

{

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Gradient Equation

Graphical Lasso

Gradient Equation

Graphical Lasso

Gradient Equation

wii=sii+λ Θ

Graphical Lasso

Gradient Equation

wii=sii+λ Θ

wii=sii

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Block-wise algorithm

w12–s12–λ∙Sign(θ12)=0

Θ11 θ12

θT12 θ22

W11 w12

wT12 w22

Graphical Lasso

Block-wise algorithm

w12–s12–λ∙Sign(θ12)=0

θ12=–θ22W-111w12 and θ22>0

w12–s12+λ∙Sign(W-111w12)=0

Graphical Lasso

Block-wise algorithm

w12–s12–λ∙Sign(θ12)=0

θ12=–θ22W-111w12 and θ22>0

w12–s12+λ∙Sign(W-111w12)=0

W11 w12

wT12 w22

Graphical Lasso

Block-wise algorithm

w12–s12–λ∙Sign(θ12)=0

θ12=–θ22W-111w12 and θ22>0

w12–s12+λ∙Sign(W-111w12)=0

β=W-111w12

W11β–s12+λ∙Sign(β)=0

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Coordinate-wise Decent Algorithm

W11β–s12+λ∙Sign(β)=0

Graphical Lasso

Coordinate-wise Decent Algorithm

W11β–s12+λ∙Sign(β)=0

Graphical Lasso

Coordinate-wise Decent Algorithm

W11β–s12+λ∙Sign(β)=0

Graphical Lasso

Coordinate-wise Decent Algorithm

W11β–s12+λ∙Sign(β)=0

Graphical Lasso

Coordinate-wise Decent Algorithm

f(βj)

Graphical Lasso

Coordinate-wise Decent Algorithm

f(βj)

y

Graphical Lasso

Coordinate-wise Decent Algorithm

f(βj)

y

Graphical Lasso

Coordinate-wise Decent Algorithm

f(βj)

y

Graphical Lasso

Coordinate-wise Decent Algorithm

f(βj)

y

Graphical Lasso

Coordinate-wise Decent Algorithm

f(βj)

y y

0

y

Graphical Lasso

Coordinate-wise Decent Algorithm

f(βj)

y y

0

y

Graphical Lasso

Coordinate-wise Decent Algorithm

Graphical Lasso

Coordinate-wise Decent Algorithm

j=1,2,…,p-1

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Overview of the Algorithm

W=S+λI

f(βj)=0

θ12=−βθ22

Graphical Lasso

Key Points of the Algorithm

Graphical Lasso

Setting of λ

λ

y

0

Graphical Lasso

Setting of λ

λ

y

0

λ=0

Graphical Lasso

Setting of λ

λ

y

0

λ=7

Graphical Lasso

Setting of λ

λ

y

0

λ=14

Graphical Lasso

Setting of λ

λ

y

0

λ=27

Graphical Lasso

Setting of λ

λ

y

0

λ=36

Graphical Lasso

Setting of λ

λ

y

0

λ=0 λ=7 λ=14 λ=27 λ=36

Graphical Lasso

Setting of λ

λ

y

0

λjkθjk

Graphical Lasso

Setting of λ

λ

y

0

λjkθjk

j k λjk=∞

Graphical Lasso

Treating Unobserved Nodes

Graphical Lasso

Treating Unobserved Nodes

Today's topics

Today's topics

Boltzmann Machines

What is the Boltzmann Machine

Boltzmann Machines

What is the Boltzmann Machine

Boltzmann Machines

What is the Boltzmann Machine

Boltzmann Machines

What is the Boltzmann Machine

Boltzmann Machines

What is the Boltzmann Machine

Boltzmann Machines

The Joint Distribution

Z

Boltzmann Machines

The Joint Distribution

Z

Only pairwiserelation is modeled.

Boltzmann Machines

The Joint Distribution

Z

X0≡1

Boltzmann Machines

The Conditional Distribution

Xj X-j

Boltzmann Machines

The Conditional Distribution

Xj X-j

Boltzmann Machines

The Conditional Distribution

Boltzmann Machines

Estimation for Known Graph

Boltzmann Machines

Estimation for Known Graph

Boltzmann Machines

Estimation for Known Graph

Boltzmann Machines

Estimation for Known Graph

xi=(xi1,…,xip) i=1,…,N

Boltzmann Machines

Estimation for Known Graph

xi=(xi1,…,xip) i=1,…,N

Boltzmann Machines

Estimation for Known Graph

xi=(xi1,…,xip) i=1,…,N

Boltzmann Machines

Estimation for Known Graph

Boltzmann Machines

Estimation for Known Graph

2p

Boltzmann Machines

Estimation for Known Graph

p (<30)

Boltzmann Machines

Estimation for Known Graph

p (<30)

Boltzmann Machines

Estimation for Known Graph

p (<30)

Boltzmann Machines

Estimation for Known Graph

p (<30)

Boltzmann Machines

Estimation for Known Graph

p (≥30)

Boltzmann Machines

Estimation for Known Graph

p (≥30)

Boltzmann Machines

Estimation for Known Graph

p (≥30)

Boltzmann Machines

Estimation for Known Graph

Boltzmann Machines

Hidden Nodes

1 2

34

Boltzmann Machines

Hidden Nodes

1 2

34

5

Boltzmann Machines

Hidden Nodes

1 2

34

5

Boltzmann Machines

Hidden Nodes

Boltzmann Machines

Hidden Nodes

Boltzmann Machines

Hidden Nodes

Boltzmann Machines

Estimating Graph Structure

Boltzmann Machines

Difference from Graphical Lasso

Boltzmann Machines

Difference from Graphical Lasso

Boltzmann Machines

Difference from Graphical Lasso

Boltzmann Machines

Restricted Boltzmann Machine

Boltzmann Machines

Restricted Boltzmann Machines

Boltzmann Machines

Restricted Boltzmann Machines

Boltzmann Machines

Restricted Boltzmann Machines

Boltzmann Machines

Restricted Boltzmann Machines

Boltzmann Machines

Restricted Boltzmann Machines

Boltzmann Machines

Restricted Boltzmann Machines

Boltzmann Machines

Learning RBM

Boltzmann Machines

Learning RBM

Boltzmann Machines

Learning RBM

Boltzmann Machines

Learning RBM

Boltzmann Machines

Learning RBM

Boltzmann Machines

Learning RBM

Boltzmann Machines

Learning RBM

Boltzmann Machines

Learning RBM

Today's topics

Today's topics

Today's topics

Recommended