Upload
shinichi-tamura
View
55
Download
3
Embed Size (px)
Citation preview
ESL 17.3.2-17.4
Graphical Lasso& Boltzmann Machines
June 8, 2015Talk by Shinichi TAMURA
Mathematical Informatics Lab @ NAIST
Today's topics
Today's topics
[Review] Properties of graphical models
What is the Graphical Model
[Review] Properties of graphical models
Fitting Gaussian Graphical Models
1
2
34
5
・ ・ 0 0 ・
・ ・ ・ 0 0
0 ・ ・ ・ 0
0 0 ・ ・ ・
・ 0 0 ・ ・
[Review] Properties of graphical models
Fitting Gaussian Graphical Models
[Review] Properties of graphical models
Fitting Gaussian Graphical Models
[Review] Properties of graphical models
Fitting Gaussian Graphical Models
[Review] Properties of graphical models
Fitting Gaussian Graphical Models
θjk=0
[Review] Properties of graphical models
Fitting Gaussian Graphical Models
[Review] Properties of graphical models
Fitting Gaussian Graphical Models
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
WΘ Γ
? ? ?
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
θjk=0
WΘ Γ0 0
0 0
0 0
0 0
0 0
?
1
2
34
5
?
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
θjk=0 γjk=0
WΘ Γ0 0
0 0
0 0
0 0
0 0
?
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
θjk=0 γjk=0wjk=sjk
WΘ Γ0 0
0 0
0 0
0 0
0 0
s11 s12 s15
s21 s22 s23
s32 s33 s34
s43 s44 s45
s51 s54 s55
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
θjk=0 γjk=0wjk=sjk
WΘ Γ? ? 0 0 ?
? ? ? 0 0
0 ? ? ? 0
0 0 ? ? ?
? 0 0 ? ?
s11 s12 ? ? s15
s21 s22 s23 ? ?
? s32 s33 s34 ?
? ? s43 s44 s45
s51 ? ? s54 s55
0 0 ? ? 0
0 0 0 ? ?
? 0 0 0 ?
? ? 0 0 0
0 ? ? 0 0
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
θjk=0 γjk=0wjk=sjk
θjk=0 wjk=0
WΘ Γ? ? 0 0 ?
? ? ? 0 0
0 ? ? ? 0
0 0 ? ? ?
? 0 0 ? ?
s11 s12 ? ? s15
s21 s22 s23 ? ?
? s32 s33 s34 ?
? ? s43 s44 s45
s51 ? ? s54 s55
0 0 ? ? 0
0 0 0 ? ?
? 0 0 0 ?
? ? 0 0 0
0 ? ? 0 0
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
θjk=0 γjk=0wjk=sjk
θjk=0 wjk=0
WΘ Γ? ? 0 0 ?
? ? ? 0 0
0 ? ? ? 0
0 0 ? ? ?
? 0 0 ? ?
s11 s12 ? ? s15
s21 s22 s23 ? ?
? s32 s33 s34 ?
? ? s43 s44 s45
s51 ? ? s54 s55
0 0 ? ? 0
0 0 0 ? ?
? 0 0 0 ?
? ? 0 0 0
0 ? ? 0 0
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
θjk=0 γjk=0wjk=sjk
θjk=0 wjk=0
WΘ Γ? ? 0 0 ?
? ? ? 0 0
0 ? ? ? 0
0 0 ? ? ?
? 0 0 ? ?
s11 s12 ? ? s15
s21 s22 s23 ? ?
? s32 s33 s34 ?
? ? s43 s44 s45
s51 ? ? s54 s55
0 0 ? ? 0
0 0 0 ? ?
? 0 0 0 ?
? ? 0 0 0
0 ? ? 0 0
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
θjk=0 γjk=0wjk=sjk
θjk=0 wjk=0
WΘ Γ? ? 0 0 ?
? ? ? 0 0
0 ? ? ? 0
0 0 ? ? ?
? 0 0 ? ?
s11 s12 ? ? s15
s21 s22 s23 ? ?
? s32 s33 s34 ?
? ? s43 s44 s45
s51 ? ? s54 s55
0 0 ? ? 0
0 0 0 ? ?
? 0 0 0 ?
? ? 0 0 0
0 ? ? 0 0
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W−S−Γ=0.
WΘ Γ
Θ W Γ
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
w12−s12−γ12=0
WΘ Γ
Θ11 θ12
θT12 θ22
W11 w12
wT12 w22
Γ11 γ12
γT12 γ22
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
w12−s12−γ12=0
w12−s12−γ12=0
W11β−s12−γ12=0 (β=θ12/θ22)
W*11β*−s*12=0
WΘ Γ
Θ11 θ12
θT12 θ22
W11 w12
wT12 w22
Γ11 γ12
γT12 γ22
1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
WΘ Γ1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
WΘ Γ1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
WΘ Γ1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
WΘ Γ1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
WΘ Γ1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
WΘ Γ1
2
34
5
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W=S+λI
W*11β*−s*12=0
θ12=−βθ22
[Review] Properties of graphical models
Block-wise Algorithm for GGM
W=S+λI
W*11β*−s*12=0
θ12=−βθ22
Θ=W-1
Today's topics
Today's topics
Graphical Lasso
How to Estimate Graph Structure
θjk
Graphical Lasso
How to Estimate Graph Structure
θjk
Graphical Lasso
How to Estimate Graph Structure
θjk
Graphical Lasso
How to Estimate Graph Structure
θjk
Graphical Lasso
How to Estimate Graph Structure
θjk
Graphical Lasso
How to Estimate Graph Structure
θjk
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Problem of lasso regularization
Graphical Lasso
Problem of lasso regularization
Graphical Lasso
Sub-derivative
Graphical Lasso
Sub-derivative
x0
yƒ( )=| |
Graphical Lasso
Sub-derivative
f(x)=|x|{-1} x<0[-1,1] x=0{1} x>0
Graphical Lasso
Sub-derivative
f(x)=|x|{-1} x<0[-1,1] x=0{1} x>0
“Sign” “sign”= -1 (θ<0)
Sign(θ) ∈ [-1,1] (θ=0)= 1 (θ>0){
Graphical Lasso
Sub-derivative
f(x)=|x|{-1} x<0[-1,1] x=0{1} x>0
“Sign” “sign”= -1 (θ<0)
Sign(θ) ∈ [-1,1] (θ=0)= 1 (θ>0)
Sign(0) sign(0)=0
{
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Gradient Equation
Graphical Lasso
Gradient Equation
Graphical Lasso
Gradient Equation
wii=sii+λ Θ
Graphical Lasso
Gradient Equation
wii=sii+λ Θ
wii=sii
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Block-wise algorithm
w12–s12–λ∙Sign(θ12)=0
WΘ
Θ11 θ12
θT12 θ22
W11 w12
wT12 w22
Graphical Lasso
Block-wise algorithm
w12–s12–λ∙Sign(θ12)=0
θ12=–θ22W-111w12 and θ22>0
w12–s12+λ∙Sign(W-111w12)=0
Graphical Lasso
Block-wise algorithm
w12–s12–λ∙Sign(θ12)=0
θ12=–θ22W-111w12 and θ22>0
w12–s12+λ∙Sign(W-111w12)=0
W11 w12
wT12 w22
Graphical Lasso
Block-wise algorithm
w12–s12–λ∙Sign(θ12)=0
θ12=–θ22W-111w12 and θ22>0
w12–s12+λ∙Sign(W-111w12)=0
β=W-111w12
W11β–s12+λ∙Sign(β)=0
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Coordinate-wise Decent Algorithm
W11β–s12+λ∙Sign(β)=0
Graphical Lasso
Coordinate-wise Decent Algorithm
W11β–s12+λ∙Sign(β)=0
Graphical Lasso
Coordinate-wise Decent Algorithm
W11β–s12+λ∙Sign(β)=0
Graphical Lasso
Coordinate-wise Decent Algorithm
W11β–s12+λ∙Sign(β)=0
Graphical Lasso
Coordinate-wise Decent Algorithm
f(βj)
Graphical Lasso
Coordinate-wise Decent Algorithm
f(βj)
y
Graphical Lasso
Coordinate-wise Decent Algorithm
f(βj)
y
Graphical Lasso
Coordinate-wise Decent Algorithm
f(βj)
y
Graphical Lasso
Coordinate-wise Decent Algorithm
f(βj)
y
Graphical Lasso
Coordinate-wise Decent Algorithm
f(βj)
y y
0
y
Graphical Lasso
Coordinate-wise Decent Algorithm
f(βj)
y y
0
y
Graphical Lasso
Coordinate-wise Decent Algorithm
Graphical Lasso
Coordinate-wise Decent Algorithm
j=1,2,…,p-1
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Overview of the Algorithm
W=S+λI
f(βj)=0
θ12=−βθ22
Graphical Lasso
Key Points of the Algorithm
Graphical Lasso
Setting of λ
λ
y
0
Graphical Lasso
Setting of λ
λ
y
0
λ=0
Graphical Lasso
Setting of λ
λ
y
0
λ=7
Graphical Lasso
Setting of λ
λ
y
0
λ=14
Graphical Lasso
Setting of λ
λ
y
0
λ=27
Graphical Lasso
Setting of λ
λ
y
0
λ=36
Graphical Lasso
Setting of λ
λ
y
0
λ=0 λ=7 λ=14 λ=27 λ=36
Graphical Lasso
Setting of λ
λ
y
0
λjkθjk
Graphical Lasso
Setting of λ
λ
y
0
λjkθjk
j k λjk=∞
Graphical Lasso
Treating Unobserved Nodes
Graphical Lasso
Treating Unobserved Nodes
Today's topics
Today's topics
Boltzmann Machines
What is the Boltzmann Machine
Boltzmann Machines
What is the Boltzmann Machine
Boltzmann Machines
What is the Boltzmann Machine
Boltzmann Machines
What is the Boltzmann Machine
Boltzmann Machines
What is the Boltzmann Machine
Boltzmann Machines
The Joint Distribution
Z
Boltzmann Machines
The Joint Distribution
Z
Only pairwiserelation is modeled.
Boltzmann Machines
The Joint Distribution
Z
X0≡1
Boltzmann Machines
The Conditional Distribution
Xj X-j
Boltzmann Machines
The Conditional Distribution
Xj X-j
Boltzmann Machines
The Conditional Distribution
Boltzmann Machines
Estimation for Known Graph
Boltzmann Machines
Estimation for Known Graph
Boltzmann Machines
Estimation for Known Graph
Boltzmann Machines
Estimation for Known Graph
xi=(xi1,…,xip) i=1,…,N
Boltzmann Machines
Estimation for Known Graph
xi=(xi1,…,xip) i=1,…,N
Boltzmann Machines
Estimation for Known Graph
xi=(xi1,…,xip) i=1,…,N
Boltzmann Machines
Estimation for Known Graph
Boltzmann Machines
Estimation for Known Graph
2p
Boltzmann Machines
Estimation for Known Graph
p (<30)
Boltzmann Machines
Estimation for Known Graph
p (<30)
•
Boltzmann Machines
Estimation for Known Graph
p (<30)
•
•
Boltzmann Machines
Estimation for Known Graph
p (<30)
•
•
•
Boltzmann Machines
Estimation for Known Graph
p (≥30)
Boltzmann Machines
Estimation for Known Graph
p (≥30)
•
Boltzmann Machines
Estimation for Known Graph
p (≥30)
•
•
Boltzmann Machines
Estimation for Known Graph
Boltzmann Machines
Hidden Nodes
1 2
34
Boltzmann Machines
Hidden Nodes
1 2
34
5
Boltzmann Machines
Hidden Nodes
1 2
34
5
Boltzmann Machines
Hidden Nodes
Boltzmann Machines
Hidden Nodes
Boltzmann Machines
Hidden Nodes
Boltzmann Machines
Estimating Graph Structure
•
•
•
Boltzmann Machines
Difference from Graphical Lasso
Boltzmann Machines
Difference from Graphical Lasso
Boltzmann Machines
Difference from Graphical Lasso
Boltzmann Machines
Restricted Boltzmann Machine
Boltzmann Machines
Restricted Boltzmann Machines
Boltzmann Machines
Restricted Boltzmann Machines
Boltzmann Machines
Restricted Boltzmann Machines
Boltzmann Machines
Restricted Boltzmann Machines
Boltzmann Machines
Restricted Boltzmann Machines
Boltzmann Machines
Restricted Boltzmann Machines
Boltzmann Machines
Learning RBM
Boltzmann Machines
Learning RBM
Boltzmann Machines
Learning RBM
•
Boltzmann Machines
Learning RBM
•
•
Boltzmann Machines
Learning RBM
•
•
Boltzmann Machines
Learning RBM
•
•
Boltzmann Machines
Learning RBM
•
•
Boltzmann Machines
Learning RBM
•
•
Today's topics
Today's topics
Today's topics