Upload
yuan-kai-wang
View
883
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
Bayesian Networks
Unit 6 Exact Inference in Bayesian Networks
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Wang, Yuan-Kai, 王元凱[email protected]
http://www.ykwang.tw
Department of Electrical Engineering, Fu Jen Univ.輔仁大學電機工程系
2006~2011
Reference this document as: Wang, Yuan-Kai, “Exact Inference in Bayesian Networks,"
Lecture Notes of Wang, Yuan-Kai, Fu Jen University, Taiwan, 2011.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 2
Goal of This Unit• Learn to efficiently compute the
sum product of the inference formula
– Remember: enumeration and multiplication of all P(Xi|Pa(Xi) are not efficient
– We will learn other 3 methods for exact inference
Hh ni
ii XPaXPeEXP~1
))(|()|(
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p.
Related Units• Background
– Probabilistic graphical model• Next units
– Approximate inference algorithms– Probabilistic inference over time
3
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 4
Self-Study References• Chapter 14, Artificial Intelligence-a modern
approach, 2nd, by S. Russel & P. Norvig, Prentice Hall, 2003.
• The generalized distributive law, S. M. Aji and R. J. McEliece, IEEE Trans. On Information Theory, vol. 46, no. 2, 2000.
• Inference in Bayesian networks, B. D’Ambrosio, AI Magazine, 1999.
• Probabilistic Inference in graphical models, M. I. Jordan & Y. Weiss.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 5
Structure of Related Lecture Notes
PGM Representation
Inference
Problem
Learning
Data
Unit 5 : BNUnit 9 : Hybrid BNUnits 10~15: Naïve Bayes, MRF,
HMM, DBN,Kalman filter
Unit 6: Exact inferenceUnit 7: Approximate inferenceUnit 8: Temporal inference
Units 16~ : MLE, EM
StructureLearning
ParameterLearning
B E
A
J M
P(A|B,E)P(J|A)P(M|A)
P(B)P(E)
Query
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p.
Contents
1. Basics of Graph ……………………………… 112. Sum-Product and Generalized Distributive
Law …………………………………………..... 203. Variable Elimination ........................................ 294. Belief Propagation ....……............................... 965. Junction Tree ……………...……………........ 1576. Summary .......................................................... 2127. Implementation ……………………………… 2148. Reference .......................................................... 215
Fu Jen University Department of Electronic Engineering Yuan-Kai Wang Copyright
6
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 7
Four Steps of Inference P(X|e)• Step 1: Bayesian theorem
• Step 2: Marginalization
• Step 3: Conditional independence
• Step 4: Sum-Product computation– Exact inference– Approximate inference
),()(
),()|( eEXPeEP
eEXPeEXP
Hh
hHeEXP ),,(
Hh ni
ii XPaXP~1
))(|(
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 8
Five Types of Queries in Inference• For a probabilistic graphical model G• Given a set of evidence E=e• Query the PGM with
–P(e) : Likelihood query–arg max P(e) :
Maximum likelihood query–P(X|e) : Posterior belief query–arg maxx P(X=x|e) : (Single query variable)
Maximum a posterior (MAP) query–arg maxx1…xk
P(X1=x1, …, Xk=xk|e) :Most probable explanation (MPE) query
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 9
Brute Force Enumeration• We can compute
in O(KN) time, where K=|Xi|
• By using BN, we can represent joint distribution in O(N) space
B E
A
J M
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 10
Expression Tree of Enumeration : Repeated Computations
• P(b|j,m)= EAP(b)P(E)P(A|b,E)P(j|A)P(m|A)
+
**
P(j|a)P(m|a)
*
P(e)P(a|b,e)
P(b)
*A=a
**
P(j|a)P(m|a)
*
P(e)P(a|b,e)
P(b)
*A= a
E=e+
P(b)
**
P(j|a)P(m|a)
*
P(e)P(a|b,e)
*A=a
**
P(j|a)P(m|a)
*
P(e)P(a|b,e)
P(b)
*A= a
E= e+
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 11
1. Basics of Graph
• Polytree• Multiply connected networks• Clique• Markov network• Chordal graph• Induced width
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 12
Two Kinds of PGMs• There are two kinds of
probabilistic graphical models (PGMs)–Singly connected network
• Polytree–Multiply connected network
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 13
Singly Connected Networks (Polytree)Burglary Earthquake
Alarm
John Calls Mary Calls
• Any two nodes are connected by at most one undirected path
• Theorem• Inference in a polytree
is linear in the node size of the network
• This assumes tabular CPT representation
A
CB
D E
F G
H
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 14
Multiply Connected Networks
WetGrass
Cloudy
Sprinkler Rain
• At least two nodes are connected by more than one undirected path
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 15
Clique (1/2)• A clique is a subgraph of an undirected
graph that is complete and maximal– Complete:
• Fully connected• Every node connects to every other nodes
– Maximal:
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 16
Clique (2/2)• Identify cliques
A
B
D
C
E
G
F
H
EGH
ADEABD
ACEDEF
CEG
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 17
Markov Network (1/2)• An undirected graph with
– Hyper-nodes (multi-vertex nodes)– Hyper-edges (multi-vertex edges)
EGH
ADEABD
ACEDEF
CEG
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 18
Markov Network (2/2)• Every hyper-edge e=(x1…xk) has a
potential function fe(x1…xk)• The probability distribution is
11
11
),...,(.../1
),...,(),...,(
x xn Eeekee
Eeekeen
xxfZ
xxfZXXP
EGH CEG
Ee
e CHGEfZCEGEGHP ),,,(),(
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 19
Chordal Graphs• Elimination ordering undirected chordal
graph
Graph:• Maximal cliques are factors in elimination• Factors in elimination are cliques in the graph• Complexity is exponential in size of the largest
clique in graph
LT
A B
X
V S
D
V S
LT
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 20
2. Sum-Product and Generalized Distributive Law
Hh ni
ii XPaXPeEXP~1
))(|()|(
We obtain the formula because two rules in probability theory
y
yxPxP ),()( :Rule Sum
)()|(),( :RuleProduct yPyxPyxP
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 21
The Sum-Product with Generalized Distributive Law
kX X ki
ii XPaXP1 ~1
))(|(
Hh ni
ii XPaXPeEXP~1
))(|()|(
))(|())(|(1
11 kkX X
XPaXPXPaXPk
1
1
),|())(|(
),|())(|(
111X
u
X Xktkk
XXPXPaXP
XXPXPaXPk k
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 22
Distributive Law for Sum-Product (1/3)
j
ji i
ij
ji xPxPxPxP )()()()(
i i
ii xaax)( 2121 xxaaxax
j
ji i
ij
ji xxxx
•
•
•
( )( )
( )( ))()( kh xPxP
)()( 21 kh xfxf
)(),(
)|( )(),(
hi
hi
ixP
xxP
ihi
xPxxP
xxPh
hi
Variable iis eliminated
j
kji i
hij
kjhi xxPxxPxxPxxP )|( )|( )|()|(
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 23
Distributive Law for Sum-Product (2/3)
j
kji i
hij
kjhi xxPxxPxxPxxP )|( )|( )|()|(
e a
amPajPebaPePbPmjbP )|()|(),|()()(),|(
ae
amPajPebaPePbP )|()|(),|()()(
•
•
( )( )•
jij
i iki
jijki xxPxxPxxPxxP )|( )|( )|()|(
)( )|( ii
ki xfxxP
)()( 21 kh xfxf
( )( ) )( kxf
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 24
Distributive Law for Sum-Product (3/3)
ab + ac = a(b+c)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 25
Distributive Law for Max-Product
)(max)(max)()(maxmax jjiijijixPxPxPxP
iiiixaax maxmax
)(maxarg iixP
),max(),max( 2121 xxaaxax
jjiijijixxxx maxmaxmaxmax
•
•
•
•
)|(max)|(max
)|()|(maxmax
kjjkii
kjkiji
xxPxxP
xxPxxP
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 26
Generalized Distributive Law (1/2)Aji and McEliece,
2000
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 27
Generalized Distributive Law (2/2)Aji and McEliece,
2000
•a+0=0+a=a•a*1=1*a=a•a*b+a*c=a*(b+c)
•max(a,0)=max(0+a)=a•a*1=1*a=a•max(a*b, a*c)=a*max(b, c)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 28
Marginal to MAP : MAX Product
Maximum Likelihood Query& MAP Query
4x 5x
1x
2x
3x
Likelihood & Posterior Queries
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 29
3. Variable Elimination
• Variable elimination improves the enumeration algorithm by–Eliminating repeated calculations
• Carry out summations right-to-left–Bottom-up in the evaluation tree
• Storing intermediate results (factors) to avoid re-computation
–Dropping irrelevant variables
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 30
Basic Idea• Write query in the form
• Iteratively–Move all irrelevant terms (constants) outside
the innermost summation
i aibc = bc i ai
–Perform innermost sum, getting a new term:factors
–Insert the new term into the product
kx x x i
iin paxPeXP3 2
)|(),(
( ) ( ( ))
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 31
An Example without Evidence (1/2)
RainSprinkler
Cloudy
WetGrass
csr
cPcsPcrPsrwPwP,,
)()|()|(),|()(
sr c
cPcsPcrPsrwP,
)()|()|(),|(
sr
srfsrwP,
1 ),(),|( ),(1 srfFactor
P(C)0.5
S R P(W|S,R)T T 0.99T F 0.90F T 0.90F F 0.00
C P(S|C)T 0.1F 0.5
C P(R|C)T 0.8F 0.2
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 32
An Example without Evidence (2/2)R S C P(R|C) P(S|C) P(C) P(R|C) P(S|C) P(C)
T T TT T FT F TT F FF T TF T FF F TF F F
R S f1(R,S) = ∑c P(R|S) P(S|C) P(C)T TT FF TF F
Factor f1(r,s)A factor may be• A function• A value
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 33
An Example with Evidence (1/2)
Factors
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 34
An Example with Evidence (2/2)• fM(a) = <0.7,0.1>• fJ(a) = <0.9,0.05>• fA(a,b,e)• fÃJM(b,e)
J M A B E fM(a) PJ(a) fA(a,b,e) fJM (a,b,e) fÃJM (b,e)T T T T T 0.7 0.9 0.95 0.7*0.9*0.95T T T T F 0.7 0.9 0.95 0.7*0.9*0.95T T T F T 0.7 0.9 0.29 0.7*0.9*0.29T T T F F 0.7 0.9 0.001 0.7*0.9*0.01T T F T T 0.1 0.05 0.05 0.1*0.05*0.05T T F T F 0.1 0.05 0.05 0.1*0.05*0.05T T F F T 0.1 0.05 0.71 0.1*0.05*0.71T T F F F 0.1 0.05 0.95 0.1*0.05*0.95
Burglary Earthquake
Alarm
John Calls Mary Calls
P(B)0.001
P(E)0.002 B E P(A|B,E)
T T 0.95T F 0.95F T 0.29F F 0.001
A P(J|A)T 0.90F 0.05
A P(M|A)T 0.70F 0.01
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 35
Basic Operations• Summing out a variable from a
product of factors–Move any irrelevant terms (constants)
outside the innermost summation–Add up submatrices in pointwise
product of remaining factors
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 36
Variable Elimination Algorithm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 37
Irrelevant Variables (1/2)• Consider the query
P(JohnCalls|Burglary = true)– P(J|b)= P(b) eP(e) aP(a|b,e)P(J|a) mP(m|a)–Sum over m is identically 1
mP(m|a) = 1–M is irrelevant to the query
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 38
Irrelevant Variables (2/2)• Theorem 1: P(X|E)
Y is irrelevant if YAncestors({X}E)• In the example P(J|b)
– X =JohnCalls, E={Burglary}– Ancestors({X} E)
= {Alarm,Earthquake}– so MaryCalls is irrelevant
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 39
Complexity• Time and space cost of variable elimination
are O(dkn)– n: No. of random variables– d: no. of discrete values– k: no. of parent nodes
• Polytrees : k is small, Linear– If k=1, O(dn)
• Multiply connected networks : – O(dkn), k is large– Can reduce 3SAT to variable elimination
• NP-hard– Equivalent to counting 3SAT models
• #P-complete, i.e. strictly harder than NP-complete problems
k is critical for complexity
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 40
Pros and Cons• Variable elimination is simple and
efficient for single query P(Xi | e)• But it is less efficient if all the variables
are computed: P(X1 | e), …, P(Xk | e)– In a polytree network, one would need to
issue O(n) queries costing O(n) each: O(n2)• Junction tree algorithm extends variable
elimination that compute posterior probabilities for all nodes simultaneously
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 41
3.1 An Example• The Asia network
Visit to Asia Smoking
Lung CancerTuberculosis
Abnormalityin Chest Bronchitis
X-Ray Dyspnea
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 42V S
LT
A B
X D
),|()|(),|()|()|()|()()(),,,,,,,(
badPaxPltaPsbPslPvtPsPvPdxbaltsvP
• We want to inference P(d)• Need to eliminate: v,s,x,t,l,a,b
Initial factors
“Brute force approach”P (d) P (v,s, t, l,a,b, x,d)
v
s
t
l
a
b
x
Complexity is exponential• N : size of the graph, number of variables • K : number of states for each variable
O(N T )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 43V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to inference P(d)• Need to eliminate : v,s,x,t,l,a,b
Initial factors
Eliminate: v
Note: fv(t) = P(t)In general, result of elimination is not necessarily a probability term
Compute: v
v vtPvPtf )|()()(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfvt fv(t)T 0.70F 0.01
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 44V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to inference P(d)• Need to eliminate : s,x,t,l,a,b
• Initial factors
Eliminate: s
•Summing on s results in fs(b,l)•A factor with two arguments •Result of elimination may be a function of several variables
Compute: s
s slPsbPsPlbf )|()|()(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
),|()|(),|(),()( badPaxPltaPlbftf sv b l fs(b,l)T T 0.95T F 0.95F T 0.29F F 0.001
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 45V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to inference P(d)• Need to eliminate : x,t,l,a,b
• Initial factors
Eliminate: x
Note: fx(a) = 1 for all values of a !!
Compute: x
x axPaf )|()(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
),|()|(),|(),()( badPaxPltaPlbftf sv
),|(),|()(),()( badPltaPaflbftf xsv
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 46V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to inference P(d)• Need to eliminate : t,l,a,b
• Initial factors
Eliminate: tCompute:
tvt ltaPtflaf ),|()(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
),|()|(),|(),()( badPaxPltaPlbftf sv
),|(),|()(),()( badPltaPaflbftf xsv
),|(),()(),( badPlafaflbf txs
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 47V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to inference P(d)• Need to eliminate : l,a,b
• Initial factors
Eliminate: lCompute:
ltsl laflbfbaf ),(),(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
),|()|(),|(),()( badPaxPltaPlbftf sv
),|(),|()(),()( badPltaPaflbftf xsv
),|(),()(),( badPlafaflbf txs
),|()(),( badPafbaf xl
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 48V S
LT
A B
X D),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to inference P(d)• Need to eliminate : b
• Initial factors
Eliminate: a,bCompute:
b
aba
xla dbfdfbadpafbafdbf ),()(),|()(),(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv
),|()|(),|(),()( badPaxPltaPlbftf sv
),|(),|()(),()( badPltaPaflbftf xsv
),|()(),( badPafbaf xl),|(),()(),( badPlafaflbf txs
)(),( dfdbf ba
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 49V S
LT
A B
X DP (v)P (s)P (t | v)P (l | s)P (b | s)P (a | t, l)P (x | a)P (d | a,b)
• Different elimination ordering• Need to eliminate : a,b,x,t,v,s,l• Initial factors
)(),(
),,(),,,(
),,,,(),,,,,(
),,,,,,(
dgdlg
sdlgvsdlg
vsdtlgvsxdtlg
vsxbdtlg
l
s
v
t
x
b
a
Intermediate factors:
)(),(
),,(),,,(
),,,,(),,,,,(
),,,,,,(
dfbaf
balfbaltf
baltxfbaltxsf
baltxsvf
b
a
l
t
x
s
v
In previous orderBothneedn=7steps
But each step has different
computation size
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 50
Short Summary• Variable elimination is a sequence of
rewriting operations• Computation depends on
– Number of variables n• Each elimination step reduces one variable• So we need n elimination steps
– Size of factors• Effected by order of elimination• Discussed in sub-section 3.2
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 51
Dealing with Evidence(1/7)
• How do we deal with evidence?
• Suppose get evidence V = t, S = f, D = t• We want to compute P(L, V = t, S = f, D = t)
V S
LT
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 52
Dealing with Evidence(2/7)
• We start by writing the factors:
• Since we know that V = t, we don’t need to eliminate V• Instead, we can replace the factors P(V) and P(T|V) with
• These “select” the appropriate parts of the original factors given the evidence
• Note that fp(V) is a constant, and thus does not appear in elimination of other variables
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
)|()()( )|()( tVTPTftVPf VTpVP
V S
LT
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 53
Dealing with Evidence(3/7) • Given evidence V = t, S = f, D = t• Compute P(L, V = t, S = f, D = t )• Initial factors, after setting evidence:
),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP
V S
LT
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 54
Dealing with Evidence(4/7) • Given evidence V = t, S = f, D = t• Compute P(L, V = t, S = f, D = t )• Initial factors, after setting evidence:
• Eliminating x, we get),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP
V S
LT
A B
X D
),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 55
Dealing with Evidence(5/7) • Given evidence V = t, S = f, D = t• Compute P(L, V = t, S = f, D = t )• Initial factors, after setting evidence:
• Eliminating x, we get
• Eliminating t, we get
),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP
V S
LT
A B
X D
),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP
),()(),()()( ),|()|()|()()( bafaflafbflfff badPxtsbPslPsPvP
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 56
Dealing with Evidence(6/7) • Given evidence V = t, S = f, D = t• Compute P(L, V = t, S = f, D = t )• Initial factors, after setting evidence:
• Eliminating x, we get
• Eliminating t, we get
• Eliminating a, we get
),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP
V S
LT
A B
X D
),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP
),()(),()()( ),|()|()|()()( bafaflafbflfff badPxtsbPslPsPvP
),()()( )|()|()()( lbfbflfff asbPslPsPvP
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 57
Dealing with Evidence(7/7) • Given evidence V = t, S = f, D = t• Compute P(L, V = t, S = f, D = t )• Initial factors, after setting evidence:
• Eliminating x, we get
• Eliminating t, we get
• Eliminating a, we get
• Eliminating b, we get
),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP
V S
LT
A B
X D
),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP
),()(),()()( ),|()|()|()()( bafaflafbflfff badPxtsbPslPsPvP
),()()( )|()|()()( lbfbflfff asbPslPsPvP
)()()|()()( lflfff bslPsPvP
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 58
• Suppose in one elimination step we compute
This requires • multiplications
– For each value for x, y1, …, yk, we do mmultiplications
• additions– For each value of y1, …, yk , we do |X| additions
x
kxkx yyxfyyf ),,,('),,( 11
m
ilikx i
yyxfyyxf1
,1,1,11 ),,(),,,('
Complexity (1/2)
i
iYXm
i
iYX
|X| : No. of discrete values of X
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 59
Complexity (2/2)• One elimination step requires
– multiplications– additions– O( ), m is a constant (neglected)– Or O(dk) if
• |X|=|Yi|=d, • k: no. of parent nodes
• Time and space cost are O(dkn)– n: No. of random variables– d: no. of discrete values– k: no. of parent nodes
i
iYXm
i
iYX
i
iYX
Complexity is exponential in number
of variables k
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 60
3.2 Order of Elimination• How to select “good” elimination
orderings in order to reduce complexity1. Start by understanding variable
elimination via the graph we are working with
2. Then reduce the problem of finding good ordering to graph-theoretic operation that is well-understood
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 61
Undirected Graph Conversion (1/2)• At each stage of the variable
elimination, • We have an algebraic term that we
need to evaluate• This term is of the form
where Zi are sets of variables
1
)(),,( 1y y i
ikn
fxxP iZ
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 62
Undirected Graph Conversion (2/2)• Plot a graph where
– If X,Y are arguments of some factor• That is, if X,Y are in some Zi
– There are undirected edges X--Y
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 63
Example• Consider the “Asia” example• The initial factors are
• The undirected graph is
• In the first step this graph is just the moralized graph
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
V S
LT
A B
X D
V S
LT
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 64
Variable Elimination Change of Graph
• Now we eliminate t, getting
• The corresponding change in the graph is
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
),,(),|()|()|()|()()( lavfbadPaxPsbPslPsPvP t
V S
LT
A B
X D
V S
LT
A B
X D
Nodes V,L,Abecome a clique
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 65
Example (1/6)• Want to compute P(L,V=t,S=f,D=t)
• Moralizing
V S
LT
A B
X DLT
A B
X
V S
D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 66
Example (2/6)• Want to compute P(L,V=t,S=f,D=t)
• Moralizing• Setting evidence
V S
LT
A B
X D
LT
A B
X
V S
D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 67
Example (3/6)• Want to compute P(L,V=t,S=f,D=t)
• Moralizing• Setting evidence• Eliminating x
– New factor fx(A) LT
A B
X
V S
D
V S
LT
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 68
Example (4/6)• Want to compute P(L,V=t,S=f,D=t)
• Moralizing• Setting evidence• Eliminating x• Eliminating a
– New factor fa(b,t,l) LT
A B
X
V S
D
V S
LT
A B
X D
A clique in reduced undirected graph
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 69
Example (5/6)• Want to compute P(L,V=t,S=f,D=t)
• Moralizing• Setting evidence• Eliminating x• Eliminating a• Eliminating b
– New factor fb(t,l)LT
A B
X
V S
D
V S
LT
A B
X D
A clique in reduced undirected graph
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 70
Example (6/6)• Want to compute P(L,V=t,S=f,D=t)
• Moralizing• Setting evidence• Eliminating x• Eliminating a• Eliminating b• Eliminating t
– New factor ft(l)
LT
A B
X
V S
D
V S
LT
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 71
Elimination and Clique (1/2)• We can eliminate a variable x by
1. For all Y,Z, s.t., Y--X, Z--X• add an edge Y--Z
2. Remove X and all adjacent edges to it• This procedures create a clique that contains
all the neighbors of X• After step 1 we have a clique that
corresponds to the intermediate factor(before marginalization)
• The cost of the step is exponential in the size of this clique : dk in O(ndk)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 72
Elimination and Clique (2/2)• The process of eliminating nodes from
an undirected graph gives us a clue to the complexity of inference
• To see this, we will examine the graph that contains all of the edges we added during the elimination
• The resulting graph is always chordal
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 73
Example (1/7)• Want to compute P(L)
• Moralizing
V S
LT
A B
X D
LT
A B
X
V S
D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 74
Example (2/7)• Want to compute P(L)
• Moralizing• Eliminating v
– Multiply to get f’v(v,t)– Result fv(t)
V S
LT
A B
X D
LT
A B
X
V S
D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 75
Example (3/7)• Want to compute P(L)
• Moralizing• Eliminating v• Eliminating x
–Multiply to get f’x(a,x)–Result fx(a)
V S
LT
A B
X D
LT
A B
X
V S
D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 76
Example (4/7)• Want to compute P(L)
• Moralizing• Eliminating v• Eliminating x• Eliminating s
–Multiply to get f’s(l,b,s)–Result fs(l,b)
V S
LT
A B
X D
LT
A B
X
V S
D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 77
Example (5/7)• Want to compute P(D)
• Moralizing• Eliminating v• Eliminating x• Eliminating s• Eliminating t
–Multiply to get f’t(a,l,t)–Result ft(a,l)
V S
LT
A B
X D
LT
A B
X
V S
D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 78
Example (6/7)• Want to compute P(D)
• Moralizing• Eliminating v• Eliminating x• Eliminating s• Eliminating t• Eliminating l
–Multiply to get f’l(a,b,l)–Result fl(a,b)
V S
LT
A B
X D
LT
A B
X
V S
D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 79
Example (7/7)• Want to compute P(D)
• Moralizing• Eliminating v• Eliminating x• Eliminating s• Eliminating t• Eliminating l• Eliminating a, b
–Multiply to get f’a(a,b,d)–Result f(d)
V S
LT
A B
X D
LT
A B
X
V S
D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 80
• The resulting graph are inducedgraphs (for this particular ordering)
• Main property:– Every maximal clique in the induced graph
corresponds to an intermediate factor in the computation
– Every factor stored during the process is a subset of some maximal clique in the graph
• These facts are true for any variable elimination ordering on any network
Induced GraphsLT
A B
X
V S
D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 81
Induced Width (Treewidth)• The size of the largest clique k in the
induced graph is – An indicator for the complexity of variable
elimination• w=k-1 is called
– Induced width (treewidth) of a graph– According to the specified ordering
• Finding a good ordering for a graph is equivalent to finding the minimal induced width of the graph
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 82
Treewidth
PCWP COHRBP
HREKGHRSAT
ERRCAUTERHRHISTORY
CATECHOL
SAO2 EXPCO2
ARTCO2
VENTALV
VENTLUNG VENITUBE
DISCONNECT
MINVOLSET
VENTMACHKINKEDTUBEINTUBATIONPULMEMBOLUS
PAP SHUNT
MINOVL
PVSAT
PRESS
INSUFFANESTHTPR
LVFAILURE
ERRBLOWOUTPUTSTROEVOLUMELVEDVOLUME
HYPOVOLEMIA
CVP
BP
Low treewidth High tree widthChains
Trees (no loops)
N=nxn grid
Loopy graphs
W = 1
W = #parents
W = O(n) = O(p N)
W = NP-hard to find
Arnborg85
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 83
Complexity• Time and space cost of variable elimination
are O(dkn)– n: No. of random variables– d: no. of discrete values– k: no. of parent nodes = treewidth + 1 (W+1)
• Polytrees : k is small, Linear– If k=1, O(dn)
• Multiply connected networks : – O(dkn), k is large– Can reduce 3SAT to variable elimination
• NP-hard– Equivalent to counting 3SAT models
• #P-complete, i.e. strictly harder than NP-complete problems
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 84
Elimination on Trees (1/3)• Suppose we have a tree that
– A network where each variable has at most one parent
• Then all the factors involve at most two variables: Treewidth=1
• The moralized graph is also a treeA
CB
D E
F G
A
CB
D E
F G
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 85
Elimination on Trees (2/3)• We can maintain the tree structure by
eliminating extreme variables in the treeA
CB
D E
F G
A
CB
D E
F GA
CB
D E
F G
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 86
Elimination on Trees (3/3)• Formally, for any tree, there is an
elimination ordering with treewidth = 1
Theorem• Inference on trees is linear in number of
variables : O(dn)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 87
Exercise: Variable Elimination
smart study
prepared fair
pass
p(smart)=.8 p(study)=.6
p(fair)=.9
p(prep|…) smart smartstudy .9 .7study .5 .1
p(pass|…)smart smart
prep prep prep prepfair .9 .7 .7 .2fair .1 .1 .1 .1
Query: What is the probability that a student studied, given that they pass the exam?
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 88
Variable Elimination Algorithm• Let X1,…, Xm be an ordering on the non-query
variables
• For i = m, …, 1– Leave in the summation for Xi only factors
mentioning Xi– Multiply the factors, getting a factor that contains a
number for each value of the variables mentioned, including Xi
– Sum out Xi, getting a factor f that contains a number for each value of the variables mentioned, not including Xi
– Replace the multiplied factor in the summation
j
jjX XX
XParentsXPm
))(|(...1 2
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 89
3.3 General Graphs• If the graph is not a polytree
–More general networks–Usually loopy networks
• Can we inference loopy networks by variable elimination?– If network has a cycle, the treewidth for
any ordering is greater than 1– Its complexity is high, – VE becomes a not practical algorithm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 90
Example (1/2)• Eliminating A, B, C, D, E,….• Resulting graph is chordal with
treewidth 2
A
H
B
D
F
C
E
G
A
H
B
D
F
C
E
G
A
H
B
D
F
C
E
G
A
H
B
D
F
C
E
G
A
H
B
D
F
C
E
G
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 91
Example (2/2)• Eliminating H,G, E, C, F, D, E, A• Resulting graph is chordal with
treewidth 3
A
H
B
D
F
C
E
G
A
H
B
D
F
C
E
G
A
H
B
D
F
C
E
G
A
H
B
D
F
C
E
G
A
H
B
D
F
C
E
G
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 92
Find Good Elimination Orderin General Graph
Theorem:• Finding an ordering that minimizes the
treewidth is NP-HardHowever,• There are reasonable heuristic for finding
“relatively” good ordering• There are provable approximations to the best
treewidth• If the graph has a small treewidth, there are
algorithms that find it in polynomial time
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 93
Heuristics for Finding an Elimination Order
• Since elimination order is NP-hard to optimize,
• It is common to apply greedy search techniques: Kjaerulff90
• At each iteration, eliminate the node that would result in the smallest– Number of fill-in edges [min-fill]– Resulting clique weight [min-weight] (Weight of
clique = product of number of states per node in clique)
• There are some approximation algorithms Amir01
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 94
Factorization in Loopy Networks
Factorizable
Not Factorizable
Probabilistic models with no loop are tractable
Probabilistic models with loop are not tractable
dcbadcba
dcba
),(),(),(),(
),(),(),(,
xPxPxPxP
xPxPxPxPa b c d
a
b
cd
ab
cd
a b c d
dcba xP ,,,,
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 95
Short Summary• Variable elimination
– Actual computation is done in elimination step
– Computation depends on order of elimination
– Very sensitive to topology– Space = time
• Complexity–Polytrees: Linear time–General graphs: NP-hard
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 96
4. Belief Propagation
• Also called–Message passing–Pearl’s algorithm
• Subsections–4.1 Message passing in simple chains–4.2 Message passing in trees–4.3 BP Algorithm–4.4 Message passing in general graphs
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 97
What’s Wrong with VarElim
• Often we want to query all hidden nodes• Variable elimination takes O(N2dk) time to
compute P(Xi|e) for all (hidden) nodes Xi• Message passing algorithms that can do
this in O(Ndk) time
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 98
Repeated Variable Elimination Leads to Redundant Calculations
Y1 Y3
X1 X2 X3
Y2
O(N2 K2) time to compute all N marginals
32
)|()|()|()|()|()()|( 332322121113:11xx
xyPxxPxyPxxPxyPxPyxP
31
)|()|()|()()|()|()|( 332311122123:12xx
xyPxxPxyPxPxyPxxPyxP
21
)|()|()|()()|()|()|( 221211133233:13xx
xyPxxPxyPxPxyPxxPyxP
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 99
Belief Propagation• Belief propagation (BP) operates by sending
beliefs/messages between nearby variables in the graphical model
• It works like variable elimination
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 100
4.1 Message Passing in Simple Chains
• Likelihood query (query without evidence)– P(X1), P(Xn), P(Xk)– P(Xj , Xk)
• Posterior query (query with evidence)– P(X1|Xn), P(Xn|X1), – P(Xk|X1), P(Xk|Xn),– P(X1|Xk), P(Xn|Xk),– P(Xk|Xj)
• Maximum A Posterior (MAP) query– arg max P(Xk|Xj)
X1 Xk Xn......
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 101
Sum-Product of the Simple Chain (1/2)
nk
k
nk k
n ik k
nk k
nkk
Xnn
Xkk
X Xkkkk
X
X Xkknn
X X
X X Xii
X X
X Xnk
X X
XXXXnkk
XXPXXP
XXPXXPXXPXP
XPXXPXXPXXP
XPaXP
XXXP
XXXPXP
)|()|(
)|()|()|()(
)()|()|()|(
))(|(
),,,,(
),,,,()(
11
121121
11211
1
,1
1
1 12
1 1 1
1 1 1
1 1 1
111
X1 Xk Xn......
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 102
Sum-Product of the Simple Chain (2/2)
},,1|{11211
},,1|{
},,1|{1
)()|()|()|(
))(|(
),,( )|(
kjiniXkknn
kjiniX Xii
kjiniXnjk
i
i i
i
XPXXPXXPXXP
XPaXP
XXPXXP
X1 Xk Xn......
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 103
4.1.1 Likelihood Query
X1 X2 X3 Xn...• P(Xn) or P(xn) : Forward passing
X1 X2 X3 Xn...• P(X1) or P(x1) : Backward passing
X1 X2 Xk Xn......
• P(Xk) or P(xk) : Forward-Backward passing
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 104
d c b a
d c b a
abPaPbcPcdPdeP
dePcdPbcPabPaPeP
)|()()|()|()|(
)|()|()|()|()()(
Forward Passing (1/6)
• P(e)
A B C ED
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 105
• Now we can perform innermost summation
• This summation is exactly – A variable elimination step– We call it: send a CPT P(b) to compute next
innermost summation– The sent CPT P(b) is called a belief, or message:
d c b
d c b a
bpbcPcdPdeP
abPaPbcPcdPdePeP
)()|()|()|(
)|()()|()|()|()(
Forward Passing (2/6)A B C EDX )(BmAB
aa
AB bafabPaPbPbm ),()|()()()(
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 106
Forward Passing (3/6)
• Rearranging and then summing again,
d c
d c b
cpcdPdeP
bpbcPcdPdePeP
)()|()|(
)()|()|()|()(
A B C EDX X
)()()|()( cPbmbcPcmb
ABBC
)(CmBC
)(DmCD
)(EmDE
d c
BC cmcdPdeP )()|()|(
)(BmAB
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 107
Forward Passing (4/6)
• How do we compute P(Xn)?
• Actually, we recursively compute mk-1,k(xk)
11
)()|()()|()()( 11,2111,1kk x
kkkkkx
kkkkkkk xmxxPxPxxPxPxm
)(,1 nnn xm
)()|()|()|()( 11223111 2
xPxxPxxPxxPxPxx x
nnnn
X1 X2 X3 Xn...
mk-1,k(xk) is called a belief, or message
)( 212 xm )( 323 xm )(,1 nnn xm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 108
Forward Passing (5/6)
xk mk-1,i(xk)TF
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
X1 X2 X3 Xn...
xk-1 xk P(xk|xk-1) P(xk-1)=mk-2,k-1(xk)P(xk)=mk-1,k(xk)
T TT FF TF FAdvantage:
After P(Xn), all P(Xk) are also obtainedCompute beliefs of all variables at once
)( 212 xm )( 323 xm )(,1 nnn xm
O(Ndk)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 109
Forward Passing (6/6)
X1 X2 X3 Xn...
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
)( 212 xm )( 323 xm )(,1 nnn xm
d c bAB
d c b a
bmbcPcdPdeP
abPaPbcPcdPdePeP
)()|()|()|(
)|()()|()|()|()(A B C EDX
)(BmAB• Because when Xi-1 has no parent
parents no has if )()(,1 kkkkk xxPxm
)( 101 xm
)( 1xP
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 110
Backward Passing (1/4)• P(a) A B C ED
b c d e
b c d e
dePcdPbcPabPaP
dePcdPbcPabPaPaP
)|()|()|()|()(
)|()|()|()|()()(
b c d
dfcdPbcPabPaP )()|()|()|()(
X)(DmED
• Eliminating variable e, we get f(d)– We call it a belief/message sent from e to d
e
ED dePdfDm 1)|()()(
( =1 )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 111
Backward Passing (2/4)
• Eliminating d, we get f(c)– We call it a belief/message sent from d to c
A B C EDX
d
EDDC dmcdPcfCm )()|()()(
)(CmDC
b c d
ED dmcdPbcPabPaPaP )()|()|()|()()(
b c
cfbcPabPaP )()|()|()(
X)(DmED
( =1 )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 112
b c
DC cmbcPabPaPaP )()|()|()()(
Backward Passing (3/4)
• Eliminating c,
A B C EDXX )(CmDC)(BmCB
X)(DmED
b
CBb
bmabPaPbfabPaPaP )()|()()()|()()(
• Eliminating b, )()()()()( amaPafaPaP BA
X)(AmBA
1)( amBA
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 113
Backward Passing (4/4)
X1 X2 X3 Xn...)( 232 xm)( 121 xm
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
)( 11, nnn xm
• Because when Xi-1 has no parentchild no has if 1)(,1 kkkk xxm
b c d e
dePcdPbcPabPaPaP )|()|()|()|()()(
b c d
ED DmcdPbcPabPaP )()|()|()|()(
A B C ED)(DmED
e
ED dePDm 1)|()(
1)(,1 nnn xm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 114
Comparison
X1 X2 X3 Xn...
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
)( 212 xm )( 323 xm )(,1 nnn xm
parents no has if )()(,1 kkkkk xxPxm
X1 X2 X3 Xn...)( 232 xm)( 121 xm
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
)( 11, nnn xm
child no has if 1)(,1 kkkk xxm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 115
Forward-Backward Passing (1/3)
X1 X2 Xk Xn
)(,1 kkk xm )(,1 kkk xm
......
)()( ,1,1 kkkkkk xmxm
• P(Xk)
)()|(
)|( )|()|( )(
11
1121
1
1 1
xPxxP
xxPxxPxxPxP
k
n k
xkk
x xkkkknnk
( )
( )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 116
Forward-Backward Passing (2/3)
X1 X2 Xk Xn
)(,1 kkk xm
)(,1 kkk xm ......
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
)()()( ,1,1 kkkkkkk xmxmxP
• P(Xk)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 117
Forward-Backward Passing (3/3)
X1 X2 Xk Xn
)(,1 nnn xm )(,1 nnn xm
......)()( ,1,1 nnnnnn xmxm
• P(Xn) as forward-backward passing=1
X1 X2 Xk Xn
)( 11,2 xm)( 11,0 xm ......)()()( 11,211,01 xmxmxP
• P(X1) as forward-backward passing
=P(x1)
)()( ,1 nnnn xmxP
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 118
Exercise• P(X1, Xn)• P(Xj, Xk)
X1 Xj Xk Xn.........
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 119
4.1.2 Posterior Query
x1 X2 X3 Xn...• P(Xn|x1) : Forward passing
X1 X2 X3 xn...• P(X1|xn) : Backward passing
X1 X2 Xk Xn......
• P(Xk) or P(xk) : Forward-Backward passing
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 120
Backward Passing (1/6)
• A query P(A|e) = P(A,e)– Variable elimination in Chains with Evidence
A B C eD
b c d
b c d
b c d
dePcdPbcPAbPAP
dePcdPbcPAbPAP
edcbAPeAP
)|()|()|()|()(
)|()|()|()|()(
),,,,(),(
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 121
Backward Passing (2/6)
• Eliminating d, we get P(e|c)– We call it a belief/message sent from d to c
A B C eD
b c
b c d
ecfbcPAbPAP
dePcdPbcPAbPAPeAP
),()|()|()(
)|()|()|()|()(),(X
d
DC dePcdPecfCm )|()|(),()(
)(CmDC )(DmED
)(DmED
)()|(1)|()( EmdePdePDm FEED
1)( EmFE
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 122
b
b cDC
ebfAbPAP
cmbcPAbPAPeAP
),()|()(
)()|()|()(),(
Backward Passing (3/6)
• Eliminating c, we getA B C eDXX
)()|(),()|(),()( CmbcPecfbcPebfBm DCc c
CB
)(CmDC)(BmCB
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 123
Backward Passing (4/6)
• Finally, we eliminate bA B C eD
),()(
)()|()(),(
eAfAP
bmAbPAPeAPb
CB
XXX
)()|(),()|(),()( BmAbPebfAbPeAfAm CBb b
BA
)(BmCB)(AmBA
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 124
Backward Passing (5/6)
• Given Xn=xn• How to compute P(X1,xn) for P(X1|xn)?
X1 X2 X3 xn...
)()()|()|( 12112~
1121
xmxPxxPxXPxx ni
iinn
)( 232 xm)( 121 xm
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
parents no has if )()(,1 kkkkk xxPxm
)()( 121101 xmxm
)( 101 xm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 125
Backward Passing (6/6)
A B C eDX
d
EDd
DC dmcdPdePcdPCm )()|()|()|()(
)(CmDC
b cDC
b c d
cmbcPabPaP
dePcdPbcPabPaPeaP
)()|()|()(
)|()|()|()|()(),(
)(DmED
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 126
4.1.3 Short Summary• Message can be recursively computed
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
)(,1 kkk xm )(,1 kkk xm
X1 Xk-1 Xn...... Xk+1Xk
)( 11,2 kkk xm )( 11,2 kkk xm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 127
Belief of Any Node
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
1
)()|()( 11,21,1kx
kkkkkkkk xmxxPxm
)()()( ,1,1 kkkkkkk xmxmxP
)(,1 kkk xm )(,1 kkk xm
X1 Xk-1 Xn...... Xk+1Xk
)( 11,2 kkk xm )( 11,2 kkk xm
)()()|( ,1,1 kkkkkkk xmxmexP
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 128
Three Special Cases of Message
)()(,1 kkkk xPxm
1)(,1 kkk xm
• Node xk with evidence
• Node xk without parents• Node xk without child
1)(,1 kkk xm 1)(,1 kkk xm
X1 Xk-1 Xn...... Xk+1Xk
m m1,1 kkm
)|()( 111, kkkkk xxPxm )|()( 111, kkkkk xxPxm
11, kkm
)|( 11, kkkk xxPm
)|( 11, kkkk xxPm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 129
4.2 Message Passing in Tree
X1
Xk-1
Xn
...... Xk+1
Xk
X1
Xk-1
Xn
...... Xk+1
Xk
Markov Chain
)(,1 kkk xm )(,1 kkk xm
)(,1 kkk xm
)(,1 kkk xm
Markov Tree)()()( ,1,1 kkkkkkk xmxmxP
Xm
Xm-1
)(, kkm xm
)()()()( ,,1,1 kkmkkkkkkk xmxmxmxP
)(
, )(kxNj
kkj xm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 130
Two Examples
Simple tree General tree
4x 5x
1x
2x
3x
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 131
Message Passing in Simple Tree (1/3)
4x 5x
1x
2x
3x
212 xm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 132
Message Passing in Simple Tree (2/3)
4x 5x
1x
2x
3x
212 xm
323 xm
535 xm 343 xm
3
)()()|()( 33,433,23555,3x
xmxmxxPxm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 133
Message Passing in Simple Tree (3/3)
4x 5x
1x
2x
3x
212 xm
323 xm
553 xm 343 xm
)}(|{
, )( )(jii xNeighborxx
jjij xmxP
)()()()( 3533433233 xmxmxmxP
j kijiix xxxNeighborxx
jjijkkkj xmxxPxm}),(|{
,, )( )|()(
535 xm
3
)()()|()( 33,433,23555,3x
xmxmxxPxm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 134
Message Passing in HMM (1/3)
Y1 Y3
X1 X2 X3
Y2
Filtering (Forward algorithm)P(X3|y1:3)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 135
• Smoothing• P(X1|y1:3) : Backward algorithm
• P(X2|y1:3) : Backward algorithm
Message Passing in HMM (2/3)
Y1 Y3
X1 X2 X3
Y2
Y1 Y3
X1 X2 X3
Y2
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 136
Message Passing without Evidence
)}(|{
, )( )()(jii xNeighborxx
jjijj xmxPxBelief
j kijiix xxxNeighborxx
jjijkkkj xmxxPxm}),(|{
,, )( )|()(
…
jx
1ix
kix
…1ki
xmi
x
jji xm ,1 jji xm
k ,
jji xmk ,1
jji xmm ,
)(11, iij xm )(, kk iij xm
)(, mm iij xm
11, kk iij xm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 137
Message Passing with Evidence (1/2)• Given a set of evidence e = e+e-
• The node x splits network into two disjoint parts
)|,( xeeP )|()|( xePxeP
Conditional Independence Polytree
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 138
Message Passing with Evidence (2/2)• Given a set of evidence e = e+e-
• The belief Belief(x) of a node x is),|()|()( eexPexPxBelief
)()|,(' xPxeeP )()|()|(' xPxePxeP
)|()|( exPxeP)()( xx
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 139
FactorizationProbabilistic models with no loop are factorizable
dcbadcba
dcba
),(),(),(),(
),(),(),(,
xPxPxPxP
xPxPxPxPa b c d
ab
cd
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 140
Marginal to MAP : MAX Product
4x 5x
1x
2x
3x
212 xm
323 xm
535 xm 343 xm 212 xm 323 xm 343 xm 535 xm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 141
Sum-Product v.s. Max-Product• Sum-product computes marginals
using this rule
• Max-product computes max marginals
using the rule
• Same algorithm on different semirings: (+,x,0,1) and (max,x,-1,1) Shafer90,Bistarelli97,Goodman99,Aji00
j kijiix xxxNeighborxx
jjijkkkj xmxxPxm}),(|{
,, )( )|()(
}),(|{
,, )( )|(max)(kijii
j xxxNeighborxxjjijkxkkj xmxxPxm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 142
4.3 Pearl’s BP Algorithm• Forwards-backwards algorithm can be
generalized to apply to any tree-like graph (ones with no loops)
• For now, we assume pairwise potentials
Pearl88,Shafer90,Yedidia01,etc
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 143
Basic Idea• 2 passes : Collect and Distribute
rootroot
Collect Evidence
rootroot
Distribute Evidence
Figure from P. Green
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 144
Collect Evidence: Absorb Messages
)}(|{
, )( )()(jii xNeighborxx
jjijj xmxPxBelief
jx
1ix
kix
…1ki
xmi
x
jji xm ,1 jji xm
k ,
jji xmk ,1
jji xmm ,
…
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 145
Distribute Evidence: Send Messages
j kijiix xxxNeighborxx
jjijkkkj xmxxPxm}),(|{
,, )( )|()(
jx
1ix
kix
…1ki
xmi
x
)(11, iij xm )(, kk iij xm
)(, mm iij xm
11, kk iij xm
…
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 146
Initialization• For nodes with evidence e
– (xi) = 1 wherever xi = ei ; 0 otherwise– (xi) = 1 wherever xi = ei ; 0 otherwise
• For nodes without parents– (xi) = p(xi) - prior probabilities
• For nodes without children– (xi) = 1 uniformly (normalize at end)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 147
Centralized Protocol
Collect to root (post-order) Distribute from root (pre-order)
Computes all N marginals in 2 passes over graph
1 2
34
5 12
3
45
R R
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 148
Distributed Protocol
Computes all N marginals in O(N) parallel updates
Collect
Distribute
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 149
Propagation Example in a Tree
• The example requires five time periods to reach equilibrium (Pearl, 1988, p 174)
DataData
Collect
Distribute
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 150
Properties of BP• Exact inference for polytrees
– Each node separates the polytree into 2 disjoint components
• On a polytree, the BP algorithm converges in time linearly proportional to number of nodes– Work done in a node is proportional to the
size of CPT– Hence BP is linear in number of network
parameters• For general graphs
– Exact inference is NP-hard– Approximate inference is NP-hard
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 151
4.4 Message Passing in General Graphs
• Belief propagation is only guaranteed to be correct for polytrees (trees)
• Most probabilistic graphs– Are not polytrees– Has many loops
• We can not factorize the joint probability P(X1,…,Xn) into sum-product P(Xi|Pa(Xi))
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 152
Loopy Belief Propagation• Applying BP to graphs with loops (cycles) can
give the wrong answer, because it overcounts evidence
• In practice, often works well (e.g., error correcting codes)
Cloudy
SprinklerRain
WetGrass
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 153
Factorization in Loopy Networks
Factorizable
Not Factorizable
Probabilistic models with no loop are tractable
Probabilistic models with loop are not tractable
dcbadcba
dcba
),(),(),(),(
),(),(),(,
xPxPxPxP
xPxPxPxPa b c d
a
b
cd
ab
cd
a b c d
dcba xP ,,,,
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 154
Two Methods• Loopy Belief Propagation
– Approximate Inference• Clustering (Join Tree, Junction Tree)
– Combine multiple nodes into a hyper-node• Transform loopy graph into polytree
– Then perform belief propagation
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 155
Loopy Belief Propagation• If BP is used on graphs with loops,
messages may circulate indefinitely• Empirically, a good approximation is
still achievable– Stop after fixed # of iterations– Stop when no significant change in beliefs– If solution is not oscillatory but converges,
it usually is a good approximation• Example: Turbo Codes
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 156
Clustering
• The general graph should be converted to a junction tree, by clustering nodes
• Message passing in the general graph = Message passing in the junction tree
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 157
5. Junction Tree
• Also known as –Clustering algorithm–Join tree algorithm
• Sub-sections–5.1 Junction tree algorithm–5.2 Example: create join tree
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 158
Basic Idea• Join individual nodes to form cluster
nodes• The resulting network becomes a
polytree– Singly connected– Undirected
• Inference is performed in the polytree• Reduce the cost of inference of all
variables to O(n)–n is the size of the modified network :
polytree
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 159
An Example (1/2)• A multiply connected network
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 160
An Example (2/2)• A polytree by combining Sprinkler and
Rain
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 161
Why Junction Tree• More efficient inference of all variables
– Than variable elimination– For some PGMs
(multiply connected network)• Avoid cycles if we
– Turn highly-interconnected subsets of the nodes into “hypernodes” (Cluster)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 162
5.1 Junction Tree AlgorithmStep 1: Graph Transformation
(a) Moralize(b) Triangulate(c) Identify cliques(c) Build junction tree
Step 2: Initialization (of values)(a) Set up potentials(b) Propagate potentials
Step 3: Update beliefs(a) Insert evidence into the junction tree(b) Propagate potentials
Steps 1 and 2 are performed only once
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 163
Step 1: Graph Transformation
DAG
Moral Graph
Triangulated Graph
Junction Tree
Hypernodes of Cliques
1(a) Moralize
1(b) Triangulate
1(c) Identify cliques
1(d) Build junction tree
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 164
Step 1(a) - Moralize (1/2)• Add undirected edges to all co-parents
which are not currently joined–Marrying parents
A
B
D
C
E
G
F
H
A
B
D
C
E
G
F
H
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 165
Step 1(a) - Moralize (2/2)• Drop the directions of the arcs
A
B
D
C
E
G
F
H
A
B
D
C
E
G
F
H
Directed Undirected Moral Graph
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 166
Step 1(b) - Triangulating• An undirected graph is triangulated • Iff every cycle of length >3 contains an
edge to connect two nonadjacent nodes
A
B
D
C
E
G
F
H
NO YES
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 167
Step 1(c) - Identifying Cliques• A clique is a subgraph of an undirected
graph that is complete and maximal
A
B
D
C
E
G
F
H
EGH
ADEABD
ACEDEF
CEG
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 168
Properties of Junction Tree• Each node is a cluster (nonempty set) of
variables• Running intersection property:
– Given two clusters X and Y, all clusters on the path between X and Y contain XY
• Separator sets (sepsets): – Intersection of the adjacent cluster
ADEABD DEFAD DE
Cluster ABD Sepset DE
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 169
Step 1(d) - Build Junction Tree (1/2)• A junction tree is a clique
graph that – is an undirected tree – contains all the cliques– satisfies the running
intersection property
EGH
ADEABD
ACEDEF
CEG
ADEABD ACEAD AE CEGCE
DEF
DE
EGH
EG
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 170
Step 1(d) - Build Junction Tree (2/2)
sepsets
In JT cliquesbecomesvertices
abd
ade
ace
ceg
eghdef
ad ae ce
de egGJT
Ex: ceg egh = eg
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 171
Junction Tree AlgorithmStep 1: Transformation of graph
(a) Moralize(b) Triangulate(c) Identify cliques(c) Build junction tree
Step 2: Initialization (of values)(a) Set up potentials(b) Propagate potentials
Step 3: Update beliefs(a) Insert evidence into the junction tree(b) Propagate potentials
Steps 1 and 2 are performed only once
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 172
Step 2: Initialization
DAG
Junction Tree
Inconsistent Junction Tree
2(a) Set up potentials
Consistent Junction Tree
2(b) Propagate potentials
)|( eE vVPMarginalization
Step 1
Step 2
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 173
PotentialsDEFINITION: A potential A over a set of variables XA is a function that maps each instantiation of xA into a non-negative real number. We denote the number that Amaps xA by A(xA). Ex: A potential abc over
the set of vertices {a,b,c}. Xa has four states, andXb and Xc has three states.
A joint probability is a special case of a potential where A(xA)=1.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 174
Decomposable Distribution
DEFINITION: A probability distribution is said to be decomposable with respect to graph G = ( V , E ) if G is triangulated and it hold for any clusters A and B with separator C that
XA XB | XC
Step 1 & 2 of the junction tree algorithm guarantees this property!
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 175
Factorization of Potentials
THEOREM: Given a decomposable probability distribution P(XV) on the graph G = ( V , E ) it can be written as the product of all potentials of the cliques divided by the product of all potentials of the sepsets:
all cliques CP(XV) = all sepsets S
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 176
Step 2(a) – Set Up Potentials1. For each cluster C and sepset S;
C 1 , S 1
2. For each vertex u in the BN select a parent cluster C s.t. C pa(u). Include the conditional probability P( Xu | Xpa(u) ) into C ;
C C · P( Xu | Xpa(u) )
all vertices P( Xu | Xpa(u) )= P(XV)
1
all cliques C
all sepsets S=
”PROOF:”
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 177
The potentials in the junction tree are not consistent with each other., i.e. if we use marginalization to get the probability distribution for a variable Xu we will get different results depending on which clique we use.
abd
ade
ace
ceg
eghdef
ad ae ce
de eg
P(Xa) = ade
= (0.02, 0.43, 0.31, 0.12)de
P(Xa) = ace
= (0.12, 0.33, 0.11, 0.03)ce
The potentials might not even sum to one, i.e. they are not joint probability distributions.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 178
Message Passing from clique A to clique B1. Project the potential of A into SAB2. Absorb the potential of SAB into B
Projection Absorption
Step 2(b) - Propagate Potentials
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 179
1. COLLECT-EVIDENCE messages 1-52. DISTRIBUTE-EVIDENCE messages 6-10
(both methods are recursive)
Global Propagation
32
5
1 48 10
9
7
6
Start here!abd
ade
ace
ceg
eghdef
ad ae ce
de eg
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 180
A Priori Distribution
global propagation
potentials are consistent
Marginalizations gives probability distributions for the
variables
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 181
Short Summarya
b c
d e
f
g
h
a
b c
d e
f
g
h
32
5
1 48 10
9
7
6
abd
ade
ace
ceg
eghdef
ad ae ce
de eg
abd
ade
ace
ceg
eghdef
ad ae ce
de eg
1. For each cluster C and sepset S;C 1 , S 1
2. For each vertex u in the BN select a parent cluster C s.t. C fa(u). Include the conditional probability P( Xu | Xpa(u) ) into C
C C · P( Xu | Xpa(u) )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 182
Junction Tree AlgorithmStep 1: Transformation of graph
(a) Moralize(b) Triangulate(c) Identify cliques(c) Build junction tree
Step 2: Initialization (of values)(a) Set up potentials(b) Propagate potentials
Step 3: Update beliefs(a) Insert evidence into the junction tree(b) Propagate potentials
Steps 1 and 2 are performed only once
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 183
• Evidence is new information about a r.v. that changes our belief about its distribution
Ex. Before receiving evidenceP(Xu) = (0.14, 0.43, 0.31, 0.12)
• Hard evidence• The r.v. is instantiated (observed)
Xu=xu P(Xu) := (0, 0, 1, 0)• Soft evidence - everything else.
Xu < x1 P(Xu) := (0.5, 0.5, 0, 0)
Step 3(a) - Insert Evidence into JT
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 184
Hard Evidence as a LikelihoodIf we observe the the variable Xu to be xu the likelihood function becomes:
Xu(0,1,2,3) & observes Xu =2 Xu(xu)=(0,0,1,0);
Xu(xu) = 1, when xu is the observed value0, otherwise
For all unobserved variables Xv we make the likelihood function constant:
Xv(xv)= 1/n ; for all xv
where n is the number of states of Xv
Xv(0,1,2) unobserved Xv(xv)=(0.33,0.33,0.33)
Modify the initialization step 2a to include this!
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 185
Entering Observations1. For each observation Xu=xu:(a)Encode the observation as a likelihood Xu(b)Identify one clique C that contains u and
update C as:
C C · Xu
Step 3b. Propagate potentials
To make the potentials in the junction tree consistent, perform a global update
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 186
Short Summarya
b c
d e
f
g
h
1. 2.
3.
Observe Xd=xdXg=xg
4. a
b c
d e
f
g
h
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 187
5.2 Example: Create Junction Tree
X1 X2
Y1 Y2
HMM with 2 time steps:
Junction Tree:X1,X2X1,Y1 X2,Y2X1 X2
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 188
Initialization
Variable Associated Cluster
Potential function
X1 X1,Y1
Y1 X1,Y1
X2 X1,X2
Y2 X2,Y2
X 1,Y1 P (X1)
X 1,Y 1
P (X1)P (Y1 | X1)
X 1,X 2 P (X 2 | X1)
X 2,Y 2 P (Y 2 | X 2)
X1,X2X1,Y1 X2,Y2X1 X2
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 189
Collect Evidence (1/2)• Choose arbitrary clique, e.g. X1,X2,
where all potential functions will be collected.
• Call recursively neighboring cliques for messages:
• 1. Call X1,Y1.– 1. Projection:– 2. Absorption:
X 1 X 1,Y 1 P (X1,Y1) P (X1)Y 1
{ X 1,Y1}X 1
X 1,X 2 X 1,X 2X 1
X 1old P (X 2 | X1)P (X1) P (X1, X 2)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 190
Collect Evidence (2/2)• 2. Call X2,Y2:
– 1. Projection:
– 2. Absorption:
X 2 X 2,Y 2 P (Y 2 | X 2) 1Y 2
{ X 2,Y 2}X 2
X1,X2X1,Y1 X2,Y2X1 X2
X 1,X 2 X 1,X 2X 2
X 2old P (X1, X 2)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 191
Distribute Evidence (1/2)• Pass messages recursively to
neighboring nodes• Pass message from X1,X2 to X1,Y1:
– 1. Projection:
– 2. Absorption:
X 1 X 1,X 2 P (X1, X 2) P (X1)X 2
{ X 1,X 2}X 1
X 1,Y1 X 1,Y 1X 1
X 1old P (X1,Y1) P (X1)
P (X1)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 192
Example: Distribute Evidence (2/2)• Pass message from X1,X2 to X2,Y2:
– 1. Projection:
– 2. Absorption:
X 2 X 1,X 2 P (X1, X 2) P (X 2)X 1
{ X 1,X 2}X 2
X 2,Y 2 X 2,Y 2X 2
X 2old P (Y 2 | X 2) P (X 2)
1 P (Y 2, X 2)
X1,X2X1,Y1 X2,Y2X1 X2
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 193
Inference with Evidence (1/2)• Assume we want to compute:
P(X2|Y1=0,Y2=1) (state estimation)• Assign likelihoods to the potential
functions during initialization:
X 1,Y1 0 if Y1 1
P (X1,Y1 0) if Y1 0
X 2,Y 2 0 if Y 2 0
P(Y 2 1 | X 2) if Y 2 1
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 194
Inference with Evidence (2/2)
• Repeating the same steps as in the previous case, we obtain:
X 1,Y1 0 if Y1 1
P(X1,Y1 0,Y 2 1) if Y1 0
X 1 P (X1,Y1 0,Y 2 1)X 1,X 2 P (X1,Y1 0, X 2,Y 2 1)X 2 P (Y1 0, X 2,Y 2 1)
X 2,Y 2 0 if Y 2 0
P(Y1 0, X 2,Y 2 1) if Y 2 1
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 195
An Example• TBU
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 196
An Example• To perform exact inference in an arbitrary graph, convert it to a
junction tree, and then perform belief propagation.• A jtree is a tree whose nodes are sets, and which has the Jtree
property: all sets which contain any given variable form a connected graph (variable cannot appear in 2 disjoint places)
C
S R
W
CSR
SRW
SR
C
S R
W
moralize Make jtree
Maximal cliques = { {C,S,R}, {S,R,W} }Separators = { {C,S,R} Å {S,R,W} = {S,R} }
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 197
Making a Junction Tree
A
B
C
D
E
Fmoralize
A
B
C
D
E
F
A
B
C
D
E
F
Triangulate(order f,d,e,c,b,a)
{b,d} {b,e,f}
{b,c,e} {a,b,c}
{a,b,c}
{b,c,e}
{b,e,f}
{b,d}
Wij = |Ci Å Cj|
1
11
2
2
1
Max spanning tree Findmax cliques
Jtree Jgraph GT
GMG
Jensen94
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 198
Clique Potentials
CSR
SRW
SR
Each model clique potential gets assigned to oneJtree clique potential
If we observe W=w*, set E(w)=(w,w*), else E(w)=1
Each observed variable assigns a delta functionto one Jtree clique potential
Square nodes are factors
C
S R
W
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 199
Separator Potentials
CSR
SRW
SRSeparator potentials enforce consistency betweenneighboring cliques on common variables.
Square nodes are factors
C
S R
W
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 200
BP on a Jtree
• A Jtree is a MRF with pairwise potentials.
• Each (clique) node potential contains CPDs and local evidence.
• Each edge potential acts like a projection function.
• We do a forwards (collect) pass, then a backwards (distribute) pass.
• The result is the Hugin/ Shafer-Shenoy algorithm.
CSR
SRW
SR
1
2
4
3
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 201
BP on a Jtree (collect)
CSR
SRW
SR
Initial clique potentials contain CPDs and evidence
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 202
BP on a Jtree (collect)
CSR
SRW
SR Message from clique to separatormarginalizes belief (projects onto intersection)[remove c]
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 203
BP on a Jtree (collect)
CSR
SRW
SR
Separator potentials gets marginal belieffrom their parent clique.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 204
BP on a Jtree (collect)
CSR
SRW
SR
Message from separator to clique expands marginal[add w]
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 205
BP on a Jtree (collect)CSR
SRW
SR
Root clique has seen all the evidence
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 206
BP on a Jtree (distribute)CSR
SRW
SR
CSR
SRW
SR
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 207
BP on a Jtree (distribute)CSR
SRW
SR
CSR
SRW
SR
Marginalize out w and excludeold evidence (ec, er)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 208
BP on a Jtree (distribute)CSR
SRW
SR
CSR
SRW
SR
Combine upstream and downstreamevidence
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 209
BP on a Jtree (distribute)CSR
SRW
SR
CSR
SRW
SR
Add c and excludeold evidence (ec, er)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 210
BP on a Jtree (distribute)CSR
SRW
SR
CSR
SRW
SR
Combine upstream and downstream evidence
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 211
Partial BeliefsCSR
SRW
SR
CSR
SRW
SR
•The “beliefs”/ messages at intermediate stages (before finishing both passes)may not be meaningful, because any given clique may not have “seen” all themodel potentials/ evidence (and hence may not be normalizable).•This can cause problems when messages may fail (eg. Sensor nets).•One must reparameterize using the decomposable model to ensure meaningfulpartial beliefs. Paskin04
Evidence on R now added here
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 212
6. Summary• Variable elimination
– Good concept of sum-product computation– No good for computing many nodes
• Belief propagation– Good for
• Computing beliefs of many nodes• Poly-tree
– No good for general graphs• Junction tree
– Good for • Computing beliefs of many nodes• General graphs
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 213
Three Methods Are Closely Related• Variable elimination provides basic
ideas of BP and Junction Tree– Belief/Message Factor
Propagation/Passing Elimination– Clustering Elimination, Factor
• Junction tree is the converging algorithm
• Message passing provide unified formula
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 214
7. Implementation
PNL GeNIeEnumeration v (Naïve)Variable EliminationBelief Propagation v (Pearl) v (Polytree)Junction Tree v v (Clustering)Direct Sampling v (Logic)Likelihood Sampling v(LWSampling) v(Likelihood
sampling)MCMC Sampling v(Gibbswithanneal) (Other 5 samplings)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 215
8. References• S. M. Aji and R. J. McEliece, The generalized
distributive law, IEEE Trans. On Information Theory, vol. 46, no. 2, 2000.
• F. R. Kschischang, B. J. Frey, H.-Andrea Loeiliger, Factor graph and the sum-product algorithm, IEEE Trans. On Information Theory, vol. 47, no. 2, 2001.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 216
Recent Books• R. E. Neapolitan, Learning Bayesian Networks,
Prentice Hall, 2004.• C. Borgelt and R. Kruse, Graphical
Models:methods for data analysis and mining, Wiley, 2002.
• D. Edwards, Introduction to Graphical Modelling, 2nd, Springer, 2000.
• S. L. Lauritzen, Graphical Models, Oxford, 1996.• M. I. Jordan (ed.), Learning in Graphical Models,
MIT, 2001.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 217
Probabilistic Inference Using Bayesian Network
• Introductory article: M. Henrion, “An introduction to algorithms for inference in belief nets,” in M. Henrion, R. Shachter, L. Kanal, and J. Lemmer (eds.), Uncertainty in Artificial Intelligence, 5, Amsterdam:North Holland, 1990.
• Textbook with HUGIN system: F. Jensen, An Introduction to Bayesian Networks, New York: Springer-Verlag, 1996.
• R. Neal, “Connectionist learning of belief networks,” Artificial Intelligence, 56:71-113, 1991.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
Bayesian Networks Unit - Exact Inference in BN p. 218
General Probabilistic Inference• J. Pearl, Probabilistic reasoning in intelligent
systems – Networks of plausible inference, Morgan – Kaufmann 1988.
• E. Castillo, J. M. Gutierrez, A. S. Hadi, Expert Systems and Probabilistic Network Models, Springer 1997.
• R. Neapolitan, Probabilistic Reasoning in Expert Systems:Theory and Algorithms, New York:John Wiley & Sons, 1990.
• A special issue on “Uncertainty in AI” of the Communications of the ACM, vol. 38, no. 3, March 1995.
• G. Shafer and J. Pearl (eds.), Readings in Uncertain Reasoning, San Francisco:Morgan Kaufmann, 1990.