06 exact inference in bn

Preview:

DESCRIPTION

 

Citation preview

Bayesian Networks

Unit 6 Exact Inference in Bayesian Networks

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Wang, Yuan-Kai, 王元凱ykwang@mails.fju.edu.tw

http://www.ykwang.tw

Department of Electrical Engineering, Fu Jen Univ.輔仁大學電機工程系

2006~2011

Reference this document as: Wang, Yuan-Kai, “Exact Inference in Bayesian Networks,"

Lecture Notes of Wang, Yuan-Kai, Fu Jen University, Taiwan, 2011.

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 2

Goal of This Unit• Learn to efficiently compute the

sum product of the inference formula

– Remember: enumeration and multiplication of all P(Xi|Pa(Xi) are not efficient

– We will learn other 3 methods for exact inference

Hh ni

ii XPaXPeEXP~1

))(|()|(

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p.

Related Units• Background

– Probabilistic graphical model• Next units

– Approximate inference algorithms– Probabilistic inference over time

3

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 4

Self-Study References• Chapter 14, Artificial Intelligence-a modern

approach, 2nd, by S. Russel & P. Norvig, Prentice Hall, 2003.

• The generalized distributive law, S. M. Aji and R. J. McEliece, IEEE Trans. On Information Theory, vol. 46, no. 2, 2000.

• Inference in Bayesian networks, B. D’Ambrosio, AI Magazine, 1999.

• Probabilistic Inference in graphical models, M. I. Jordan & Y. Weiss.

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 5

Structure of Related Lecture Notes

PGM Representation

Inference

Problem

Learning

Data

Unit 5 : BNUnit 9 : Hybrid BNUnits 10~15: Naïve Bayes, MRF,

HMM, DBN,Kalman filter

Unit 6: Exact inferenceUnit 7: Approximate inferenceUnit 8: Temporal inference

Units 16~ : MLE, EM

StructureLearning

ParameterLearning

B E

A

J M

P(A|B,E)P(J|A)P(M|A)

P(B)P(E)

Query

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p.

Contents

1. Basics of Graph ……………………………… 112. Sum-Product and Generalized Distributive

Law …………………………………………..... 203. Variable Elimination ........................................ 294. Belief Propagation ....……............................... 965. Junction Tree ……………...……………........ 1576. Summary .......................................................... 2127. Implementation ……………………………… 2148. Reference .......................................................... 215

Fu Jen University Department of Electronic Engineering Yuan-Kai Wang Copyright

6

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 7

Four Steps of Inference P(X|e)• Step 1: Bayesian theorem

• Step 2: Marginalization

• Step 3: Conditional independence

• Step 4: Sum-Product computation– Exact inference– Approximate inference

),()(

),()|( eEXPeEP

eEXPeEXP

Hh

hHeEXP ),,(

Hh ni

ii XPaXP~1

))(|(

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 8

Five Types of Queries in Inference• For a probabilistic graphical model G• Given a set of evidence E=e• Query the PGM with

–P(e) : Likelihood query–arg max P(e) :

Maximum likelihood query–P(X|e) : Posterior belief query–arg maxx P(X=x|e) : (Single query variable)

Maximum a posterior (MAP) query–arg maxx1…xk

P(X1=x1, …, Xk=xk|e) :Most probable explanation (MPE) query

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 9

Brute Force Enumeration• We can compute

in O(KN) time, where K=|Xi|

• By using BN, we can represent joint distribution in O(N) space

B E

A

J M

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 10

Expression Tree of Enumeration : Repeated Computations

• P(b|j,m)= EAP(b)P(E)P(A|b,E)P(j|A)P(m|A)

+

**

P(j|a)P(m|a)

*

P(e)P(a|b,e)

P(b)

*A=a

**

P(j|a)P(m|a)

*

P(e)P(a|b,e)

P(b)

*A= a

E=e+

P(b)

**

P(j|a)P(m|a)

*

P(e)P(a|b,e)

*A=a

**

P(j|a)P(m|a)

*

P(e)P(a|b,e)

P(b)

*A= a

E= e+

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 11

1. Basics of Graph

• Polytree• Multiply connected networks• Clique• Markov network• Chordal graph• Induced width

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 12

Two Kinds of PGMs• There are two kinds of

probabilistic graphical models (PGMs)–Singly connected network

• Polytree–Multiply connected network

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 13

Singly Connected Networks (Polytree)Burglary Earthquake

Alarm

John Calls Mary Calls

• Any two nodes are connected by at most one undirected path

• Theorem• Inference in a polytree

is linear in the node size of the network

• This assumes tabular CPT representation

A

CB

D E

F G

H

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 14

Multiply Connected Networks

WetGrass

Cloudy

Sprinkler Rain

• At least two nodes are connected by more than one undirected path

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 15

Clique (1/2)• A clique is a subgraph of an undirected

graph that is complete and maximal– Complete:

• Fully connected• Every node connects to every other nodes

– Maximal:

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 16

Clique (2/2)• Identify cliques

A

B

D

C

E

G

F

H

EGH

ADEABD

ACEDEF

CEG

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 17

Markov Network (1/2)• An undirected graph with

– Hyper-nodes (multi-vertex nodes)– Hyper-edges (multi-vertex edges)

EGH

ADEABD

ACEDEF

CEG

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 18

Markov Network (2/2)• Every hyper-edge e=(x1…xk) has a

potential function fe(x1…xk)• The probability distribution is

11

11

),...,(.../1

),...,(),...,(

x xn Eeekee

Eeekeen

xxfZ

xxfZXXP

EGH CEG

Ee

e CHGEfZCEGEGHP ),,,(),(

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 19

Chordal Graphs• Elimination ordering undirected chordal

graph

Graph:• Maximal cliques are factors in elimination• Factors in elimination are cliques in the graph• Complexity is exponential in size of the largest

clique in graph

LT

A B

X

V S

D

V S

LT

A B

X D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 20

2. Sum-Product and Generalized Distributive Law

Hh ni

ii XPaXPeEXP~1

))(|()|(

We obtain the formula because two rules in probability theory

y

yxPxP ),()( :Rule Sum

)()|(),( :RuleProduct yPyxPyxP

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 21

The Sum-Product with Generalized Distributive Law

kX X ki

ii XPaXP1 ~1

))(|(

Hh ni

ii XPaXPeEXP~1

))(|()|(

))(|())(|(1

11 kkX X

XPaXPXPaXPk

1

1

),|())(|(

),|())(|(

111X

u

X Xktkk

XXPXPaXP

XXPXPaXPk k

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 22

Distributive Law for Sum-Product (1/3)

j

ji i

ij

ji xPxPxPxP )()()()(

i i

ii xaax)( 2121 xxaaxax

j

ji i

ij

ji xxxx

( )( )

( )( ))()( kh xPxP

)()( 21 kh xfxf

)(),(

)|( )(),(

hi

hi

ixP

xxP

ihi

xPxxP

xxPh

hi

Variable iis eliminated

j

kji i

hij

kjhi xxPxxPxxPxxP )|( )|( )|()|(

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 23

Distributive Law for Sum-Product (2/3)

j

kji i

hij

kjhi xxPxxPxxPxxP )|( )|( )|()|(

e a

amPajPebaPePbPmjbP )|()|(),|()()(),|(

ae

amPajPebaPePbP )|()|(),|()()(

( )( )•

jij

i iki

jijki xxPxxPxxPxxP )|( )|( )|()|(

)( )|( ii

ki xfxxP

)()( 21 kh xfxf

( )( ) )( kxf

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 24

Distributive Law for Sum-Product (3/3)

ab + ac = a(b+c)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 25

Distributive Law for Max-Product

)(max)(max)()(maxmax jjiijijixPxPxPxP

iiiixaax maxmax

)(maxarg iixP

),max(),max( 2121 xxaaxax

jjiijijixxxx maxmaxmaxmax

)|(max)|(max

)|()|(maxmax

kjjkii

kjkiji

xxPxxP

xxPxxP

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 26

Generalized Distributive Law (1/2)Aji and McEliece,

2000

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 27

Generalized Distributive Law (2/2)Aji and McEliece,

2000

•a+0=0+a=a•a*1=1*a=a•a*b+a*c=a*(b+c)

•max(a,0)=max(0+a)=a•a*1=1*a=a•max(a*b, a*c)=a*max(b, c)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 28

Marginal to MAP : MAX Product

Maximum Likelihood Query& MAP Query

4x 5x

1x

2x

3x

Likelihood & Posterior Queries

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 29

3. Variable Elimination

• Variable elimination improves the enumeration algorithm by–Eliminating repeated calculations

• Carry out summations right-to-left–Bottom-up in the evaluation tree

• Storing intermediate results (factors) to avoid re-computation

–Dropping irrelevant variables

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 30

Basic Idea• Write query in the form

• Iteratively–Move all irrelevant terms (constants) outside

the innermost summation

i aibc = bc i ai

–Perform innermost sum, getting a new term:factors

–Insert the new term into the product

kx x x i

iin paxPeXP3 2

)|(),(

( ) ( ( ))

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 31

An Example without Evidence (1/2)

RainSprinkler

Cloudy

WetGrass

csr

cPcsPcrPsrwPwP,,

)()|()|(),|()(

sr c

cPcsPcrPsrwP,

)()|()|(),|(

sr

srfsrwP,

1 ),(),|( ),(1 srfFactor

P(C)0.5

S R P(W|S,R)T T 0.99T F 0.90F T 0.90F F 0.00

C P(S|C)T 0.1F 0.5

C P(R|C)T 0.8F 0.2

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 32

An Example without Evidence (2/2)R S C P(R|C) P(S|C) P(C) P(R|C) P(S|C) P(C)

T T TT T FT F TT F FF T TF T FF F TF F F

R S f1(R,S) = ∑c P(R|S) P(S|C) P(C)T TT FF TF F

Factor f1(r,s)A factor may be• A function• A value

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 33

An Example with Evidence (1/2)

Factors

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 34

An Example with Evidence (2/2)• fM(a) = <0.7,0.1>• fJ(a) = <0.9,0.05>• fA(a,b,e)• fÃJM(b,e)

J M A B E fM(a) PJ(a) fA(a,b,e) fJM (a,b,e) fÃJM (b,e)T T T T T 0.7 0.9 0.95 0.7*0.9*0.95T T T T F 0.7 0.9 0.95 0.7*0.9*0.95T T T F T 0.7 0.9 0.29 0.7*0.9*0.29T T T F F 0.7 0.9 0.001 0.7*0.9*0.01T T F T T 0.1 0.05 0.05 0.1*0.05*0.05T T F T F 0.1 0.05 0.05 0.1*0.05*0.05T T F F T 0.1 0.05 0.71 0.1*0.05*0.71T T F F F 0.1 0.05 0.95 0.1*0.05*0.95

Burglary Earthquake

Alarm

John Calls Mary Calls

P(B)0.001

P(E)0.002 B E P(A|B,E)

T T 0.95T F 0.95F T 0.29F F 0.001

A P(J|A)T 0.90F 0.05

A P(M|A)T 0.70F 0.01

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 35

Basic Operations• Summing out a variable from a

product of factors–Move any irrelevant terms (constants)

outside the innermost summation–Add up submatrices in pointwise

product of remaining factors

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 36

Variable Elimination Algorithm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 37

Irrelevant Variables (1/2)• Consider the query

P(JohnCalls|Burglary = true)– P(J|b)= P(b) eP(e) aP(a|b,e)P(J|a) mP(m|a)–Sum over m is identically 1

mP(m|a) = 1–M is irrelevant to the query

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 38

Irrelevant Variables (2/2)• Theorem 1: P(X|E)

Y is irrelevant if YAncestors({X}E)• In the example P(J|b)

– X =JohnCalls, E={Burglary}– Ancestors({X} E)

= {Alarm,Earthquake}– so MaryCalls is irrelevant

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 39

Complexity• Time and space cost of variable elimination

are O(dkn)– n: No. of random variables– d: no. of discrete values– k: no. of parent nodes

• Polytrees : k is small, Linear– If k=1, O(dn)

• Multiply connected networks : – O(dkn), k is large– Can reduce 3SAT to variable elimination

• NP-hard– Equivalent to counting 3SAT models

• #P-complete, i.e. strictly harder than NP-complete problems

k is critical for complexity

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 40

Pros and Cons• Variable elimination is simple and

efficient for single query P(Xi | e)• But it is less efficient if all the variables

are computed: P(X1 | e), …, P(Xk | e)– In a polytree network, one would need to

issue O(n) queries costing O(n) each: O(n2)• Junction tree algorithm extends variable

elimination that compute posterior probabilities for all nodes simultaneously

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 41

3.1 An Example• The Asia network

Visit to Asia Smoking

Lung CancerTuberculosis

Abnormalityin Chest Bronchitis

X-Ray Dyspnea

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 42V S

LT

A B

X D

),|()|(),|()|()|()|()()(),,,,,,,(

badPaxPltaPsbPslPvtPsPvPdxbaltsvP

• We want to inference P(d)• Need to eliminate: v,s,x,t,l,a,b

Initial factors

“Brute force approach”P (d) P (v,s, t, l,a,b, x,d)

v

s

t

l

a

b

x

Complexity is exponential• N : size of the graph, number of variables • K : number of states for each variable

O(N T )

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 43V S

LT

A B

X D

),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP

• We want to inference P(d)• Need to eliminate : v,s,x,t,l,a,b

Initial factors

Eliminate: v

Note: fv(t) = P(t)In general, result of elimination is not necessarily a probability term

Compute: v

v vtPvPtf )|()()(

),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfvt fv(t)T 0.70F 0.01

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 44V S

LT

A B

X D

),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP

• We want to inference P(d)• Need to eliminate : s,x,t,l,a,b

• Initial factors

Eliminate: s

•Summing on s results in fs(b,l)•A factor with two arguments •Result of elimination may be a function of several variables

Compute: s

s slPsbPsPlbf )|()|()(),(

),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv

),|()|(),|(),()( badPaxPltaPlbftf sv b l fs(b,l)T T 0.95T F 0.95F T 0.29F F 0.001

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 45V S

LT

A B

X D

),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP

• We want to inference P(d)• Need to eliminate : x,t,l,a,b

• Initial factors

Eliminate: x

Note: fx(a) = 1 for all values of a !!

Compute: x

x axPaf )|()(

),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv

),|()|(),|(),()( badPaxPltaPlbftf sv

),|(),|()(),()( badPltaPaflbftf xsv

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 46V S

LT

A B

X D

),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP

• We want to inference P(d)• Need to eliminate : t,l,a,b

• Initial factors

Eliminate: tCompute:

tvt ltaPtflaf ),|()(),(

),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv

),|()|(),|(),()( badPaxPltaPlbftf sv

),|(),|()(),()( badPltaPaflbftf xsv

),|(),()(),( badPlafaflbf txs

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 47V S

LT

A B

X D

),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP

• We want to inference P(d)• Need to eliminate : l,a,b

• Initial factors

Eliminate: lCompute:

ltsl laflbfbaf ),(),(),(

),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv

),|()|(),|(),()( badPaxPltaPlbftf sv

),|(),|()(),()( badPltaPaflbftf xsv

),|(),()(),( badPlafaflbf txs

),|()(),( badPafbaf xl

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 48V S

LT

A B

X D),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP

• We want to inference P(d)• Need to eliminate : b

• Initial factors

Eliminate: a,bCompute:

b

aba

xla dbfdfbadpafbafdbf ),()(),|()(),(),(

),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv

),|()|(),|(),()( badPaxPltaPlbftf sv

),|(),|()(),()( badPltaPaflbftf xsv

),|()(),( badPafbaf xl),|(),()(),( badPlafaflbf txs

)(),( dfdbf ba

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 49V S

LT

A B

X DP (v)P (s)P (t | v)P (l | s)P (b | s)P (a | t, l)P (x | a)P (d | a,b)

• Different elimination ordering• Need to eliminate : a,b,x,t,v,s,l• Initial factors

)(),(

),,(),,,(

),,,,(),,,,,(

),,,,,,(

dgdlg

sdlgvsdlg

vsdtlgvsxdtlg

vsxbdtlg

l

s

v

t

x

b

a

Intermediate factors:

)(),(

),,(),,,(

),,,,(),,,,,(

),,,,,,(

dfbaf

balfbaltf

baltxfbaltxsf

baltxsvf

b

a

l

t

x

s

v

In previous orderBothneedn=7steps

But each step has different

computation size

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 50

Short Summary• Variable elimination is a sequence of

rewriting operations• Computation depends on

– Number of variables n• Each elimination step reduces one variable• So we need n elimination steps

– Size of factors• Effected by order of elimination• Discussed in sub-section 3.2

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 51

Dealing with Evidence(1/7)

• How do we deal with evidence?

• Suppose get evidence V = t, S = f, D = t• We want to compute P(L, V = t, S = f, D = t)

V S

LT

A B

X D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 52

Dealing with Evidence(2/7)

• We start by writing the factors:

• Since we know that V = t, we don’t need to eliminate V• Instead, we can replace the factors P(V) and P(T|V) with

• These “select” the appropriate parts of the original factors given the evidence

• Note that fp(V) is a constant, and thus does not appear in elimination of other variables

),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP

)|()()( )|()( tVTPTftVPf VTpVP

V S

LT

A B

X D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 53

Dealing with Evidence(3/7) • Given evidence V = t, S = f, D = t• Compute P(L, V = t, S = f, D = t )• Initial factors, after setting evidence:

),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP

V S

LT

A B

X D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 54

Dealing with Evidence(4/7) • Given evidence V = t, S = f, D = t• Compute P(L, V = t, S = f, D = t )• Initial factors, after setting evidence:

• Eliminating x, we get),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP

V S

LT

A B

X D

),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 55

Dealing with Evidence(5/7) • Given evidence V = t, S = f, D = t• Compute P(L, V = t, S = f, D = t )• Initial factors, after setting evidence:

• Eliminating x, we get

• Eliminating t, we get

),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP

V S

LT

A B

X D

),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP

),()(),()()( ),|()|()|()()( bafaflafbflfff badPxtsbPslPsPvP

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 56

Dealing with Evidence(6/7) • Given evidence V = t, S = f, D = t• Compute P(L, V = t, S = f, D = t )• Initial factors, after setting evidence:

• Eliminating x, we get

• Eliminating t, we get

• Eliminating a, we get

),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP

V S

LT

A B

X D

),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP

),()(),()()( ),|()|()|()()( bafaflafbflfff badPxtsbPslPsPvP

),()()( )|()|()()( lbfbflfff asbPslPsPvP

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 57

Dealing with Evidence(7/7) • Given evidence V = t, S = f, D = t• Compute P(L, V = t, S = f, D = t )• Initial factors, after setting evidence:

• Eliminating x, we get

• Eliminating t, we get

• Eliminating a, we get

• Eliminating b, we get

),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP

V S

LT

A B

X D

),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP

),()(),()()( ),|()|()|()()( bafaflafbflfff badPxtsbPslPsPvP

),()()( )|()|()()( lbfbflfff asbPslPsPvP

)()()|()()( lflfff bslPsPvP

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 58

• Suppose in one elimination step we compute

This requires • multiplications

– For each value for x, y1, …, yk, we do mmultiplications

• additions– For each value of y1, …, yk , we do |X| additions

x

kxkx yyxfyyf ),,,('),,( 11

m

ilikx i

yyxfyyxf1

,1,1,11 ),,(),,,('

Complexity (1/2)

i

iYXm

i

iYX

|X| : No. of discrete values of X

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 59

Complexity (2/2)• One elimination step requires

– multiplications– additions– O( ), m is a constant (neglected)– Or O(dk) if

• |X|=|Yi|=d, • k: no. of parent nodes

• Time and space cost are O(dkn)– n: No. of random variables– d: no. of discrete values– k: no. of parent nodes

i

iYXm

i

iYX

i

iYX

Complexity is exponential in number

of variables k

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 60

3.2 Order of Elimination• How to select “good” elimination

orderings in order to reduce complexity1. Start by understanding variable

elimination via the graph we are working with

2. Then reduce the problem of finding good ordering to graph-theoretic operation that is well-understood

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 61

Undirected Graph Conversion (1/2)• At each stage of the variable

elimination, • We have an algebraic term that we

need to evaluate• This term is of the form

where Zi are sets of variables

1

)(),,( 1y y i

ikn

fxxP iZ

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 62

Undirected Graph Conversion (2/2)• Plot a graph where

– If X,Y are arguments of some factor• That is, if X,Y are in some Zi

– There are undirected edges X--Y

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 63

Example• Consider the “Asia” example• The initial factors are

• The undirected graph is

• In the first step this graph is just the moralized graph

),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP

V S

LT

A B

X D

V S

LT

A B

X D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 64

Variable Elimination Change of Graph

• Now we eliminate t, getting

• The corresponding change in the graph is

),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP

),,(),|()|()|()|()()( lavfbadPaxPsbPslPsPvP t

V S

LT

A B

X D

V S

LT

A B

X D

Nodes V,L,Abecome a clique

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 65

Example (1/6)• Want to compute P(L,V=t,S=f,D=t)

• Moralizing

V S

LT

A B

X DLT

A B

X

V S

D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 66

Example (2/6)• Want to compute P(L,V=t,S=f,D=t)

• Moralizing• Setting evidence

V S

LT

A B

X D

LT

A B

X

V S

D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 67

Example (3/6)• Want to compute P(L,V=t,S=f,D=t)

• Moralizing• Setting evidence• Eliminating x

– New factor fx(A) LT

A B

X

V S

D

V S

LT

A B

X D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 68

Example (4/6)• Want to compute P(L,V=t,S=f,D=t)

• Moralizing• Setting evidence• Eliminating x• Eliminating a

– New factor fa(b,t,l) LT

A B

X

V S

D

V S

LT

A B

X D

A clique in reduced undirected graph

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 69

Example (5/6)• Want to compute P(L,V=t,S=f,D=t)

• Moralizing• Setting evidence• Eliminating x• Eliminating a• Eliminating b

– New factor fb(t,l)LT

A B

X

V S

D

V S

LT

A B

X D

A clique in reduced undirected graph

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 70

Example (6/6)• Want to compute P(L,V=t,S=f,D=t)

• Moralizing• Setting evidence• Eliminating x• Eliminating a• Eliminating b• Eliminating t

– New factor ft(l)

LT

A B

X

V S

D

V S

LT

A B

X D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 71

Elimination and Clique (1/2)• We can eliminate a variable x by

1. For all Y,Z, s.t., Y--X, Z--X• add an edge Y--Z

2. Remove X and all adjacent edges to it• This procedures create a clique that contains

all the neighbors of X• After step 1 we have a clique that

corresponds to the intermediate factor(before marginalization)

• The cost of the step is exponential in the size of this clique : dk in O(ndk)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 72

Elimination and Clique (2/2)• The process of eliminating nodes from

an undirected graph gives us a clue to the complexity of inference

• To see this, we will examine the graph that contains all of the edges we added during the elimination

• The resulting graph is always chordal

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 73

Example (1/7)• Want to compute P(L)

• Moralizing

V S

LT

A B

X D

LT

A B

X

V S

D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 74

Example (2/7)• Want to compute P(L)

• Moralizing• Eliminating v

– Multiply to get f’v(v,t)– Result fv(t)

V S

LT

A B

X D

LT

A B

X

V S

D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 75

Example (3/7)• Want to compute P(L)

• Moralizing• Eliminating v• Eliminating x

–Multiply to get f’x(a,x)–Result fx(a)

V S

LT

A B

X D

LT

A B

X

V S

D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 76

Example (4/7)• Want to compute P(L)

• Moralizing• Eliminating v• Eliminating x• Eliminating s

–Multiply to get f’s(l,b,s)–Result fs(l,b)

V S

LT

A B

X D

LT

A B

X

V S

D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 77

Example (5/7)• Want to compute P(D)

• Moralizing• Eliminating v• Eliminating x• Eliminating s• Eliminating t

–Multiply to get f’t(a,l,t)–Result ft(a,l)

V S

LT

A B

X D

LT

A B

X

V S

D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 78

Example (6/7)• Want to compute P(D)

• Moralizing• Eliminating v• Eliminating x• Eliminating s• Eliminating t• Eliminating l

–Multiply to get f’l(a,b,l)–Result fl(a,b)

V S

LT

A B

X D

LT

A B

X

V S

D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 79

Example (7/7)• Want to compute P(D)

• Moralizing• Eliminating v• Eliminating x• Eliminating s• Eliminating t• Eliminating l• Eliminating a, b

–Multiply to get f’a(a,b,d)–Result f(d)

V S

LT

A B

X D

LT

A B

X

V S

D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 80

• The resulting graph are inducedgraphs (for this particular ordering)

• Main property:– Every maximal clique in the induced graph

corresponds to an intermediate factor in the computation

– Every factor stored during the process is a subset of some maximal clique in the graph

• These facts are true for any variable elimination ordering on any network

Induced GraphsLT

A B

X

V S

D

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 81

Induced Width (Treewidth)• The size of the largest clique k in the

induced graph is – An indicator for the complexity of variable

elimination• w=k-1 is called

– Induced width (treewidth) of a graph– According to the specified ordering

• Finding a good ordering for a graph is equivalent to finding the minimal induced width of the graph

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 82

Treewidth

PCWP COHRBP

HREKGHRSAT

ERRCAUTERHRHISTORY

CATECHOL

SAO2 EXPCO2

ARTCO2

VENTALV

VENTLUNG VENITUBE

DISCONNECT

MINVOLSET

VENTMACHKINKEDTUBEINTUBATIONPULMEMBOLUS

PAP SHUNT

MINOVL

PVSAT

PRESS

INSUFFANESTHTPR

LVFAILURE

ERRBLOWOUTPUTSTROEVOLUMELVEDVOLUME

HYPOVOLEMIA

CVP

BP

Low treewidth High tree widthChains

Trees (no loops)

N=nxn grid

Loopy graphs

W = 1

W = #parents

W = O(n) = O(p N)

W = NP-hard to find

Arnborg85

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 83

Complexity• Time and space cost of variable elimination

are O(dkn)– n: No. of random variables– d: no. of discrete values– k: no. of parent nodes = treewidth + 1 (W+1)

• Polytrees : k is small, Linear– If k=1, O(dn)

• Multiply connected networks : – O(dkn), k is large– Can reduce 3SAT to variable elimination

• NP-hard– Equivalent to counting 3SAT models

• #P-complete, i.e. strictly harder than NP-complete problems

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 84

Elimination on Trees (1/3)• Suppose we have a tree that

– A network where each variable has at most one parent

• Then all the factors involve at most two variables: Treewidth=1

• The moralized graph is also a treeA

CB

D E

F G

A

CB

D E

F G

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 85

Elimination on Trees (2/3)• We can maintain the tree structure by

eliminating extreme variables in the treeA

CB

D E

F G

A

CB

D E

F GA

CB

D E

F G

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 86

Elimination on Trees (3/3)• Formally, for any tree, there is an

elimination ordering with treewidth = 1

Theorem• Inference on trees is linear in number of

variables : O(dn)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 87

Exercise: Variable Elimination

smart study

prepared fair

pass

p(smart)=.8 p(study)=.6

p(fair)=.9

p(prep|…) smart smartstudy .9 .7study .5 .1

p(pass|…)smart smart

prep prep prep prepfair .9 .7 .7 .2fair .1 .1 .1 .1

Query: What is the probability that a student studied, given that they pass the exam?

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 88

Variable Elimination Algorithm• Let X1,…, Xm be an ordering on the non-query

variables

• For i = m, …, 1– Leave in the summation for Xi only factors

mentioning Xi– Multiply the factors, getting a factor that contains a

number for each value of the variables mentioned, including Xi

– Sum out Xi, getting a factor f that contains a number for each value of the variables mentioned, not including Xi

– Replace the multiplied factor in the summation

j

jjX XX

XParentsXPm

))(|(...1 2

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 89

3.3 General Graphs• If the graph is not a polytree

–More general networks–Usually loopy networks

• Can we inference loopy networks by variable elimination?– If network has a cycle, the treewidth for

any ordering is greater than 1– Its complexity is high, – VE becomes a not practical algorithm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 90

Example (1/2)• Eliminating A, B, C, D, E,….• Resulting graph is chordal with

treewidth 2

A

H

B

D

F

C

E

G

A

H

B

D

F

C

E

G

A

H

B

D

F

C

E

G

A

H

B

D

F

C

E

G

A

H

B

D

F

C

E

G

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 91

Example (2/2)• Eliminating H,G, E, C, F, D, E, A• Resulting graph is chordal with

treewidth 3

A

H

B

D

F

C

E

G

A

H

B

D

F

C

E

G

A

H

B

D

F

C

E

G

A

H

B

D

F

C

E

G

A

H

B

D

F

C

E

G

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 92

Find Good Elimination Orderin General Graph

Theorem:• Finding an ordering that minimizes the

treewidth is NP-HardHowever,• There are reasonable heuristic for finding

“relatively” good ordering• There are provable approximations to the best

treewidth• If the graph has a small treewidth, there are

algorithms that find it in polynomial time

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 93

Heuristics for Finding an Elimination Order

• Since elimination order is NP-hard to optimize,

• It is common to apply greedy search techniques: Kjaerulff90

• At each iteration, eliminate the node that would result in the smallest– Number of fill-in edges [min-fill]– Resulting clique weight [min-weight] (Weight of

clique = product of number of states per node in clique)

• There are some approximation algorithms Amir01

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 94

Factorization in Loopy Networks

Factorizable

Not Factorizable

Probabilistic models with no loop are tractable

Probabilistic models with loop are not tractable

dcbadcba

dcba

),(),(),(),(

),(),(),(,

xPxPxPxP

xPxPxPxPa b c d

a

b

cd

ab

cd

a b c d

dcba xP ,,,,

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 95

Short Summary• Variable elimination

– Actual computation is done in elimination step

– Computation depends on order of elimination

– Very sensitive to topology– Space = time

• Complexity–Polytrees: Linear time–General graphs: NP-hard

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 96

4. Belief Propagation

• Also called–Message passing–Pearl’s algorithm

• Subsections–4.1 Message passing in simple chains–4.2 Message passing in trees–4.3 BP Algorithm–4.4 Message passing in general graphs

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 97

What’s Wrong with VarElim

• Often we want to query all hidden nodes• Variable elimination takes O(N2dk) time to

compute P(Xi|e) for all (hidden) nodes Xi• Message passing algorithms that can do

this in O(Ndk) time

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 98

Repeated Variable Elimination Leads to Redundant Calculations

Y1 Y3

X1 X2 X3

Y2

O(N2 K2) time to compute all N marginals

32

)|()|()|()|()|()()|( 332322121113:11xx

xyPxxPxyPxxPxyPxPyxP

31

)|()|()|()()|()|()|( 332311122123:12xx

xyPxxPxyPxPxyPxxPyxP

21

)|()|()|()()|()|()|( 221211133233:13xx

xyPxxPxyPxPxyPxxPyxP

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 99

Belief Propagation• Belief propagation (BP) operates by sending

beliefs/messages between nearby variables in the graphical model

• It works like variable elimination

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 100

4.1 Message Passing in Simple Chains

• Likelihood query (query without evidence)– P(X1), P(Xn), P(Xk)– P(Xj , Xk)

• Posterior query (query with evidence)– P(X1|Xn), P(Xn|X1), – P(Xk|X1), P(Xk|Xn),– P(X1|Xk), P(Xn|Xk),– P(Xk|Xj)

• Maximum A Posterior (MAP) query– arg max P(Xk|Xj)

X1 Xk Xn......

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 101

Sum-Product of the Simple Chain (1/2)

nk

k

nk k

n ik k

nk k

nkk

Xnn

Xkk

X Xkkkk

X

X Xkknn

X X

X X Xii

X X

X Xnk

X X

XXXXnkk

XXPXXP

XXPXXPXXPXP

XPXXPXXPXXP

XPaXP

XXXP

XXXPXP

)|()|(

)|()|()|()(

)()|()|()|(

))(|(

),,,,(

),,,,()(

11

121121

11211

1

,1

1

1 12

1 1 1

1 1 1

1 1 1

111

X1 Xk Xn......

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 102

Sum-Product of the Simple Chain (2/2)

},,1|{11211

},,1|{

},,1|{1

)()|()|()|(

))(|(

),,( )|(

kjiniXkknn

kjiniX Xii

kjiniXnjk

i

i i

i

XPXXPXXPXXP

XPaXP

XXPXXP

X1 Xk Xn......

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 103

4.1.1 Likelihood Query

X1 X2 X3 Xn...• P(Xn) or P(xn) : Forward passing

X1 X2 X3 Xn...• P(X1) or P(x1) : Backward passing

X1 X2 Xk Xn......

• P(Xk) or P(xk) : Forward-Backward passing

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 104

d c b a

d c b a

abPaPbcPcdPdeP

dePcdPbcPabPaPeP

)|()()|()|()|(

)|()|()|()|()()(

Forward Passing (1/6)

• P(e)

A B C ED

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 105

• Now we can perform innermost summation

• This summation is exactly – A variable elimination step– We call it: send a CPT P(b) to compute next

innermost summation– The sent CPT P(b) is called a belief, or message:

d c b

d c b a

bpbcPcdPdeP

abPaPbcPcdPdePeP

)()|()|()|(

)|()()|()|()|()(

Forward Passing (2/6)A B C EDX )(BmAB

aa

AB bafabPaPbPbm ),()|()()()(

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 106

Forward Passing (3/6)

• Rearranging and then summing again,

d c

d c b

cpcdPdeP

bpbcPcdPdePeP

)()|()|(

)()|()|()|()(

A B C EDX X

)()()|()( cPbmbcPcmb

ABBC

)(CmBC

)(DmCD

)(EmDE

d c

BC cmcdPdeP )()|()|(

)(BmAB

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 107

Forward Passing (4/6)

• How do we compute P(Xn)?

• Actually, we recursively compute mk-1,k(xk)

11

)()|()()|()()( 11,2111,1kk x

kkkkkx

kkkkkkk xmxxPxPxxPxPxm

)(,1 nnn xm

)()|()|()|()( 11223111 2

xPxxPxxPxxPxPxx x

nnnn

X1 X2 X3 Xn...

mk-1,k(xk) is called a belief, or message

)( 212 xm )( 323 xm )(,1 nnn xm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 108

Forward Passing (5/6)

xk mk-1,i(xk)TF

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

X1 X2 X3 Xn...

xk-1 xk P(xk|xk-1) P(xk-1)=mk-2,k-1(xk)P(xk)=mk-1,k(xk)

T TT FF TF FAdvantage:

After P(Xn), all P(Xk) are also obtainedCompute beliefs of all variables at once

)( 212 xm )( 323 xm )(,1 nnn xm

O(Ndk)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 109

Forward Passing (6/6)

X1 X2 X3 Xn...

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

)( 212 xm )( 323 xm )(,1 nnn xm

d c bAB

d c b a

bmbcPcdPdeP

abPaPbcPcdPdePeP

)()|()|()|(

)|()()|()|()|()(A B C EDX

)(BmAB• Because when Xi-1 has no parent

parents no has if )()(,1 kkkkk xxPxm

)( 101 xm

)( 1xP

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 110

Backward Passing (1/4)• P(a) A B C ED

b c d e

b c d e

dePcdPbcPabPaP

dePcdPbcPabPaPaP

)|()|()|()|()(

)|()|()|()|()()(

b c d

dfcdPbcPabPaP )()|()|()|()(

X)(DmED

• Eliminating variable e, we get f(d)– We call it a belief/message sent from e to d

e

ED dePdfDm 1)|()()(

( =1 )

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 111

Backward Passing (2/4)

• Eliminating d, we get f(c)– We call it a belief/message sent from d to c

A B C EDX

d

EDDC dmcdPcfCm )()|()()(

)(CmDC

b c d

ED dmcdPbcPabPaPaP )()|()|()|()()(

b c

cfbcPabPaP )()|()|()(

X)(DmED

( =1 )

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 112

b c

DC cmbcPabPaPaP )()|()|()()(

Backward Passing (3/4)

• Eliminating c,

A B C EDXX )(CmDC)(BmCB

X)(DmED

b

CBb

bmabPaPbfabPaPaP )()|()()()|()()(

• Eliminating b, )()()()()( amaPafaPaP BA

X)(AmBA

1)( amBA

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 113

Backward Passing (4/4)

X1 X2 X3 Xn...)( 232 xm)( 121 xm

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

)( 11, nnn xm

• Because when Xi-1 has no parentchild no has if 1)(,1 kkkk xxm

b c d e

dePcdPbcPabPaPaP )|()|()|()|()()(

b c d

ED DmcdPbcPabPaP )()|()|()|()(

A B C ED)(DmED

e

ED dePDm 1)|()(

1)(,1 nnn xm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 114

Comparison

X1 X2 X3 Xn...

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

)( 212 xm )( 323 xm )(,1 nnn xm

parents no has if )()(,1 kkkkk xxPxm

X1 X2 X3 Xn...)( 232 xm)( 121 xm

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

)( 11, nnn xm

child no has if 1)(,1 kkkk xxm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 115

Forward-Backward Passing (1/3)

X1 X2 Xk Xn

)(,1 kkk xm )(,1 kkk xm

......

)()( ,1,1 kkkkkk xmxm

• P(Xk)

)()|(

)|( )|()|( )(

11

1121

1

1 1

xPxxP

xxPxxPxxPxP

k

n k

xkk

x xkkkknnk

( )

( )

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 116

Forward-Backward Passing (2/3)

X1 X2 Xk Xn

)(,1 kkk xm

)(,1 kkk xm ......

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

)()()( ,1,1 kkkkkkk xmxmxP

• P(Xk)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 117

Forward-Backward Passing (3/3)

X1 X2 Xk Xn

)(,1 nnn xm )(,1 nnn xm

......)()( ,1,1 nnnnnn xmxm

• P(Xn) as forward-backward passing=1

X1 X2 Xk Xn

)( 11,2 xm)( 11,0 xm ......)()()( 11,211,01 xmxmxP

• P(X1) as forward-backward passing

=P(x1)

)()( ,1 nnnn xmxP

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 118

Exercise• P(X1, Xn)• P(Xj, Xk)

X1 Xj Xk Xn.........

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 119

4.1.2 Posterior Query

x1 X2 X3 Xn...• P(Xn|x1) : Forward passing

X1 X2 X3 xn...• P(X1|xn) : Backward passing

X1 X2 Xk Xn......

• P(Xk) or P(xk) : Forward-Backward passing

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 120

Backward Passing (1/6)

• A query P(A|e) = P(A,e)– Variable elimination in Chains with Evidence

A B C eD

b c d

b c d

b c d

dePcdPbcPAbPAP

dePcdPbcPAbPAP

edcbAPeAP

)|()|()|()|()(

)|()|()|()|()(

),,,,(),(

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 121

Backward Passing (2/6)

• Eliminating d, we get P(e|c)– We call it a belief/message sent from d to c

A B C eD

b c

b c d

ecfbcPAbPAP

dePcdPbcPAbPAPeAP

),()|()|()(

)|()|()|()|()(),(X

d

DC dePcdPecfCm )|()|(),()(

)(CmDC )(DmED

)(DmED

)()|(1)|()( EmdePdePDm FEED

1)( EmFE

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 122

b

b cDC

ebfAbPAP

cmbcPAbPAPeAP

),()|()(

)()|()|()(),(

Backward Passing (3/6)

• Eliminating c, we getA B C eDXX

)()|(),()|(),()( CmbcPecfbcPebfBm DCc c

CB

)(CmDC)(BmCB

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 123

Backward Passing (4/6)

• Finally, we eliminate bA B C eD

),()(

)()|()(),(

eAfAP

bmAbPAPeAPb

CB

XXX

)()|(),()|(),()( BmAbPebfAbPeAfAm CBb b

BA

)(BmCB)(AmBA

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 124

Backward Passing (5/6)

• Given Xn=xn• How to compute P(X1,xn) for P(X1|xn)?

X1 X2 X3 xn...

)()()|()|( 12112~

1121

xmxPxxPxXPxx ni

iinn

)( 232 xm)( 121 xm

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

parents no has if )()(,1 kkkkk xxPxm

)()( 121101 xmxm

)( 101 xm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 125

Backward Passing (6/6)

A B C eDX

d

EDd

DC dmcdPdePcdPCm )()|()|()|()(

)(CmDC

b cDC

b c d

cmbcPabPaP

dePcdPbcPabPaPeaP

)()|()|()(

)|()|()|()|()(),(

)(DmED

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 126

4.1.3 Short Summary• Message can be recursively computed

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

)(,1 kkk xm )(,1 kkk xm

X1 Xk-1 Xn...... Xk+1Xk

)( 11,2 kkk xm )( 11,2 kkk xm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 127

Belief of Any Node

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

1

)()|()( 11,21,1kx

kkkkkkkk xmxxPxm

)()()( ,1,1 kkkkkkk xmxmxP

)(,1 kkk xm )(,1 kkk xm

X1 Xk-1 Xn...... Xk+1Xk

)( 11,2 kkk xm )( 11,2 kkk xm

)()()|( ,1,1 kkkkkkk xmxmexP

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 128

Three Special Cases of Message

)()(,1 kkkk xPxm

1)(,1 kkk xm

• Node xk with evidence

• Node xk without parents• Node xk without child

1)(,1 kkk xm 1)(,1 kkk xm

X1 Xk-1 Xn...... Xk+1Xk

m m1,1 kkm

)|()( 111, kkkkk xxPxm )|()( 111, kkkkk xxPxm

11, kkm

)|( 11, kkkk xxPm

)|( 11, kkkk xxPm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 129

4.2 Message Passing in Tree

X1

Xk-1

Xn

...... Xk+1

Xk

X1

Xk-1

Xn

...... Xk+1

Xk

Markov Chain

)(,1 kkk xm )(,1 kkk xm

)(,1 kkk xm

)(,1 kkk xm

Markov Tree)()()( ,1,1 kkkkkkk xmxmxP

Xm

Xm-1

)(, kkm xm

)()()()( ,,1,1 kkmkkkkkkk xmxmxmxP

)(

, )(kxNj

kkj xm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 130

Two Examples

Simple tree General tree

4x 5x

1x

2x

3x

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 131

Message Passing in Simple Tree (1/3)

4x 5x

1x

2x

3x

212 xm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 132

Message Passing in Simple Tree (2/3)

4x 5x

1x

2x

3x

212 xm

323 xm

535 xm 343 xm

3

)()()|()( 33,433,23555,3x

xmxmxxPxm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 133

Message Passing in Simple Tree (3/3)

4x 5x

1x

2x

3x

212 xm

323 xm

553 xm 343 xm

)}(|{

, )( )(jii xNeighborxx

jjij xmxP

)()()()( 3533433233 xmxmxmxP

j kijiix xxxNeighborxx

jjijkkkj xmxxPxm}),(|{

,, )( )|()(

535 xm

3

)()()|()( 33,433,23555,3x

xmxmxxPxm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 134

Message Passing in HMM (1/3)

Y1 Y3

X1 X2 X3

Y2

Filtering (Forward algorithm)P(X3|y1:3)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 135

• Smoothing• P(X1|y1:3) : Backward algorithm

• P(X2|y1:3) : Backward algorithm

Message Passing in HMM (2/3)

Y1 Y3

X1 X2 X3

Y2

Y1 Y3

X1 X2 X3

Y2

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 136

Message Passing without Evidence

)}(|{

, )( )()(jii xNeighborxx

jjijj xmxPxBelief

j kijiix xxxNeighborxx

jjijkkkj xmxxPxm}),(|{

,, )( )|()(

jx

1ix

kix

…1ki

xmi

x

jji xm ,1 jji xm

k ,

jji xmk ,1

jji xmm ,

)(11, iij xm )(, kk iij xm

)(, mm iij xm

11, kk iij xm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 137

Message Passing with Evidence (1/2)• Given a set of evidence e = e+e-

• The node x splits network into two disjoint parts

)|,( xeeP )|()|( xePxeP

Conditional Independence Polytree

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 138

Message Passing with Evidence (2/2)• Given a set of evidence e = e+e-

• The belief Belief(x) of a node x is),|()|()( eexPexPxBelief

)()|,(' xPxeeP )()|()|(' xPxePxeP

)|()|( exPxeP)()( xx

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 139

FactorizationProbabilistic models with no loop are factorizable

dcbadcba

dcba

),(),(),(),(

),(),(),(,

xPxPxPxP

xPxPxPxPa b c d

ab

cd

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 140

Marginal to MAP : MAX Product

4x 5x

1x

2x

3x

212 xm

323 xm

535 xm 343 xm 212 xm 323 xm 343 xm 535 xm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 141

Sum-Product v.s. Max-Product• Sum-product computes marginals

using this rule

• Max-product computes max marginals

using the rule

• Same algorithm on different semirings: (+,x,0,1) and (max,x,-1,1) Shafer90,Bistarelli97,Goodman99,Aji00

j kijiix xxxNeighborxx

jjijkkkj xmxxPxm}),(|{

,, )( )|()(

}),(|{

,, )( )|(max)(kijii

j xxxNeighborxxjjijkxkkj xmxxPxm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 142

4.3 Pearl’s BP Algorithm• Forwards-backwards algorithm can be

generalized to apply to any tree-like graph (ones with no loops)

• For now, we assume pairwise potentials

Pearl88,Shafer90,Yedidia01,etc

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 143

Basic Idea• 2 passes : Collect and Distribute

rootroot

Collect Evidence

rootroot

Distribute Evidence

Figure from P. Green

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 144

Collect Evidence: Absorb Messages

)}(|{

, )( )()(jii xNeighborxx

jjijj xmxPxBelief

jx

1ix

kix

…1ki

xmi

x

jji xm ,1 jji xm

k ,

jji xmk ,1

jji xmm ,

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 145

Distribute Evidence: Send Messages

j kijiix xxxNeighborxx

jjijkkkj xmxxPxm}),(|{

,, )( )|()(

jx

1ix

kix

…1ki

xmi

x

)(11, iij xm )(, kk iij xm

)(, mm iij xm

11, kk iij xm

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 146

Initialization• For nodes with evidence e

– (xi) = 1 wherever xi = ei ; 0 otherwise– (xi) = 1 wherever xi = ei ; 0 otherwise

• For nodes without parents– (xi) = p(xi) - prior probabilities

• For nodes without children– (xi) = 1 uniformly (normalize at end)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 147

Centralized Protocol

Collect to root (post-order) Distribute from root (pre-order)

Computes all N marginals in 2 passes over graph

1 2

34

5 12

3

45

R R

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 148

Distributed Protocol

Computes all N marginals in O(N) parallel updates

Collect

Distribute

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 149

Propagation Example in a Tree

• The example requires five time periods to reach equilibrium (Pearl, 1988, p 174)

DataData

Collect

Distribute

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 150

Properties of BP• Exact inference for polytrees

– Each node separates the polytree into 2 disjoint components

• On a polytree, the BP algorithm converges in time linearly proportional to number of nodes– Work done in a node is proportional to the

size of CPT– Hence BP is linear in number of network

parameters• For general graphs

– Exact inference is NP-hard– Approximate inference is NP-hard

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 151

4.4 Message Passing in General Graphs

• Belief propagation is only guaranteed to be correct for polytrees (trees)

• Most probabilistic graphs– Are not polytrees– Has many loops

• We can not factorize the joint probability P(X1,…,Xn) into sum-product P(Xi|Pa(Xi))

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 152

Loopy Belief Propagation• Applying BP to graphs with loops (cycles) can

give the wrong answer, because it overcounts evidence

• In practice, often works well (e.g., error correcting codes)

Cloudy

SprinklerRain

WetGrass

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 153

Factorization in Loopy Networks

Factorizable

Not Factorizable

Probabilistic models with no loop are tractable

Probabilistic models with loop are not tractable

dcbadcba

dcba

),(),(),(),(

),(),(),(,

xPxPxPxP

xPxPxPxPa b c d

a

b

cd

ab

cd

a b c d

dcba xP ,,,,

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 154

Two Methods• Loopy Belief Propagation

– Approximate Inference• Clustering (Join Tree, Junction Tree)

– Combine multiple nodes into a hyper-node• Transform loopy graph into polytree

– Then perform belief propagation

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 155

Loopy Belief Propagation• If BP is used on graphs with loops,

messages may circulate indefinitely• Empirically, a good approximation is

still achievable– Stop after fixed # of iterations– Stop when no significant change in beliefs– If solution is not oscillatory but converges,

it usually is a good approximation• Example: Turbo Codes

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 156

Clustering

• The general graph should be converted to a junction tree, by clustering nodes

• Message passing in the general graph = Message passing in the junction tree

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 157

5. Junction Tree

• Also known as –Clustering algorithm–Join tree algorithm

• Sub-sections–5.1 Junction tree algorithm–5.2 Example: create join tree

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 158

Basic Idea• Join individual nodes to form cluster

nodes• The resulting network becomes a

polytree– Singly connected– Undirected

• Inference is performed in the polytree• Reduce the cost of inference of all

variables to O(n)–n is the size of the modified network :

polytree

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 159

An Example (1/2)• A multiply connected network

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 160

An Example (2/2)• A polytree by combining Sprinkler and

Rain

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 161

Why Junction Tree• More efficient inference of all variables

– Than variable elimination– For some PGMs

(multiply connected network)• Avoid cycles if we

– Turn highly-interconnected subsets of the nodes into “hypernodes” (Cluster)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 162

5.1 Junction Tree AlgorithmStep 1: Graph Transformation

(a) Moralize(b) Triangulate(c) Identify cliques(c) Build junction tree

Step 2: Initialization (of values)(a) Set up potentials(b) Propagate potentials

Step 3: Update beliefs(a) Insert evidence into the junction tree(b) Propagate potentials

Steps 1 and 2 are performed only once

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 163

Step 1: Graph Transformation

DAG

Moral Graph

Triangulated Graph

Junction Tree

Hypernodes of Cliques

1(a) Moralize

1(b) Triangulate

1(c) Identify cliques

1(d) Build junction tree

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 164

Step 1(a) - Moralize (1/2)• Add undirected edges to all co-parents

which are not currently joined–Marrying parents

A

B

D

C

E

G

F

H

A

B

D

C

E

G

F

H

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 165

Step 1(a) - Moralize (2/2)• Drop the directions of the arcs

A

B

D

C

E

G

F

H

A

B

D

C

E

G

F

H

Directed Undirected Moral Graph

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 166

Step 1(b) - Triangulating• An undirected graph is triangulated • Iff every cycle of length >3 contains an

edge to connect two nonadjacent nodes

A

B

D

C

E

G

F

H

NO YES

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 167

Step 1(c) - Identifying Cliques• A clique is a subgraph of an undirected

graph that is complete and maximal

A

B

D

C

E

G

F

H

EGH

ADEABD

ACEDEF

CEG

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 168

Properties of Junction Tree• Each node is a cluster (nonempty set) of

variables• Running intersection property:

– Given two clusters X and Y, all clusters on the path between X and Y contain XY

• Separator sets (sepsets): – Intersection of the adjacent cluster

ADEABD DEFAD DE

Cluster ABD Sepset DE

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 169

Step 1(d) - Build Junction Tree (1/2)• A junction tree is a clique

graph that – is an undirected tree – contains all the cliques– satisfies the running

intersection property

EGH

ADEABD

ACEDEF

CEG

ADEABD ACEAD AE CEGCE

DEF

DE

EGH

EG

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 170

Step 1(d) - Build Junction Tree (2/2)

sepsets

In JT cliquesbecomesvertices

abd

ade

ace

ceg

eghdef

ad ae ce

de egGJT

Ex: ceg egh = eg

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 171

Junction Tree AlgorithmStep 1: Transformation of graph

(a) Moralize(b) Triangulate(c) Identify cliques(c) Build junction tree

Step 2: Initialization (of values)(a) Set up potentials(b) Propagate potentials

Step 3: Update beliefs(a) Insert evidence into the junction tree(b) Propagate potentials

Steps 1 and 2 are performed only once

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 172

Step 2: Initialization

DAG

Junction Tree

Inconsistent Junction Tree

2(a) Set up potentials

Consistent Junction Tree

2(b) Propagate potentials

)|( eE vVPMarginalization

Step 1

Step 2

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 173

PotentialsDEFINITION: A potential A over a set of variables XA is a function that maps each instantiation of xA into a non-negative real number. We denote the number that Amaps xA by A(xA). Ex: A potential abc over

the set of vertices {a,b,c}. Xa has four states, andXb and Xc has three states.

A joint probability is a special case of a potential where A(xA)=1.

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 174

Decomposable Distribution

DEFINITION: A probability distribution is said to be decomposable with respect to graph G = ( V , E ) if G is triangulated and it hold for any clusters A and B with separator C that

XA XB | XC

Step 1 & 2 of the junction tree algorithm guarantees this property!

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 175

Factorization of Potentials

THEOREM: Given a decomposable probability distribution P(XV) on the graph G = ( V , E ) it can be written as the product of all potentials of the cliques divided by the product of all potentials of the sepsets:

all cliques CP(XV) = all sepsets S

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 176

Step 2(a) – Set Up Potentials1. For each cluster C and sepset S;

C 1 , S 1

2. For each vertex u in the BN select a parent cluster C s.t. C pa(u). Include the conditional probability P( Xu | Xpa(u) ) into C ;

C C · P( Xu | Xpa(u) )

all vertices P( Xu | Xpa(u) )= P(XV)

1

all cliques C

all sepsets S=

”PROOF:”

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 177

The potentials in the junction tree are not consistent with each other., i.e. if we use marginalization to get the probability distribution for a variable Xu we will get different results depending on which clique we use.

abd

ade

ace

ceg

eghdef

ad ae ce

de eg

P(Xa) = ade

= (0.02, 0.43, 0.31, 0.12)de

P(Xa) = ace

= (0.12, 0.33, 0.11, 0.03)ce

The potentials might not even sum to one, i.e. they are not joint probability distributions.

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 178

Message Passing from clique A to clique B1. Project the potential of A into SAB2. Absorb the potential of SAB into B

Projection Absorption

Step 2(b) - Propagate Potentials

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 179

1. COLLECT-EVIDENCE messages 1-52. DISTRIBUTE-EVIDENCE messages 6-10

(both methods are recursive)

Global Propagation

32

5

1 48 10

9

7

6

Start here!abd

ade

ace

ceg

eghdef

ad ae ce

de eg

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 180

A Priori Distribution

global propagation

potentials are consistent

Marginalizations gives probability distributions for the

variables

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 181

Short Summarya

b c

d e

f

g

h

a

b c

d e

f

g

h

32

5

1 48 10

9

7

6

abd

ade

ace

ceg

eghdef

ad ae ce

de eg

abd

ade

ace

ceg

eghdef

ad ae ce

de eg

1. For each cluster C and sepset S;C 1 , S 1

2. For each vertex u in the BN select a parent cluster C s.t. C fa(u). Include the conditional probability P( Xu | Xpa(u) ) into C

C C · P( Xu | Xpa(u) )

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 182

Junction Tree AlgorithmStep 1: Transformation of graph

(a) Moralize(b) Triangulate(c) Identify cliques(c) Build junction tree

Step 2: Initialization (of values)(a) Set up potentials(b) Propagate potentials

Step 3: Update beliefs(a) Insert evidence into the junction tree(b) Propagate potentials

Steps 1 and 2 are performed only once

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 183

• Evidence is new information about a r.v. that changes our belief about its distribution

Ex. Before receiving evidenceP(Xu) = (0.14, 0.43, 0.31, 0.12)

• Hard evidence• The r.v. is instantiated (observed)

Xu=xu P(Xu) := (0, 0, 1, 0)• Soft evidence - everything else.

Xu < x1 P(Xu) := (0.5, 0.5, 0, 0)

Step 3(a) - Insert Evidence into JT

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 184

Hard Evidence as a LikelihoodIf we observe the the variable Xu to be xu the likelihood function becomes:

Xu(0,1,2,3) & observes Xu =2 Xu(xu)=(0,0,1,0);

Xu(xu) = 1, when xu is the observed value0, otherwise

For all unobserved variables Xv we make the likelihood function constant:

Xv(xv)= 1/n ; for all xv

where n is the number of states of Xv

Xv(0,1,2) unobserved Xv(xv)=(0.33,0.33,0.33)

Modify the initialization step 2a to include this!

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 185

Entering Observations1. For each observation Xu=xu:(a)Encode the observation as a likelihood Xu(b)Identify one clique C that contains u and

update C as:

C C · Xu

Step 3b. Propagate potentials

To make the potentials in the junction tree consistent, perform a global update

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 186

Short Summarya

b c

d e

f

g

h

1. 2.

3.

Observe Xd=xdXg=xg

4. a

b c

d e

f

g

h

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 187

5.2 Example: Create Junction Tree

X1 X2

Y1 Y2

HMM with 2 time steps:

Junction Tree:X1,X2X1,Y1 X2,Y2X1 X2

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 188

Initialization

Variable Associated Cluster

Potential function

X1 X1,Y1

Y1 X1,Y1

X2 X1,X2

Y2 X2,Y2

X 1,Y1 P (X1)

X 1,Y 1

P (X1)P (Y1 | X1)

X 1,X 2 P (X 2 | X1)

X 2,Y 2 P (Y 2 | X 2)

X1,X2X1,Y1 X2,Y2X1 X2

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 189

Collect Evidence (1/2)• Choose arbitrary clique, e.g. X1,X2,

where all potential functions will be collected.

• Call recursively neighboring cliques for messages:

• 1. Call X1,Y1.– 1. Projection:– 2. Absorption:

X 1 X 1,Y 1 P (X1,Y1) P (X1)Y 1

{ X 1,Y1}X 1

X 1,X 2 X 1,X 2X 1

X 1old P (X 2 | X1)P (X1) P (X1, X 2)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 190

Collect Evidence (2/2)• 2. Call X2,Y2:

– 1. Projection:

– 2. Absorption:

X 2 X 2,Y 2 P (Y 2 | X 2) 1Y 2

{ X 2,Y 2}X 2

X1,X2X1,Y1 X2,Y2X1 X2

X 1,X 2 X 1,X 2X 2

X 2old P (X1, X 2)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 191

Distribute Evidence (1/2)• Pass messages recursively to

neighboring nodes• Pass message from X1,X2 to X1,Y1:

– 1. Projection:

– 2. Absorption:

X 1 X 1,X 2 P (X1, X 2) P (X1)X 2

{ X 1,X 2}X 1

X 1,Y1 X 1,Y 1X 1

X 1old P (X1,Y1) P (X1)

P (X1)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 192

Example: Distribute Evidence (2/2)• Pass message from X1,X2 to X2,Y2:

– 1. Projection:

– 2. Absorption:

X 2 X 1,X 2 P (X1, X 2) P (X 2)X 1

{ X 1,X 2}X 2

X 2,Y 2 X 2,Y 2X 2

X 2old P (Y 2 | X 2) P (X 2)

1 P (Y 2, X 2)

X1,X2X1,Y1 X2,Y2X1 X2

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 193

Inference with Evidence (1/2)• Assume we want to compute:

P(X2|Y1=0,Y2=1) (state estimation)• Assign likelihoods to the potential

functions during initialization:

X 1,Y1 0 if Y1 1

P (X1,Y1 0) if Y1 0

X 2,Y 2 0 if Y 2 0

P(Y 2 1 | X 2) if Y 2 1

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 194

Inference with Evidence (2/2)

• Repeating the same steps as in the previous case, we obtain:

X 1,Y1 0 if Y1 1

P(X1,Y1 0,Y 2 1) if Y1 0

X 1 P (X1,Y1 0,Y 2 1)X 1,X 2 P (X1,Y1 0, X 2,Y 2 1)X 2 P (Y1 0, X 2,Y 2 1)

X 2,Y 2 0 if Y 2 0

P(Y1 0, X 2,Y 2 1) if Y 2 1

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 195

An Example• TBU

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 196

An Example• To perform exact inference in an arbitrary graph, convert it to a

junction tree, and then perform belief propagation.• A jtree is a tree whose nodes are sets, and which has the Jtree

property: all sets which contain any given variable form a connected graph (variable cannot appear in 2 disjoint places)

C

S R

W

CSR

SRW

SR

C

S R

W

moralize Make jtree

Maximal cliques = { {C,S,R}, {S,R,W} }Separators = { {C,S,R} Å {S,R,W} = {S,R} }

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 197

Making a Junction Tree

A

B

C

D

E

Fmoralize

A

B

C

D

E

F

A

B

C

D

E

F

Triangulate(order f,d,e,c,b,a)

{b,d} {b,e,f}

{b,c,e} {a,b,c}

{a,b,c}

{b,c,e}

{b,e,f}

{b,d}

Wij = |Ci Å Cj|

1

11

2

2

1

Max spanning tree Findmax cliques

Jtree Jgraph GT

GMG

Jensen94

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 198

Clique Potentials

CSR

SRW

SR

Each model clique potential gets assigned to oneJtree clique potential

If we observe W=w*, set E(w)=(w,w*), else E(w)=1

Each observed variable assigns a delta functionto one Jtree clique potential

Square nodes are factors

C

S R

W

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 199

Separator Potentials

CSR

SRW

SRSeparator potentials enforce consistency betweenneighboring cliques on common variables.

Square nodes are factors

C

S R

W

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 200

BP on a Jtree

• A Jtree is a MRF with pairwise potentials.

• Each (clique) node potential contains CPDs and local evidence.

• Each edge potential acts like a projection function.

• We do a forwards (collect) pass, then a backwards (distribute) pass.

• The result is the Hugin/ Shafer-Shenoy algorithm.

CSR

SRW

SR

1

2

4

3

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 201

BP on a Jtree (collect)

CSR

SRW

SR

Initial clique potentials contain CPDs and evidence

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 202

BP on a Jtree (collect)

CSR

SRW

SR Message from clique to separatormarginalizes belief (projects onto intersection)[remove c]

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 203

BP on a Jtree (collect)

CSR

SRW

SR

Separator potentials gets marginal belieffrom their parent clique.

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 204

BP on a Jtree (collect)

CSR

SRW

SR

Message from separator to clique expands marginal[add w]

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 205

BP on a Jtree (collect)CSR

SRW

SR

Root clique has seen all the evidence

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 206

BP on a Jtree (distribute)CSR

SRW

SR

CSR

SRW

SR

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 207

BP on a Jtree (distribute)CSR

SRW

SR

CSR

SRW

SR

Marginalize out w and excludeold evidence (ec, er)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 208

BP on a Jtree (distribute)CSR

SRW

SR

CSR

SRW

SR

Combine upstream and downstreamevidence

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 209

BP on a Jtree (distribute)CSR

SRW

SR

CSR

SRW

SR

Add c and excludeold evidence (ec, er)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 210

BP on a Jtree (distribute)CSR

SRW

SR

CSR

SRW

SR

Combine upstream and downstream evidence

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 211

Partial BeliefsCSR

SRW

SR

CSR

SRW

SR

•The “beliefs”/ messages at intermediate stages (before finishing both passes)may not be meaningful, because any given clique may not have “seen” all themodel potentials/ evidence (and hence may not be normalizable).•This can cause problems when messages may fail (eg. Sensor nets).•One must reparameterize using the decomposable model to ensure meaningfulpartial beliefs. Paskin04

Evidence on R now added here

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 212

6. Summary• Variable elimination

– Good concept of sum-product computation– No good for computing many nodes

• Belief propagation– Good for

• Computing beliefs of many nodes• Poly-tree

– No good for general graphs• Junction tree

– Good for • Computing beliefs of many nodes• General graphs

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 213

Three Methods Are Closely Related• Variable elimination provides basic

ideas of BP and Junction Tree– Belief/Message Factor

Propagation/Passing Elimination– Clustering Elimination, Factor

• Junction tree is the converging algorithm

• Message passing provide unified formula

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 214

7. Implementation

PNL GeNIeEnumeration v (Naïve)Variable EliminationBelief Propagation v (Pearl) v (Polytree)Junction Tree v v (Clustering)Direct Sampling v (Logic)Likelihood Sampling v(LWSampling) v(Likelihood

sampling)MCMC Sampling v(Gibbswithanneal) (Other 5 samplings)

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 215

8. References• S. M. Aji and R. J. McEliece, The generalized

distributive law, IEEE Trans. On Information Theory, vol. 46, no. 2, 2000.

• F. R. Kschischang, B. J. Frey, H.-Andrea Loeiliger, Factor graph and the sum-product algorithm, IEEE Trans. On Information Theory, vol. 47, no. 2, 2001.

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 216

Recent Books• R. E. Neapolitan, Learning Bayesian Networks,

Prentice Hall, 2004.• C. Borgelt and R. Kruse, Graphical

Models:methods for data analysis and mining, Wiley, 2002.

• D. Edwards, Introduction to Graphical Modelling, 2nd, Springer, 2000.

• S. L. Lauritzen, Graphical Models, Oxford, 1996.• M. I. Jordan (ed.), Learning in Graphical Models,

MIT, 2001.

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 217

Probabilistic Inference Using Bayesian Network

• Introductory article: M. Henrion, “An introduction to algorithms for inference in belief nets,” in M. Henrion, R. Shachter, L. Kanal, and J. Lemmer (eds.), Uncertainty in Artificial Intelligence, 5, Amsterdam:North Holland, 1990.

• Textbook with HUGIN system: F. Jensen, An Introduction to Bayesian Networks, New York: Springer-Verlag, 1996.

• R. Neal, “Connectionist learning of belief networks,” Artificial Intelligence, 56:71-113, 1991.

Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright

Bayesian Networks Unit - Exact Inference in BN p. 218

General Probabilistic Inference• J. Pearl, Probabilistic reasoning in intelligent

systems – Networks of plausible inference, Morgan – Kaufmann 1988.

• E. Castillo, J. M. Gutierrez, A. S. Hadi, Expert Systems and Probabilistic Network Models, Springer 1997.

• R. Neapolitan, Probabilistic Reasoning in Expert Systems:Theory and Algorithms, New York:John Wiley & Sons, 1990.

• A special issue on “Uncertainty in AI” of the Communications of the ACM, vol. 38, no. 3, March 1995.

• G. Shafer and J. Pearl (eds.), Readings in Uncertain Reasoning, San Francisco:Morgan Kaufmann, 1990.

Recommended