Upload
bryan-leonard
View
285
Download
0
Embed Size (px)
Citation preview
1
Graph Embedding (GE) &Marginal Fisher Analysis (MFA)
2
Outline
1. Introduction
2. System Flowchart
3. Dimensionality Reduction- Graph Embedding 3.1 Cost Function
-Intrinsic Graph/Penalty Graph
3.2 Linearization
3.3 Example: LDA
4. Marginal Fisher Analysis
5. Experiments Result
3
Outline
1. Introduction
2. System Flowchart
3. Dimensionality Reduction- Graph Embedding 3.1 Cost Function
-Intrinsic Graph/Penalty Graph
3.2 Linearization
3.3 Example: LDA
4. Marginal Fisher Analysis
5. Experiments Result
4
1. Introduction
We present a general framework called Graph
Embedding (GE).
In graph embedding, the underlying merits and
shortcomings of different dimensionality reduction
schemes, existing or new, are revealed by
differences in the design of their intrinsic and penalty
graphs and their types of embedding.
A novel dimensionality reduction algorithm, Marginal
Fisher Analysis (MFA).
5
Outline
1. Introduction
2. System Flowchart
3. Dimensionality Reduction- Graph Embedding 3.1 Cost Function
-Intrinsic Graph/Penalty Graph
3.2 Linearization
3.3 Example: LDA
4. Marginal Fisher Analysis
5. Experiments Result
6
2. Face Recognition Flowchart
N : # Image (200, 20 image pre person)
: # People (10)
m : Image Size (24x24)
cN
mm
200)2424(
1][ RRxxX Nm
N
37)2424( RRw mm
w : Unitary Linear Projection Vector
k : k nearest neighbors
X
1. Training Image Set: X
2.1 MFA Space Creation: w
2.2 Projection to MFA Space: Y
20037' , RRYXwY NmT
3. k-NN Classification
1. Test image: xtest
Classification Result
Training Process
Test Process
2. Projection to MFA Space: ytest
21, kk
1)2424( Rxtest
137, Ryxwytesttest
T
test
k
winfluence
w
Y
k1 : k-NN for intrinsic graphk2 : k-NN for penalty graph
7
Outline
1. Introduction
2. System Flowchart
3. Dimensionality Reduction- Graph Embedding 3.1 Cost Function
-Intrinsic Graph/Penalty Graph
3.2 Linearization
3.3 Example: LDA
4. Marginal Fisher Analysis
5. Experiments Result
8
3. Graph Embedding
For a dimensionality reduction problem, we require an intrinsic graph G and, optionally, a penalty graph as input.
We now introduce the dimensionality reduction problem from the new point of view of graph embedding.
Let G={X,W} be an undirected weighted graph (two-way direction) with vertex set X (N nodes) and similarity (or weighted) matrix .
W :
1. Symmetric matrix
2. May be negative
44
4321
RW
xxxxX
NxxxX 21
NNRW
9
3. Graph Embedding: Laplacian Matrix
L=Degree-Adjacent=>
jiijii iWD
WDL
.,(2)
0011
0001
1001
1110
W
2000
0100
0020
0003
D
2011
0101
1021
1113
L
W is weighted matrix ,also call similarity matrix
G={X,W} :
10
3.1 Cost Function (1/2) Our graph-preserving criterion is given as follows:
For larger (positive) similarity samples and :
For smaller (negative) similarity samples and :
B typically is the Laplacian matrix of a penalty graph .
Use Lagrange multipliers to solve:
LyyWyyy T
dByyijji
dByy TT minarg minarg*
2(3)
2
ji yy
2
ji yy
ixjx
ix jx
pG
ppp WDLB
YLYBy
u
dBYYLYYu TT
1
rmPenalty te
graphpenalty
tion termRegulariza
graph intrinsic
0
)(
Y must be an eigenvector of the LB 1
Intrinsic Graph
Penalty Graphunknown Known
11
3.1 Cost Function: Intrinsic Graph in LDA
2
100000
02
10000
004
3000
0004
300
00004
30
000004
3
66RD
ji
jicc
c
cc
ij
cc
cc
nW
ji
ji
0
1,
,
21
21
654321
]65[ ]4321[
2 4 2 6
cc
nnNN
xxxxxxX
c
If we have 6 images for 2
people.
2
1
2
10000
2
1
2
10000
004
3
4
1
4
1
4
1
004
1
4
3
4
1
4
1
004
1
4
1
4
3
4
1
004
1
4
1
4
1
4
3
66RL
02
10000
2
100000
0004
1
4
1
4
1
004
10
4
1
4
1
004
1
4
10
4
1
004
1
4
1
4
10
66RW
WDL is not important, because in GE i always not equal to j.iiW
12
3.1 Cost Function: Penalty Graph
We define an intrinsic graph to be the graph G itself
Penalty graph :
1. As a graph whose vertices X are the same as those
of G.
2. Whose edge weight matrix corresponds to the similarity
characteristic that is to be suppressed in the dimension-
reduced feature space .
3. Penalty graph = constraint
},{ pp WXG
pW
13
3.1 Cost Function: Penalty Graph in LDA
100000
010000
001000
000100
000010
000001
66RD p
21
21
654321
]65[ ]4321[
2 4 2 6
cc
nnNN
xxxxxxX
c
If we have 6 images for 2
people.
6
5
6
1
6
1
6
1
6
1
6
16
1
6
5
6
1
6
1
6
1
6
16
1
6
1
6
5
6
1
6
1
6
16
1
6
1
6
1
6
5
6
1
6
16
1
6
1
6
1
6
1
6
5
6
16
1
6
1
6
1
6
1
6
1
6
5
B 66R
6
1
6
1
6
1
6
1
6
1
6
16
1
6
1
6
1
6
1
6
1
6
16
1
6
1
6
1
6
1
6
1
6
16
1
6
1
6
1
6
1
6
1
6
16
1
6
1
6
1
6
1
6
1
6
16
1
6
1
6
1
6
1
6
1
6
1
66RW p
NW p
ij1
ppp WIWB DMaximize class covariance is equal to maximize data covariance.
Here, LDA penalty graph is as PCA intrinsic graph: Consider only btw-class
scatter.
14
3.2 Linearization
Assuming that the low-dimensional vector representations of the
vertices can be obtained from a linear projection as ,
where w is the unitary projection vector, the objective function in
(3) becomes :
wXLXwWxwxww TT
dwwdwXBXwji
ijjT
iT
dwwdwXBXw
T
TT
T
TT
or
2
or
* minargminarg (4)
Solution
:wLwBBwLw
w
u
dwXBXwwXLXwu TTTT
10
)(
w can be solved by singular value decomposition (SVD)
15
Outline
1. Introduction
2. System Flowchart
3. Dimensionality Reduction- Graph Embedding 3.1 Cost Function
-Intrinsic Graph/Penalty Graph
3.2 Linearization
3.3 Example: LDA
4. Marginal Fisher Analysis (MFA)
5. Experiments Result
16
4.1 Marginal Fisher Analysis (MFA) (1/3)
Fig. 4.
The adjacency relationships of the intrinsic and penalty graphs for the Marginal Fisher Analysis algorithm.
k-NN : k1=5
In same class
k-NN : k2=2
In difference class
Marginal k-NN
(Within-Class) (Btw-Class)
17
(15)wXWDXw
wXWDXw
S
Sw
TppT
TT
wp
c
w )(
)(minarg~
~minarg*
Cost function:
Minimize within-class (intrinsic graph)Maximize between- class (penalty graph)
(13)
else ,0
)(or )( if,1
2
~
11
11
2
iNjjNiW
wW)XX(Dw
WxwxwS
kkij
TT
iji (i)N(j) or jNi
jT
iT
c
kk
Intrinsic graph
4.1 Marginal Fisher Analysis (MFA) (2/3)
indicates the index set of the k1 nearest neighbors of the sample xi in the same
class.
)(1
iNk
)(1
iNk
L
k-NN : k1=5
In same class
18
4.1 Marginal Fisher Analysis (MFA) (3/3)
(14)
else ,0
)(),(or )(),( if,1
2
~
22
22
2
jkikpij
TppT
iji (cj)P) or (i,j)(cP(i,j)
jT
iT
p
cPjicPjiW
w)XWX(Dw
WxwxwSkik
Penalty graph
How to decide k1 and k2-nearest neighbor : k-nearest neighbor (k-NN)
is a set of data pairs that are the k2 nearest pairs among the set
denote the index set belonging to the cth class
)(2
cPk}j,ij),{(i, c c
c
B
19
4.1 Q&A (1/2)
Q1: MFA: How to Decide k1 k2
A1: k1 : (Within Class) Sampled five values between two and {minc(nc-1)},
and chose the value with the best MFA performance.
nc: # of images per class (subject)
(We direct use 5)
k2 : (Btw-Class) Choose the best k2 between 20 and 8Nc at sampled
intervals of 20.
Nc: # of classes (subjects)
(We direct use 20)
20
MFA: Comparison with LDA Advantages :
1. The number of available projection directions (axes) is much larger than that of LDA (MFA finds more significant axes and has better classification results. ).
MFA: Rank (B-1L) LDA: Nc-1
2. There is no assumption on the data distribution, thus it is more general for discriminant analysis, LDA assumption data is approximately Gaussian distributed.
Data distribution: MFA: Non-linear LDA: Linear
3. The inter-class margin can better characterize the separability of different classes than the inter-class scatter in LDA.
MFA: Maximize margin LDA: Difference between means
Disadvantage:
LDA-> incremental LDA, MFA->?
4.1 Q&A (2/2)
margin
Positive Negative
2121
Outline
1. Introduction
2. System Flowchart
3. Dimensionality Reduction- Graph Embedding 3.1 Cost Function
-Intrinsic Graph/Penalty Graph
3.2 Linearization
3.3 Example: LDA
4. Marginal Fisher Analysis
5. Experiments Result
22
5. Experiments: Database (1/2)
Dadabase Yale B
People 10
#Image30 per person, random select 20 image for training, and
remain 10 image for test (G30/P20)
Image size 24x24
Variations Variable illumination, cropped face
24x24
23
5. Experiments Result
k-NN k=1 k=3 k=5
ErrorRate 11.72%±3.58 11.04%±3.66 11.10±3.27
For each k run 100 times, and calculate mean and standard
deviation .
24
Reference
1. P. Belhumeur, J. Hespanha, and D. Kriegman, “Eigenfaces vs. Fisherfac
es: Recognition Using Class Specific Linear Projection,” IEEE Trans. on
Pattern Analysis and Machine Intelligence, Vol. 19, No. 7, pp. 711–720, 1
997.
2. T.K. Kim, S.F. Wong, B. Stenger, J. Kittler and R. Cipolla, “Incremental
Linear Discriminant Analysis Using Sufficient Spanning Set
Approximations”, CVPR, pp. 1-8, 2007.
3. S. Yan, D. Xu, B. Zhang, H. Zhang, Q. Yang, and S. Lin, “Graph
Embedding and Extensions: A General Framework for Dimensionality
Reduction,” IEEE Trans. on Pattern Analysis and Machine Intelligence,
Vol. 29, No. 1, pp. 40–51, 2007.