64
Non-Parameter Estimation 主主主 主主主

Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Embed Size (px)

Citation preview

Page 1: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Non-Parameter Estimation

主講人:虞台文

Page 2: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Contents IntroductionParzen Windowskn-Nearest-Neighbor EstimationClassification Techiques

– The Nearest-Neighbor rule (1-NN)– The k-Nearest-Neighbor rule (k-NN)

Distance Metrics

Page 3: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Non-Parameter Estimation

Introduction

Page 4: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Facts Classical parametric densities are unimodal. Many practical problems involve multimodal

densities.

Common parametric forms rarely fit the densities actually encountered in practice.

Page 5: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

GoalsEstimate class-conditional densities

Estimate posterior probabilities

)|( ip x

)|( xiP

Page 6: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

R

Density Estimation

n samples

')'()( xxX dpP R

R

Assume p(x) is continuous & R is small

R

')( xx dp

+x

RPRVp )(x

Randomly take n samples, let K denote the number of samples inside R.

knk PPk

nkKP

)1()( RR

),(~ RPnBK

RnPKE ][ RnPKE ][

Page 7: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Density Estimation

R

n samples

+x

')'()( xxX dpP R

R

Assume p(x) is continuous & R is small

R

')( xx dp

RPRVp )(x

RnPKE ][ RnPKE ][

RkKE ][

Let kR denote the number of samples in R.

nkP /RR R

R

V

nkp

/)( x

R

R

V

nkp

/)( x

Page 8: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

R

Density Estimation

n samplesR

R

V

nkp

/)( x

R

R

V

nkp

/)( x

+x

What items can be controlled?

How?

Use subscript n to take sample size into account.

n

nn V

nkp

/)( x

n

nn V

nkp

/)( x

To this, we should have

We hope )()(lim xx ppnn

)()(lim xx ppnn

1. 0lim n

nV

2. n

nklim

3. 0/lim

nknn

Page 9: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Two Approaches

Parzen Windows– Control Vn

kn-Nearest-Neighbor– Control kn

R

n samples

+x

n

nn V

nkp

/)( x

n

nn V

nkp

/)( x

1. 0lim n

nV

2. n

nklim

3. 0/lim

nknn

What items can be controlled?

How?

Page 10: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Two Approaches

Parzen Windows

kn-Nearest-Neighbor

Page 11: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Non-Parameter Estimation

Parzen Windows

Page 12: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Window Function

otherwise0

,,2,1 ,2/1||1)(

dju j u

otherwise0

,,2,1 ,2/1||1)(

dju j u

1

1

1

1)( uu d 1)( uu d

Page 13: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Window Function

otherwise0

,,2,1 ,2/1||1)(

dju j u

otherwise0

,,2,1 ,2/1||1)(

dju j u

hn

hn

hn

otherwise0

2/||1 nj

n

hx

h

x

1)( uu d 1)( uu d

Page 14: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Window Function

otherwise0

,,2,1 ,2/1||1)(

dju j u

otherwise0

,,2,1 ,2/1||1)(

dju j u

hn

hn

hnx

otherwise0

2/||1 njj

n

hxx

h

xx

otherwise0

2/||1 nj

n

hx

h

x

1)( uu d 1)( uu d

Page 15: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Parzen-Window Estimation

hn

hn

hnx

otherwise0

2/||1 njj

n

hxx

h

xx

otherwise0

2/||1 njj

n

hxx

h

xx

},,,{ 21 nxxx Dkn: # samples inside hypercube centered at x.

n

i n

in h

k1

xx

n

i n

i

nn hVn

p1

11)(

xxx

n

i n

i

nn hVn

p1

11)(

xxx

dnn hV dnn hV

1)( uu d 1)( uu d

Page 16: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Generalization

otherwise0

2/||1 njj

n

hxx

h

xx

otherwise0

2/||1 njj

n

hxx

h

xx

n

i n

i

nn hVn

p1

11)(

xxx

n

i n

i

nn hVn

p1

11)(

xxx

1)( uu d 1)( uu d

Requirement 1)( xx dpn

1

1 1ni

i n n

dn V h

x x

x

n

i

dn 1

1uu

Set x/hn=u.

1)( uu d 1)( uu d The window is not necessary a hypercube.

hn is a important parameter.

It depends on sample size.

1

1 1ni

i n n

dn V h

x xx

Page 17: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Interpolation Parameter

n

i n

i

nn hVn

p1

11)(

xxx

n

i n

i

nn hVn

p1

11)(

xxx 1)( uu d 1)( uu d

nnn hV

xx 1

)(

nnn hV

xx 1

)(hn 0

n(x) is a Dirac delta function.

Page 18: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Example

Parzen-window estimations for five samples

n

i n

i

nn hVn

p1

11)(

xxx

n

i n

i

nn hVn

p1

11)(

xxx

Page 19: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Convergence Conditions

To assure convergence, i.e.,

n

i n

i

nn hVn

p1

11)(

xxx

n

i n

i

nn hVn

p1

11)(

xxx

)()]([lim xx ppE nn

)()]([lim xx ppE nn

and 0)]([lim

xn

npVar 0)]([lim

xn

npVar

we have the following additional constraints:

)(sup uu

0)(lim1

||||

d

iiuu

u

0lim n

nV

n

nnVlim

Page 20: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Illustrations

n

i n

i

nn hVn

p1

11)(

xxx

n

i n

i

nn hVn

p1

11)(

xxx

One dimension case:

2/2

2

1)( ue

u

n

i n

i

nn hhn

p1

11)(

xxx

)1,0(~ NX )1,0(~ NX

nhhn /1 nhhn /1

Page 21: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Illustrations

n

i n

i

nn hVn

p1

11)(

xxx

n

i n

i

nn hVn

p1

11)(

xxx

One dimension case:

2/2

2

1)( ue

u

n

i n

i

nn hhn

p1

11)(

xxx

nhhn /1 nhhn /1

Page 22: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Illustrations

n

i n

i

nn hVn

p1

11)(

xxx

n

i n

i

nn hVn

p1

11)(

xxx

Two dimensioncase:

n

i n

i

nn hhn

p1

2

11)(

xxx

nhhn /1 nhhn /1

Page 23: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Classification Example

Smaller window Larger window

Page 24: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Choosing the Window Function

Vn must approach zero when n, but at a rate slower than 1/n, e.g.,

The value of initial volume V1 is important.

In some cases, a cell volume is proper for one region but unsuitable in a different region.

nVVn /1 nVVn /1

Page 25: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

PNN (Probabilistic Neural Network)

Σuu

Σu T

d 2

1exp

||)2(

1)(

2/12/

Σuu

Σu T

d 2

1exp

||)2(

1)(

2/12/

},,,{ 21 nxxx D

)()(

2

1exp

2 kT

knn

k

hhxxxx

xx IΣ

T

kTik

TT

nh)2(

2

1exp

2xxxxxx

n

k n

k

nn hVn

p1

11)(

xxx

n

k n

k

nn hVn

p1

11)(

xxx

n

k

Tk

Tkk

TT

nnn hVn

p1

2)2(

2

1exp

11)( xxxxxxx

n

k

Tk

Tkk

TT

nh12

)2(2

1exp xxxxxx

Page 26: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

PNN (Probabilistic Neural Network)

)(xnp

Irrelevant for discriminant a

nalysis 2k(bias)

kT

knet xx

2/kTkk xx

kk xw

xwTk

2exp)(

n

kk h

xxa

n

k

Tk

Tkk

TT

nh12

)2(2

1exp xxxxxx

Irrelevant for discriminant a

nalysis

n

kkkn netap

1

)()(x

n

kkkn netap

1

)()(x

Page 27: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

PNN (Probabilistic Neural Network)

kT

knet xx

2/kTkk xx

kk xw

xwTk

2exp)(

n

kk h

xxa

n

kkkn netap

1

)()(x

n

kkkn netap

1

)()(x

ak()

wk1 wk2 wkd

x1 x2 xd

k

wk

x

ak(netk)

. . .

. . .

Page 28: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

PNN (Probabilistic Neural Network)

n

kkkn netap

1

)()(x

n

kkkn netap

1

)()(x },,,{ 21 cDDDD },,,{ 21 cDDDD

11)|( op x

2

o2

22 )|( op x cc op )|( x

1

o1

x1 x2 xd. . .

c

oc

. . .

. . .

Page 29: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

PNN (Probabilistic Neural Network)

11)|( op x

2

o2

22 )|( op x cc op )|( x

1

o1

x1 x2 xd. . .

c

oc

. . .

. . .Assign patterns to the class with maximum output values.Assign patterns to the class with maximum output values.

1. Pros and cons of the approach?

2. How to deal with prior probabilities?

Page 30: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Non-Parameter Estimation

kn-Nearest-Neighbor Estimation

Page 31: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Basic Concept

Let the cell volume depends on the training data. To estimate p(x), we can center a cell about x and

let it grow until it captures kn samples, where is some specified function of n, e.g.,

nkn nkn

Page 32: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Example

kn=5

Page 33: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Example

nkn nkn

Page 34: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Estimation of A Posteriori Probabilities

x

Pn(i|x)=?

Page 35: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Estimation of A Posteriori Probabilities

x

c

jjn

inin

xp

xpP

1

),(

),()|(

x

n

iinn V

nkxp

/),(

n

nc

jjnn V

nkxp

/),(

1

n

i

k

k

Pn(i|x)=?

Page 36: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Estimation of A Posteriori Probabilities

c

jjn

inin

xp

xpP

1

),(

),()|(

x

n

iinn V

nkxp

/),(

n

nc

jjnn V

nkxp

/),(

1

n

i

k

k

The value of Vn or kn can be determined base on Parzen window or kn-nearest-neighbor technique.

Page 37: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Non-Parameter Estimation

Classification Techniques

• The Nearest-Neighbor Rule• The k-Nearest-Neighbor Rule

Page 38: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

The Nearest-Neighbor Rule

x

},,,{ 21 nxxx D A set of labeled prototypes

x’Classify as

Page 39: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

The Nearest-Neighbor Rule

Voronoi Tessellation

Page 40: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Error Rate

x

Baysian (optimum):

)|(maxarg xmi

m P

)|(1)|( xx mPerrorP

xx derrorperrorP ),()(

xxx dperrorP )()|(

Optimum: )|(1)|(* xx mPerrorP

xxx dperrorPerrorP )()|()( **

Page 41: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Error Rate

x x’

},,,{

,,,

21

2

2

1

1

ci

n

n

xxxD

},,,{

,,,

21

2

2

1

1

ci

n

n

xxxD

1-NN

),|,(1

),|(

1

xx

xx

i

c

iiP

errorP

Suppose the true class for x is

)|()|(11

xx

i

c

ii PP

xxxxxx dperrorPerrorP )|(),|()|(

Optimum: )|(1)|(* xx mPerrorP

xxx dperrorPerrorP )()|()( **

Page 42: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Error Rate

x x’

},,,{

,,,

21

2

2

1

1

ci

n

n

xxxD

},,,{

,,,

21

2

2

1

1

ci

n

n

xxxD

1-NN

xxxxxx dperrorPerrorP )|(),|()|(

)|()|(1),|(1

xxxx

i

c

ii PPerrorP

?

As n, xx )()|( xxxx p

c

iiP

1

2)|(1 x

c

iiP

1

2)|(1 x

Optimum: )|(1)|(* xx mPerrorP

xxx dperrorPerrorP )()|()( **

Page 43: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Error Rate

x x’

},,,{

,,,

21

2

2

1

1

ci

n

n

xxxD

},,,{

,,,

21

2

2

1

1

ci

n

n

xxxD

1-NN

c

iiP

1

2)|(1 x)|( xerrorP

xxx dperrorPerrorP )()|()(

xxx dpPc

ii )()|(1

1

2

Optimum: )|(1)|(* xx mPerrorP

xxx dperrorPerrorP )()|()( **

Page 44: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Error Bounds

xxx dpPerrorP m )()|(1)(*

1-NN

xxx dpPerrorPc

ii )()|(1)(

1

2

Bayesian

Consider the most complex classification case:

cP i /1)|( x cP i /1)|( x

xx dpc )()/11

c/11

xx dpc )()/11

21 1/ c

)()(* errorPerrorP ?

Page 45: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Error Bounds

xxx dpPerrorP m )()|(1)(*

1-NN

xxx dpPerrorPc

ii )()|(1)(

1

2

Bayesian

Consider the opposite case: 1)|( xmP 1)|( xmP

mi

im

c

ii PPP 22

1

2 )|()|()|( xxx

)|(1)|(* xx mPerrorP )|(1)|(* xx mPerrorP

Maximized this term tofind the upper bound

2**2*2 )|()|(21)]|(1[)|( xxxx errorPerrorPerrorPP m

This term is minimumwhen all elements have

the same value

i.e.,

1

)(

1

)|(1)|(

*

c

errorP

c

PP m

i

xx

2**

1

2 )|(1

)|(21)|( xxx errorPc

cerrorPP

c

ii

Minimized this term

Page 46: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Error Bounds

xxx dpPerrorP m )()|(1)(*

1-NN

xxx dpPerrorPc

ii )()|(1)(

1

2

Bayesian

Consider the opposite case: 1)|( xmP 1)|( xmP

)|(1)|(* xx mPerrorP )|(1)|(* xx mPerrorP

2**

1

2 )|(1

)|(21)|( xxx errorPc

cerrorPP

c

ii

2**

1

2 )|(1

)|(2)|(1 xxx errorPc

cerrorPP

c

ii

)|(2 * xerrorP

xxx dperrorP )()|(2 * xxx dperrorP )()|(*

)(2)()( ** errorPerrorPerrorP )(2)()( ** errorPerrorPerrorP

Page 47: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Error Bounds

xxx dpPerrorP m )()|(1)(*

1-NN

xxx dpPerrorPc

ii )()|(1)(

1

2

Bayesian

Consider the opposite case: 1)|( xmP 1)|( xmP

)|(1)|(* xx mPerrorP )|(1)|(* xx mPerrorP

xxx dperrorP )()|(2 * xxx dperrorP )()|(*

)(2)()( ** errorPerrorPerrorP )(2)()( ** errorPerrorPerrorP

The nearest-neighbor rule is a suboptimal procedure.

The error rate is never worse than twice the Bayes rate.

Page 48: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Error Bounds

Page 49: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

The k-Nearest-Neighbor Rule

k = 5

Assign pattern to the class wins the majority.

Page 50: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Error Bounds

Page 51: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Computation Complexity

The computation complexity of the nearest-neighbor algorithm (both in time and space) has received a great deal of analysis.

Require O(dn) space to store n prototypes in a training set.– Editing, pruning or condensing

To search the nearest neighbor for a d-dimensional test point x, the time complexity is O(dn).

– Partial distance– Search tree

Page 52: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Partial Distance

Using the following fact to early throw far-away prototypes

2/1

1

2

2/1

1

2 )()(),(

d

kkk

r

kkkr babaD ba

2/1

1

2

2/1

1

2 )()(),(

d

kkk

r

kkkr babaD ba

dr

Page 53: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Editing Nearest Neighbor

Given a set of points, a Voronoi diagram is a partition of space into regions, within which all points are closer to some particular node than to any other node.

Page 54: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Delaunay Triangulation

If two Voronoi regions share a boundary, the nodes of these regions are connected with an edge. Such nodes are called the Voronoi neighbors (or Delaunay neighbors).

Page 55: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

The Decision Boundary

The circled prototypes are redundant.

Page 56: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

The Edited Training Set

Page 57: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Editing:The Voronoi Diagram Approach

Compute the Delaunay triangulation for the training set.

Visit each node, marking it if all its Delaunay neighbors are of the same class as the current node.

Delete all marked nodes, exiting with the remaining o

nes as the edited training set.

Demo

Page 58: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Editing: Other Approaches

The Gabriel Graph Approach The Relative Neighbour Graph Approach

References:– Binay K. Bhattacharya, Ronald S. Poulsen, Godfried T. Toussaint,

"Application of Proximity Graphs to Editing Nearest Neighbor Decision Rule", International Symposium on Information Theory, Santa Monica, 1981.

– T.M. Cover, P.E. Hart, "Nearest Neighbor Pattern Classification", IEEE Transactions on Information Theory, vol. IT-13, No.1, 1967, pp.21-27.

– V. Klee, On the complexity of d-dimensional Voronoi diagrams", Arch. Math., vol. 34, 1980, pp. 75-80.

– Godfried T. Toussaint, "The Relative Neighborhood Graph of a Finite Planar Set", Pattern Recognition, vol.12, No.4, 1980, pp.261-268.

Page 59: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Non-Parameter Estimation

Distance Metrics

Page 60: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Nearest-Neighbor Classifier

Distance Measurement is an importance factor for nearest-neighbor classifier, e.g.,– To achieve invariant pattern recognition

The effect of change units

Page 61: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Nearest-Neighbor Classifier

Distance Measurement is an importance factor for nearest-neighbor classifier, e.g.,– To achieve invariant pattern recognition

The effect of translation

Page 62: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Properties of a Distance Metric

Nonnegativity

Reflexivity

Symmetry

Triangle Inequality

0),( baD

baba iff 0),(D

),(),( abba DD

),(),(),( cacbba DDD

Page 63: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Minkowski Metric (Lp Norm)

pd

i

pp x

/1

1

||||||

x

pd

i

pp x

/1

1

||||||

x

1. L1 normManhattan or city block distance

d

iii baL

111 ||||||),( baba

d

iii baL

111 ||||||),( baba

2. L2 normEuclidean distance

2/1

1

222 ||||||),(

d

iii baL baba

2/1

1

222 ||||||),(

d

iii baL baba

3. L norm Chessboard distance

|)max(|||||||),(/1

1ii

d

iii babaL

baba |)max(|||||||),(

/1

1ii

d

iii babaL

baba

Page 64: Non-Parameter Estimation 主講人:虞台文. Contents Introduction Parzen Windows k n -Nearest-Neighbor Estimation Classification Techiques – The Nearest-Neighbor

Minkowski Metric (Lp Norm)

pd

i

pp x

/1

1

||||||

x

pd

i

pp x

/1

1

||||||

x