29
Introduction to Support Vector Machine Lucas Xu September 4, 2012 Lucas Xu Introduction to Support Vector Machine September 4, 2012 1 / 20

Support Vector Machine

Embed Size (px)

DESCRIPTION

rudimentary quick intro to SVM

Citation preview

Page 1: Support Vector Machine

Introduction to Support Vector Machine

Lucas Xu

September 4, 2012

Lucas Xu Introduction to Support Vector Machine September 4, 2012 1 / 20

Page 2: Support Vector Machine

1 Classifier

2 Hyper-Plane

3 Convex Optimization

4 Kernel

5 Application

Lucas Xu Introduction to Support Vector Machine September 4, 2012 2 / 20

Page 3: Support Vector Machine

Classifier

Attributes and Class Labels

Training Data

S ={

(x(1), y(1)), · · · , (x(m), y(m))}, x(i) ∈ Rd, y(i) ∈ {−1, 1}

Lucas Xu Introduction to Support Vector Machine September 4, 2012 3 / 20

Page 4: Support Vector Machine

Classifier

Umeng Gender Classification Data

user app1 app2 · · · appd genderuser1 1 0 · · · 0 maleuser2 0 1 · · · 1 female

......

.... . .

......

usern 1 1 · · · 1 female

Each App belongs to one category, ≈ 20 categories.

Categories are mutual exclusive.

Lucas Xu Introduction to Support Vector Machine September 4, 2012 4 / 20

Page 5: Support Vector Machine

Classifier

Umeng Gender Classification Data

S ={

(x(1), y(1)), · · · , (x(m), y(m))}, x(i) ∈ Rd, y(i) ∈ {−1, 1}

x(i)k ∈ {0, 1}, 0 means not installed, 1 means installed on the device

1 ≤ k ≤ d, d ' 30, 000, about 30,000 apps

y(i) ∈ {male, female}

Lucas Xu Introduction to Support Vector Machine September 4, 2012 5 / 20

Page 6: Support Vector Machine

Hyper-Plane

Figure : Hyper Plane

The hyper-plane: wTx+ b = 0Classification function: hw,b(x) = g(wTx+ b)

g(z) =

{1 if z ≥ 0−1 otherwise

Lucas Xu Introduction to Support Vector Machine September 4, 2012 6 / 20

Page 7: Support Vector Machine

Hyper-Plane

Functional Margin:γ̂(i) = y(i)(wTx(i) + b)

Scaling: set constraint normalization condition : ‖w‖ = 1Geometric Margin:

γ(i) = y(i)(( w

‖w‖

)Tx(i) +

b

‖w‖

)γ(i) should be a large positive number to increase the predictionconfidence.

Lucas Xu Introduction to Support Vector Machine September 4, 2012 7 / 20

Page 8: Support Vector Machine

Hyper-Plane

Definition

The geometry margin of (w, b) with respect to training dataset S:

γ = mini=1,...,m

γ(i)

Lucas Xu Introduction to Support Vector Machine September 4, 2012 8 / 20

Page 9: Support Vector Machine

Hyper-Plane

The optimal margin classifier: (Intuitive)find a decision boundary that maximizes the margin.

maxγ,w,b γ

s.t. y(i)(wTx(i) + b) ≥ γ, i = 1, ...,m

‖w‖ = 1.

Figure : Hyper Plane

How to solve?

Lucas Xu Introduction to Support Vector Machine September 4, 2012 9 / 20

Page 10: Support Vector Machine

Hyper-Plane

Normalization Constraint: let function margin γ̂ = 1

maxγ,w,b1

w

s.t. y(i)(wTx(i) + b) ≥ γ, i = 1, ...,m

maxw,b1

2‖w‖2

s.t. y(i)(wTx(i) + b) ≥ 1, i = 1, ...,m

Lucas Xu Introduction to Support Vector Machine September 4, 2012 10 / 20

Page 11: Support Vector Machine

Hyper-Plane

Convex function

Convex set

So-called Quadratic Programming. Their are many softwarepackages to solve the problem.

Basic Ideas for Support Vector Machine DONE !

More efficient solution ?

Lucas Xu Introduction to Support Vector Machine September 4, 2012 11 / 20

Page 12: Support Vector Machine

Hyper-Plane

Convex function

Convex set

So-called Quadratic Programming. Their are many softwarepackages to solve the problem.

Basic Ideas for Support Vector Machine DONE !

More efficient solution ?

Lucas Xu Introduction to Support Vector Machine September 4, 2012 11 / 20

Page 13: Support Vector Machine

Hyper-Plane

Convex function

Convex set

So-called Quadratic Programming. Their are many softwarepackages to solve the problem.

Basic Ideas for Support Vector Machine DONE !

More efficient solution ?

Lucas Xu Introduction to Support Vector Machine September 4, 2012 11 / 20

Page 14: Support Vector Machine

Hyper-Plane

Convex function

Convex set

So-called Quadratic Programming. Their are many softwarepackages to solve the problem.

Basic Ideas for Support Vector Machine DONE !

More efficient solution ?

Lucas Xu Introduction to Support Vector Machine September 4, 2012 11 / 20

Page 15: Support Vector Machine

Hyper-Plane

Convex function

Convex set

So-called Quadratic Programming. Their are many softwarepackages to solve the problem.

Basic Ideas for Support Vector Machine DONE !

More efficient solution ?

Lucas Xu Introduction to Support Vector Machine September 4, 2012 11 / 20

Page 16: Support Vector Machine

Convex Optimization

Primal Problem:

maxw,b1

2‖w‖2

s.t. y(i)(wTx(i) + b) ≥ 1, i = 1, ...,m

Lucas Xu Introduction to Support Vector Machine September 4, 2012 12 / 20

Page 17: Support Vector Machine

Convex Optimization

Lagrangian for the original problem:

minw,b

maxα:αi≥0

L(w, b, α) =1

2‖w‖2 −

m∑i=1

αi

[y(i)(wTx(i) + b)− 1

]⇓

Under K.K.T condition, transforms to its Dual problem:

maxα

W (α) =m∑i=1

αi −1

2

m∑i,j=1

y(i)y(j)αiαj〈x(i), x(j)〉

s.t. αi ≥ 0, i = 1, ...,mm∑i=1

αiy(i) = 0

Lucas Xu Introduction to Support Vector Machine September 4, 2012 13 / 20

Page 18: Support Vector Machine

Convex Optimization

Solutions:

w∗ =

m∑i=1

αiy(i)x(i)

b∗ = −maxi:y(i)=−1w

∗Tx(i) +mini:y(i)=1w∗Tx(i)

2

Predict:

g(x) = wTx+ b

=

( m∑i=1

αiy(i)x(i)

)Tx+ b

=

m∑i=1

αiy(i)〈x(i), x〉+ b

Lucas Xu Introduction to Support Vector Machine September 4, 2012 14 / 20

Page 19: Support Vector Machine

Kernel

For most of αi, αi = 0.

For those αi > 0, (x(i), y(i)) are called support vectors

Only needs to compute 〈x(i), x〉if we can map feature space (x

(i)1 , x

(i)2 , ...x

(i)k ) to another high

dimension space (z(i)1 , z

(i)2 , ...z

(i)l ), z = φ(x)

i.e. 〈φ(x(i), φ(x)〉we can easily compute 〈z(i), z〉 = K(φ(〈x(i), x〉))Use a slightly different notation:

K(x, y) = 〈φ(x), φ(y)〉

Intuitive Explanation: Measure of Similarities

Lucas Xu Introduction to Support Vector Machine September 4, 2012 15 / 20

Page 20: Support Vector Machine

Kernel

Definition

Mercer Kernel: K is positive semi-definite

Lucas Xu Introduction to Support Vector Machine September 4, 2012 16 / 20

Page 21: Support Vector Machine

Kernel

Primitive 〈x, y〉

Polynomial (〈x, y〉+ 1)d

RBF exp(−γ||x− y||2)Sigmoid tanh(κ〈x, y〉+ c).

String

Tree

Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20

Page 22: Support Vector Machine

Kernel

Primitive 〈x, y〉Polynomial (〈x, y〉+ 1)d

RBF exp(−γ||x− y||2)Sigmoid tanh(κ〈x, y〉+ c).

String

Tree

Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20

Page 23: Support Vector Machine

Kernel

Primitive 〈x, y〉Polynomial (〈x, y〉+ 1)d

RBF exp(−γ||x− y||2)

Sigmoid tanh(κ〈x, y〉+ c).

String

Tree

Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20

Page 24: Support Vector Machine

Kernel

Primitive 〈x, y〉Polynomial (〈x, y〉+ 1)d

RBF exp(−γ||x− y||2)Sigmoid tanh(κ〈x, y〉+ c).

String

Tree

Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20

Page 25: Support Vector Machine

Kernel

Primitive 〈x, y〉Polynomial (〈x, y〉+ 1)d

RBF exp(−γ||x− y||2)

Sigmoid tanh(κ〈x, y〉+ c).

String

Tree

Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20

Page 26: Support Vector Machine

Kernel

Primitive 〈x, y〉Polynomial (〈x, y〉+ 1)d

RBF exp(−γ||x− y||2)

Sigmoid tanh(κ〈x, y〉+ c).

String

Tree

Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20

Page 27: Support Vector Machine

Apply to Umeng Gender Classification

Problem DescriptionClassify the gender of a user based on apps (s)he installed andcategories of apps.

Kernel Design

K(x, y) =

m∑i,j=0

φ(xi, yj)

φ(xi, yj) =

(1 + w)xiyj if i = jxiyj if i 6= j but the same category0 if not the same category

w ≥ 0 , the extra weight if two users have installed the same app.default to 1.0

Experiment Result

Lucas Xu Introduction to Support Vector Machine September 4, 2012 18 / 20

Page 28: Support Vector Machine

Apply to Umeng Gender Classification

x1x2...xm

w · x1w · x2

...w · xmc1c2...c20

ci counts the number of apps belonging to category i

Lucas Xu Introduction to Support Vector Machine September 4, 2012 19 / 20

Page 29: Support Vector Machine

references

Book: Christopher Bishop – PRML Chapter 7: Section 7.1

Slides: Andrew Moore – Support Vector Machines

Video: Bernhard Scholkopf – Kernel Methods

Video: Liva Ralaivola – Introduction to Kernel Methods

Video: Colin Campbell – Introduction to Support Vector Machines

Video: Alex Smola – Kernel Methods and Support VectorMachines

Video: Partha Niyogi – Introduction to Kernel Methods

Many more videos on kernel-related topics here

http://www.seas.harvard.edu/courses/cs281/

Lucas Xu Introduction to Support Vector Machine September 4, 2012 20 / 20