MELJUN CORTES IBM SPSS SVM

Embed Size (px)

Citation preview

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    1/23

    SVM Models(Support VectorMachine Models)

    A Classification and Regression Technique

    This contains my personal notes only thus, this is not

    complete. Most of the contents were taken from the

    training manual of IBM SPSS Modeler. Please refer to

    the training manual for a complete discussion.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    2/23

    SVM:

    A classification technique that is used to

    predict either a categorical and continuous

    outcome field.

    It is suited to analyzed data with a large

    number of predictor fields.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    3/23

    SVM:

    It maps data into a dimensional space where the

    data points can be categorized or predicted

    accurately, even if there is no easy way to

    separate the points in the original dimensionalspace.

    It uses a kernel function to map the data from

    the original space into the new space.

    It does not provide output in the form of an

    equation with coefficients on the predictor fields.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    4/23

    Assume that the X and Y

    axis represent two

    predictors, while the

    circles and squares

    represent the two

    categories of a target field

    we wish to predict.

    There is no simple

    straight line that can

    separate the categories,but the curve drawn

    around the squares

    shows that there is a

    complex curve that will

    completely separate the

    two categories.

    SVM was developed to handle difficult

    classification/prediction problems where

    the simple linear models were unable to

    accurately separate the categories of anoutcome field.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    5/23

    Central task of SVM:

    Transform the data so that a hyperplane can be used to

    separate the points. The mathematical function used for the transformation is

    known as a kernel function.

    The squares and circles can now be separated by a straight

    line in this two-dimensional space.

    Transformed DataOriginal Data

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    6/23

    The filled-in circles and squares are thecases that are on the boundary between

    the two classes.

    The filled-in circles and squares are all the

    data that are needed to separate the twocategories, and these key points are called

    support vectors because they support the

    solution and boundary definition.

    Transformed Data

    SVM models were developed in the

    machine learning tradition, this technique

    was called support vector machine, hence

    the model name.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    7/23

    There is more than one

    straight line (hyperplane) that

    could be used to separate thetwo categories.

    SVM models try to find the

    best hyperplane that

    maximizes the margin

    (separation) between the

    categories while balancing

    the tradeoff of potentially

    overfitting the data.

    The narrower the margin

    between the support vectors,

    the more accurate the modelwill be on the current data.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    8/23

    Misclassified cases

    Circles or squares can

    fall on the wrong side of

    the support vectors.

    These are classified in

    error.

    SVM attempts to

    maximize the margin

    between the support

    vectors while minimizing

    error.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    9/23

    The mathematical function used for the

    transformation is known as the kernel function.

    SVM in Modeler supports the following kernel

    types:

    Kernel Function

    Linear: Simple function that works well when nonlinear relationships

    in the data are minimal

    Polynomial:A more complex function that allows for higher order

    terms

    RBF (Radial Basis Function): Equivalent to the neural network of this

    type. Can fit highly nonlinear data. Sigmoid: Equivalent to a two-layer neural network. Can also fit

    highly nonlinear data.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    10/23

    Radial Basis Function

    A radial basis function (RBF) is a real-valued function whose value depends

    only on the distance from the origin, so that ; or alternatively onthe distance from some other point c, called a center, so

    that . Any function that satisfies the

    property is a radial function.

    http://en.wikipedia.org/wiki/Origin_%28mathematics%29http://en.wikipedia.org/wiki/Origin_%28mathematics%29
  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    11/23

    Sigmoid Function

    Many natural processes and complex system

    learning curves display a history dependent

    progression from small beginnings that

    accelerates and approaches a climax over

    time.For lack of complex descriptions a sigmoid

    function is often used. A sigmoid curve is

    produced by a mathematicalfunction having

    an "S" shape. Often, sigmoid function refers

    to the special case of the logistic function

    shown at right and defined by the formula

    http://en.wikipedia.org/wiki/Learning_curvehttp://en.wikipedia.org/wiki/Mathematicalhttp://en.wikipedia.org/wiki/Function_%28mathematics%29http://en.wikipedia.org/wiki/Logistic_functionhttp://en.wikipedia.org/wiki/Logistic_functionhttp://en.wikipedia.org/wiki/Function_%28mathematics%29http://en.wikipedia.org/wiki/Mathematicalhttp://en.wikipedia.org/wiki/Learning_curve
  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    12/23

    SVM Node Model Optio n

    If a partition field is defined, this option ensuresthat data from only the training partition is used to

    build the model.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    13/23

    SVM Node Expert Optio n

    If selected, the probabilities for each

    possible value of a set or flag target field

    are displayed for each record processed by

    the node.

    If not selected, the probability of only the

    predicted value is displayed for set or flagtarget fields.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    14/23

    SVM Node Expert Optio n Determines when to stop the optimization

    algorithm.

    Values range from 1.0E1 to 1.0E6;

    default is 1.0E3.

    Reducing the value results in a more

    accurate model, but the model will take

    longer to train.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    15/23

    Controls the trade-off betweenmaximizing the margin and minimizing the

    training error.

    Its values range from 1 to 10 (with 10 as

    default)

    Increasing the value improves the

    classification accuracy (or reducesthe regression error) for the training

    data

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    16/23

    Epsilon is used when the target variable iscontinuous.

    Errors in the model prediction is accepted

    if they are under this value.

    Increasing epsilon may result in faster

    modeling, but at the expense of accuracy.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    17/23

    There are four kernel types:

    Linear work well when nonlinearrelationships in the data are minimal.

    Polynomial allows higher order terms

    Radial Basis Function (RBF)Can fit

    nonlinear data

    Sigmoid S shape, special case of logistic

    function

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    18/23

    RBF gamma should normally

    be between 3/k and 6/k, where

    k=number of input fields.

    increasing the value improves the

    classification accuracy (or reduces theregression error).

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    19/23

    Gamma is enabled only

    when polynomial or sigmoid

    is used..

    Increasing the values

    improves the classificationaccuracy (or reduces

    regression error)

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    20/23

    Biasis enabled only if

    polynomial or sigmoid is

    used. It sets the coefficient

    value of the kernel function.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    21/23

    Degree is enabled if

    polynomial is used. It is

    used to control the

    dimension of the mapping

    space.

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    22/23

    Target Variable: Loyal (leave/stay)

    LONGDIST time spent for long distance calls per month

    International - time spent for international calls per month

    LOCAL time spent for local calls per month Dropped- number of dropped calls

    Pay_mthd payment method of the monthly telephone bill

    LocalBillType- tariff for locally based calls

    LongdistanceBillType- tariff for long distance calls

    Age Sex

    Status marital status

    Children number of children

    Est_income estimated income

    Car_owner- car owner

    Example: Predicting Loyal Customers

  • 7/30/2019 MELJUN CORTES IBM SPSS SVM

    23/23

    Example 2: Predicting if customers will

    accept a new cash-card offering.

    Have a mortgage?

    Have a life insurance?

    Have a credit card?

    Have a debit card?

    Use mobile bank service?

    Has a current account?

    Has internet access to the account?

    Has a personal loan?

    Has savings? Has used a Cash Point in th last week?

    Has hit the overdraft limit during last year?

    Has an ISA account?

    Age in years

    How long as a customer?

    Accept_CashCard - Accept the new cash card