30
Rotation Forest: A New Classifier Ensemble Method 交交交交 交交交 交交交 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

  • View
    248

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

Rotation Forest: A New Classifier Ensemble Method

交通大學 電子所蕭晴駿2007.3.7

Juan J. Rodríguez and Ludmila I. Kuncheva

Page 2: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

2

Outline

IntroductionRotation forestsExperimental resultsConclusions

Page 3: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

3

Outline

IntroductionRotation forestsExperimental resultsConclusions

Page 4: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

4

Introduction(1)

Why classifier ensemble? combine the predictions of multiple classifiers

instead of single classifierMotivation - reduce variance: less dependent on

peculiarities of a single training set - reduce bias: learn a more expressive

concept class than a single classifier

Page 5: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

5

Introduction(2)

Key step:

formation of an ensemble of diverse classifiers from a single training set

It’s necessary to modify the data set (Bagging, Boosting) or the learning method (Random Forest) to create different classifiers

Performance evaluation:

diversity, accuracy

Page 6: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

6

Bagging(1)

Page 7: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

7

Bagging(2)

Bootstrap sample

- the individual classifiers have high classification accuracy

- low diversity

1. for m = 1 to M // M ... number of iterations a) draw (with replacement) a bootstrap sample Sm of the data b) learn a classifier Cm from Sm

2. for each test example a) try all classifiers Cm

b) predict the class that receives the highest number of votes

Page 8: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

8

Boosting

Basic idea:

- later classifiers focus on examples that were misclassified by earlier classifiers

- weight the predictions of the classifiers with their error

Page 9: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

9

Bagging vs. Boosting

Making the classifiers diverse will reduce individual accuracy accuracy-diversity dilemma

AdaBoost creates inaccurate classifiers by forcing them to concentrate on difficult objects and ignore the rest of the data large diversity that boost the ensemble performance

Page 10: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

10

Outline

IntroductionRotation forestsExperimental resultsConclusions

Page 11: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

11

Rotation Forest(1)

Rotation forest transforms the data set while preserving all information

PCA is used to transform the data

- subset of the instances

- subset of the classes

- subset of the features: low computation, low storage

Page 12: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

12

Page 13: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

13

Rotation Forest(2)

Base classifiers: decision tree Forest

PCA is a simple rotation of the coordinate axes Rotation Forest

Page 14: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

14

Method(1)

X: the objects in the training data set

x = [x1, x2, …, xn]T a data point with n features

1 1 11 2

1 2

n

N N Nn

x x x

X

x x x

N×n matrix

Y = [y1, y2, …, yN]T : class label with c classes

Page 15: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

15

Method(2)

Given: - L : the number of classifiers in the ensemble

(D1, D2, …, DL)

- F : the feature set- X, YAll classifiers can be trained in parallel

Page 16: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

16

Method(3)

For i = 1 … L (to construct the training set for

classifier Di)

F : feature set

Fi,1 Fi,2

Fi,3

…Fi,K

K subsets (Fi,j j=1…K)

each has M = n/K features

Page 17: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

17

Method(3)

For j = 1 … K

F1,1 F1,2

F1,3

…F1,K

X1,1: data set X for the features in F1,1

Eliminate a random subset of classesSelect a bootstrap sample from X1,1 to obtain X’1,1

Run PCA on X’1,1 using only M features

Principal components a(1)1,1,…,a(M1)

1,1

Page 18: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

18

Method(4)

Arrange the principal components for all j to obtain rotation matrix

Rearrange the rows of R1 so as to match the order of features in F obtain R1

a

Build classifier D1 using XR1a as a training set

1

2

( )(1) (2)1,1 1,1 1,1

( )(1) (2)1,2 1,2 1,2

1

( )(1) (2)1, 1, 1,

, ,..., [0] [0]

[0] , ,..., [0]

[0] [0] , ,..., K

M

M

MK K K

a a a

a a aR

a a a

1 1 11 2

1 2

n

N N Nn

x x x

X

x x x

Page 19: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

19

How It Works ?

Diversity - Each decision tree uses different set of

axes. - Trees are sensitive to rotation of the axesAccuracy - No principal components are discarded - The whole data set is used to train each

classifier (with different extracted features)

Page 20: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

20

Outline

IntroductionRotation forestsExperimental resultsConclusions

Page 21: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

21

Experimental Results(1)

Experimental settings:

1. Bagging, AdaBoost, and Random Forest were kept at their default values in WEKA

2. for Rotation Forest, M is fixed to be 3

3. all ensemble methods have the same L

4. base classifier: tree classifier J48 (WEKA)

5. database: UCI Machine Learning Repository

Waikato environment for knowledge analysis

Page 22: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

22

Database

Page 23: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

23

Experimental Results(2)

TABLE 2 Classification Accuracy and Standard Deviation of J48 and Ensemble Methods without Pruning

15 10-fold cross validation

Page 24: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

24

Experimental Results(3)

Fig. 1. Percentage diagram for the four studied ensemble methods withunpruned J48 trees.

3.03%

24.24 %

3.03%

69.70%

Page 25: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

25

Experimental Results (4)

Fig. 2. Comparison of accuracy of Rotation Forest ensemble (RF) and the best accuracy from any of a single tree, Bagging, Boosting, and Random Forest ensembles.

Page 26: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

26

Diversity-Error Diagram

Pairwise diversity measures were chosenKappa(κ) evaluates the level of agreement bet

ween two classifier outputsDiversity-error diagram

- x-axis: κ for the pair

- y-axis: averaged individual error of Di and Dj

Ei,j=(Ei+Ej)/2

- small values of κ indicate the better diversity and small values of Ei,j indicate better accuracy

κ

Ei,j

Page 27: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

27

Experimental Results (5)

Rotation Forest has the potential to improve on diversity significantly without compromising the individual accuracy

Fig. 3. Kappa-error diagrams for the vowel-n data set.

Page 28: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

28

Experimental Results (6)Rotation Forest is not a

s diverse as the other ensembles but clearly has the most accurate classifiers

Rotation Forest is similar to Bagging, but more accurate and diverse

Fig. 4. Kappa-error diagrams for the waveform data set.

Page 29: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

29

Conclusions

Rotation Forest transforms the data with different axes while preserve the information completely achieve diversity and accuracy

Rotation Forest gives a scope for ensemble methods “on the side of Bagging”

Page 30: Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 2007.3.7 Juan J. Rodríguez and Ludmila I. Kuncheva

30

References

J.J. Rodriguez, L.I Kuncheva, and C.J. Alonso, “Rotation Forest: A New Classifier Ensemble Method,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 10, pp. 1619-1630, Oct. 2006

J.J. Rodriguez, C. J. Alonso, “Rotation-based ensembles,” Proc. Current Topics in Artificial Intelligence: 10th Conference of the Spanish Association for Artificial Intelligence, LNAI 3040, Springer, 2004, 498-506.

J. Furnkranz, “Ensemble Classifiers” (class notes)