15
Intelligent Database Systems Lab 國國國國國國國國 National Yunlin University of Science and Technology 1 Supervised and Traditional Term Weighting Methods for Automatic Text Categorization Presenter : Cheng-Han Tsai Authors : Man Lan, Chew Lim Tan, Senior Member, IEEE, Jian Su, and Yue Lu, Member, IEEE TPAMI, 2009

Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

  • Upload
    jania

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Supervised and Traditional Term Weighting Methods for Automatic Text Categorization. Presenter : Cheng-Han Tsai Authors : Man Lan , Chew Lim Tan, Senior Member, IEEE, Jian Su, and Yue Lu, Member, IEEE TPAMI, 2009. Outlines. Motivation Objectives Methodology Experiments - PowerPoint PPT Presentation

Citation preview

Page 1: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

1

Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Presenter : Cheng-Han Tsai  Authors : Man Lan, Chew Lim Tan, Senior Member, IEEE, Jian Su, and Yue Lu, Member, IEEE

TPAMI, 2009

Page 2: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

2

Outlines

Motivation Objectives Methodology Experiments Conclusions Comments

Page 3: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation

· The popularly used tf idf‧ method has not shown a uniformly good performance in terms of different data sets

3

Text categorization

Page 4: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objectives

· To propose a new simple supervised term weighting method to improve the terms’ discriminating power for text categorization task─ Are supervised term weighting methods better

performance than unsupervised ones for TC?─ Does the difference between supervised and

unsupervised have any relationship with different learning algorithms?

─ Why is the new supervised method, i.e., tf rf, effective ‧for TC?

4

Text categorization

Page 5: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

5

Text categorization

TF RF‧

Page 6: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

6

Page 7: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

7

Page 8: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

8

Page 9: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

9

Page 10: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

10

Page 11: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

11

Page 12: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

12

Page 13: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

13

Page 14: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusions

· Not all supervised term weighting methods are superior to unsupervised methods (i.e. tf x^2, ‧tf ig)‧

· An adapted learning method is more important than weighting method

· The best performance of tf rf‧ has been analyzed and explained from cross-method comparison, cross-classifier, and cross-corpus validation

14

Page 15: Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

15

Comments

· Advantages─ The writing structure of this paper is clear

· Applications─ Text categorization