20
Intelligent Database Systems Lab 國國國國國國國國 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature Discovery Presenter : Cheng-Hui Chen Author : Yuefeng Li, Abdulmohsen Algarni, Ning Zhong KDD 2010

Mining Positive and Negative Patterns for Relevance Feature Discovery

  • Upload
    soyala

  • View
    44

  • Download
    1

Embed Size (px)

DESCRIPTION

Mining Positive and Negative Patterns for Relevance Feature Discovery. Presenter : Cheng- Hui Chen Author : Yuefeng Li, Abdulmohsen Algarni , Ning Zhong KDD 2010. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. - PowerPoint PPT Presentation

Citation preview

Page 1: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

1

Mining Positive and Negative Patterns for Relevance Feature Discovery

Presenter : Cheng-Hui Chen Author : Yuefeng Li, Abdulmohsen Algarni, Ning Zhong

KDD 2010

Page 2: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

2

Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Page 3: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation· Over the years, people have often held the

hypothesis that pattern-based methods should perform better than term-based ones in describing user preferences, but many experiments do not support this hypothesis.

· Many text mining only consider term’s distributions.

3

Page 4: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objectives· The innovative technique presented in paper

makes a breakthrough for this difficulty.· To purpose consider both term’s distributions

and their specificities when we use them for text mining and classification.

4

Page 5: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

5

Frequency weight

SpecificityWeight

New weight

Page 6: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Definitions· Frequent pattern

─ Absolute support:─ Relative support :─ A termset X is called, if supa (or supr) >= min_sup

· Closed pattern─ ─ Cls (X) = termset (coverset (X))─ A termset X is called, if and only if X = Cls (X) ─ , for all pattern X1 X

· Closed sequential pattern

6

Page 7: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.The deploying method· To improve the efficiency of the pattern taxonomy

mining (PTM), an algorithm, SPMining(D+; min_sup).─ For a given term t, its support (or called weight) in

discovered patterns can be described as follow:

─ the following rank will be assigned to every incoming document d to decide its relevance.

7

Page 8: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Mining Algorithms

8

Page 9: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Specificity of low-level features· We define the specificity of a given term t in

the training set D = D+ ∪ D- as follows:

9

Page 10: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Revision of discovered features· Revision of discovered Features

10

Page 11: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Revision Algorithms

11

Page 12: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Data

─ This research uses Reuters Corpus Volume1 (RCV1) and the 50 assessor topics to evaluate the proposed model.

· Compare ─ The up-to date pattern mining─ The well-known term-based method

12

Page 13: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· The well-known term-based methods

─ The Rocchio model

─ BM25

─ SVM

13

Page 14: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

14

Page 15: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

15

Experiments

Page 16: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

16

Page 17: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

17

Page 18: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

18

Page 19: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusions· Compared with the state-of-the-art models, the

experiments on RCV1 and TREC topics demonstrate that the effectiveness of relevance feature discovery can be significantly improved by the proposed approach.

· This paper recommends to classify low-level terms into three categories in order to largely improve the performance of the revision.

19

Page 20: Mining Positive and Negative Patterns for  Relevance Feature  Discovery

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Comments· Advantages

─ The effectiveness of relevance feature discovery can be significantly improved by the proposed approach.

· Drawback─ …

· Applications─ Text mining─ Classification

20