24
δ-Tolerance Closed Fre quent Itemsets James Cheng,Yiping Ke,and Wilf red Ng ICDM’06 報報報 報報報 2007/03/15

δ-Tolerance Closed Frequent Itemsets

  • Upload
    kalil

  • View
    38

  • Download
    1

Embed Size (px)

DESCRIPTION

δ-Tolerance Closed Frequent Itemsets. James Cheng,Yiping Ke,and Wilfred Ng ICDM ’ 06 報告者:林靜怡 2007/03/15. Introduction. Th e number of Frequent Itemsets mined is often too large Closed Frequent Itemsets and Maximal Frequent Itemsets δ -Tolerance CFIs. Introduction. - PowerPoint PPT Presentation

Citation preview

Page 1: δ-Tolerance Closed Frequent Itemsets

δ-Tolerance Closed Frequent Itemsets

James Cheng,Yiping Ke,and Wilfred Ng ICDM’06

報告者:林靜怡2007/03/15

Page 2: δ-Tolerance Closed Frequent Itemsets

Introduction

The number of Frequent Itemsets mined is often too large Closed Frequent Itemsets and Maxima

l Frequent Itemsets δ-Tolerance CFIs

Page 3: δ-Tolerance Closed Frequent Itemsets

Introduction

Page 4: δ-Tolerance Closed Frequent Itemsets

Introduction

where Y is X’s smallest superset that has the greatest frequency

The frequency of the pruned FIs can be accurately estimated as the average frequency of the pruned FIs that are of the same size and covered by the same superset

Ex : ab, ac and ad (106+108+107)/3 = 107

Page 5: δ-Tolerance Closed Frequent Itemsets

Introduction

δ-Tolerance CFIs (δ-TCFIs) CFI2TCFI MineTCFI

Page 6: δ-Tolerance Closed Frequent Itemsets

δ-Tolerance Closed Frequent Itemsets

Page 7: δ-Tolerance Closed Frequent Itemsets

Example Example 2 Referring to the 15 FIs in Figure

1. Let δ= 0.04, then the set of 0.04-TCFIs is {b, bd, bcd,abcd}.

The FI a is not a 0.04-TCFI since

Page 8: δ-Tolerance Closed Frequent Itemsets

Frequency Estimation

F : the set of all FIs T : the set of allδ-TCFIs, for a given δ The frequency of an FI X ∈ F can be es

timated from the frequency of its supersets in T

Page 9: δ-Tolerance Closed Frequent Itemsets

Frequency Estimation

the closest superset of a is ac, that of ac is acd and that of acd is abcd

the closest superset of ab is abc

Page 10: δ-Tolerance Closed Frequent Itemsets

Frequency Estimation

Example 3 When δ= 0.04 abcd is the closest δ-TCFI superset of all its

subsets that contain the item a bcd is the closest δ-TCFI superset of bc, cd a

nd c

Page 11: δ-Tolerance Closed Frequent Itemsets
Page 12: δ-Tolerance Closed Frequent Itemsets
Page 13: δ-Tolerance Closed Frequent Itemsets

CFI2TCFI

Page 14: δ-Tolerance Closed Frequent Itemsets

MineTCFI the search for the supersets of an itemset in

the set of δ-TCFIs already discovered In mining CFIs, the subset testing can be effi

ciently processed by an FP-tree-like structure

a conditional δ-TCFI tree is created corresponding to the conditional FP-tree,in each of the recursive pattern-growth procedure calls.

Page 15: δ-Tolerance Closed Frequent Itemsets

MineTCFI

Page 16: δ-Tolerance Closed Frequent Itemsets

The δ-TCFI Tree

Page 17: δ-Tolerance Closed Frequent Itemsets

Update and Construction of -TCFI Tree sort the items in Y as the order of the i

tems in .header the sorted Y is inserted into If a prefix of the sorted Y already appe

ars as a path in , we share the prefix but change the δ-TCFI link

Assume link currently points to W, then link will point to Z if either

Page 18: δ-Tolerance Closed Frequent Itemsets

Update and Construction of -TCFI Tree

Page 19: δ-Tolerance Closed Frequent Itemsets

Closure-Based Pruning

Page 20: δ-Tolerance Closed Frequent Itemsets
Page 21: δ-Tolerance Closed Frequent Itemsets

Experiment

Intel P4 3.2GHz CPU 2GB RAM Datasets : the real datasets from the

popular FIMI Dataset Repository

Page 22: δ-Tolerance Closed Frequent Itemsets
Page 23: δ-Tolerance Closed Frequent Itemsets
Page 24: δ-Tolerance Closed Frequent Itemsets