23
Clustering Aggregatio n Nir Geffen 021537980 Yotam Margolin 039719729 Supervisor Professor Zeev Volkovich 1 ORT BRAUDE COLLEGE – SE DEPT . 16.01.2012

Clustering Aggregation

  • Upload
    sorcha

  • View
    49

  • Download
    0

Embed Size (px)

DESCRIPTION

Clustering Aggregation. Nir Geffen 021537980 Yotam Margolin039719729 SupervisorProfessor Zeev Volkovich. ORT BRAUDE COLLEGE – SE DEPT. 16.01.2012. Purposes. Our goal is to study the results of different clustering ensemble techniques and to present the distinction - PowerPoint PPT Presentation

Citation preview

Page 1: Clustering Aggregation

1

Clustering Aggregation

Nir Geffen 021537980Yotam Margolin 039719729

SupervisorProfessor Zeev Volkovich

ORT BRAUDE COLLEGE – SE DEPT.

16.01.2012

Page 2: Clustering Aggregation

PurposesOur goal is to study the resultsof different clustering ensemble techniques and to present the distinction between the cluster ensembleand clustering aggregation approaches via self learning methodology – implemented for image segmentation.

/24 2

Page 3: Clustering Aggregation

Table of Contents Introduction

What Does It Do? Clustering Spectral Clustering Cluster Ensembles.

Consensus HGPA MCLA Volkovich-Yahalom Main Algorithm SE Documents

2/24

Page 4: Clustering Aggregation

What does it do?

/24 4

Com

paris

on

VYCAA

MCLA

HGPA

Prep

roce

ss

Page 5: Clustering Aggregation

Introduction – Clustering Clustering is a method of the

unsupervised learning aimed at partitioning a given data set into subsets named clusters, so that items belonging to the same cluster are similar to each other while items belonging to different clusters are not similar.

/24 5

Page 6: Clustering Aggregation

Introduction – Spectral Clustering

What is wrong with classic clustering? Spectral Clustering

Eigenvectors and eigenvalues Noise removed

/24 6

Clustered by K-MeansClustered by Spectral

Page 7: Clustering Aggregation

Introduction – Cluster Ensembles

As no clustering algorithm is agreed to be superior for any data set, a common practice is to construct several cluster solutions and to aggregate them.

We use the Consensus function approach to combine the resulting partitions into a new one, in order t increase the robustness of the clustering process.

/24 7

EMSpectral

K-Means

Page 8: Clustering Aggregation

Consensus Algorithms that solve the Cluster

Ensemble problem, are also known as Consensus functions, most of which rely on Graph theory.

/24 8

Page 9: Clustering Aggregation

Consensus II Cluster-based Similarity Partitioning Algorithm (CSPA)

Simple. Considered the brute-force.

Hyper Graph Partitioning Algorithm (HGPA), Balanced. Not always optimal.

Meta-CLustering Algorithm (MCLA) high-end solution. yields robust results.

/24 9

Page 10: Clustering Aggregation

Consensus III A criteria by which to choose any

specific consensus function is ANMI. ANMI (or Average Normalized Mutual Information) is defined as the average of the NMI which the final Clustering shares with the solutions.

Mutual Information I(X,Y )≤min(H(X),H(Y )).

/24 10

Page 11: Clustering Aggregation

HGPA – Steps/24 11

1. Create hyper-graph (hyper-edges are clusters from all clusterings).

2. Repeat K -1 times: Obtain sub-set (Cluster) by min-cutting the

hyper-graph while maintaining a vertex imbalance of at most 5%.

Page 12: Clustering Aggregation

MCLA – Steps/24 12

1. Create hyper-graph G.2. Expand hyper edges. (Create meta-graph from G).

3. Collapse meta-graph (Cluster meta graph to K clusters).

4. Compete for Objects

Page 13: Clustering Aggregation

Volkovich-Yahalom - Purpose Use various partitions of the same data

set in order to define a new metric on the data set.

Using the new metric as an enhanced input for a clustering algorithm will produce better and more robust partitions.

This process can be utilized repeatedly, where in each step the metric is updated using the original data as well as the new cluster partition.

/24 13

Page 14: Clustering Aggregation

Volkovich-Yahalom – Steps

/24 14

Given m partitions, and the original dataset. Returns a new clustering .1. For each partition , calculate to be the new

metric learned by clustering results.2. Combine by statistical means to complete R.3. Calculate S to be the square root of R.4. Cluster original data by new metric S, and k

the desired number of clusters.

Page 15: Clustering Aggregation

Main Algorithm - Steps Produce r individual spectral partitions Use MCLA to obtain Sc MCLA(x); Use HGPA to obtain Sc HGPA(x); Use Volkovich-Yahalom to obtain SC

VYCAA(x); By ANMI criterion, get the final decision

Sc*(x) from Sc MCLA(x) and Sc HGPA(x) and Sc VYCAA (x).

/24 15

Page 16: Clustering Aggregation

16

GUI – Main window

GUI 1

Page 17: Clustering Aggregation

17

GUI – Results window

GUI 2

Page 18: Clustering Aggregation

18

SE Documents

SE DOCUMENTS

Page 19: Clustering Aggregation

19

SE Documents

SE DOCUMENTS

Page 20: Clustering Aggregation

20

SE Documents

SE DOCUMENTS

Page 21: Clustering Aggregation

Test Plan Unit test is our first line in the test plan

(Test Driven Development) Coding conventions . Lint the code for errors such as dead-

code or uninitialized pointers. Usage and code coverage Test.

/24 21

Page 22: Clustering Aggregation

References[1] Z. Volkovich, O. Yahalom, "Clustering aggregation via the self-learning approach" , work in preparation 2010-2012[2] A. Strehl, J Ghosh, "Cluster Ensembles – a knowledge Reuse Framework for Combining Multiple Partitions", Journal of machine Learning Research 3 (2002) 583- 617.[3] Y. Ng, M, Jordan, Y Weish, "On spectral clustering: analysis and an algorithm", 2002, Advances in neural information processing systems 14: proceedings of the 2002 conference, Sec.2, 849.[4] X. Ma, W. Wan, L. Jiao, “Spectral Clustering Ensemble for Image Segmentation”, 2009 GEC '09 Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation.[5] I.S. Dhilon, Y. Guan, B. Kulis, “Kernel k-means, Spectral Clustering and Normalized Cuts”, 2004 http://www.cs.utexas.edu/~kulis/pubs/spectral_kdd.pdf[6] E. David, “Spectral Clustering”, 2008 Image Processing Seminar.

/24 22

Page 23: Clustering Aggregation

THE END!

Thank you for listening!