Click here to load reader
Upload
marilynn-garrett
View
229
Download
0
Embed Size (px)
Citation preview
PowerPoint
8 Unsupervised method G201449015 0. unsupervised method
: finds groups in your data with similar characteristics
k-means
: finds elements or properties in the data that tend occur togetherUnsupervised method
1. cluster analysis
: clustering .
.
hierarchical clusteringk-meansUnsupervised method
1.Cluster analysis
1.1 hierarchical clustering
.
2 (), .
dist(), hclust()
dist(x, method, diag, upper)x : data(numeric matrix)method : euclideanmaximummanhattanbinaryminkowskidiag : T/Fupper : T/F
Unsupervised method
1.Cluster analysis
1.1hierarchical clusteringhclust(d, method)d : distance datamethod : ward.D, ward.D2singlecompleteaveragemediancentroid1.1 hierarchical clustering
Unsupervised method
1.Cluster analysis
1.1hierarchical clustering
1.1 hierarchical clustering
cutree() : .
cutree(tree, k, h)tree : hcluter tree datak : h : Unsupervised method
1.Cluster analysis
1.1hierarchical clustering
1.1 hierarchical clustering
visualizing cluters - 2
(PCA) 2 plotting
prcomp() : , .
prcomp(x)x : dataUnsupervised method
1.Cluster analysis
1.1hierarchical clustering1.1 hierarchical clustering
Unsupervised method
1.Cluster analysis
1.1hierarchical clustering
1.1 hierarchical clustering
bootstrap evaluation of clusters : clusterboot()
library(fpc)clusterboot(data, clustermethod)data : data matrixclustermethod : clustering methodkmeansCBI(data, k) : kmeans clusteringhclustCBI(data, k, method) : agglomerative hierarchical clustering
Unsupervised method
1.Cluster analysis
1.1hierarchical clustering1.1 hierarchical clustering
Unsupervised method
1.Cluster analysis
1.1hierarchical clustering
original cluster resampling cluster ,original cluster resampling cluster ,
1.1 hierarchical clustering
Calinski-Harabasz index : , .
(W) = WSS(k) / (n-k)(B) = BSS(k) / (k-1)ch = B / W, k .
good cluster : small WSS(k), large BSS(k)
WSS(total within sum of square) : - centroid
BSS(between sum of square) : TSS-WSS(k)Unsupervised method
1.Cluster analysis
1.1hierarchical clusteringUnsupervised method
1.Cluster analysis
1.1hierarchical clustering
1.1 hierarchical clustering
Unsupervised method
1.Cluster analysis
1.1hierarchical clustering
1.1 hierarchical clustering
1.2 k-means algorithm
K .
kmeans()
kmeans(x, centers, iter.max, nstart, algorithm, trace)x : datacenters : (k)iter.max : nstart : random sets algorithmHartigan-WongLloydForgyMacQueentrace : T/F
Unsupervised method
1.Cluster analysis
1.2k-means algorithm
1.2 k-means algorithm
bootstrap evaluation of clusters : clusterboot()
library(fpc)clusterboot(data, clustermethod)data : data matrixclustermethod : clustering methodkmeansCBI(data, k) : kmeans clusteringhclustCBI(data, k, method) : agglomerative hierarchical clustering
Unsupervised method
1.Cluster analysis
1.2k-means algorithm
1.2 k-means algorithm
: kmeansruns()
kmeansruns(data, krange, criterion)data : datakrange : criterionch : Calinski-Harabasz Indexasw : average silhouette width
a = b = asw = 1-a/b, if aY) = support(union(X,Y) / support(X))
Support(X) = X / T
10% 60% .
Unsupervised method
2.Association rules
2. Association rules
Unsupervised method
2.Association rules
2. Association rules
Unsupervised method
2.Association rules
...
2. Association rules
apriori(data, parameter)data : dataparametersupportconfidence
Unsupervised method
2.Association rules
2. Association rules
inspect(x)
Unsupervised method
2.Association rules
sort, sift lift : lhs, rhs
2. Association rules
apriori(data, parameter, appearance)data : dataparametersupportconfidenceappearance :
Unsupervised method
2.Association rules
2. Association rules
inspect(x)
cf)
Unsupervised method
2.Association rules