Upload presentasi
Presentasi sedang didownload. Silahkan tunggu
Diterbitkan olehRina Anthony Telah diubah "9 tahun yang lalu
1
Clustering
2
Definition Clustering is “the process of organizing objects into groups whose members are similar in some way”. A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.
3
Pengklusteran merupakan pengelompokan record, pengamatan, atau memperhatikan dan membentuk kelas objek-objek yang memiliki kemiripan. Beberapa algoritma pengelompokkan diantaranya adalah EM dan Fuzzy C- Means
4
Clustering Main Features Clustering – a data mining technique Usage: –Statistical Data Analysis –Machine Learning –Data Mining –Pattern Recognition –Image Analysis –Bioinformatics
5
Notion of a Cluster can be Ambiguous How many clusters? Four ClustersTwo Clusters Six Clusters
6
Distance based method In this case we easily identify the 4 clusters into which the data can be divided; the similarity criterion is distance: two or more objects belong to the same cluster if they are “close” according to a given distance. This is called distance-based clustering.
7
Limitations of K-means: Non-globular Shapes Original Points K-means (2 Clusters)
8
Limitations of K-means: Differing Sizes Original Points K-means (3 Clusters)
9
Types of Clustering –Hierarchical Finding new clusters using previously found ones –Partitional Finding all clusters at once
10
Partitional Clustering Original Points A Partitional Clustering
11
Hierarchical Clustering Traditional Hierarchical Clustering Non-traditional Hierarchical Clustering Non-traditional Dendrogram Traditional Dendrogram
12
Algoritma Pengelompokan K-Means Langkah-langkah algoritma K-Means: 1.Tentukan berapa kelompok yang akan dibuat sebanyak k kelompok. 2.Secara sembarang pilih k buah catatan yang ada sebagai pusat-pusat keompok awal. 3.Setiap catatan akan ditentukan pusat kelompok terdekatnya. 4.Perbarui pusat-pusat kelompok. 5.Pusat kelompok yang terdekat pada setiap catatan akan ditentukan, dan seterusnya sampai nilai rasio tidak membesar lagi.
13
Rumus Jarak dua titik: Between Cluster Variation (BCV): BCV=d(m 1,m 2 )+d(m 1, 3 )+d(m 2,m 3 ) Dalam hal ini, d(m i, j ) menyatakan jarak m i ke m j Within Cluster Variation (WCV): WCV= (jarak pusat tiap cluster yang paling minimum) 2
Presentasi serupa
© 2024 SlidePlayer.info Inc.
All rights reserved.