Presentasi sedang didownload. Silahkan tunggu

# Clustering. Definition Clustering is “the process of organizing objects into groups whose members are similar in some way”. A cluster is therefore a collection.

## Presentasi berjudul: "Clustering. Definition Clustering is “the process of organizing objects into groups whose members are similar in some way”. A cluster is therefore a collection."— Transcript presentasi:

Clustering

Definition Clustering is “the process of organizing objects into groups whose members are similar in some way”. A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.

Pengklusteran merupakan pengelompokan record, pengamatan, atau memperhatikan dan membentuk kelas objek-objek yang memiliki kemiripan. Beberapa algoritma pengelompokkan diantaranya adalah EM dan Fuzzy C- Means

Clustering Main Features Clustering – a data mining technique Usage: –Statistical Data Analysis –Machine Learning –Data Mining –Pattern Recognition –Image Analysis –Bioinformatics

Notion of a Cluster can be Ambiguous How many clusters? Four ClustersTwo Clusters Six Clusters

Distance based method In this case we easily identify the 4 clusters into which the data can be divided; the similarity criterion is distance: two or more objects belong to the same cluster if they are “close” according to a given distance. This is called distance-based clustering.

Limitations of K-means: Non-globular Shapes Original Points K-means (2 Clusters)

Limitations of K-means: Differing Sizes Original Points K-means (3 Clusters)

Types of Clustering –Hierarchical Finding new clusters using previously found ones –Partitional Finding all clusters at once

Partitional Clustering Original Points A Partitional Clustering

Hierarchical Clustering Traditional Hierarchical Clustering Non-traditional Hierarchical Clustering Non-traditional Dendrogram Traditional Dendrogram

Algoritma Pengelompokan K-Means Langkah-langkah algoritma K-Means: 1.Tentukan berapa kelompok yang akan dibuat sebanyak k kelompok. 2.Secara sembarang pilih k buah catatan yang ada sebagai pusat-pusat keompok awal. 3.Setiap catatan akan ditentukan pusat kelompok terdekatnya. 4.Perbarui pusat-pusat kelompok. 5.Pusat kelompok yang terdekat pada setiap catatan akan ditentukan, dan seterusnya sampai nilai rasio tidak membesar lagi.

Rumus Jarak dua titik: Between Cluster Variation (BCV): BCV=d(m 1,m 2 )+d(m 1, 3 )+d(m 2,m 3 ) Dalam hal ini, d(m i, j ) menyatakan jarak m i ke m j Within Cluster Variation (WCV): WCV=  (jarak pusat tiap cluster yang paling minimum) 2

Presentasi serupa

Iklan oleh Google