Presentasi sedang didownload. Silahkan tunggu

Presentasi sedang didownload. Silahkan tunggu

1. Konsep Dasar 2. Statistik dalam Analisis Cluster 3. Langkah-langkah Analisis Cluster a.Rumuskan Permasalahan b.Memilih ukuran Jarak atau Kesamaan c.Memilih.

Presentasi serupa


Presentasi berjudul: "1. Konsep Dasar 2. Statistik dalam Analisis Cluster 3. Langkah-langkah Analisis Cluster a.Rumuskan Permasalahan b.Memilih ukuran Jarak atau Kesamaan c.Memilih."— Transcript presentasi:

1

2 1. Konsep Dasar 2. Statistik dalam Analisis Cluster 3. Langkah-langkah Analisis Cluster a.Rumuskan Permasalahan b.Memilih ukuran Jarak atau Kesamaan c.Memilih Prosedur Peng-clusteran d.Menetapkan Jumlah Cluster e.Interpretasi dan Profil dari Cluster f.Menaksir Reliabilitas and Validitas Pokok Bahasan

3 Cluster Analysis adalah suatu teknik mengelompokkan obyek atau cases ke dalam kelompok yang relatif homogen yang disebut CLUSTER Analisis Cluster sering juga disebut sebagai : Classification Analysis Classification Analysis Numerical Taxonomy Numerical Taxonomy Pengelompokan dalam prakek sering tidak sama dengan pengelompokan yang ideal prakek pengelompokan yang idealprakek pengelompokan yang ideal Perbedaan Analisis Discriminant dengan Cluster : Konsep Dasar W

4 Situasi Pengelompokan Ideal Situasi Pengelompokan Ideal Variable 2 Variable 1 Back

5 Situasi Pengelompokan dalam Praktek X Variable 2 Variable 1 Back

6 Penggunaan Analisis Cluster Contoh :  Segmentasi Pasar.  Memahami perilaku pembeli  Mengidentifikasi peluang produk baru.  Memilih pasar yang akan diuji.  Mengurangi Data

7 Statistik dalam Analisis Cluster Agglomeration schedule Cluster centroid Cluster Centers Cluster membership Dendrogram Distance between cluster centers Incicle diagram

8 Langkah-langkah Analisis Cluster Memilih ukuran Jarak atau Kesamaan Rumuskan Permasalahan Memilih Prosedur peng-Cluster-an Menetapkan Jumlah Cluster Interpretasi dan Profil dari Cluster Menaksir Reliablitas dan Validitas

9 Rumuskan Permasalahan Contoh : Melakukan pengelompokan konsumen berdasarkan sikap mereka pada akvitivas belanja. Didasarkan pada penelitian sebelumnya dapat diidentifikasikan ada enamvariabel sikap. Konsumen diminta menyatakan tingkat kesepakatan mereka dengan pernyataan skala tujuh berikut ini : V1 = Shopping is fun V2 = Shopping is bad for your budget V3 = I combine shopping with eating out. V4 = I try to get best buys while shopping. V5 = I don’t care about shopping. V6 = You can save a lot of money by comparing prices. Data yang diperoleh dari 20 responden adalah sebagai berikut :

10 Case No.V 1 V 2 V 3 V 4 V 5 V 6 1647323 2231454 3726413 4464536 5132264 6646334 7536334 8737414 9243363 10353646 11132353 12545424 13221544 14464647 15654214 16354647 17447225 18372643 19463727 20232472 Data Mentah

11 Memilih ukuran Jarak atau Kesamaan Sebab tujuan clustering adalah mengelompokan obyek bersama- sama, maka beberapa pengukuran dibutuhkan untuk menilai perbedaan atau kesamaan diantara obyek. Pengukuran yang sering dipergunakan adalah : Euclidean Distance is square root of the sum of the square differences in values for each variables. City Block or Manhattan distance is the sum of the absolute differences in value for each variables Chebychev distance is the maximum absolute difference in values for any variables.

12 Klasifikasi Prosedur peng-Cluster-an Klasifikasi Prosedur peng-Cluster-an Clustering Procedures Hierarchical Nonhierarchical Agglomerative Divisive Sequential Threshold Parallel Threshold Optimizing Partitioning Linkage Methods Variance Methods Centroid Methods Ward’s Method Single Complete Average

13 Metode Hubungan Cluster (Linkage) Single Linkage Minimum Distance Cluster 1Cluster 2 Complete Linkage Maximum Distance Cluster 1Cluster 2 Average Linkage Average Distance Cluster 1Cluster 2

14 Metode Cluster lainnya Metode Cluster Agglomerative lainnya Ward’s Procedure Centroid Method

15 Output Cluster Hirarki

16 Icicle Plot Vertikal 8+ 1+ 4+ 5+ 6+ 7+ 2+ 3+ 11+ 12+ 13+ 14+ 9+ 10+ 16+ 19+ 17+ 18+ 15+ 1 Nomor Kasus 1 11211 11 111 9 8 404096 328315 7 627 5 Jumlah Cluster Back

17 Dendrogram Using Ward’s Method Rescaled Distance Cluster Combine 3 15 1 12 7 8 17 6 11 5 13 2 20 9 19 16 4 10 18 14 0 152025510 Case Label Seq Back

18 Keanggotaan Cluster Cluster 4 cluster 3 cluster 2 cluster 1888 26612 356 41 Jumlah anggota per cluster

19 Menetapkan Jumlah Cluster Pedoman dalam menetapkan jumlah cluster :  Theoretical, conceptual, or practical consideration may suggest a certain number of cluster.  In hierarchical clustering, the distance at which cluster are combined can be used as criteria. Thins information can be obtained from the agglomeration schedule or from the dendrogram.  In non hierarchical clustering the ratio within group variance to between group variance can be plotted against the number of cluster. Point at which an elbow or a sharp bend occurs indicates an appropriate number of clusters.  The relative size of clusters should be meaningful. In Cluster Membership table by making a simple frequency count of cluster membership. We. See that a three-cluster solution result in cluster with eight, six, and six element. However, if we go to four-cluster solution, the size of clusters are eight, six, five, and one. It is not meaningful to have a cluster with only one case.

20 Rata-rata per Variabel Rata-rata per Variabel No. ClusterV 1 V 2 V 3 V 4 V 5 V 6 15.7503.6256.0003.1251.7503.875 21.6673.0001.8333.5005.5003.333 33.5005.8333.3336.0003.5006.000 Cluster Centroids Nilai Cluster Centriod dapat diperoleh dari Pengolahan Data K- Mean Cluster (lihat pada Final Cluster Center)

21 Menghitung Cluster Centroids pakai Ms Ecxel No Resp V1v2v3v4v5v6 Cluster membership 16473231 37264131 66463341 75363341 87374141 125454241 156542141 174472251 5,753,6363,131,883,88 22314542 51322642 92433632 111323532 132215442 202324722 1,6731,833,55,53,33 44645363 103536463 144646473 163546473 183726433 194637273 3,55,833,3363,56 Cluster centroid untuk Cluster 1 Cluster centroid untuk Cluster 2

22 Interpretasi and Profil dari Cluster Kita lihat dari Tabel Cluster Centroid : shopping is funI combine shopping with eating out Pada Cluster 1 V1(shopping is fun), dan V3 (I combine shopping with eating out) nilainya relatif tinggi, sehingga cluster ini dapat diberi nama “fun-loving and concerned shoppers” I don’t care about shopping apathetic shoppers Pada Cluster 2 V5(I don’t care about shopping) nilainya relatif tinggi, sehingga cluster ini dapat diberi nama “apathetic shoppers” Shopping is bad for my budgetI try to get the best buys while shoppingYou can save a lot of money by comparing prices Pada Cluster 3 V2 (Shopping is bad for my budget), V4 (I try to get the best buys while shopping), dan V6 (You can save a lot of money by comparing prices) nilainya relatif tinggi, sehingga cluster ini dapat diberi nama “economical shoppers”

23 Menaksir Reliabilitas dan Validitas Prosedur formal untuk menilai reliabilitas dan viliditas dari hasil cluster kompleks. Prosedur berikut cukup memadai untuk mengecek kualitas hasil cluster : 1. Perform cluster analysis on the same data using different distance measure. Compare the result across measure to determine the stability of the solutions. 2. Use different methods of clustering and compare the result. 3. Split the data randomly in halves. Perform clustering separetly on each half. Compare cluster centroids across the two subsamples. 4. Delete variables randomly. Perform clustering based on the reduced set of variables. Compare the result with those obtained by clustering based on the entire set of variables.

24 ClusterV1V2V3V4V5V6 14.00006.00003.00007.00002.00007.0000 22.00003.00002.00004.00007.00002.0000 37.00002.00006.00004.00001.00003.0000 Initial Cluster Centers Results of Nonhierarchical Clustering Classification Cluster Centers ClusterV1V2V3V4V5V6 13.81355.89923.25226.48912.51496.6957 21.85073.02341.83273.78646.44362.5056 36.35582.83566.15763.67361.30473.2010 Case Listing of Cluster Membership Case IDClusterDistanceCase IDClusterDistance 131.780222.254 331.174411.882 522.525632.340 731.862831.410 921.8431012.112 1121.9231232.400 1323.3821411.772 1533.6051612.137 1733.7601814.421 1910.8532020.813

25 Final Cluster Centers ClusterV1V2V3V4V5V6 13.50005.83333.33336.00003.50006.0000 21.66673.00001.83333.50005.50003.3333 35.75003.62506.00003.12501.75003.8750 Distances between Final Cluster Centers Cluster 1 2 3 10.0000 25.56780.0000 35.73536.99440.0000 Analysis of Variance Variable Cluster MS df Error MS df F p V1 29.108320.6078 17 47.8879.000 V2 13.545820.6299 17 21.5047.000 V3 31.391720.8333 17 37.6700.000 V4 15.712520.7279 17 21.5848.000 V5 24.1500 20.7353 17 32.8440.000 V6 12.170821.0711 17 11.3632.001 Number of Cases in each Cluster ClusterUnweighted Cases Weighted Cases 1 66 2 66 3 88 Missing 0 Total 20 20


Download ppt "1. Konsep Dasar 2. Statistik dalam Analisis Cluster 3. Langkah-langkah Analisis Cluster a.Rumuskan Permasalahan b.Memilih ukuran Jarak atau Kesamaan c.Memilih."

Presentasi serupa


Iklan oleh Google