Analysis of Variance (ANOVA)
Penjelasan Umum Seringkali kita ingin menguji apakah tiga atau lebih populasi memiliki rata-rata yg sama. Contoh: Apakah bahan bakar/km yg digunakan untuk beberapa merek mobil sama? Apakah pendapatan pekerja pada beberapa lapangan pekerjaan sama? Atau apakah biaya produksi yg menggunakan beberapa proses yg berbeda adalah sama?
Penjelasan Umum Kita dapat menggunakan cara seperti yg lalu untuk menguji kesamaan rata-rata dua populasi, tetapi hal tersebut akan memakan waktu dan perhitungan yg lebih lama. Contoh: jika ada 5 pop, maka ada 5C2 cara/perhitungan yg harus dilakukan. Untuk itu kita dapat melakukan uji secara simultan /keseluruhan populasi tersebut dengan menggunakan distribusi F dan metoda yg disebut ANOVA (Analysis of Variance)
One-Way Analysis of Variance Assumptions Populations are normally distributed Populations have equal variances Samples are randomly and independently drawn
Hipotesis untuk One-Way ANOVA Seluruh rata-rata populasi adalah sama Artinya: Tidak ada efek treatment (tidak ada keragaman rata-rata dalam kelompok) Minimal salah satu rata-rata populasi ada yang tidak sama Artinya: Terdapat efek treatment (terdapat keragaman rata-rata dalam kelompok) Tidak berarti bahwa semua rata-rata populasi tidak sama (beberapa pasang mungkin sama)
The Null Hypothesis is True One-Factor ANOVA All Means are the same: The Null Hypothesis is True (No Treatment Effect)
One-Factor ANOVA At least one mean is different: (continued) At least one mean is different: The Null Hypothesis is NOT true (Treatment Effect is present) or
Partitioning the Variation
Partitioning the Variation Total variation can be split into two parts: SST = SSB + SSW SST = Total Sum of Squares SSB = Sum of Squares Between SSW = Sum of Squares Within
Partitioning the Variation (continued) SST = SSB + SSW Total Variation = jumlah kuadrat total (SST), yang mengukur keragaman total dalam data Between-Sample Variation = keragaman antar kelompok populasi (SSB) Within-Sample Variation = kergaman di dalam masing-masing kelompok populasi (SSW)
Partition of Total Variation Total Variation (SST) Variation Due to Factor (SSB) + Variation Due to Random Sampling (SSW) = Commonly referred to as: Sum of Squares Between Sum of Squares Among Sum of Squares Explained Among Groups Variation Commonly referred to as: Sum of Squares Within Sum of Squares Error Sum of Squares Unexplained Within Groups Variation
Total Sum of Squares SST = SSB + SSW Dimana: SST = Total sum of squares (Jumlah kuadrat total) k = jumlah populasi (kelompok, level atau treatment) ni = sample size dari populasi ke-i xij = pengamatan ke-jth dari populasi ke-i x = rata-rata total (rata-rata dari seluruh data)
Total Variation (continued)
Sum of Squares Between SST = SSB + SSW Dimana: SSB = Sum of squares between k = jumlah populasi (kelompok, level, atau treatment) ni = sample size dari populasi ke-i xi = rata-rata sample dari populasi ke-i x = rata-rata total (rata-rata dari seluruh data)
Between-Group Variation Variation Due to Differences Among Groups Mean Square Between = SSB/degrees of freedom
Between-Group Variation (continued)
Sum of Squares Within SST = SSB + SSW Dimana: SSW = Sum of squares within k = jumlah populasi (kelompok, level, atau treatment) ni = sample size dari populasi ke-i xi = rata-rata sample dari populasi ke-i xij = pengamatan ke-jth dari populasi ke-i
Within-Group Variation Summing the variation within each group and then adding over all groups Mean Square Within = SSW/degrees of freedom
Within-Group Variation (continued)
One-Way ANOVA Table Source of Variation SS df MS F ratio Between Samples SSB MSB SSB k - 1 MSB = F = k - 1 MSW Within Samples SSW SSW N - k MSW = N - k SST = SSB+SSW Total N - 1 k = jumlah populasi (kelompok, level, atau treatment) N = jumlah seluruh pengamatan df = derajat bebas
One-Factor ANOVA F Test Statistic H0: μ1= μ2 = … = μ k HA: At least two population means are different Test statistic MSB is mean squares between variances MSW is mean squares within variances Degrees of freedom df1 = k – 1 (k = number of populations) df2 = N – k (N = sum of sample sizes from all populations)
Interpreting One-Factor ANOVA F Statistic The F statistic is the ratio of the between estimate of variance and the within estimate of variance The ratio must always be positive df1 = k -1 will typically be small df2 = N - k will typically be large The ratio should be close to 1 if H0: μ1= μ2 = … = μk is true The ratio will be larger than 1 if H0: μ1= μ2 = … = μk is false
One-Factor ANOVA F Test Example Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204 You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the .05 significance level, is there a difference in mean distance?
One-Factor ANOVA Example: Scatter Diagram Distance 270 260 250 240 230 220 210 200 190 Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204 • • • • • • • • • • • • • • • 1 2 3 Club
One-Factor ANOVA Example Computations Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204 x1 = 249.2 x2 = 226.0 x3 = 205.8 x = 227.0 n1 = 5 n2 = 5 n3 = 5 N = 15 k = 3 SSB = 5 [ (249.2 – 227)2 + (226 – 227)2 + (205.8 – 227)2 ] = 4716.4 SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6 MSB = 4716.4 / (3-1) = 2358.2 MSW = 1119.6 / (15-3) = 93.3
One-Factor ANOVA Example Solution Test Statistic: Decision: Conclusion: H0: μ1 = μ2 = μ3 HA: μi not all equal = .05 df1= 2 df2 = 12 Critical Value: F = 3.885 Reject H0 at = 0.05 There is evidence that at least one μi differs from the rest = .05 Do not reject H0 Reject H0 F = 25.275 F.05 = 3.885
Uji Wilayah Berganda μ = Dari hasil pengujian kesamaan rata-rata populasi dgn ANOVA, jika keputusan adalah menolak Ho. Maka kita dapati kesimpulan bahwa tidak semua µ sama (paling sedikit ada dua yang tidak sama). Namun kita tidak tahu mana yang berbeda. Untuk mencari µ mana yang berbeda nyata → UJI WILAYAH BERGANDA DUNCAN DAN UJI TUKEY x μ 1 = 2 3
Uji Duncan Prosedur: Urutkan rata-rata sampel untuk masing-masing populasi (kelompok) dari yang terkecil hingga terbesar Hitung wilayah nyata terpendek dari berbagai rata-rata
Uji Duncan Kriteria pengujian Bandingkan selisih kedua rata-rata yang ingin dilihat perbedaannya dengan kriteria sbb: xi – xj ≤ Rp (Tidak berbeda nyata) xi – xj > Rp (Berbeda nyata)
Contoh: Uji Duncan 1. Urutkan rata-rata sampel: Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204 2. Hitung wilayah nyata terpendek dari berbagai rata-rata: α = 0.05, df = 12
Contoh: Uji Duncan Bandingkan selisih rata-rata dengan Rp:
Uji Tukey-Kramer Dimana: q = Nilai dari standardized range table dengan df = k dan N - k MSW = Mean Square Within ni dan nj = Sample sizes dari populasi (kelompok) ke-i & ke-j
Contoh: Uji Tukey-Kramer 1. Compute absolute mean differences: Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204 2. Find the q value from the table Tukey with k and N - k degrees of freedom for the desired level of
Contoh: Uji Tukey-Kramer 3. Compute Critical Range: 4. Compare: 5. All of the absolute mean differences are greater than critical range. Therefore there is a significant difference between each pair of means at 5% level of significance.