Expectation Maximization
Coin flipping experiment Diberikan koin A dan B dengan nilai bias A dan B yang belum diketahui Koin A akan memunculkan head dengan probabilitas A dan memunculkan tail dengan probabilitas 1- A, demikian juga dengan koin B Percobaan berikut diulang lima kali : pilih secara random koin A atau B (with equal probability), dan lakukan toss sebanyak 10 kali, sehingga total ada 50 toss Tujuan : Estimasikan A dan B Dalam statistik, ini disebut dengan melakukan estimasi maximum likelihood
Maximum likelihood for complete data Ketika semua random variable diketahui, yaitu hasil dari setiap coin flip dan tipe koin yang digunakan, maka hal ini disebut sebagai complete data case. Pada complete data case, nilai A dan B dapat dihitung sbb
Maximum likelihood for complete data Koin A akan memunculkan head dengan probabilitas A dan memunculkan tail dengan probabilitas 1- A, demikian juga dengan koin B
Incomplete data case Bagaimana jika ada variable yang tidak diketahui -> incomplete data Hasil dari setiap coin flip diketahui, tetapi tipe koin yang digunakan tidak diketahui Pada kasus seperti ini, mengestimasi A dan B dapat menggunakan EM
Expectation Maximization
E step Contoh : pada baris ke-dua 9H1T, koin mana yang likely menghasilkan ini? Berawal dari initial parameter P(H 9 T 1 |A) be the probability of observing 9 heads, 1 tail when coin is A = 0.8 P(H 9 T 1 |B) be the probability of observing 9 heads, 1 tail when coin is B = 0.2 P(A|H 9 T 1 ) be the probability of the coin being A when you observe 9 heads, 1 tail. P(B|H 9 T 1 ) be the probability of the coin being B when you observe 9 heads, 1 tail.
P(H 9 T 1 |A) be the probability of observing 9 heads, 1 tail when coin is A = 0.8 P(H 9 T 1 |B) be the probability of observing 9 heads, 1 tail when coin is B = 0.2
Since we calculated that proportion to be 0.8:0.2, a contribution of 0.8 ⋅ (9,1)=(7.2,0.8) is added to the column for coin A and a contribution of 0.2 ⋅ (9,1)=(1.8,0.2) is added to the column for coin B. Together, they add up to (9,1) (since we obtained the weights by normalizing their sum to 1). Thus, the more likely it seems, according to the current bias estimates, that this row was produced by coin A, the more of it we add to the column for coin A. Note that we're not calculating an expectation value in the columns; we're merely adding up fractions of heads and tails in proportion to the likelihood that they came from this coin, and in the end we take the overall ratio of heads and tails to get a new bias estimate; there's no need for the heads and tails to add up to anything or to form an expectation value in either of the columns individually. P(H 9 T 1 |A) be the probability of observing 9 heads, 1 tail when coin is A = 0.8 P(H 9 T 1 |B) be the probability of observing 9 heads, 1 tail when coin is B = 0.2 P(A|H 9 T 1 ) be the probability of the coin being A when you observe 9 heads, 1 tail. P(B|H 9 T 1 ) be the probability of the coin being B when you observe 9 heads, 1 tail.
M Step Menggunakan nilai bias yang baru, ulangi langkah sebelumnya sampai beberapa iterasi sehingga menghasilkan nilai bias yang konvergen
Expectation Maximization for Soft Clustering