Presentasi sedang didownload. Silahkan tunggu

Presentasi sedang didownload. Silahkan tunggu

Bayesian: Single Parameter

Presentasi serupa


Presentasi berjudul: "Bayesian: Single Parameter"— Transcript presentasi:

1 Bayesian: Single Parameter
Prof. Nur Iriawan, PhD. Statistika – FMIPA – ITS, SURABAYA 21 Februari 2006

2 Frequentist Vs Bayesian (Casella dan Berger, 1987)
Grup Frequentist Grup yang mendasarkan diri pada cara klasik: MLE, Moment, UMVUE, MSE, dll Pendekatan analitis selalu sebagai solusi Grup Bayesian Grup yang mendasarkan diri pada cara Bayesian Pendekatan numerik serta komputasi secara intensif Inference lebih didasarkan pada kemungkinan muncul terbesar Nur Iriawan Bayesian Modeling, PENS – ITS

3 Teorema Bayes (Thomas Bayes, 1702-1761)
Nur Iriawan Bayesian Modeling, PENS – ITS

4 Model Bayesian (Box dan Tiao, 1973), (Zellner, 1971), (Gelman, Stern, Carlin, dan Rubin, 1995)
Mengacu pada bentuk proporsional Yang dibentuk sebagai Bahwa data yang dibentuk sebagai likelihood digunakan sebagai bahan untuk meng-update informasi prior menjadi sebuah informasi posterior yang siap untuk digunakan sebagai bahan inferensi. Nur Iriawan Bayesian Modeling, PENS – ITS

5 Bayesian: Parameter juga diperlakukan sebagai variabel
Dalam Bayesian semua parameter dalam model diperlakukan sebagai variabel Prinsip berfikir sebagai bentuk Full Conditional Distribution digunakan untuk mempelajari karakteristik setiap parameter Dibedakan antara simbol penyajian likelihood data dan Full Conditional Distribution. Nur Iriawan Bayesian Modeling, PENS – ITS

6 Motivasi Bayesian Theorema Bayes Thomas Bayes
Pada bentuk lain jika adalah suatu r.v yang independen dengan θ adalah parameternya, maka P(B) adalah konstan Nur Iriawan Bayesian Modeling, PENS – ITS

7 Example: the Icy Road Case
Ice: Is there an icy road? Values {Yes, No} Initial Probabilities (.7, .3) Watson: Does Watson have a car crash? Probabilities (.8, .2) if Ice=Yes, (.1, .9) if Ice=No. How do we reflect changes in our belief as we make observations? These next couple of slides are are brief primer on Bayes nets as they can be used in assessment. In this small example, we have just one student-model variable, “Level of Proficiency.” It has two levels, expert and novice, that we can’t see directly. What we can see is an examinee take a patient history, and we can determine whether it was adequate or inadequate. . Nur Iriawan Bayesian Modeling, PENS – ITS

8 Icy Road: Conditional Probabilities
Watson Yes No Ice Yes .8 .2 Here are the conditional probabilities for an expert. In applied work, for a specific inferential problem, structure and conditional probabilities may be taken as given. Where do conditional probabilities come from? We’ll talk about this a little later. For this example, we can think of them as coming from an expert where we observed a large group of people known to be experts, and another large group of people known to be novices, take patient histories in this particular setting. We observed that 80% of the experts took adequate histories. This figures captures reasoning from proficiency to observable. If we we going to observe another expert solution, then absent other information we’d put an 80% probability on observing them take an adequate history. Similarly, we saw only 40% of the novices take adequate histories. Re No .1 .9 p(Watson=no|ice=yes) p(Watson=yes|Ice=yes) Nur Iriawan Bayesian Modeling, PENS – ITS

9 Icy Road: Likelihoods Note: 8/1 ratio p(Watson=yes|Ice=yes)
.8 .2 Here are the conditional probabilities for an expert. In applied work, for a specific inferential problem, structure and conditional probabilities may be taken as given. Where do conditional probabilities come from? We’ll talk about this a little later. For this example, we can think of them as coming from an expert where we observed a large group of people known to be experts, and another large group of people known to be novices, take patient histories in this particular setting. We observed that 80% of the experts took adequate histories. This figures captures reasoning from proficiency to observable. If we we going to observe another expert solution, then absent other information we’d put an 80% probability on observing them take an adequate history. Similarly, we saw only 40% of the novices take adequate histories. Re No .1 .9 p(Watson=yes|Ice=no) Nur Iriawan Bayesian Modeling, PENS – ITS

10 Icy Road: Bayes Theorem: If Watson = yes -- Before Normalizing
Prior * Likelihood µ Posterior Watson .8 .9 .2 .1 No Yes Ice .56 .03 Yes No Ice .7 .3 No Ice Yes Here are the conditional probabilities for an expert. In applied work, for a specific inferential problem, structure and conditional probabilities may be taken as given. Where do conditional probabilities come from? We’ll talk about this a little later. For this example, we can think of them as coming from an expert where we observed a large group of people known to be experts, and another large group of people known to be novices, take patient histories in this particular setting. We observed that 80% of the experts took adequate histories. This figures captures reasoning from proficiency to observable. If we we going to observe another expert solution, then absent other information we’d put an 80% probability on observing them take an adequate history. Similarly, we saw only 40% of the novices take adequate histories. Re Sum = Need to divide through by this ‘normalizing constant’ to get probabilities. Nur Iriawan Bayesian Modeling, PENS – ITS

11 Icy Road: Bayes Theorem: If Watson = yes
Prior * Likelihood µ Posterior Watson .8 .9 .2 .1 No Yes Ice .56 .03 Yes No Ice .95 .05 Yes No Ice .7 .3 No Ice Yes Here are the conditional probabilities for an expert. In applied work, for a specific inferential problem, structure and conditional probabilities may be taken as given. Where do conditional probabilities come from? We’ll talk about this a little later. For this example, we can think of them as coming from an expert where we observed a large group of people known to be experts, and another large group of people known to be novices, take patient histories in this particular setting. We observed that 80% of the experts took adequate histories. This figures captures reasoning from proficiency to observable. If we we going to observe another expert solution, then absent other information we’d put an 80% probability on observing them take an adequate history. Similarly, we saw only 40% of the novices take adequate histories. Re Posterior probabilities -- each term in the product divided through by the normalizing constant .59. Nur Iriawan Bayesian Modeling, PENS – ITS

12 Contoh pada kasus Normal
Representasi alami suatu distribusi Normal(μ,σ2) atau N(μ,σ2) ? Mana representasi yang representatif ? Nur Iriawan Bayesian Modeling, PENS – ITS

13 ? Apa perbedaan antara penyajian berikut ini?
Nur Iriawan Bayesian Modeling, PENS – ITS

14 Plot variabel x, μ dan σ dalam full conditional Normal
Nur Iriawan Bayesian Modeling, PENS – ITS

15 Interval vs Highest Posterior Density (HPD) (Box dan Tiao, 1973),(Gelman et.al, 1995), (Iriawan, 2001) Pembentukan interval konfidensi pada frequentist adalah sbb Pembentukan interval konfidensi pada Bayesian didekati dengan HPD. Nur Iriawan Bayesian Modeling, PENS – ITS

16 Representasi Kesamaan Densitas (Iriawan, 2001)
Nur Iriawan Bayesian Modeling, PENS – ITS

17 Compromise dalam Control Chart
Nur Iriawan Bayesian Modeling, PENS – ITS

18 HPD pada Control Chart Individu
Peta Kendali (1-) x 100% Batas Kendali Bawah Batas Kendali Atas 95,0 71,3953 109,481 97,5 64,4857 110,915 99,0 55,3356 112,775 Nur Iriawan Bayesian Modeling, PENS – ITS

19 Contoh Kasus pada Bernoulli
Seperti halnya pada Normal sebelumnya, x~Ber(x;p) disajikan sbb: dimana pada frequentist, p dianggap konstan Bagaimana jika karena situasi dan tempat pengamatan yang berbeda dan diperoleh p berubah-ubah? Prinsip Bayesian, p akan diperlakukan menjadi sebuah variabel agar mempunyai kemampuan akomodatif pada keadaan seperti di atas. Nur Iriawan Bayesian Modeling, PENS – ITS

20 Anggap p berubah sesuai dengan distribusi Beta(α,β), seperti berikut:
apa yang akan terjadi? Nur Iriawan Bayesian Modeling, PENS – ITS

21 Anggap satu pengamatan bernoulli telah dilakukan, maka posterior distribusinya adalah sbb:
Nur Iriawan Bayesian Modeling, PENS – ITS

22 Sesuai dengan spesifikasi fungsi Beta, maka penyebut dapat diproses sbb:
Nur Iriawan Bayesian Modeling, PENS – ITS

23 Sehingga distribusi posterior untuk p setelah satu observasi tersebut adalah
Nur Iriawan Bayesian Modeling, PENS – ITS

24 Estimator Bayes Bayesian estimate dari p dapat diperoleh dengan meminimumkan loss function. Beberapa loss functions dapat digunakan, tetapi disini akan digunakan quadratic loss function yang konsisten dengan mean square errors (MSE) Secara umum, estimasi θ dengan pendekatan Bayes sbb ((Carlin and Louis, 1996), and (Elfessi and Reineke, 2001)) : Nur Iriawan Bayesian Modeling, PENS – ITS

25 Dengan memperlakukan expektasi pada posterior distribution diperoleh
Nur Iriawan Bayesian Modeling, PENS – ITS

26 Seperti sebelumnya, diselesaikan integral tersebut dengan membuat variabel baru a*=a+x+1 dan b*=b-x+1. Integralnya akan memberikan hasil sbb: Nur Iriawan Bayesian Modeling, PENS – ITS

27 Dengan menggunakan penyederhanaan seperti berikut Maka,
Atau Ingat hasil ini kembali pada saat pembahaan Compromising Bayesian dengan Classical Approaches Nur Iriawan Bayesian Modeling, PENS – ITS

28 Pengembangan hasil ini ke bentuk n buah percobaan Bernoulli akan menghasilkan sebanyak y sukses memberikan hasil Dimana y adalah jumlah sukses dari observasi setiap bernoulli x. Nilai taksiran y adalah sebagai berikut: Ingat hasil ini kembali pada saat pembahaan Compromising Bayesian dengan Classical Approaches Nur Iriawan Bayesian Modeling, PENS – ITS

29 Prior dan Metode Bayesian (Gelman et.al, 1995)
Karena parameter  diperlakukan sebagai variabel maka dalam Bayesian  akan mempunyai nilai dalam domain , dengan densitas f (). Dan densitas inilah yang akan dinamakan sebagai distribusi prior dari  . Dengan adanya informasi prior yang dipadukan dengan data / informasi saat itu, X, yang digunakan dalam membentuk posterior  , maka penghitungan posteriornya akan semakin mudah, yaitu hanya dengan menghitung densitas bersyarat dari  diberikan oleh X=x . Kritikan pada Bayesian biasanya terfokus pada “legitimacy dan desirability” untuk menggunakan  sebagai random variabel dan ketepatan mendefinisikan/memilih distribusi prior-nya. Nur Iriawan Bayesian Modeling, PENS – ITS

30 Bentuk Prior, Likelihood, dan Posterior yang ideal
Proper/ conjugate Posterior Prior θ Nur Iriawan Bayesian Modeling, PENS – ITS

31 Bagaimana jika pemilihan priornya seperti berikut ini?
Pemilihan prior seperti ini akan Merupakan sebuah misleading prior, Sehingga posteriornya tidak akan Jelas bentuknya. ? Likelihood Posterior Prior θ Nur Iriawan Bayesian Modeling, PENS – ITS

32 Prior yang serba sama densitasnya di semua domain
Likelihood improper posterior prior θ Nur Iriawan Bayesian Modeling, PENS – ITS

33 Interpretasi distribusi Prior
Sebagai bentuk distribusi frequency Sebagai bentuk representasi normatif dan objectif pada suatu parameter yang lebih rasional untuk dipercayai Sebagai suatu representasi subjectifitas seseorang dalam memandang sebuah parameter menurut penilainnya sendiri Nur Iriawan Bayesian Modeling, PENS – ITS

34 Prior sebagai representasi Frequensi Distribusi
Adakalanya nilai suatu parameter dibangkitkan dari modus pola data sebelumnya baik itu dari pola simetri ataupun tidak simetri Dalam sebuah inspeksi dalam proses industri, data kerusakan pada batch sebelumnya biasanya akan digunakan sebagai estimasi informasi prior untuk keadaan batch selanjutnya Prior biasanya mempunyai arti fisik sesuai dengan frequensi kejadian data-datanya Nur Iriawan Bayesian Modeling, PENS – ITS

35 Interpretasi Normative/Objective dari suatu prior
Permasalahan pokok agar prior dapat interpretatif adalah bagaimana memilih distribusi prior untuk suatu parameter yang tidak diketahui namun sesuai dengan permasalahan fisik yang ada. Jika  hanya mempunyai nilai-nilai pada range yang tertentu saja, hal ini cukup beralasan jika digunakan prior yang mempunyai densitas serba sama (equally likelly / uniformly distributed). Interpretasinya adalah bahwa setiap kondisi diberi kesempatan yang sama untuk dapat terpilih sebagai suporter likelihood dalam membentuk posteriornya. Prior dapat mempunyai arti yang sangat janggal jika salah dalam pemilihannya Nur Iriawan Bayesian Modeling, PENS – ITS

36 Kasus prior dalam Continuous Parameters
Invariance arguments. Hal ini akan dapat terjadi, sebagai contoh dalam kasus Normal mean m, dapat diartikan bahwa semua titik dalam semua interval (a,a+h) harus mempunyai probabilitas prior untuk semua h dan a yang diketahui. Hal ini akan memberikan pengertian bahwa untuk semua titik dalam interval tersebut mempunyai kesempatan sama terpilih atau cenderung mempunyai uniform prior (“improper prior”) Untuk parameter, s, dalam interval (a,ka) akan mempunyai prior probabilitas yang sama, yang hal ini akan memberikan arti bahwa priornya akan proportional pada nilai 1/ s. Lagi-lagi hal ini juga menghasilkan sebuah improper prior. Nur Iriawan Bayesian Modeling, PENS – ITS

37 Macam-macam Prior Conjugate prior vs non-conjugate prior ((Box dan Tiao, 1973),(Gelman et.al, 1995), (Tanner, 1996), (Zellner, 1971)) Prior terkait dengan pola model likelihood datanya Proper prior vs Improper prior (Jeffreys prior) Prior terkait pada pemberian pembobotan/ densitas di setiap titik, uniformly distributed atau tidak Informative prior vs Non-Informative prior Prior terkait dengan sudah diketahui pola/frekuensi distribusi dari datanya atau belum Pseudo-prior (Carlin dan Chib, 1995) Prior terkait dengan pemberian nilainya yang disetarakan dengan hasil elaborasi dari frequentist (misal regresi dengan OLS) Nur Iriawan Bayesian Modeling, PENS – ITS

38 Continuous Parameters
Biasanya digunakan uniform prior (at least if the parameter space is of finite extent) Tetapi jika  adalah uniform, maka suatu bentuk fungsi non-linear dari , g(), tidak akan uniform Contoh jika p()=1, >0. Re-parameterisasi sebagai maka: dimana sehingga: “ignorance about ” does not imply “ignorance about g.” The notion of prior “ignorance” may be untenable (mungkin dapat diperbolehkan)? Nur Iriawan Bayesian Modeling, PENS – ITS

39 Turning this process around slightly, Bayesian analysis assumes that we can make some kind of probability statement about parameters before we start. The sample is then used to update our prior distribution. Nur Iriawan Bayesian Modeling, PENS – ITS

40 Pertama, anggap bahwa prior yang digunakan dapat direpresentasikan sebagai probability density function p(q) dengan q adalah parameter yang akan dipelajari. Berdasarkan pada sampel X (likelihood function) kita akan dapat meng-update distribusi priornya mengguankan Bayes rule Nur Iriawan Bayesian Modeling, PENS – ITS

41 Beberapa Conjugate priors
Nur Iriawan Bayesian Modeling, PENS – ITS

42 The Jeffreys Prior (single parameter)
Jeffreys prior diberikan sebagai berikut: dimana adalah expected Fisher Information This is invariant to transformation in the sense that all parametrizations lead to the same prior Can also argue that it is uniform for a parametrization where the likelihood is completely determined (see Box and Tiao, 1973, Section 1.3) Nur Iriawan Bayesian Modeling, PENS – ITS

43 Contoh Jeffreys pada Binomial
Hasil ini adalah suatu bentuk distribusi beta dengan parameters ½ and ½ Nur Iriawan Bayesian Modeling, PENS – ITS

44 Contoh Jeffreys’ Priors yang lain
Nur Iriawan Bayesian Modeling, PENS – ITS

45 Improper Priors  Trouble Posterior (sometimes)
Suppose Y1, .,Yn are independently normally distributed with constant variance s2 and with: Suppose it is known that r is in [0,1], r is uniform on [0,1], and g, b, and s have improper priors Then for any observations y, the marginal posterior density of r is proportional to where h is bounded and has no zeroes in [0,1]. This posterior is an improper distribution on [0,1]! Nur Iriawan Bayesian Modeling, PENS – ITS

46 Improper prior usually  proper posterior
Nur Iriawan Bayesian Modeling, PENS – ITS

47 Contoh lain: improper proper
Nur Iriawan Bayesian Modeling, PENS – ITS

48 Subjective Degrees of Belief
Probability represents a subjective degree of belief held by a particular person at a particular time Various techniques for eliciting subjective priors. For example, Good’s device of imaginary results. e.g. binomial experiment. beta prior with a=b. “Imagine” the experiment yields 1 tail and n-1 heads. How large should n be in order that we would just give odds of 2 to 1 in favor of a head occurring next? (eg n = 4 implies a=b=1) Nur Iriawan Bayesian Modeling, PENS – ITS

49 Problems with Subjectivity
What if the prior and the likelihood disagree substantially? The subjective prior cannot be “wrong” but may be based on a misconception The model may be substantially wrong Often use hierarchical models in practice: Nur Iriawan Bayesian Modeling, PENS – ITS

50 Hierarchical Model Contoh pada kasus Binomial Beta(a, b) Poisson(λ)
Gamma(c, d) Gamma(g, h) Gamma(e, f) Beta(a, b) Poisson(λ) Binomial(n, p) Nur Iriawan Bayesian Modeling, PENS – ITS

51 General Comments Determination of subjective priors is difficult
Difficult to assess the usefulness of a subjective posterior Don’t be misled by the term of “subjective”; all data analyses involve appreciable personal elements Nur Iriawan Bayesian Modeling, PENS – ITS

52 Once again: An example with a continuous variable: A beta-binomial example
The setup: We are flipping a biased coin, where the probability of heads p could be anywhere between 0 and 1. We are interested in p. We will have two sources of information: Prior beliefs, which we will express as a beta distribution, and Data, which will come in the form of counts of heads in 10 independent flips. Nur Iriawan Bayesian Modeling, PENS – ITS

53 An example with a continuous variable: A beta-binomial example--the Prior Distribution
Let’s suppose we think it is more likely that the coin is close to fair, so p is probably nearer to .5 than it is to either 0 or 1. We don’t have any reason to think it is biased toward either heads or tails, so we’ll want a prior distribution that is symmetric around .5. We’re not real sure about what p might be--say about as sure as only 6 observations. This corresponds to 3 pseudo-counts of H and 3 of T, which, if we want to use a beta distribution to express this belief, corresponds to beta(4,4): Nur Iriawan Bayesian Modeling, PENS – ITS

54 An example with a continuous variable: A beta-binomial example--the Prior Distribution
Beta. Defined on [0,1]. Conjugate prior for the probability parameter in Bernoulli & binomial models. p ~ dbeta(4,4) Mean(p): Variance(p): Mode(p): PseudoCount of successes PseudoCount of failures The variable: “success probability” The failure probability Shape, or “prior sample info” The success probability Nur Iriawan Bayesian Modeling, PENS – ITS

55 An example with a continuous variable: A beta-binomial example--the Likelihood
Next we will flip the coin ten times. Assuming the same true (but unknown to us) value of p is in effect for each of ten independent trials, we can use the binomial distribution to model the probability of getting any number of heads: i.e., Count of observed successes The variable Count of observed failures The “success probability” parameter The failure probability The success probability Nur Iriawan Bayesian Modeling, PENS – ITS

56 An example with a continuous variable: A beta-binomial example--the Likelihood
We flip the coin ten times, and observe 7 heads; i.e., r=7. The likelihood is obtained now using the same form as in the preceding slide, except now r is fixed at 7 and we are interested in the relative value of this function at different possible values of p: Nur Iriawan Bayesian Modeling, PENS – ITS

57 An example with a continuous variable: Obtaining the posterior by Bayes Theorem
posterior likelihood prior General form: In our example, 7 plays the role of x*, and p plays the role of y. Before normalizing: After normalizing: Now, how can we get an idea of what this means we believe about p after combining our prior belief and our observations? Nur Iriawan Bayesian Modeling, PENS – ITS

58 An example with a continuous variable: In pictures
Prior x Likelihood Posterior Nur Iriawan Bayesian Modeling, PENS – ITS

59 An example with a continuous variable: Using the fact that we have conjugate distributions
Now This is just the kernel of a beta(11,7) distribution. This is rather special. The data were observed in accordance with a probability function which would have that same mathematical form as a likelihood once data are observed. We chose a prior distribution (in this case, a beta distribution) which would combine with the likelihood just so as to produce another distribution in the same parametric family (another beta distribution), just with updated parameters. We can work out its summary statistics: Mean(p): Variance(p): Mode(p): prior was Nur Iriawan Bayesian Modeling, PENS – ITS

60 An example with a continuous variable: Using BUGS
Now What BUGS does in this simple problem with one variable is to sample lots of values from the posterior distribution for p; that is, its distribution as determined first with information from the prior, but further conditional on the observed data. Here are the summary statistics from draws: Mean(p): Variance(p): Mode(p): prior was .11162~.0125 Nur Iriawan Bayesian Modeling, PENS – ITS

61 An example with a continuous variable: Using BUGS
BUGS setup for this problem: Nur Iriawan Bayesian Modeling, PENS – ITS

62 Looking ahead to sampling-based approaches with many variables
BUGS = Bayesian-inference Using Gibbs Sampling Basic idea: Model multi-parameter problem in terms of assemblies of distributions and functions for all data and all parameters (taking advantage of conditional dependence whenever possible). E.g., p(Data|x,y) p(x|z) p(y) p(z). (*) Observe Data*; Posterior p(x,y,z|Data*) is proportional to (*). Hard to evaluate normalizing constant, but ... Nur Iriawan Bayesian Modeling, PENS – ITS

63 Looking ahead to sampling-based approaches with many variables
Can draw values from “full conditional” distributions: Start with a possible value for each variable in cycle 0. In cycle t+1, Draw xt+1 from p(x|Y= yt,Z= zt,Data*) Draw yt+1 from p(y|X= xt+1,Z= zt,Data*) Draw zt+1 from p(z|X= xt+1,Y= yt+1,Data*) Under suitable conditions, these series of draws will come to approximate draws from the actual true joint posterior for all the parameters. Nur Iriawan Bayesian Modeling, PENS – ITS

64 Inference in a chain Recursive representation:
p(u,v,x,y,z) = p(z|y,x,v,u) p(y|x,v,u) p(x|v,u) p(v|u) p(u) = p(z|y) p(y|x) p(x|v) p(v|u) p(u). U V X Y Z p(v|u) p(x|v) p(y|x) p(z|y) Nur Iriawan Bayesian Modeling, PENS – ITS

65 Start here, by revising belief about X
Inference in a chain Suppose we learn the value of X: Start here, by revising belief about X U V X Y Z p(v|u) p(x|v) p(y|x) p(z|y) Nur Iriawan Bayesian Modeling, PENS – ITS

66 Inference in a chain Propagate information down the chain using conditional probabilities: From updated belief about X, use conditional probability to revise belief about Y U V X Y Z p(v|u) p(x|v) p(y|x) p(z|y) Nur Iriawan Bayesian Modeling, PENS – ITS

67 Inference in a chain Propagate information down the chain using conditional probabilities: From updated belief about Y, use conditional probability to revise belief about Z U V X Y Z p(v|u) p(x|v) p(y|x) p(z|y) Nur Iriawan Bayesian Modeling, PENS – ITS

68 Inference in a chain Propagate information up the chain using Bayes Theorem: From updated belief about X, use Bayes Theorem to revise belief about V U V X Y Z p(v|u) p(x|v) p(y|x) p(z|y) Nur Iriawan Bayesian Modeling, PENS – ITS

69 Inference in a chain Propagate information up the chain using Bayes Theorem: From updated belief about V, use Bayes Theorem to revise belief about U U V X Y Z p(v|u) p(x|v) p(y|x) p(z|y) Nur Iriawan Bayesian Modeling, PENS – ITS

70 Inference in singly-connected nets
Singly connected: There is never more than one path from one variable to another variable. Chains and trees are singly connected. Can use repeated applications of Bayes theorem and conditional probability to propagate evidence. (Pearl, early 1980s) V U X Y Z Nur Iriawan Bayesian Modeling, PENS – ITS

71 Posterior Summaries Mean, median, mode, percentile, etc.
Central 95% interval versus highest posterior density region (normal mixture example…) Nur Iriawan Bayesian Modeling, PENS – ITS

72 Bayesian Confidence Intervals
Apart from providing an alternative procedure for estimation, the Bayesian approach provides a direct procedure for the formulation of parameter confidence intervals. Returning to the simple case of a single coin toss, the probability density function of the estimator becomes: Nur Iriawan Bayesian Modeling, PENS – ITS

73 As previously discussed, try to give a=b=1
As previously discussed, try to give a=b=1.4968, the Bayesian estimator of P is Nur Iriawan Bayesian Modeling, PENS – ITS

74 Please verify this result!
However, using the posterior distribution function, we can also compute the probability that the value of p is less than .5 given a head: Please verify this result! Hence, we have a very formal statement of confidence intervals as P(0.3 < p < 0.7). Nur Iriawan Bayesian Modeling, PENS – ITS

75 Prediction “Posterior Predictive Density” of a future observation
binomial example, n=20, x=12, a=1, b=1 ~ y y Nur Iriawan Bayesian Modeling, PENS – ITS

76 Prediction for Univariate Normal
Nur Iriawan Bayesian Modeling, PENS – ITS

77 Prediction for Univariate Normal
Posterior Predictive Distribution is Normal Nur Iriawan Bayesian Modeling, PENS – ITS

78 Prediction for a Poisson
Nur Iriawan Bayesian Modeling, PENS – ITS

79 On the Compromise of Bayesian to Classical Estimation (presented on South-East Asia Stat & Math Muslim Society Conference) Nur Iriawan Statistics Department of Institut Teknologi Sepuluh Nopember Jl. Arief Rahman Hakim Sukolilo, Surabaya 60111, Indonesia Nur Iriawan Bayesian Modeling, PENS – ITS

80 Example on Exponential
Suppose x is exponentially distributed The MLE of is Nur Iriawan Bayesian Modeling, PENS – ITS

81 Using Bayesian approach with prior of is
The likelihood would be Then the posterior of given the data X is Nur Iriawan Bayesian Modeling, PENS – ITS

82 The Bayes estimator for can be derived using
Nur Iriawan Bayesian Modeling, PENS – ITS

83 Nur Iriawan Bayesian Modeling, PENS – ITS - 2006 83

84 Numerical Calculation
One thousand generated data from Exponential distribution, then The classical MLE give the result (using MINITAB) as follows Nur Iriawan Bayesian Modeling, PENS – ITS

85 Using WinBUGS, the Bayes estimator is
Nur Iriawan Bayesian Modeling, PENS – ITS

86 Lihat kembali hasil dari Binomial
Estimator Bayes diperoleh Cara klasik memberikan hasil bahwa Bagaimana jika α = β = 0? Estimator Bayes akan menjadi sama dengan cara klasik. Demikian halnya jika nilai-nilai ini diterapkan pada prior beta, maka prior tersebut akan berubah menjadi sebuah Jeffreys’ prior. Nur Iriawan Bayesian Modeling, PENS – ITS

87 Summary The Bayesian estimator reported as the posterior mean which is used here is generated from an improper prior distribution. It has been shown that when there is no information about the prior of the parameter of model, a constant or Jeffreys’ prior is used, the resulting estimator will give a compromise result between Bayesian and Classical estimator. Nur Iriawan Bayesian Modeling, PENS – ITS

88 Numerical Integration: Monte Carlo Method (Low dan Kelton, 2000)
Anggap kita akan menghitung integral berikut Jika g(x) cukup kompleks maka nilai I akan cukup rumit. Dengan cara numerik seperti beriktu dapat diperoleh nilai I dengan cukup sederhana. Caranya adalah sbb: Nur Iriawan Bayesian Modeling, PENS – ITS

89 Hitung ekspektasi Y dengan cara berikut
Buat random variabel baru dengan x bernilai uniform dalam interval (a,b), atau U(a,b). Hitung ekspektasi Y dengan cara berikut Nur Iriawan Bayesian Modeling, PENS – ITS

90 Sehingga nilai integral I dapat didekati secara numerik oleh
Diketahui bahwa Sehingga nilai integral I dapat didekati secara numerik oleh Berarti, bangkitkan data yang mempunyai distribusi Uniform dan masukkan nilainya ke fungsi g(x) jumlahkan nilainya dan hitung rata-ratanya sebagai taksiran nilai integral yang sedang dicari. Nur Iriawan Bayesian Modeling, PENS – ITS

91 Berapa banyak data yang harus dibangkitkan?
Data harus dibangkitkan sebanyak mungkin sampai nilai rata-ratanya mencapai titik konvergen. Burn-in Nur Iriawan Bayesian Modeling, PENS – ITS

92 Cara lain menghitung nilai estimasi integral dengan RNG
Macam Random Number Generator (RNG) Transformasi Invers Composisition Convolution Acceptance Rejection (AR) Adaptive Acceptence Rejection (AAR) Nur Iriawan Bayesian Modeling, PENS – ITS

93 Transformasi Invers Syarat Transformasi Invers Metodenya adalah sbb:
Fungsi mempunyai CDF yang close form Metodenya adalah sbb: x u 1 F(x) Nur Iriawan Bayesian Modeling, PENS – ITS

94 Composition (Mixture form)
Perhatikan bentuk fungsi berikut Half Normal Exponential I II f(x) Dimana data di daerah I dibangkitkan dengan Normal dan di daerah II dengan Exponential Nur Iriawan Bayesian Modeling, PENS – ITS

95 Convolution Misalkan sebuah fungsi Erlang(m ), maka cara pembangkitan datanya adalah dengan mengkonvolusikan data bangkitan Exponential( ). Nur Iriawan Bayesian Modeling, PENS – ITS

96 Acceptance Rejection (AR)
Sangat bagus untuk fungsi yang tidak jelas pdf atau bukan Dapat mengakomodasikan fungsi yang tidak mempunyai CDF close form Caranya adalah sbb: tx f(x) Reject Accept rx Nur Iriawan Bayesian Modeling, PENS – ITS

97 Algoritma AR Bangkitkan x ~ rx Bangkitkan u ~ U(0,1) If then Accept x
Else Reject x Nur Iriawan Bayesian Modeling, PENS – ITS


Download ppt "Bayesian: Single Parameter"

Presentasi serupa


Iklan oleh Google