Model Logistik untuk Data Ordinal (Ordinal Regression)

Slides:

Advertisements

Presentasi serupa

UJI PERBEDAAN (Differences analysis)

Advertisements

© aSup-2007 PENGENALAN SPSS   1 INTRODUCTION to SPSS Statistical Package for Social Science.

PERTEMUAN 4 GRAFIK DAN TABEL.

Pelatihan SPSS Basic.

KUSWANTO, SUB POKOK BAHASAN Mata kuliah dan SKS Manfaat Deskripsi Tujuan instruksional umum Pokok bahasan.

ORDINAL REGRESSION KELOMPOK 4 / 3SE1.

Aplikasi Program Analisis Data (SPSS)

Common Effect Model.

Validitas & Reliabilitas

ARRAY RUBY. PENDAHULUAN Ruby's arrays are untyped and mutable. The elements of an array need not all be of the same class, and they can be changed at.

Korelasi Linier KUSWANTO Korelasi Keeratan hubungan antara 2 variabel yang saling bebas Walaupun dilambangkan dengan X dan Y namun keduanya diasumsikan.

Validitas & Reliabilitas

K-Map Using different rules and properties in Boolean algebra can simplify Boolean equations May involve many of rules / properties during simplification.

ANALISIS INSTRUMEN PENELITIAN 1.UJI VALIDITAS 2.UJI RELIABILITAS.

Analisis Data dengan SPSS

BLACK BOX TESTING.

1. Properties of Electric Charges 2. Coulomb’s law 3. The Electric Fields 4. Electrics Field of a Continuous Charge Distribution 5. Electric Field Lines.

STRUKTUR DATA (4) array stack dan queue

Mekanisme Pasar Permintaan dan Penawaran

Ekonometrika Program Studi Statistika, semester Ganjil 2012/2013 Dr. Rahma Fitriani, S.Si., M.Sc.

Dibuat oleh : Yessica ( ). Notes Output Created 23-MAY :54:51 Comments Input Active Dataset DataSet0 Filter Weight Split File N of Rows.

INTRODUCTION TO SPSS Statistical Package for Social Science 1.

The Foreign Exchange Market Pertemuan 2

Ruang Contoh dan Peluang Pertemuan 05

Pendugaan Parameter Proporsi dan Varians (Ragam) Pertemuan 14 Matakuliah: L0104 / Statistika Psikologi Tahun : 2008.

Population and sample. Population is complete actual/theoretical collection of numerical values (scores) that are of interest to the researcher. Simbol.

PENDUGAAN PARAMETER Pertemuan 7

Pertemuan 07 Peluang Beberapa Sebaran Khusus Peubah Acak Kontinu

HAMPIRAN NUMERIK SOLUSI PERSAMAAN NIRLANJAR Pertemuan 3

Simple Regression ©. Null Hypothesis The analysis of business and economic processes makes extensive use of relationships between variables.

MULTIPLE REGRESSION ANALYSIS THE THREE VARIABLE MODEL: NOTATION AND ASSUMPTION 08/06/2015Ika Barokah S.

1 Pertemuan #2 Probability and Statistics Matakuliah: H0332/Simulasi dan Permodelan Tahun: 2005 Versi: 1/1.

1 Pertemuan 8 JARINGAN COMPETITIVE Matakuliah: H0434/Jaringan Syaraf Tiruan Tahun: 2005 Versi: 1.

9.3 Geometric Sequences and Series. Objective To find specified terms and the common ratio in a geometric sequence. To find the partial sum of a geometric.

Expectation Maximization. Coin flipping experiment  Diberikan koin A dan B dengan nilai bias A dan B yang belum diketahui  Koin A akan memunculkan head.

Keuangan dan Akuntansi Proyek Modul 2: BASIC TOOLS CHRISTIONO UTOMO, Ph.D. Bidang Manajemen Proyek ITS 2011.

Smoothing. Basic Smoothing Models Moving average, weighted moving average, exponential smoothing Single and Double Smoothing First order exponential smoothing.

LOGISTIC REGRESSION Logistic regression adalah regressi dengan binary untuk variabel dependen. Variabel dependen bersifat dikotomi dengan mengambil nilai.

ANALISIS MULTIVARIAT.

STATISTIKA CHATPER 4 (Perhitungan Dispersi (Sebaran))

Pengujian Hipotesis (I) Pertemuan 11

Matakuliah : I0014 / Biostatistika Tahun : 2005 Versi : V1 / R1

An Editing Process: Rereading

The first reason Sebab yang pertama.

EKONOMI INTERNASIONAL

Sosial Marketing k.

Pertemuan I Greeting and instruction

T(ea) for Two Again Tests Between the Means of Related Groups

Kuis 1 April 2017 Pilih Suatu Proyek IT

Sosial Marketing k tentang : Ide.

Pendugaan Parameter (II) Pertemuan 10

REGRESI LOGIT ATAU REGRESI LOGISTIK.

ANALISA REGRESI LINEAR DAN BERGANDA

Kk ilo Associative entity.

Pertemuan Kesembilan Analisa Data

Pertemuan Kesepuluh Data Analysis

UJI HIPOTESIS ANALISIS BIVARIAT.

Ukuran Akurasi Model Deret Waktu Manajemen Informasi Kesehatan

PEMODELAN MATEMATIKA Kudang B. Seminar.

(Hepatitics Drug) Website:

How to Pitch an Event

© Mark E. Damon - All Rights Reserved Another Presentation © All rights Reserved

PENGENALAN SPSS.

ENGINEERING RESEARCH IS A QUANTITATIVE RESEARCH

Copyright©2010 Companyname Free template by Investintech PDF SolutionsInvestintech PDF Solutions Placenta previa is placenta implantation on the uterine.

Al Muizzuddin F Matematika Ekonomi Lanjutan 2013

Vector. A VECTOR can describe anything that has both MAGNITUDE and DIRECTION The MAGNITUDE describes the size of the vector. The DIRECTION tells you where.

Path Analysis. Path Diagram Single headed arrowruns from cause to effect Double headed bent arrow: correlation The model above assumes that all 5 variables.

Transcript presentasi:

Model Logistik untuk Data Ordinal (Ordinal Regression) Analisis Data Kategorik Pertemuan X

Ordinal Regression Menggunakan variabel ordinal Bisa mengurutkan nilainya tetapi jarak sebenarnya antar nilai tidak diketahui Model ordinal logistic untuk satu variabel bebas X: ln 𝜃 𝑗 = 𝛼 𝑗 −𝛽𝑋, 𝑗=1,..𝑗𝑢𝑚𝑙𝑎ℎ 𝑘𝑎𝑡𝑒𝑔𝑜𝑟𝑖 Makin tinggi koefisien mengindikasikan asosiasi dengan skor yang tinggi

Ordinal Regression When you see a positive coefficient for a dichotomous factor, you know that higher scores are more likely for the first category. A negative coefficient tells you that lower scores are more likely. For a continuous variable, a positive coefficient tells you that as the values of the variable increase, the likelihood of larger scores increases. An association with higher scores means smaller cumulative probabilities for lower scores, since they are less likely to occur.

Ordinal Regression Setiap logit memiliki ∝ 𝑗 -nya sendiri (thresholds values) dengan koefisien 𝛽 yang sama, artinya efek dari variabel independen sama untuk fungsi logit yang berbeda (proportional odds model) Contoh: Survei kepuasan responden dengan pilihan jawaban sangat setuju hingga sangat tidak setuju Ordinal variabel sebagai dependen variabel SPSS Ordinal Regression atau PLUM (Polytomous Universal Model)

Ordinal Regression dengan SPSS Untuk dependent variabel, SPSS memodelkan probabilitas setiap level atau dibawahnya (bukan setiap level atau di atasnya) Secara otomatis, SPSS mengambil kategori terakhir sebagai reference category Contoh: Level awal kelas bahasa inggris (Y), dengan gender (X; boys = 0, girls = 1). LSYPE.sav Analyses > Regression > Ordinal Gender LEVEL 3 4 5 6 7 Boys 967 1372 2835 1500 503 Girls 462 904 2780 2015 828

We compare the final model (model with all explanatory variables) against the baseline (model without any explanatory variables) to see whether it has significantly improved the fit to the data. The statistically significant chi-square statistic (p<.0005) indicates that the Final model gives a significant improvement over the baseline intercept-only model.

The Deviance (-2LL) Statistic Deviance, ukuran seberapa banyak variasi yang tidak dapat dijelaskan oleh model regresi logistik Semakin tinggi nilai deviance semakin kurang akurat modelnya 𝜒 2 = −2𝐿𝐿 𝑏𝑎𝑠𝑒𝑙𝑖𝑛𝑒 − −2𝐿𝐿 𝑛𝑒𝑤 𝑑𝑜𝑓= 𝑘 𝑏𝑎𝑠𝑒𝑙𝑖𝑛𝑒 − 𝑘 𝑛𝑒𝑤, 𝑘=𝑗𝑢𝑚𝑙𝑎ℎ 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 𝑑𝑎𝑙𝑎𝑚 𝑚𝑜𝑑𝑒𝑙 Jika model baru lebih baik dalam menjelaskan data daripada baseline maka seharusnya ada pengurangan yang signifikan pada deviance yang bisa di uji pada distribusi chi-square (memberikan p-value)

Kecenderungan chi-square untuk significant pada sample berukuran besar Sensitive terhadap sel yang kosong Gunakan p-value yang lebih rendah (misalnya 0.01) Gunakan pseudo 𝑅 2

These statistics are intended to test whether the observed data are consistent with the fitted model. We start from the null hypothesis that the fit is good. If we do not reject this hypothesis (i.e. if the p value is large), then you conclude that the data and the model predictions are similar and that you have a good model. Here, the pseudo R2 values (e.g. Nagelkerke = 3.1%) indicates that gender explains a relatively small proportion of the variation between students in their attainment.

Parameter estimates merupakan tabel inti dimana bisa dilihat hubungan antara variabel penjelas dengan variabel outcome Thresholds tidak diintepretasikan, hanya intercept titik (logit) dimana pelajar diprediksikan ke kategori yang lebih tinggi Odds level 6 atau di bawah level 6 (level = 6) adalah komplemen dari odds berada di level 7, level 5 atau di bawah level 5 (level = 5) adalah komplemen dari odds berada di level 6 ke atas dst

Proportional odds principle Girls = reference category y = a – bx 1/0.53= 1.88, equally 1/1.88=0.53 Proportional odds principle

OR (girls as the base) = exp(-.629) = 0.53 This test compares the ordinal model which has one set of coefficients for all thresholds (labelled Null Hypothesis), to a model with a separate set of coefficients for each threshold (labelled General). If the general model gives a significantly better fit to the data than the ordinal (proportional odds) model (i.e. if p<.05) then we are led to reject the assumption of proportional odds. OR (girls as the base) = exp(-.629) = 0.53 OR (boys as the base) = exp(.629) = 1.88

Asumsi Proportional Odds (PO) Cumulative proportion = just the percentage Cumulative odds = 1347/(14463-1347), odds mencapai level 7, odds berada di level 6 atau ke atas= 4918/9545 = 0.52 atau p/(1-p) Cumulative logits = ln (cumulative odds)

Efek dari variabel penjelas adalah konsisten atau proporsional pada thresholds yang berbeda (SPSS,parallel lines assumption)

Remaja putri cenderung untuk memperoleh level yang lebih tinggi daripada remaja putra

Secara umum odds untuk remaja putri selalu lebih tinggi daripada remaja putra OR bervariasi pada threshold kategori yang berbeda, jika OR ini tidak berbeda secara signifikan maka kita bisa meringkas hubungan antara gender dengan level bahasa inggris dengan OR tunggal dari regresi ordinal

Ordinal Regression dengan Beberapa Variabel Bebas Sebuah study dilakukan untuk melihat faktor-faktor yang mempengaruhi seseorang untuk mendaftar sekolah ke jenjang lebih tinggi Seorang pelajar ditanya apakah mereka: “tidak akan mendaftar”, “tidak tahu”, dan “akan mendaftar” ke jenjang lebih tinggi. Variabel outcome memiliki tiga kategori (0,1,2) Dikumpulkan juga data mengenai pendidikan orang tua (apakah pendidikan terakhir orang tua adalah S1;0,1), jenis institusi pendidikan (public atau private;0,1), dan GPA. ologit.sav

PLUM apply with pared public gpa /LINK=LOGIT /PRINT=FIT PARAMETER SUMMARY TPARALLEL

Odds Ratio (ln Estimate) Threshold biasanya tidak disertakan dalam intepretasi proportional OR Untuk pared, setiap kenaikan satu unit pared (dari 0 ke 1), odds untuk mendaftar 2.85 kali lebih besar daripada tidak tahu dan tidak mendaftar, dengan asumsi semua variabel dalam model konstan Demikian juga, odds antara tidak tahu dan mendaftar 2.85 kali lebih besar daripada tidak mendaftar Setiap kenaikan satu unit GPA, odds tidak mendaftar dan tidak tahu 1.85 kali lebih besar daripada yang mendaftar

Pendidikan orang tua dan GPA memiliki asosiasi positif untuk kecenderungan mendaftar ke jenjang sekolah yang lebih tinggi Setiap satu unit kenaikan pada pendidikan orang tua, ekspektasi log odds akan bertambah 1.05 setiap kenaikan kategori apply yang lebih tinggi Setiap kenaikan satu unit GPA diharapkan kenaikan ekspektasi log odds sebesar 0.62 pada setiap kenaikan apply yang lebih tinggi Public tidak memberikan efek yang signifikan pada apply

Example: Random sample of Vermont citizens was asked to rate the work of criminal judges in the state. The scale was Poor (1), Only fair (2), Good (3), and Excellent (4). At the same time, they had to report whether somebody of their household had been a crime victim within the last 3 years(1=Yes, 2=No).(vermont.sav) Apakah orang dengan riwayat pernah menjadi korban dan orang yang tidak memiliki riwayat pernah menjadi korban memiliki pandangan yang sama mengenai penegakan keadilan?

Penambahan variabel baru: sex, age(dua kategori), pendidikan (5 kategori)

Regresi Logistik VS Loglinier Model Regresi logistik adalah model statistika yang digunakan untuk variabel dependen/respon kategorik Loglinier model digunakan jika paling sedikit terdapat dua variabel respon dalam tabel kontingensi. Model akan menjelaskan pola hubungan diantara sekumpulan variabel respon kategorik

Loglinier Model dan Regresi Logistik berbeda dalam hal: Distribusi dari variabel kategorik yaitu Poisson bukan binomial Fungsi link yaitu log, bukan logit Prediksi merupkan estimasi dari sel yang dihitung berdasar tabel kontingensi, bukan nilai logit dari dependen

Kesesuaian Model Loglinier dan Model Logit Model loglinier dan model logit memiliki struktur yang sama untuk asosiasi antara variabel dependen/respon dan variabel-variabel independen/penjelas Mengandung interaksi yang paling umum untuk hubungan-hubungan diantara variabel-variabel penjelas

Kesesuaian Model Loglinier dan Model Logit Kesesuaian antara model logit dengan model log linier pada Tabel I x J x 2, : 𝑙𝑜𝑔 𝑚 𝑖𝑗1 𝑚 𝑖𝑗2 =𝛼+ 𝛽 𝑖 𝐴 + 𝛽 𝑗 𝐵 Respon Y berasosiasi dengan faktor A dan B dengan efek tiap variabel sama pada tiap level dari faktor yang lain Model loglinier mengandung asosiasi antara 𝜆 𝑖𝑘 𝐴𝑌 & 𝜆 𝑗𝑘 𝐵𝑌 dan 𝜆 𝑖𝑗 𝐴𝐵 untuk hubungan antara faktor Hasil akhir model adalah (AB,AY,BY)

Kesesuaian Model Loglinier dan Model Logit Model loglinier (AB,AY,BY) menyatakan secara tidak langsung model logit dapat diperlihatkan sebagai berikut: 𝑙𝑜𝑔 𝑚 𝑖𝑗1 𝑚 𝑖𝑗2 = log 𝑚 𝑖𝑗1 −log⁡( 𝑚 𝑖𝑗2 ) = μ+ 𝜆 𝑖 𝐴 + 𝜆 𝑗 𝐵 + 𝜆 1 𝑌 + 𝜆 𝑖𝑗 𝐴𝐵 + 𝜆 𝑖1 𝐴𝑌 + 𝜆 𝑗1 𝐵𝑌 − μ+ 𝜆 𝑖 𝐴 + 𝜆 𝑗 𝐵 + 𝜆 2 𝑌 + 𝜆 𝑖𝑗 𝐴𝐵 + 𝜆 𝑖2 𝐴𝑌 + 𝜆 𝑗2 𝐵𝑌 = 𝜆 1 𝑌 − 𝜆 2 𝑌 + 𝜆 𝑖1 𝐴𝑌 − 𝜆 𝑖2 𝐴𝑌 + 𝜆 𝑗1 𝐵𝑌 − 𝜆 𝑗2 𝐵𝑌 Dengan mengasumsikan bahwa : 𝑘 𝜆 𝑘 𝑌 = 𝑘 𝜆 𝑖𝑘 𝐴𝑌 = 𝑘 𝜆 𝑗𝑘 𝐵𝑌 = 0 𝜆 1 𝑌 =− 𝜆 2 𝑌 , 𝜆 𝑖1 𝐴𝑌 =− 𝜆 𝑖2 𝐴𝑌 , 𝜆 𝑗1 𝐵𝑌 =− 𝜆 𝑗2 𝐵𝑌

Kesesuaian Model Loglinier dan Model Logit Dengan demikian bentuk sederhana dari model logit adalah: 𝑙𝑜𝑔 𝑚 𝑖𝑗1 𝑚 𝑖𝑗2 =2 𝜆 1 𝑌 + 2𝜆 𝑖1 𝐴𝑌 + 2𝜆 𝑗1 𝐵𝑌