Data Mining: 7. Algoritma Estimasi

Slides:



Advertisements
Presentasi serupa
Data Mining: Proses Data Mining
Advertisements

Chapter 11 k- Fold Cross Validation
HEALTHCARE DATAMINING
Chapter 10 ALGORITME for ASSOCIATION RULES
Chapter 9 ALGORITME Cluster dan WEKA
Korelasi Linier KUSWANTO Korelasi Keeratan hubungan antara 2 variabel yang saling bebas Walaupun dilambangkan dengan X dan Y namun keduanya diasumsikan.
Presented By : Group 2. A solution of an equation in two variables of the form. Ax + By = C and Ax + By + C = 0 A and B are not both zero, is an ordered.
AKUNTANSI BIAYA TUJUAN AKUNTANSI BIAYA
1 Pertemuan 09 Kebutuhan Sistem Matakuliah: T0234 / Sistem Informasi Geografis Tahun: 2005 Versi: 01/revisi 1.
Pertemuan 08 Modeling Business Processes Matakuliah: M0034 /Informasi dan Proses Bisnis Tahun: 2005 Versi: 01/05.
Bersediakah Anda membantu menjangkau Indonesia dengan Injil? (Will you help reach Indonesia with the Gospel?) A ministry of Campus Crusade for Christ Australia.
1 Pertemuan 22 Analisis Studi Kasus 2 Matakuliah: H0204/ Rekayasa Sistem Komputer Tahun: 2005 Versi: v0 / Revisi 1.
PERTEMUAN KE-6 UNIFIED MODELLING LANGUAGE (UML) (Part 2)
MULTIPLE REGRESSION ANALYSIS THE THREE VARIABLE MODEL: NOTATION AND ASSUMPTION 08/06/2015Ika Barokah S.
1 Pertemuan 8 JARINGAN COMPETITIVE Matakuliah: H0434/Jaringan Syaraf Tiruan Tahun: 2005 Versi: 1.
1 Pertemuan 15 Game Playing Matakuliah: T0264/Intelijensia Semu Tahun: Juli 2006 Versi: 2/1.
1 HAMPIRAN NUMERIK SOLUSI PERSAMAAN LANJAR Pertemuan 5 Matakuliah: K0342 / Metode Numerik I Tahun: 2006 TIK:Mahasiswa dapat meghitung nilai hampiran numerik.
Data Mining: 2. Proses Data Mining
Data Mining Romi Satria Wahono WA/SMS:
Medium for Teaching SMA Grade X Semester 2
Knowledge Management: 4. Systems
4- Classification: Logistic Regression 9 September 2015 Intro to Logistic Regression.
9.3 Geometric Sequences and Series. Objective To find specified terms and the common ratio in a geometric sequence. To find the partial sum of a geometric.
Data Mining: 7. Algoritma Estimasi dan Forecasting Romi Satria Wahono WA/SMS:
Data Mining Romi Satria Wahono WA/SMS:
Using Course-view to Enhance our Course Design
Data Mining: 2. Proses Data Mining
Data Mining: 2. Proses Data Mining
Pertemuan 03 Materi : Buku Wajib & Sumber Materi :
Data Mining Junta Zeniarja, M.Kom, M.CS
ILIMA FITRI AZMI TEACHING MATERIAL DEVELOPMENT
STATISTIKA CHATPER 4 (Perhitungan Dispersi (Sebaran))
Being Researcher-Technopreneur
Data Mining Junta Zeniarja, M.Kom, M.CS
Pertemuan 17 Materi : Buku Wajib & Sumber Materi :
Data Mining Yohana Nugraheni, S.Kom, MT
BY EKA ANDRIANI NOVALIA RIZKANISA VELA DESTINA
Software Engineering Rekayasa Perangkat Lunak
FISIKA DASAR By: Mohammad Faizun, S.T., M.Eng.
W1. About Social Informatics
Dr Rilla Gantino, SE., AK., MM
Kk ilo Associative entity.
Manajemen Proyek Perangkat Lunak (MPPL)
Enhancing Decision Making
Master data Management
Analisis Korelasi dan Regresi Berganda Manajemen Informasi Kesehatan
Magnitude and Vector Physics 1 By : Farev Mochamad Ihromi / 010
Pertemuan 21 dan 22 Analisis Regresi dan Korelasi Sederhana
Customer Relationship Management
Object-Oriented Programming Data Mining Romi Satria Wahono
How You Can Make Your Fleet Insurance London Claims Letter.
How Can I Be A Driver of The Month as I Am Working for Uber?
Things You Need to Know Before Running on the Beach.
Don’t Forget to Avail the Timely Offers with Uber
Konsep Aplikasi Data Mining
Konsep Aplikasi Data Mining
How to Tell if a Girl Likes You By luvleela.com luvleela.com.
Group 3 About causal Conjunction Member : 1. Ahmad Fandia R. S.(01) 2. Hesti Rahayu(13) 3. Intan Nuraini(16) 4. Putri Nur J. (27) Class: XI Science 5.
 Zoho Mail offers easy options to migrate data from G Suite or Gmail accounts. All s, contacts, and calendar or other important data can be imported.
Fix problems opening Norton  Fix problems opening Norton This problem can happen after you update Norton. To fix the matter, restart the computer. Fix.
If you are an user, then you know how spam affects your account. In this article, we tell you how you can control spam’s in your ZOHO.
In this article, you can learn about how to synchronize AOL Mail with third-party applications like Gmail, Outlook, and Window Live Mail, Thunderbird.
INTERROGATIVE ADJECTIVE. DEFINITION FUNCTION EXAMPLE QUESTION.
By Yulius Suprianto Macroeconomics | 02 Maret 2019 Chapter-5: The Standard of Living Over Time and A Cross Countries Source: http//
Right, indonesia is a wonderful country who rich in power energy not only in term of number but also diversity. Energy needs in indonesia are increasingly.
Website: Website Technologies.
Rank Your Ideas The next step is to rank and compare your three high- potential ideas. Rank each one on the three qualities of feasibility, persuasion,
Vector. A VECTOR can describe anything that has both MAGNITUDE and DIRECTION The MAGNITUDE describes the size of the vector. The DIRECTION tells you where.
Draw a picture that shows where the knife, fork, spoon, and napkin are placed in a table setting.
2. Discussion TASK 1. WORK IN PAIRS Ask your partner. Then, in turn your friend asks you A. what kinds of product are there? B. why do people want to.
Transcript presentasi:

Data Mining: 7. Algoritma Estimasi romi@romisatriawahono.net Data Mining: 7. Algoritma Estimasi Romi Satria Wahono romi@romisatriawahono.net http://romisatriawahono.net/dm WA/SMS: +6281586220090 http://romisatriawahono.net

Romi Satria Wahono SD Sompok Semarang (1987) SMPN 8 Semarang (1990) SMA Taruna Nusantara Magelang (1993) B.Eng, M.Eng and Ph.D in Software Engineering from Saitama University Japan (1994-2004) Universiti Teknikal Malaysia Melaka (2014) Research Interests: Software Engineering, Machine Learning Founder dan Koordinator IlmuKomputer.Com Peneliti LIPI (2004-2007) Founder dan CEO PT Brainmatics Cipta Informatika

Course Outline 1. Pengantar Data Mining 2. Proses Data Mining 3. Pengolahan Awal Data 4. Algoritma Klasifikasi 5. Algoritma Klastering 6. Algoritma Asosiasi 7. Algoritma Estimasi

7. Algoritma Estimasi 7.1 Linear Regression 7.2 Neural Network 7.3 Support Vector Machine

7.1 Linear Regression romi@romisatriawahono.net Object-Oriented Programming 7.1 Linear Regression http://romisatriawahono.net

7.1.1 Tahapan Algoritma Linear Regression romi@romisatriawahono.net Object-Oriented Programming 7.1.1 Tahapan Algoritma Linear Regression http://romisatriawahono.net

Tahapan Algoritma Linear Regression Siapkan data Identifikasi Atribut dan Label Hitung  X², Y², XY dan total dari masing- masingnya Hitung a dan b berdasarkan persamaan yang sudah ditentukan Buat Model Persamaan Regresi Linear Sederhana

Rata-rata Suhu Ruangan (X) 1. Persiapan Data Tanggal Rata-rata Suhu Ruangan (X) Jumlah Cacat (Y) 1 24 10 2 22 5 3 21 6 4 20 19 7 8 23 9 11 25 13

2. Identifikasikan Atribut dan Label Y = a + bX Dimana: Y = Variabel terikat (Dependen) X = Variabel tidak terikat (Independen) a = konstanta b = koefisien regresi (kemiringan); besaran Response yang ditimbulkan oleh variabel  a =   (Σy) (Σx²) – (Σx) (Σxy)                  n(Σx²) – (Σx)²  b =   n(Σxy) – (Σx) (Σy)               n(Σx²) – (Σx)²

3. Hitung X², Y², XY dan total dari masing-masingnya Tanggal Rata-rata Suhu Ruangan (X) Jumlah Cacat (Y) X2 Y2 XY 1 24 10 576 100 240 2 22 5 484 25 110 3 21 6 441 36 126 4 20 400 9 60 132 19 361 16 76 7 8 23 529 81 207 11 121 264 13 625 169 325 220 72 4876 618 1640

4. Hitung a dan b berdasarkan persamaan yang sudah ditentukan Menghitung Koefisien Regresi (a) a =   (Σy) (Σx²) – (Σx) (Σxy)                 n(Σx²) – (Σx)² a = (72) (4876) – (220) (1640)                 10 (4876) – (220)² a = -27,02   Menghitung Koefisien Regresi (b) b =   n(Σxy) – (Σx) (Σy)           n(Σx²) – (Σx)² b = 10 (1640) – (220) (72)         10 (4876) – (220)² b = 1,56

5. Buatkan Model Persamaan Regresi Linear Sederhana Y = a + bX Y = -27,02 + 1,56X

Pengujian Prediksikan Jumlah Cacat Produksi jika suhu dalam keadaan tinggi (Variabel X), contohnya: 30°C Y = -27,02 + 1,56X Y = -27,02 + 1,56(30) =19,78 Jika Cacat Produksi (Variabel Y) yang ditargetkan hanya boleh 5 unit, maka berapakah suhu ruangan yang diperlukan untuk mencapai target tersebut? 5= -27,02 + 1,56X 1,56X = 5+27,02 X= 32,02/1,56 X =20,52 Jadi Prediksi Suhu Ruangan yang paling sesuai untuk mencapai target Cacat Produksi adalah sekitar 20,520C

7.1.2 Studi Kasus CRISP-DM Heating Oil Consumption – Estimation (Matthew North, Data Mining for the Masses, 2012, Chapter 8 Estimation, pp. 127-140) Dataset: HeatingOil-Training.csv dan HeatingOil-Scoring.csv

CRISP-DM romi@romisatriawahono.net Object-Oriented Programming http://romisatriawahono.net

Context and Perspective Sarah, the regional sales manager is back for more help Business is booming, her sales team is signing up thousands of new clients, and she wants to be sure the company will be able to meet this new level of demand, she now is hoping we can help her do some prediction as well She knows that there is some correlation between the attributes in her data set (things like temperature, insulation, and occupant ages), and she’s now wondering if she can use the previous data set to predict heating oil usage for new customers You see, these new customers haven’t begun consuming heating oil yet, there are a lot of them (42,650 to be exact), and she wants to know how much oil she needs to expect to keep in stock in order to meet these new customers’ demand Can she use data mining to examine household attributes and known past consumption quantities to anticipate and meet her new customers’ needs?

1. Business Understanding Sarah’s new data mining objective is pretty clear: she wants to anticipate demand for a consumable product We will use a linear regression model to help her with her desired predictions She has data, 1,218 observations that give an attribute profile for each home, along with those homes’ annual heating oil consumption She wants to use this data set as training data to predict the usage that 42,650 new clients will bring to her company She knows that these new clients’ homes are similar in nature to her existing client base, so the existing customers’ usage behavior should serve as a solid gauge for predicting future usage by new customers.

2. Data Understanding We create a data set comprised of the following attributes: Insulation: This is a density rating, ranging from one to ten, indicating the thickness of each home’s insulation. A home with a density rating of one is poorly insulated, while a home with a density of ten has excellent insulation Temperature: This is the average outdoor ambient temperature at each home for the most recent year, measure in degree Fahrenheit Heating_Oil: This is the total number of units of heating oil purchased by the owner of each home in the most recent year Num_Occupants: This is the total number of occupants living in each home Avg_Age: This is the average age of those occupants Home_Size: This is a rating, on a scale of one to eight, of the home’s overall size. The higher the number, the larger the home

3. Data Preparation A CSV data set for this chapter’s example is available for download at the book’s companion web site (https://sites.google.com/site/dataminingforthemasses/)

3. Data Preparation

3. Data Preparation

4. Modeling

4. Modeling

5. Evaluation

5. Evaluation

6. Deployment

6. Deployment

6. Deployment

7.2 Neural Network romi@romisatriawahono.net Object-Oriented Programming 7.2 Neural Network http://romisatriawahono.net

7.3 Support Vector Machine romi@romisatriawahono.net Object-Oriented Programming 7.3 Support Vector Machine http://romisatriawahono.net

Post-Test Jelaskan perbedaan antara data, informasi dan pengetahuan! romi@romisatriawahono.net Object-Oriented Programming Post-Test Jelaskan perbedaan antara data, informasi dan pengetahuan! Jelaskan apa yang anda ketahui tentang data mining! Sebutkan peran utama data mining! Sebutkan pemanfaatan dari data mining di berbagai bidang! Pengetahuan atau pola apa yang bisa kita dapatkan dari data di bawah? NIM Gender Nilai UN Asal Sekolah IPS1 IPS2 IPS3 IPS 4 ... Lulus Tepat Waktu 10001 L 28 SMAN 2 3.3 3.6 2.89 2.9 Ya 10002 P 27 SMAN 7 4.0 3.2 3.8 3.7 Tidak 10003 24 SMAN 1 2.7 3.4 3.5 10004 26.4 SMAN 3 11000 23.4 SMAN 5 2.8 3.1 http://romisatriawahono.net

romi@romisatriawahono.net Object-Oriented Programming Referensi Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques Third Edition, Elsevier, 2012 Ian H. Witten, Frank Eibe, Mark A. Hall, Data mining: Practical Machine Learning Tools and Techniques 3rd Edition, Elsevier, 2011 Markus Hofmann and Ralf Klinkenberg, RapidMiner: Data Mining Use Cases and Business Analytics Applications, CRC Press Taylor & Francis Group, 2014 Daniel T. Larose, Discovering Knowledge in Data: an Introduction to Data Mining, John Wiley & Sons, 2005 Ethem Alpaydin, Introduction to Machine Learning, 3rd ed., MIT Press, 2014 Florin Gorunescu, Data Mining: Concepts, Models and Techniques, Springer, 2011 Oded Maimon and Lior Rokach, Data Mining and Knowledge Discovery Handbook Second Edition, Springer, 2010 Warren Liao and Evangelos Triantaphyllou (eds.), Recent Advances in Data Mining of Enterprise Data: Algorithms and Applications, World Scientific, 2007 http://romisatriawahono.net