Jeff Howbert Introduction to Machine Learning Winter 2012 1 Classification Nearest Neighbor.

Slides:



Advertisements
Presentasi serupa
Menggambarkan Data: Tabel Frekuensi, Distribusi Frekuensi, dan Presentasi Grafis Chapter 2.
Advertisements

Developing Knowledge Management dalam perusahaan Week 10 – Pert 19 & 20 (Off Class Session)
PERSAMAAN DAN PERTIDAKSAMAAN
Pengujian Hipotesis untuk Satu dan Dua Varians Populasi
3. Economic Returns to Land Resources: Theories of Land Rent
Mata Kuliah : ALGORITMA dan STRUKTUR DATA 1.
PEMOGRAMAN BERBASIS JARINGAN
Modeling Data in the Organization
GRAPHICAL SOLUTION OF LINEAR PROGRAMMING PROBLEMS
LOCAL AREA NETWORK (LAN)
TRIP GENERATION.
Evaluasi Ujian Ketrampilan dan Laporan. 14:30 – 14:10 : Penjelasan Tugas14:10 – 15:10 : Kerja Mandiri15:10 – 16: 00 : Diskusi Alokasi waktu.
Materi Analisa Perancangan System.
Peta Kontrol (Untuk Data Variabel)
1 Pertemuan > Desain fisik basis data Matakuliah: >/ > Tahun: > Versi: >
1 Pertemuan 21 Pompa Matakuliah: S0634/Hidrologi dan Sumber Daya Air Tahun: 2006 Versi: >
INTRODUCTION TO PSYCHOLOGY. Recommended Literature 1. Introduction to Psychology : Gateway to Mind and Behavior by Dennis Coon and John O. Mitterer 2.
ESTIMATION AND ROONDING OF NUMBERS
PERULANGANPERULANGAN. 2 Flow of Control Flow of Control refers to the order that the computer processes the statements in a program. –Sequentially; baris.
Tugas-Tugas.
Slide 3-1 Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Revised by IB & SAM, Fasilkom UI, 2005 Exercises Apa saja komponen utama.
Estimasi Prob. Density Function dengan EM Sumber: -Forsyth & Ponce Chap. 7 -Standford Vision & Modeling Sumber: -Forsyth & Ponce Chap. 7 -Standford Vision.
Introduction to The Design & Analysis of Algorithms
PROSES PADA WINDOWS Pratikum SO. Introduksi Proses 1.Program yang sedang dalam keadaan dieksekusi. 2.Unit kerja terkecil yang secara individu memiliki.
M. Suwarso Kegiatan Lembaga Standarisasi Internasional Dalam Hal Telepon Internet Telepon Internet.
Review Operasi Matriks
TEKNOLOGI WIRELESS Modul 1 - Teknologi Wireless.
Applications of Matrix and Linear Transformation in Geometric and Computational Problems by Algebra Research Group Dept. of Mathematics Course 2.
Restricting and Sorting Data
Pengantar/pengenalan (Introduction)
Interface Nur Hayatin, S.ST Jurusan Teknik Informatika Universitas Muhammadiyah Malang Sem Genap 2010.
Bilqis1 Pertemuan bilqis2 Sequences and Summations Deret (urutan) dan Penjumlahan.
Menjelaskan sifat – sifat komponen elektronika aktif dan pasif
Risk Management.
Implementing an REA Model in a Relational Database
Pertemuan 3 Menghitung: Nilai rata-rata (mean) Modus Median
Analysis of Variance (ANOVA)
Pendugaan Parameter part 2
MEMORY Bhakti Yudho Suprapto,MT. berfungsi untuk memuat program dan juga sebagai tempat untuk menampung hasil proses bersifat volatile yang berarti bahwa.
SUBPROGRAM IN PASCAL Function.
3 nd Meeting Chemical Analysis Steps and issues STEPS IN CHEMICAL ANALYSIS 1. Sampling 2. Preparation 3. Testing/Measurement 4. Data analysis 2. Error.
Basisdata Pertanian. After completing this lesson, you should be able to do the following Identify the available group functions Describe the use of group.
1 Magister Teknik Perencanaan Universitas Tarumanagara General View On Graduate Program Urban & Real Estate Development (February 2009) Dr.-Ing. Jo Santoso.
Switch. Perluasan dari bridge Arsitektur switch: – Store and forward.
Departemen Ilmu Komputer FMIPA IPB
LOGO Manajemen Data Berdasarkan Komputer dengan Sistem Database.
Lecture 1 Introduction to C# Erick Pranata © Sekolah Tinggi Teknik Surabaya 1.
Linked List dan Double Linked List
Definisi VLAN Pemisahan jaringan secara logis yang dilakukan pada switch Pada tradisional switch, dalam satu switch menunjukkan satu segmentasi LAN.
3.1 © 2007 by Prentice Hall OVERVIEW Information Systems, Organizations, and Strategy.
Diagnose device problems that connected to the Wide Area Network Identify problems Through the Symptoms that arise HOME.
SMPN 2 DEMAK GRADE 7 SEMESTER 2
Introduction to Softcomputing Son Kuswadi Robotic and Automation Based on Biologically- inspired Technology (RABBIT) Electronic Engineering Polytechnic.
1. 2 Work is defined to be the product of the magnitude of the displacement times the component of the force parallel to the displacement W = F ║ d F.
Lecture 8 Set and Dictionary Sandy Ardianto & Erick Pranata © Sekolah Tinggi Teknik Surabaya 1.
MAINTENANCE AND REPAIR OF RADIO RECEIVER Competency : Repairing of Radio Receiver.
© 2009 Fakultas Teknologi Informasi Universitas Budi Luhur Jl. Ciledug Raya Petukangan Utara Jakarta Selatan Website:
PERSAMAAN DAN PERTIDAKSAMAAN
PENJUMLAHAN GAYA TUJUAN PEMBELAJARAN:
SISTEM TERDISTRIBUSI (SILABUS dan Introduction to Distributed Systems)
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Pengalamatan Jaringan – IPv4 Dosen Pengampu: Resi Utami Putri, S.Kom.,
TCP, THREE-WAY HANDSHAKE, WINDOW
Retrosintetik dan Strategi Sintesis
Web Teknologi I (MKB511C) Minggu 12 Page 1 MINGGU 12 Web Teknologi I (MKB511C) Pokok Bahasan: – Text processing perl-compatible regular expression/PCRE.
Lecture 2 Introduction to C# - Object Oriented Sandy Ardianto & Erick Pranata © Sekolah Tinggi Teknik Surabaya 1.
Lecture 9 Single Linked List Sandy Ardianto & Erick Pranata © Sekolah Tinggi Teknik Surabaya 1.
Applied Multivariate Analysis
Presented By : Group 2. A solution of an equation in two variables of the form. Ax + By = C and Ax + By + C = 0 A and B are not both zero, is an ordered.
Classification Supervised learning.
PRODI MIK | FAKULTAS ILMU-ILMU KESEHATAN
Transcript presentasi:

Jeff Howbert Introduction to Machine Learning Winter Classification Nearest Neighbor

Jeff Howbert Introduction to Machine Learning Winter Instance based classifiers Store the training samples Use training samples to predict the class label of unseen samples

Jeff Howbert Introduction to Machine Learning Winter l Examples: –Rote learner  memorize entire training data  perform classification only if attributes of test sample match one of the training samples exactly –Nearest neighbor  use k “closest” samples (nearest neighbors) to perform classification Instance based classifiers

Jeff Howbert Introduction to Machine Learning Winter l Basic idea: –If it walks like a duck, quacks like a duck, then it’s probably a duck Nearest neighbor classifiers training samples test sample compute distance choose k of the “nearest” samples

Jeff Howbert Introduction to Machine Learning Winter Nearest neighbor classifiers Requires three inputs: 1.The set of stored samples 2.Distance metric to compute distance between samples 3.The value of k, the number of nearest neighbors to retrieve

Jeff Howbert Introduction to Machine Learning Winter Nearest neighbor classifiers To classify unknown record: 1.Compute distance to other training records 2.Identify k nearest neighbors 3.Use class labels of nearest neighbors to determine the class label of unknown record (e.g., by taking majority vote)

Jeff Howbert Introduction to Machine Learning Winter Definition of nearest neighbor k-nearest neighbors of a sample x are datapoints that have the k smallest distances to x

Jeff Howbert Introduction to Machine Learning Winter nearest neighbor Voronoi diagram

Jeff Howbert Introduction to Machine Learning Winter l Compute distance between two points: –Euclidean distance l Options for determining the class from nearest neighbor list –Take majority vote of class labels among the k-nearest neighbors –Weight the votes according to distance  example: weight factor w = 1 / d 2 Nearest neighbor classification

Jeff Howbert Introduction to Machine Learning Winter l Choosing the value of k: –If k is too small, sensitive to noise points –If k is too large, neighborhood may include points from other classes Nearest neighbor classification

Jeff Howbert Introduction to Machine Learning Winter l Scaling issues –Attributes may have to be scaled to prevent distance measures from being dominated by one of the attributes –Example:  height of a person may vary from 1.5 m to 1.8 m  weight of a person may vary from 90 lb to 300 lb  income of a person may vary from $10K to $1M Nearest neighbor classification

Jeff Howbert Introduction to Machine Learning Winter l Problem with Euclidean measure: –High dimensional data  curse of dimensionality –Can produce counter-intuitive results Nearest neighbor classification… vs d =  one solution: normalize the vectors to unit length

Jeff Howbert Introduction to Machine Learning Winter l k-Nearest neighbor classifier is a lazy learner –Does not build model explicitly. –Unlike eager learners such as decision tree induction and rule-based systems. –Classifying unknown samples is relatively expensive. l k-Nearest neighbor classifier is a local model, vs. global model of linear classifiers. Nearest neighbor classification

Jeff Howbert Introduction to Machine Learning Winter l Contoh Contoh Kasus X1 = Ketahanan Asam (detik) X2 = Kekuatan (Kg / meter persegi) Y = Klasifikasi 77 Buruk 7744Buruk 3344Baik 1144Baik

Jeff Howbert Introduction to Machine Learning Winter Tentukan parameter K = jumlah tetangga terdekat Misalkan menggunakan K = 3 2. Hitung jarak antara permintaan dan contoh- contoh training semua Koordinat query instance adalah (3, 7), kemudian untuk menghitung jarak, kita menghitung jarak kuadrat yang lebih cepat (tanpa akar kuadrat) Penyelesaian

Jeff Howbert Introduction to Machine Learning Winter X1 = Ketahanan Asam (detik) X2 = Kekuatan (Kg / meter persegi) Jarak Kuadrat untuk Query- Instance (3,7)

Jeff Howbert Introduction to Machine Learning Winter Urutkan jarak dan tentukan tetangga terdekat berdasarkan jarak terdekat ke-K X1 = Ketahanan Asam (detik) X2 = Kekuatan (Kg / meter persegi) Jarak Kuadrat untuk Query- Instance (3,7) Urutan Jarak Terdekat Apakah termasuk dalam 3- Nearest Neighbors? 773Yes 744No 341Yes 142

Jeff Howbert Introduction to Machine Learning Winter l X1 = Ketahanan Asam (detik) X2 = Kekuatan (Kg / meter persegi) Jarak Kuadrat untuk Query- Instance (3,7) Urutan Jarak Terdekat Apakah termasuk dalam 3- Nearest Neighbors? Y = Kategori dari Nearest Neighbor 773YesBuruk 744No- 341YesBaik 142YesBaik

Jeff Howbert Introduction to Machine Learning Winter Gunakan mayoritas sederhana dari kategori tetangga terdekat sebagai nilai prediksi query- instance Terdapat 2 baik dan 1 buruk pada masing- masing kategori, dimana 2 > 1 maka kita menyimpulkan bahwa kertas tisu baru yang lulus uji laboratorium dengan X1 = 3 dan X2 = 7 adalah termasuk dalam kategori baik.