PENGENALAN DATA MINING

Slides:



Advertisements
Presentasi serupa
Data Mining.
Advertisements

KNOWLEGDE DISCOVERY in DATABASE (KDD)
Aplikasi Basis Data.
Oleh: Achmad Zakki Falani Universitas Narotama Fakultas Ilmu Komputer
Pengantar Ver dok: 0.4 / Sept 2011
BASIS DATA LANJUTAN.
INTRODUCTION OF DATA WAREHOUSE
Knowledge Discovery in Databases
Business Intelligence
Data Warehouse, Data Mart, OLAP, dan Data Mining
Pengenalan Datawarehouse
Data Mining.
Defining Business Requirement / Medefiniskan kebutuhan User
Data Warehouse dan Data Mining
Pertemuan XIV FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or.
Pengenalan Datawarehouse
Mata Kuliah :Web Mining Dosen
Information Retrieval
1 Pertemuan 12 Pengkodean & Implementasi Matakuliah: T0234 / Sistem Informasi Geografis Tahun: 2005 Versi: 01/revisi 1.
PENGANTAR DATA MINING.
Pertemuan XIV FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or.
1 Pertemuan 6 Sistem Manajemen Data (02) Matakuliah: M0154 / Management Support Systems Tahun: 2005 Versi: 1/1.
1 Pertemuan > > Matakuliah: >/ > Tahun: > Versi: >
1 Pertemuan 17 Pengaruh perkembangan teknologi dalam usaha perjalanan wisata Matakuliah: G1174/Tourism Management and Planning Tahun: 2005 Versi: 1/R0.
Samples: Smart Goals ©2014 Colin G Smith
SMART GRID Group 26: Trang Trieu Grace Truong Nicki Tran Lisa Trinh.
A rsitektur dan M odel D ata M ining. Arsitektur Data Mining.
Sistem Temu-Balik Informasi INFORMATION RETRIEVAL SYSTEMS (IRS)
INTRODUCTION OF DATA WAREHOUSE
Peran dan Manfaatnya sebagai Decission Support System (DSS)
Data Warehouse dan Data Mining
Penambangan data Pertemuan 2.
EIS (Executive Information Systems)
MARKETING INFORMATION SYSTEMS AND THE SALES ORDER
DATA AND INFORMATION MANAGEMENT
Chapter 6 Foundations of Business Intelligence: Databases and Information Management.
Pert. 16. Menyimak lingkungan IS/IT saat ini
DATAWAREHOUSING & BUSINESS INTELLIGENT <<Pertemuan – 12>>
Data Mining.
Konsep Data Mining Ana Kurniawati.
Pengantar DATA MINING • Mengapa data mining? Apa data mining?
DATA MART Pertemuan ke-3.
Data Mining.
W1. About Social Informatics
Business Intelligent Ramos Somya, S.Kom., M.Cs.
PRODI MIK | FAKULTAS ILMU-ILMU KESEHATAN
EIS (Executive Information Systems)
KELOMPOK 6 Nama Kelompok: Lulus Irmawati ( )
INTRODUCTION OF DATA WAREHOUSE
INTRODUCTION OF DATA WAREHOUSE
Trust (Kepercayaan) Kuliah 05
Pengantar Teknologi Informasi Introduction to Computers and Networks
Sistem Temu-Balik Informasi INFORMATION RETRIEVAL SYSTEMS (IRS)
Enhancing Decision Making
INFORMASI UNTUK BERBAGAI USER DW
Master data Management
Pengantar Basis Data Pengantar Basis Data.
Data Mining 1 S2 Kom.
How to Set Up AT&T on MS Outlook ATT is a multinational company headquartered in Texas. ATT services are used by many people widely across.
How You Can Make Your Fleet Insurance London Claims Letter.
Sistem Pendukung Keputusan Roni Andarsyah, ST., M.Kom Lecture Series.
Pertemuan 1 & 2 Pengantar Data Mining 12/6/2018.
Konsep Data Mining Ana Kurniawati.
Data Mining.
Konsep Aplikasi Data Mining
Right, indonesia is a wonderful country who rich in power energy not only in term of number but also diversity. Energy needs in indonesia are increasingly.
Konsep Aplikasi Data Mining
PENGENALAN DATA MINING
Textbooks. Association Rules Association rule mining  Oleh Agrawal et al in  Mengasumsikan seluruh data categorical.  Definition - What does.
Transcript presentasi:

PENGENALAN DATA MINING Fakultas Informatika – Telkom University 1 10/11/2017

Pokok Bahasan Latar Belakang Data Mining Apa dan Mengapa Data Mining Task dalam Data mining Fungsionalitas Data mining Hubungan antara sistem data mining dengan Sistem Basis Data, Sistem Data Warehouse, dan Business Intelligence Permasalahan dalam Data Mining 10/11/2017

Sistem belajar kita: Student Centered Learning 3 10/11/2017

Latar Belakang Data Mining (1) Melimpahnya Data Terciptanya data dari tools otomatis dan teknologi basis data sehingga jumlah yang tercatat dalam basis data atau media penyimpanan lain semakin membesar 10/11/2017

Latar Belakang Data Mining (2) Walaupun data teramat melimpah, namun yang diolah menjadi knowledge sangat sedikit Solusinya??  Data warehouse dan data mining Data warehouse dan OLAP (on-line analytical processing) Ekstraksi knowledge yang menarik dalam bentuk rule, regularities, pola, konstrain dll dari data yang tersimpan dalam sejumlah besar basis data 10/11/2017

Top 10 Database Terbesar 2012 No Badan/Organisasi Jumlah Data 1 World Data Centre for Climate 20 terabytes of web data 6 petabytes of additional data 2 National Energy Research Scientific Computing Center 2.8 petabytes of data Operated by 2,000 computational scientists 3 AT&T 23 terabytes of information 1.9 trillion phone call records 4 Google 1 million searches per day 10. Not even the digital age can prevent the world's largest library from ending up on this list.  The Library of Congress (LC) boasts more than 130 million items ranging from cook books to colonial newspapers to U.S. government proceedings. It is estimated that the text portion of the Library of Congress would comprise 20 terabytes of data. The LC expands at a rate of 10,000 items per day and takes up close to 530 miles of shelf space -- talk about a lengthy search for a book. 9. Portions of the CIA database available to the public include the Freedom of Information Act (FOIA) Electronic Reading Room, The World Fact Book, and various other intelligence related publications. 8. Amazon 7. Youtube 6. LexisNexis 5. Sprint Sprint is one of the world's largest telecommunication companies as it offers mobile services to more than 53 million subscribers, and prior to being sold in May of 2006, offered local and long distance land line packages. Large telecommunication companies like Sprint are notorious for having immense databases to keep track of all of the calls taking place on their network.  Sprint's database processes more than 365 million call detail records and operational measurements per day. The Sprint database is spread across 2.85 trillion database rows making it the database with the largest number of rows (data insertions if you will) in the world. At its peak, the database is subjected to more than 70,000 call detail record insertions per second. 4. Google 3. AT&T Similar to Sprint, the United States' oldest telecommunications company AT&T 2. The second largest database in the world belongs to the National Energy Research Scientific Computing Center (NERSC) in Oakland, California.  NERSC is owned and operated by the Lawrence Berkeley National Laboratory and the U.S. Department of Energy. Sumber: http://www.siliconindia.com/news/enterpriseit/Top-10-Largest-Databases-in-the-World-nid-118891-cid-7.html 10/11/2017

Perkembangan Data di Dunia (1) Source : Tan, 2004 10/11/2017

Perkembangan Data di Dunia (2) The amount of data stored in various media has doubled in three years, from 1999 to 2002. the amount of data put into storage in 2002, five exabytes (one quintillion bytes), was equal to the contents pf ahalf a million new libraries, each containing a digitised version of the print collection of the entire US Library of Congress (Lyman and varian, UC Berkeley, 2003) 10/11/2017

Perkembangan Data di Dunia (3) "  It is projected that just four years from now, the world’s information base will be doubling in size every 11 hours. So rapid is the growth in the global stock of digital data that the very vocabulary used to indicate quantities has had to expand to keep pace. A decade or two ago, professional computer users and managers worked in kilobytes and megabytes. Now school children have access to laptops with tens of gigabytes of storage, and network managers have to think in terms of the terabyte (1,000 gigabytes) and the petabyte (1,000 terabytes). Beyond those lie the exabyte, zettabyte and yottabyte, each a thousand times bigger than the last.  (IBM Global Technical Services white paper published in July 2006, titled, "The toxic terabyte: How data-dumping threatens business efficiency.) 10/11/2017

Pokok Bahasan Latar Belakang Data Mining Apa dan Mengapa Data Mining Hubungan sistem data mining dengan Sistem Basis Data, Sistem Data Warehouse , dan Business Intelligence Task dalam Data mining Fungsionalitas Data mining Permasalahan dalam Data Mining 10/11/2017

Data Mining? 10/11/2017

10/11/2017

Just Joke.. 10/11/2017

Definisi Data Mining Data mining is an iterative process within which progress is defined by discovery, through either automatic or manual methods. [Kantardzic  , 2003] Data mining (DM) is the extraction of hidden predictive information from large databases (DBs). With the automatic discovery of knowledge implicit within DBs, DM uses sophisticated statistical analysis and modeling techniques to uncover patterns and relationships hidden in organizational DBs [Wang, 2003] Data mining refers to extracting or \mining" knowledge from large amounts of data [Han, 2005] Non-trivial extraction of implicit, previously unknown and potentially useful information from data [Tan, 2003] 10/11/2017

Awal Data Mining Berawal dari beberapa disiplin ilmu, bertujuan untuk memperbaiki teknik tradisional sehingga bisa menangani: Jumlah data yang sangat besar Dimensi data yang tinggi Data yang heterogen dan berbeda bersifat 10/11/2017

Kata kunci data mining: Jadi Data Mining?? Kata kunci data mining: Sifatnya non trivial/ iteratif Menemukan knowledge atau informasi dari data yang berjumlah besar  Data Mining merupakan inti dari proses Knowledge Discovery in Databases (KDD) 10/11/2017

Data Mining & Proses KDD Knowledge Data Mining Evaluasi Pola Task-relevant Data Selection Data Warehouse Data Cleaning Data Integration Databases Source : Han 2004 10/11/2017

Jenis Data pada Data Mining database, data warehouse, database transaksional Data streams dan sensor data Time-series data, temporal data, sequence data Struktur data, graf, social networks dan database link Object-relational database Spatial data spatiotemporal data Multimedia database Text databases The World-Wide Web 10/11/2017

Latar Belakang Data Mining Apa dan Mengapa Data Mining Pokok Bahasan Latar Belakang Data Mining Apa dan Mengapa Data Mining Hubungan sistem data mining dengan Sistem Basis Data, Sistem Data Warehouse , dan Business Intelligence Fungsionalitas Data mining Task dalam Data mining Permasalahan dalam Data Mining 10/11/2017

Arsitektur Sistem Data Mining data cleaning, integration, and selection Database or Data Warehouse Server Data Mining Engine Pattern Evaluation Graphical User Interface Knowledge-Base Database Data Warehouse World-Wide Web Other Info Repositories 10/11/2017

Hubungan DM, DB dan DW Untuk mengoptimalkan penggunaannya sistem Data Mining seharusnya memiliki hubungan dengan sistem basis data dan data warehouse. Tidak adanya hubungan tidak direkomendasikan misalnya seperti flat file processing Hubungan Loose coupling misalkan mpengambilan data dari DB/DW Hubungan Semi-tight coupling, yakni utnuk menambah performansi DM dengan pengimplementasian primitif data mining dalam sistem DB/DW misalkan sorting, indexing, aggregation, histogram analysis, multiway join dll Hubungan Tight coupling— merupakan enviroment pemrosesan yang sama dimana DM terintegrasi dengan sistem DB/DW, mining query dioptimasi berdasrkan mining query, indexing, metode pemrosesan query processing methods, dll. 10/11/2017

Data Mining & Business Intelligence Meningkatkan potensi untuk mendukung keputusan bisnis End User Business Analyst Data DBA Making Decisions Data Presentation Visualization Techniques Data Mining Information Discovery Data Exploration OLAP, MDA Statistical Analysis, Querying and Reporting Data Warehouses / Data Marts Data Sources Paper, Files, Information Providers, Database Systems, OLTP 10/11/2017

Latar Belakang Data Mining Apa dan Mengapa Data Mining Pokok Bahasan Latar Belakang Data Mining Apa dan Mengapa Data Mining Integrasi sistem data mining dengan Sistem Basis Data,Sistem Data Warehouse , dan Business Intelligence Task dalam Data mining Fungsionalitas Data mining Permasalahan dalam Data Mining 10/11/2017

Task dalam Data Mining Metode Prediksi Metode Deskripsi Dengan menggunakan beberapa variabel untuk memprediksi nilai yang belum diketahui (unknown ) atau nilai selanjutnya (future) dari variabel lain Contoh: Classification Regression Deviation Detection Metode Deskripsi Menemukan pola pendeskripsian data yang dapat diinterpretasikan oleh manusia Clustering Association Rule Discovery Sequential Pattern Discovery 10/11/2017

Latar Belakang Data Mining Apa dan Mengapa Data Mining Pokok Bahasan Latar Belakang Data Mining Apa dan Mengapa Data Mining Integrasi sistem data mining dengan Sistem Basis Data,Sistem Data Warehouse , dan Business Intelligence Task dalam Data mining Fungsionalitas Data mining Permasalahan dalam Data Mining 10/11/2017

Fungsionalitas Data Mining (1) Klasifikasi dan Prediksi Frequent patterns, asosiasi , korelasi dan kausalitas Analisis klaster Analisis Outlier Analysis Trend dan evolution Analisis statistik 10/11/2017

Aplikasi Data Mining (1) Analisis dan Manajemen Pasar target pemasaran, customer relation management (CRM), market basket analysis, cross selling, segmentasi pasar Analisis dan Manajemen Resiko Forecasting, customer retention, quality control, analisis kompetisi Deteksi dan manajemen fraud (kecurangan) Text mining (news group, email, dokumen) dan Analisis Web. Intelligent query answering 10/11/2017

Aplikasi Data Mining (2) Marketing and Sales Promotion Supermarket shelf management. Inventory Management Diagnosis Medis Collaborative Filtering Business Intelligence Network Intrusion detection Deteksi spam dll 10/11/2017

10/11/2017

10/11/2017

Latar Belakang Data Mining Apa dan Mengapa Data Mining Pokok Bahasan Latar Belakang Data Mining Apa dan Mengapa Data Mining Integrasi sistem data mining dengan Sistem Basis Data,Sistem Data Warehouse , dan Business Intelligence Task dalam Data mining Fungsionalitas Data mining Permasalahan dalam Data Mining 10/11/2017

Permasalahan Utama Bagaimana Menentukan metodologi mining? karena: Tipe data berbeda Performansi yang diharapkan dari segi keefektifan, efisiensi dan skalabilitas bisa jadi berbeda tiap metodologi Evaluasi pola yanki pengukuran “interestingness’ yang berbeda Penanganan missing value dan noise dll Bagaimana Bentuk Interaksi dengan User? Apakah: Menggunakan Data mining query languages dan ad-hoc mining Hasil data mining berupa ekspresi dan visualisasi Aplikasi dan Dampak Sosial Perlindungan terhadap keamanan , integrity dan privacy data 10/11/2017

10/11/2017

10/11/2017