Aplikasi Data Mining Seminar Data Mining

Slides:



Advertisements
Presentasi serupa
BASIS DATA LANJUTAN.
Advertisements

Robert Groth, “Data Mining: Building Competitive Advantage”, chap 2
SI527 - ERP (Enterprise Resources Planning)
Information Systems, Organizations, and Strategy
Data Mining.
Achieving Operational Excellence and Customer Intimacy: Enterprise Applications Ivan Diryana, ST., MT.
KEY ISSUES IN SUPPLY CHAIN MANAGEMENT
Pertemuan XIV FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or.
Roesfiansjah Rasjidin Program Studi Teknik Industri Fakultas Teknik – Univ. Esa Unggul.
Perancangan Database Pertemuan 07 s.d 08
MATERIAL RESOURCE PLANNING
KONSEP STRATEGI BISNIS DAN IMPLIKASINYA PADA STRATEGI IS/IT
Clustering. Definition Clustering is “the process of organizing objects into groups whose members are similar in some way”. A cluster is therefore a collection.
Inventory Management. Introduction Basic definitions ? An inventory is an accumulation of a commodity that will be used to satisfy some future demand.
1 Pertemuan 22 Analisis Studi Kasus 2 Matakuliah: H0204/ Rekayasa Sistem Komputer Tahun: 2005 Versi: v0 / Revisi 1.
Pertemuan <<1>> Pengantar tentang database(01)
Pertemuan XIV FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or.
1 Pertemuan 2 Unit 1 - Careers Matakuliah: G0682 / Bahasa Inggris Ekonomi 1 Tahun: 2005 Versi: versi/revisi.
9.3 Geometric Sequences and Series. Objective To find specified terms and the common ratio in a geometric sequence. To find the partial sum of a geometric.
1 Pertemuan 17 Pengaruh perkembangan teknologi dalam usaha perjalanan wisata Matakuliah: G1174/Tourism Management and Planning Tahun: 2005 Versi: 1/R0.
 Materi :  Understanding e-CRM Concept and Application  Buku Wajib & Sumber Materi :  Kalakota, Ravi & Marcia Robinson (2001). e-Business 2.0. Roadmap.
Introduction.  Proses manajemen untuk mengidentifikasi, mengantisipasi dan memuaskan kebutuhan pelanggan secara menguntungkan  Pemasaran adalah proses.
Samples: Smart Goals ©2014 Colin G Smith
SMART GRID Group 26: Trang Trieu Grace Truong Nicki Tran Lisa Trinh.
Penambangan data Pertemuan 2.
MARKETING INFORMATION SYSTEMS AND THE SALES ORDER
Pertemuan 03 Materi : Buku Wajib & Sumber Materi :
Chapter 6 Foundations of Business Intelligence: Databases and Information Management.
Pert. 16. Menyimak lingkungan IS/IT saat ini
Membangun Web Site“Cantik”
Pertemuan 17 Materi : Buku Wajib & Sumber Materi :
Data Mining.
CLASS DIAGRAM.
Pendahuluan.
Pertemuan <<18>> << Penemuan Fakta(01) >>
Organizational Environment Analysis
Dr Rilla Gantino, SE., AK., MM
Pengantar Teknologi Informasi Introduction to Computers and Networks
SELLING SKILL FITB DIVISION.
Enhancing Decision Making
Master data Management
Pertemuan 4 CLASS DIAGRAM.
How to Set Up AT&T on MS Outlook ATT is a multinational company headquartered in Texas. ATT services are used by many people widely across.
Kuwait Warehousing Market is Expected to reach USD million by 2022: Ken Research.
Sweden Telemedicine Market is Driven By Increase in the Number of Medical Applications, Rise in the Geriatric Population and Increasing Shortage of Nurses.
How You Can Make Your Fleet Insurance London Claims Letter.
Why It Is Necessary to Have More Sells Through the Social Media
How Can I Be A Driver of The Month as I Am Working for Uber?
How the Challenges Make You A Perfect Event Organiser.
Things You Need to Know Before Running on the Beach.
Grow Your Social Media Communities
Don’t Forget to Avail the Timely Offers with Uber
Social Media for Events audiovisualhire.co.uk.
Pendahuluan.
THE INFORMATION ABOUT HEALTH INSURANCE IN AUSTRALIA.
Group 3 About causal Conjunction Member : 1. Ahmad Fandia R. S.(01) 2. Hesti Rahayu(13) 3. Intan Nuraini(16) 4. Putri Nur J. (27) Class: XI Science 5.
HughesNet was founded in 1971 and it is headquartered in Germantown, Maryland. It is a provider of satellite-based communications services. Hughesnet.
HOTEL MANAGEMENT OF UNIVERSITAS DIAN NUSWANTORO
Konsep Aplikasi Data Mining
In this article, you can learn about how to synchronize AOL Mail with third-party applications like Gmail, Outlook, and Window Live Mail, Thunderbird.
By Yulius Suprianto Macroeconomics | 02 Maret 2019 Chapter-5: The Standard of Living Over Time and A Cross Countries Source: http//
Pertemuan 6 Mappa Panglima Banding. 2 COST DRIVER: Definition Is a factor that causes, “drives,” an activity’s costs. LO 4.
SISTEM PENUNJANG KEPUTUSAN UNTUK SISTEM INFORMASI MANAJEMEN.
Konsep Aplikasi Data Mining
Textbooks. Association Rules Association rule mining  Oleh Agrawal et al in  Mengasumsikan seluruh data categorical.  Definition - What does.
Rank Your Ideas The next step is to rank and compare your three high- potential ideas. Rank each one on the three qualities of feasibility, persuasion,
GROUP 8 Martha Prasetya Ningrum ( ) Bimantara Wicaksana ( ) DISCOURSE AND CULTURE.
Draw a picture that shows where the knife, fork, spoon, and napkin are placed in a table setting.
2. Discussion TASK 1. WORK IN PAIRS Ask your partner. Then, in turn your friend asks you A. what kinds of product are there? B. why do people want to.
KELOMPOK 6 Arranged by Group 3 Adam Pangestu ( ) Muhammad Arif( ) Mohammad Lutfi( ) Mala Sari( ) Noor Fajri( )
Transcript presentasi:

Aplikasi Data Mining Seminar Data Mining Business Trouble and Industrial Applications Lab Data Mining, Teknik Industri Universitas Islam Indonesia 10 Mei, 2008 Budi Santosa

Isi Pendahuluan Data Association rules Klasifikasi Clustering Aplikasi data mining Commercial tools Kesimpulan 4/24/2017 Budi Santosa

Pendahuluan Apa data mining? Mengapa kita perlu untuk ‘mine’ data? Jenis data seperti apa yang bisa kita ‘mine’? 4/24/2017 Budi Santosa

Pengertian data mining Data mining adalah gabungan metode-metode analisis data secara statistik dan algoritma-algoritma untuk memproses data berukuran besar. Data mining merupakan proses menemukan informasi atau pola yang penting dalam basis data berukuran besar. Bagian dari proses Knowledge Discovery in Data (KDD). Explorasi dan analisis large quantities of data Dengan tools secara automatic or semi-automatic Menemukan meaningful patterns dan rules. Patterns ini memungkinkan suatu company untuk better understand its customers improve its marketing, sales, and customer support operations 4/24/2017 Budi Santosa

4/24/2017 Budi Santosa

Mengapa data mining? Pertumbuhan yang explosive dalam data collection Penyimpanan data dalam data warehouses Ketersediaan akses data yang semakin meningkat dari Web dan intranet  Kita perlu menemukan cara yang lebih efektif untuk menggunakan data ini dalam proses decision support dari sekedar menggunakan traditional querry languages 4/24/2017 Budi Santosa

Jenis data apa? Data warehouses Transactional databases Structure - 3D Anatomy Function – 1D Signal Metadata – Annotation Data warehouses Transactional databases Advanced database systems Spacial and Temporal Time-series Multimedia, text WWW … 4/24/2017 Budi Santosa

Working with data Kebanyakan algoritma data mining cocok hanya untuk data numerik Semua data seharusnya direpresentasikan sebagai bilangan/data numerik sehingga algoritma bisa diterapkan Data sales, crime rates, text, atau images, kita harus menemukan cara yang tepat untuk mentransform data menjadi bilangan/number. 4/24/2017 Budi Santosa

Knowledge Discovery dan Data Mining Non-trivial extraction of implicit, unknown, and potentially useful information from databases. Proses Knowledge discovery terdiri dari fase: 4/24/2017 Budi Santosa

Tugas (task) dari Data Mining Prediksi: Bagaimana perilaku atribut tertentu dalam data dimasa datang? (predictive) Time series Pattern Sequence Independent-dependent relation Klasifikasi: mengelompokkan data ke dalam kategori berdasarkan sampel yang ada (label diskrit) Feature selection Clustering: mengklasterkan obyek tanpa ada sampel sebagai contoh (descriptive) Association: object association 4/24/2017 Budi Santosa

Association Rules Tujuan Memberikan aturan yang berkaitan dengan kehadiran set item dengan set item yang lain Contoh: 4/24/2017 Budi Santosa

Association Rules Market-basket model Mencari kombinasi beberapa produk Letakkan SHOES dekat dengan SOCK sehingga jika seorang customer membeli satu dia akan membeli yang lain Transaksi: seseorang membeli beberapa items dalam itemset di supermarket 4/24/2017 Budi Santosa

Klasifikasi married Yes no salary Acct balance >5k <20k <5k >=50 age Poor risk <25 Poor risk >=25 Good risk Fair risk Fair risk Good risk Budi Santosa 4/24/2017

Class attribute E(Married)=0.92 Gain(Married)=0.08 E(Salary)=0.33 RID Married Salary Acct balance Age Loanworthy 1 No >=50 <5k >=25 Yes 2 >=5k 3 20k..50k <25 4 <20k 5 6 E(Married)=0.92 Gain(Married)=0.08 E(Salary)=0.33 Gain(Salary)=0.67 E(A.balance)=0.82 Gain(A.balance)=0.18 Expected information E(Age)=0.81 Salary Gain(Age)=0.19 >=50k <20k 20k..50k I(3,3)=1 age Class is “yes” {1,2} Class is “no” {4,5} Entropy <25 >=25 Class is “no” {3} Class is “yes” {6} Information gain Gain(A) = I-E(A) 4/24/2017 Budi Santosa

Klasifikasi Learn Classifier Model categorical continuous class Test Set Learn Classifier Model Training Set 4/24/2017 Budi Santosa

Text Classification Learn Classifier Model text class Training Set Test Set Learn Classifier Model Training Set 4/24/2017 Budi Santosa

Klastering Klastering adalah proses mengelompokkan obyek-obyek yang mirip ke dalam satu klaster. Obyek bisa berasal dari data base customer, produk, gen, mahasiswa, dsb. 4/24/2017 Budi Santosa

Klastering Berapa Konsep Salah satu hal yang sangat penting adalah penggunaan ukuran kemiripan (similarity) Jika datanya numerik, fungsi kemiripan ( similarity function) berdasarkan jarak sering digunakan Euclidean metric (Euclidean distance), Minkowsky metric, Manhattan metric. Korelasi, cosinus, kovariance Hiraki, Kmeans, Fuzzy, SOM, Support Vector Clustering 4/24/2017 Budi Santosa

Klaster 4/24/2017 Budi Santosa

Aplikasi data mining Cuaca Bisnis Mikrobiologi Market analysis Manufacturing and production Fraud detection dan detection of unusual patterns (outliers) Telecommunication Financial transactions 4/24/2017 Budi Santosa

Aplikasi data mining Text mining (news group, email, documents) and Web mining DNA and bio-data analysis Diseases outcome Effectiveness of treatments Identify new drugs 4/24/2017 Budi Santosa

Cuaca Elevation 54 km Chandler 180 km

North Azimuth angle Chandler 54 km WSR-88D records digital database containing 3 variables: velocity (V), reflectivity (Z), and spectrum width (W).

MDA Algorithm Untuk Deteksi Tornado The current Mesocyclone Detection Algorithm (MDA) was created at the National Severe Storms Laboratory (NSSL) , Oklahoma, to work with native variables derived from the WSR-88D In order to detect circulations associated with vortices that spin up into tornadoes, the velocity data are exploited The data are measured for circulation depth, height above the ground, strength of the circulation, shear (change in wind speed or direction with distance), etc. By relaxing previous threshold values, the MDA is capable of detecting weaker circulations that may eventually spin up into mesocyclones (thereby enhancing the probability of detection) 4/24/2017

MDA Attributes 1. base (m) [0-12000] 2. depth (m) [0-13000] 3. strength rank [0-25] 4. low-level diameter (m) [0-15000] 5. maximum diameter (m) [0-15000] 6. height of maximum diameter (m) [0-12000] 7. low-level rotational velocity (m/s) [0-65] 8. maximum rotational velocity (m/s) [0-65] 9. height of maximum rotational velocity (m) [0-12000] 10. low-level shear (m/s/km) [0-175] 11. maximum shear (m/s/km) [0-175] 12. height of maximum shear (m) [0-12000] 13. low-level gate-to-gate velocity difference (m/s) [0-130] 14. maximum gate-to-gate velocity difference (m/s) [0-130] 15. height of maximum gate-to-gate velocity difference (m) [0-12000] 16. core base (m) [0-12000] 17. core depth (m) [0-9000] 18. age (min) [0-200] 19. strength index (MSI) wghtd by avg density of integrated layer [0-13000] 20. strength index (MSIr) "rank" [0-25] 21. relative depth (%) [0-100] 22. low-level convergence (m/s) [0-70] 23. mid-level convergence (m/s) [0-70] 4/24/2017

Medis Bisa kah saya menggunakan contact lenses? Possible output: none, soft, hard. Decision berdasar pada: - age - spectacle prescription - astigmatism - tear production rate 4/24/2017 Budi Santosa

contoh umur resep astigmatism tear p.r. lenses muda miope tidak kurang Tdk perlu normal soft hypermetrope ya pre-presbyopic presbyopic hard Astigmatism is a vision condition that causes blurred vision due either to the irregular shape of the cornea, the clear front cover of the eye 4/24/2017 Budi Santosa

Prosedur pengklasifikasian A set of “if-then” rules A decision tree A Neural Network SVM, LSVM, LS-SVM LDA KNN Minimax Prob Machine Analytic Center Machine Relevance Vector Machine 4/24/2017 Budi Santosa

Prosedur if -then If umur = muda and astigmatic = tidak dan tear production rate = normal then rekomendasi = soft If age = pre-presbyopic and astigmatic = no and tear production rate = normal then rekomendasi = soft If age = presbyopic and spectacle prescription = myope and astigmatic = no then rekomendasi = none If spectacle prescription = hypermetrope and astigmatic = no and tear production rate = normal then rekomendasi = soft If spectacle prescription = myope and astigmatic = yes and tear production rate = normal then rekomendasi = hard If age = young and astigmatic = yes and tear production rate = Normal then rekomendasi = hard If age = pre-presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then rekomendasi = none If age = presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then rekomendasi = none 4/24/2017 Budi Santosa

Decision tree 4/24/2017 Budi Santosa

Regression Regression is similar to classification First, construct a model Second, use model to predict unknown value Methods Linear and multiple regression Non-linear regression, Neural network, SVR Regression is different from classification Classification refers to predict categorical class label Regression models continuous-valued functions 2004/09/09

Bisnis Contoh: pemakai Credit card bisa diklasterkan menurut Berapa sering menggunakan kartu: • frequent/seldom usage • domestic/foreign transactions • high/low amounts of money • transactions of specific type • … Untuk setiap klaster, sistem fraud detection bisa dikembangkan. Atau sejumlah produk yang lain yang bisa ditawarkan 4/24/2017 Budi Santosa

Credit Attribute 1: (qualitative) Status of existing checking account A11 : ... < 0 DM A12 : 0 <= ... < 200 DM A13 : ... >= 200 DM /salary assignments for at least 1 year A14 : no checking account Attribute 2: (numerical) Duration in month Attribute 3: (qualitative) Credit history A30 : no credits taken/all credits paid back duly A31 : all credits at this bank paid back duly A32 : existing credits paid back duly till now A33 : delay in paying off in the past A34 : critical account/other credits existing (not at this bank) 4/24/2017 Budi Santosa

Attribute 4: (qualitative). Purpose. A40 : car (new). A41 : car (used) Attribute 4: (qualitative) Purpose A40 : car (new) A41 : car (used) A42 : furniture/equipment A43 : radio/television A44 : domestic appliances A45 : repairs A46 : education A47 : (vacation - does not exist?) A48 : retraining A49 : business A410 : others 4/24/2017 Budi Santosa

Checking account durasi Credit hist purpose amount … Good or bad Attribute 15: (qualitative) Housing A151 : rent A152 : own A153 : for free Attribute 16: (numerical) Number of existing credits at this bank Attribute 17: (qualitative) Job A171 : unemployed/ unskilled - non-resident A172 : unskilled - resident A173 : skilled employee / official A174 : management/ self-employed/ highly qualified employee/ officer Checking account durasi Credit hist purpose amount … Good or bad 4/24/2017 Budi Santosa

Cross Selling Cross selling salah satu aplikasi data mining penting yang lain Apa yang merupakan best additional or best next offer (BNO) untuk setiap customer? Misal, sebuah bank ingin bisa menjual automobile insurance ketika seorang customer mendapatkan car loan Bank tersebut mungkin memutuskan untuk mendapatkan a full-service insurance agency 4/24/2017 Budi Santosa

Paying Claims A major manufacturer of diesel engines must also service engines under warranty Warranty claims come in from all around the world Data mining is used to determine rules for routing claims some are automatically approved others require further research Result: The manufacturer saves millions of dollars Data mining also enables insurance companies and the Fed. Government to save millions of dollars by not paying fraudulent medical insurance claims 4/24/2017 Budi Santosa

Finding Prospects A cellular phone company wanted to introduce a new service They wanted to know which customers were the most likely prospects Data mining identified “sphere of influence” as a key indicator of likely prospects Sphere of influence is the number of different telephone numbers that someone calls 4/24/2017 Budi Santosa

Antisipasi Customer Needs Clustering is an undirected data mining technique that finds groups of similar items Based on previous purchase patterns, customers are placed into groups Customers in each group are assumed to have an affinity for the same types of products New product recommendations can be generated automatically based on new purchases made by the group This is sometimes called collaborative filtering 4/24/2017 Budi Santosa 39

Microbiology 4/24/2017 Budi Santosa

Microarray Problem Biology Application Domain Data Analysis validasi Data Analysis Microarray Experiment Image Analysis Data Mining Experiment Design and Hypothesis Data Warehouse Knowledge discovery in databases (KDD) 4/24/2017 Budi Santosa

Data Mining Untuk Manufaktur Enterprise Resources Planning (ERP) systems generate large volumes of data. Examples of data sources in manufacturing include: Schedules. Production capacity, efficiency, failures, etc. Manufacturing parameters. Process quality. Process plans. 4/24/2017 Budi Santosa

Generate Data dalam ERP System 4/24/2017 Budi Santosa

4/24/2017 Budi Santosa

Methodologi for the Selection of Manufacturing Processes with Data Mining The learning stage focuses on discovering knowledge from manufacturing processes: Step 1: Similar parts and processes are grouped into clusters. Step 2: Relevant processes are associated with each cluster. The exploitation stage takes advantage of the clusters to improve the efficiency of generation of process plans for new parts: Step 3: A new part to be manufactured is matched with a suitable cluster. Step 4: The new part is assigned the relevant process plan. The specialization stage adapts the relevant process for the new part: Step 5: The relevant process is adapted to the new part. Step 6: The new process plan data is incorporated into the database. 4/24/2017 Budi Santosa

4/24/2017 Budi Santosa

Data Mining to select supplier Input feature set of a performance measure for suppliers Feature Content Fl Quality of material (0, 1, 2, 3) F10 Warranty (0/1) F2 Track record (0, 1, 2, 3) F11 Warehousing (0, 1, 2) F3 Technical ability (0, 1, 2) F12 Reliability (%) F4 Tools and equipment (0, 1, 2, 3) F13 Efficiency (%) F5 Safety practices (0, 1, 2,3) F14 Dependability (0, 1, 2) F6 Deliveries/shipments (0, 1, 2, 3) F15 Frequency of rejects (time/year) F7 Conformance to standards (0, 1, 2) F16 Failure rate (%) F8 Applicability of product (0, 1, 2) F17 Offered price (0, 1, 2, 3) F9 Product development (0, 1) F18 Responsiveness to bidding (0, 1, 2) 4/24/2017 Budi Santosa

Pabrik Sampoerna Perencanaan dimulai dari forecasting demand Dari demand forecasting didapatkan petunjuk: Apa saja bahan yang dibutuhkan? Berapa kebutuhan per jenis bahan? Alokasi tenaga kerja Apa saja variabel yang diperlukan? harga, nilai promosi, promosi pesaing, usia customer, permintaan masa lalu Hybrid time series forecasting dan causal relation 4/24/2017 Budi Santosa

Sequential Pattern Analysis Given a set of sequences, find the complete set of frequent subsequences Applications of sequential pattern Customer shopping sequences: First buy computer, then CD-ROM, and then digital camera, within 3 months. Weblog click streams Telephone calling patterns SID sequence 10 <a(abc)(ac)d(cf)> 20 <(ad)c(bc)(ae)> 30 <(ef)(ab)(df)cb> 40 <eg(af)cbc> Given support threshold min_sup =2, <(ab)c> is a sequential pattern

Contoh lain Direct mailing: siapa yang harus ditawari produk tertentu? Remote sensing: menentukan water pollution dari spectral images Forecast beban: prediksi permintaan untuk electric power Intelligent ATM’s : how much cash will be there tomorrow? City-planning: Identifying groups of houses according to their house type, value, and geographical location 4/24/2017 Budi Santosa

Contoh lain Beberapa tahun lalu, UPS mempunyai masalah dengan pekerjanya/pemogokan FedEx mendapati volumenya meningkat Setelah pemogokan, volume FedEx jatuh FedEx mengidentifikasi kustomer yang dulu pindah dan pindah lagi ke jasa lain Kustomer ini menggunakan UPS lagi FedEx memberikan special offers pada Kustomer ini agar mau menggunakan FedEx

Co-location Patterns Can you find co-location patterns from the following sample dataset? 4/24/2017 Budi Santosa Jawab: and

4/24/2017 Budi Santosa

How Data Mining Helps in Marketing Campaigns Improves profit by limiting campaign to most likely responders Reduces costs by excluding individuals least likely to respond Using RFM : recency, frequency, monetary

How Data Mining Helps in Marketing Campaigns--continued Predicts response rates to help staff call centers, with inventory control, etc. Identifies most important channel for each customer Discovers patterns in customer data

Data Mining is about Creating Models A model takes a number of inputs, which often come from databases, and it produces one or more outputs Sometimes, the purpose is to build the best model The best model yields the most accurate output Such a model may be viewed as a black box Sometimes, the purpose is to better understand what is happening This model is more like a gray box

Confusion Matrix (or Correct Classification Matrix) There are 1000 records in the model set When the model predicts Yes, it is right 800/850 = 94% of the time Actual Yes No Yes 800 50 Predicted No 50 100 When the model predicts No, it is right 100/150 = 67% of the time

Confusion Matrix-- continued The model is correct 800 times in predicting Yes The model is correct 100 times in predicting No The model is wrong 100 times in total The overall prediction accuracy is 900/1000 = 90%

Performansi Regressi MSE SSE MAPE MAD R2 4/24/2017 Budi Santosa

No Substitute for Human Intelligence Data mining is a tool to achieve goals The goal is better service to customers Only people know what to predict Only people can make sense of rules Only people can make sense of visualizations Only people know what is reasonable, legal, tasteful Human decision makers are critical to the data mining process

Data Mining Uses Data from the Past to Effect Future Action Analyze available data (from the past) Discover patterns, facts, and associations Apply this knowledge to future actions

Is the Past Relevant? Does past data contain the important business drivers? e.g., demographic data Is the business environment from the past relevant to the future? in the e-commerce era, what we know about the past may not be relevant to tomorrow users of the web have changed since late 1990s Are the data mining models created from past data relevant to the future? have critical assumptions changed?

CRM Requires Learning and More Form a learning relationship with your customers Notice their needs On-line Transaction Processing Systems Remember their preferences Decision Support Data Warehouse Learn how to serve them better Data Mining Act to make customers more profitable

The Corporate Memory Several years ago, Land’s End could not recognize regular Christmas shoppers some people generally don’t shop from catalogs but spend hundreds of dollars every Christmas if you only store 6 months of history, you will miss them Victoria’s Secret builds customer loyalty with a no-hassle returns policy some “loyal customers” return several expensive outfits each month they are really “loyal renters”

The Importance of Channels Channels are the way a company interfaces with its customers Examples Direct mail Email Banner ads Telemarketing Customer service centers Messages on receipts Key data about customers come from channels

Channels -- continued Channels are the source of data Channels are the interface to customers Channels enable a company to get a particular message to a particular customer Channel management is a challenge in organizations CRM is about serving customers through all channels

They Sometimes get Their Man The FBI handles numerous, complex cases such as the Unabomber case Leads come in from all over the country The FBI and other law enforcement agencies sift through thousands of reports from field agents looking for some connection Data mining plays a key role in FBI forensics

Contoh penelitian/paper An application of data mining for marketing in telecommunication Application of data mining to customer profile analysis in the power electricity Conditional Market Segmentation by Neural Networks cluster analysis in Industrial market marketing segmentation using support vector Using data mining for manufacturing process selection Data mining application in credit card business ….. 4/24/2017 Budi Santosa

A Customer is an Account More often, a customer is an account Retail banking checking account, mortgage, auto loan, … Telecommunications long distance, local, ISP, mobile, … Insurance auto policy, homeowners, life insurance, … Utilities The account-level view of a customer also misses the boat since each customer can have multiple accounts

The Customer’s Lifecycle Childhood birth, school, graduation, … Young Adulthood choose career, move away from parents, … Family Life marriage, buy house, children, divorce, … Retirement sell home, travel, hobbies, … Much marketing effort is directed at each stage of life

The Customer’s Lifecycle is Unpredictable It is difficult to identify the appropriate events graduation, retirement may be easy marriage, parenthood are not so easy many events are “one-time” Companies miss or lose track of valuable information a man moves a woman gets married, changes her last name, and merges her accounts with spouse It is hard to track your customers so closely, but, to the extent that you can, many marketing opportunities arise

Customers Evolve Over Time Customers begin as prospects Prospects indicate interest fill out credit card applications apply for insurance visit your website They become new customers After repeated purchases or usage, they become established customers Eventually, they become former customers either voluntarily or involuntarily

Business Processes Organize Around the Customer Lifecycle Acquisition Activation Relationship Management Winback Former Customer High Value Prospect New Customer Established Customer Voluntary Churn High Potential Low Value Forced Churn

Different Events Occur Throughout the Lifecycle Prospects receive marketing messages When they respond, they become new customers They make initial purchases They become established customers and are targeted by cross-sell and up-sell campaigns Some customers are forced to leave (cancel) Some leave (cancel) voluntarily Others simply stop using the product (e.g., credit card) Winback/collection campaigns

Different Data is Available Throughout the Lifecycle The purpose of data warehousing is to keep this data around for decision-support purposes Charles Schwab wants to handle all of their customers’ investment dollars Schwab observed that customers started with small investments

Different Data is Available Throughout the Lifecycle -- continued By reviewing the history of many customers, Schwab discovered that customers who transferred large amounts into their Schwab accounts did so soon after joining After a few months, the marketing cost could not be justified Schwab’s marketing strategy changed as a result

Different Models are Appropriate at Different Stages Prospect acquisition Prospect product propensity Best next offer Forced churn Voluntary churn Bottom line: We use data mining to predict certain events during the customer lifecycle

Examples Prediction uses data from the past to make predictions about future events (“likelihoods” and “probabilities”) Profiling characterizes past events and assumes that the future is similar to the past (“similarities”) Description and visualization find patterns in past data and assume that the future is similar to the past

Preventing Customer Attrition We use the noun churn as a synonym for attrition We use the verb churn as a synonym for leave Why study attrition? it is a well-defined problem it has a clear business value we know our customers and which ones are valuable we can rely on internal data the problem is well-suited to predictive modeling

When You Know Who is Likely to Leave, You Can … Focus on keeping high-value customers Focus on keeping high-potential customers Allow low-potential customers to leave, especially if they are costing money Don’t intervene in every case Topic should be called “managing customer attrition”

Free Tools Weka, (Waikato Environment for Knowledge Analysis) is a Java-based data mining tool developed by Waikato University. RapidMiner, http://www.rapidminer.com 4/24/2017 Budi Santosa

Commercial tools Oracle Data Miner http://www.oracle.com/technology/products/bi/odm/odminer.html Data To Knowledge http://alg.ncsa.uiuc.edu/do/tools/d2k SAS http://www.sas.com/ Clementine http://spss.com/clemetine/ Intelligent Miner http://www-306.ibm.com/software/data/iminer/ 4/24/2017 Budi Santosa

Kesimpulan Data mining is a “decision support” process in which we search for patterns of information in data. This technique can be used on many types of data. 4/24/2017 Budi Santosa

References Budi Santosa, Data Mining Teknik pemanfaatan data untuk keperluan bisnis A. Kusiak, International Journal of Production Research,Vol. 44,Data mining: manufacturing and service applications, Bruno Agard, Data mining for selection of Manufacturing processes, Data mining and knowledge discovery handbook Michael Berry and Gordon Linoff, Customer Relationship Management Through Data Mining, SAS Institute, 2000 Michael Berry and Gordon Linoff, Mastering Data Mining, John Wiley & Sons, 2000 Trafalis, T.B., M. Richman, and B. Santosa,"Prediction of Rainfall from WSR-88D Radar Using Support Vector Regression", ASME Press, (2002). Book Published of Collection: C.H. Dagli, A.L. Buczak, J. Ghosh, M.J. Embrechts, O. Ersoy, and S.W. Kercel, Intelligent Engineering Systems Through Artificial Neural Networks, Vol. 12 (pp.  639-644). Theodore B. Trafalis, Budi Santosa, and Michael B. Richman , “Learning Networks for Tornado Detection”, International Journal of General Systems, 2005 Sumber dari internet 4/24/2017 Budi Santosa