Upload
francine-cox
View
256
Download
0
Embed Size (px)
Citation preview
Data Mining資料探勘
1
1012DM01MI4
Thu 9, 10 (16:10-18:00) B216
Introduction to Data Mining資料探勘導論
Min-Yuh Day戴敏育
Assistant Professor專任助理教授
Dept. of Information Management, Tamkang University淡江大學 資訊管理學系
http://mail. tku.edu.tw/myday/2013-02-21
淡江大學 101學年度第 2學期課程教學計畫表(2013.02 - 2013.06)
• 課程名稱:資料探勘 (Data Mining)• 授課教師:戴敏育 (Min-Yuh Day)• 開課系級:資管四 P (TLMXB4P)• 開課資料:選修 單學期 2 學分 (2 Credits,
Elective)
• 上課時間:週四 9,10 (Thu 16:10-18:00)• 上課教室: B216
2
Data Mining at the Intersection of Many Disciplines
Management Science & Information Systems
Databases
Pattern Recognition
MachineLearning
MathematicalModeling
DATAMINING
Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 3
Knowledge Discovery (KDD) Process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
4Source: Han & Kamber (2006)
Data mining: core of knowledge discovery process
Data WarehouseData Mining and Business Intelligence
Increasing potentialto supportbusiness decisions End User
Business Analyst
DataAnalyst
DBA
Decision Making
Data Presentation
Visualization Techniques
Data MiningInformation Discovery
Data ExplorationStatistical Summary, Querying, and Reporting
Data Preprocessing/Integration, Data Warehouses
Data SourcesPaper, Files, Web documents, Scientific experiments, Database Systems
5Source: Han & Kamber (2006)
Business Pressures–Responses–Support Model
Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 6
課程簡介• 本課程介紹資料探勘 (Data Mining) 的基礎概念及應用技術。
• 課程內容包括– 資料探勘導論– 關連分析– 分類與預測– 分群分析– SAS 企業資料採礦實務與認證– 資料探勘個案分析與實作
7
Course Introduction• This course introduces the fundamental concepts and
applications technology of data mining. • Topics include
– Introduction to Data Mining – Association Analysis – Classification and Prediction– Cluster Analysis– Data Mining Using SAS Enterprise Miner– Case Study and Implementation of Data Mining
8
課程目標(Objective)
• 瞭解及應用資料探勘基本概念與技術。
• Understand and apply the fundamental concepts and technology of data mining
9
週次 日期 內容 (Subject/Topics)1 102/02/21 資料探勘導論 (Introduction to Data Mining)2 102/02/28 和平紀念日 ( 放假一天 )
(Peace Memorial Day) (No Classes)3 102/03/07 關連分析 (Association Analysis)4 102/03/14 分類與預測 (Classification and Prediction)5 102/03/21 分群分析 (Cluster Analysis)6 102/03/28 SAS 企業資料採礦實務 (Data Mining Using SAS Enterprise Miner)
7 102/04/04 清明節、兒童節 ( 放假一天 ) (Children's Day, Tomb Sweeping Day)(No Classes)
8 102/04/11 個案分析與實作一 (SAS EM 分群分析 ) : Banking Segmentation (Cluster Analysis – K-Means using SAS EM)
課程大綱 (Syllabus)
10
週次 日期 內容 (Subject/Topics)9 102/04/18 期中報告 (Midterm Presentation)10 102/04/25 期中考試週 11 102/05/02 個案分析與實作二 (SAS EM 關連分析 ) : Web Site Usage Associations ( Association Analysis using SAS EM)
12 102/05/09 個案分析與實作三 (SAS EM 決策樹、模型評估 ) : Enrollment Management Case Study (Decision Tree, Model Evaluation using SAS EM)
13 102/05/16 個案分析與實作四 (SAS EM 迴歸分析、類神經網路 ) : Credit Risk Case Study (Regression Analysis, Artificial Neural Network using SAS EM)
14 102/05/23 期末專題報告 (Term Project Presentation)15 102/05/30 畢業考試週
課程大綱 (Syllabus)
11
教學方法與評量方法• 教學方法
– 講述、討論、實作• 評量方法
– 實作、報告、上課表現
12
教材課本• 講義 (Slides)• 參考書籍
– Applied Analytics Using SAS Enterprise Mining, Jim Georges, Jeff Thompson and Chip Wells, 2010, SAS
– Decision Support and Business Intelligence Systems, Ninth Edition, Efraim Turban, Ramesh Sharda, Dursun Delen, 2011, Pearson
– 決策支援與企業智慧系統,九版, Efraim Turban 等著,李昇暾審定, 2011 ,華泰
13
作業與學期成績計算方式• 批改作業篇數
– 2 篇( Team Term Project )• 學期成績計算方式
– 期中評量: 30 % ( 期中報告 ) – 期末評量: 30 % ( 期末報告 )– 其他(課堂參與及報告討論表現): 40 %
14
Team Term Project• Term Project Topics
– Data mining – Web mining– Business Intelligence
• 3-5 人為一組– 分組名單於 2013.03.07 ( 四 ) 課程下課時繳交– 由班代統一收集協調分組名單
15
A Taxonomy for Data Mining TasksData Mining
Prediction
Classification
Regression
Clustering
Association
Link analysis
Sequence analysis
Learning Method Popular Algorithms
Supervised
Supervised
Supervised
Unsupervised
Unsupervised
Unsupervised
Unsupervised
Decision trees, ANN/MLP, SVM, Rough sets, Genetic Algorithms
Linear/Nonlinear Regression, Regression trees, ANN/MLP, SVM
Expectation Maximization, Apriory Algorithm, Graph-based Matching
Apriory Algorithm, FP-Growth technique
K-means, ANN/SOM
Outlier analysis Unsupervised K-means, Expectation Maximization (EM)
Apriory, OneR, ZeroR, Eclat
Classification and Regression Trees, ANN, SVM, Genetic Algorithms
Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 16
The Evolution of BI Capabilities
Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 17
A High-Level Architecture of BI
Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 18
Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites
19Source: http://www.amazon.com/Mining-Social-Web-Analyzing-Facebook/dp/1449388345
Web Mining Success Stories• Amazon.com, Ask.com, Scholastic.com, …• Website Optimization Ecosystem
Web Analytics
Voice of Customer
Customer Experience Management
Customer Interaction on the Web
Analysis of Interactions Knowledge about the Holistic View of the Customer
Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 20
Top 10 CIO Technology Priorities in 2013
Top 10 Technology Priorities Ranking
Analytics and business intelligence 1Mobile technologies 2
Cloud computing (SaaS, IaaS, PaaS) 3
Collaboration technologies (workflow) 4
Legacy modernization 5
IT management 6CRM 7
Virtualization 8
Security 9
ERP Applications 10
21Source: Gartner Executive Programs (January 2013)
http://www.gartner.com/newsroom/id/2304615
Top 10 CIO Business Priorities in 2013Top 10 Business Priorities Ranking
Increasing enterprise growth 1Delivering operational results 2
Reducing enterprise costs 3
Attracting and retaining new customers 4
Improving IT applications and infrastructure 5
Creating new products and services (innovation) 6Improving efficiency 7
Attracting and retaining the workforce 8
Implementing analytics and big data 9
Expanding into new markets and geographies 10
22Source: Gartner Executive Programs (January 2013)
http://www.gartner.com/newsroom/id/2304615
Summary• This course introduces the fundamental concepts and
applications technology of data mining. • Topics include
– Introduction to Data Mining – Association Analysis – Classification and Prediction– Cluster Analysis– Data Mining Using SAS Enterprise Miner– Case Study and Implementation of Data Mining
23
Contact Information戴敏育 博士 (Min-Yuh Day, Ph.D.) 專任助理教授淡江大學 資訊管理學系
電話: 02-26215656 #2347傳真: 02-26209737研究室: i716 ( 覺生綜合大樓 )地址: 25137 新北市淡水區英專路 151 號Email : [email protected]網址: http://mail.tku.edu.tw/myday/
24