50
Data Collection and Data Collection and Analysis Analysis ( ( 資資資資資資資 資資資資資資資 ) ) By C. L. Hsieh By C. L. Hsieh Department of Industrial Department of Industrial Management Management Aletheia University Aletheia University

Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Data Collection and AnalysisAnalysis

(( 資料收集與分析資料收集與分析 ))By C. L. HsiehBy C. L. Hsieh

Department of Industrial Department of Industrial Management Management

Aletheia UniversityAletheia University

Page 2: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 2

Introduction (Introduction ( 介紹介紹 ))

““You can observe a lot just by watching” You can observe a lot just by watching” (( 你可以只以視覺方式來觀察許多數據你可以只以視覺方式來觀察許多數據 ))

Data gathering results a conceptual Data gathering results a conceptual model of how the system operated model of how the system operated (( 資料資料收集可以產生一個觀念式模型來解釋系統是如何收集可以產生一個觀念式模型來解釋系統是如何運作的運作的 ))

Data gathering should avoid ending up Data gathering should avoid ending up with lots of data but with very little useful with lots of data but with very little useful information information (( 資料收集應避免最後留下一堆資資料收集應避免最後留下一堆資料但無太多有用的資訊料但無太多有用的資訊 ))

Page 3: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 3

Questions for Data GatheringQuestions for Data Gathering(( 資料收集的可能問題資料收集的可能問題 ))

What is the best procedure to follow?What is the best procedure to follow? (( 資料收集的最佳程序為何資料收集的最佳程序為何 ?)?) What types of data should be gathered?What types of data should be gathered? (( 哪些資料種類應被收集哪些資料種類應被收集 ?)?) What sources should be used ? What sources should be used ? (( 資料的來源為何資料的來源為何 ?)?) What types of analyses should be performed on the What types of analyses should be performed on the

data?data? (( 資料應進行何種分析資料應進行何種分析 ?)?) How do you select the right probability distribution How do you select the right probability distribution

representing the data? representing the data? (( 如何找出代表資料的分配如何找出代表資料的分配 ?)?) How should data be documented? How should data be documented? (( 資料如何文件化資料如何文件化 ?)?)

Page 4: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 4

Guidelines for Data GatheringGuidelines for Data Gathering (( 資料收集應注意事項資料收集應注意事項 ))

Identify triggering events:Identify triggering events: (( 了解啟動活動原因了解啟動活動原因 )) identify the causes or conditions that trigger identify the causes or conditions that trigger

the activitiesthe activities ,, e.g. the causes of downtime: e.g. the causes of downtime: failure, idle, unavailability of stock……failure, idle, unavailability of stock……

(( 了解啟動活動的原因或狀況,如停機原因:機器故障、了解啟動活動的原因或狀況,如停機原因:機器故障、閒置、缺貨閒置、缺貨…… …… ))

Look for common groupingLook for common grouping (( 分群以化簡資料分群以化簡資料 )) the solution is to reduce the data to common the solution is to reduce the data to common

behaviors and patterns behaviors and patterns (( 化簡資料至一般行為與樣式化簡資料至一般行為與樣式 )) Identify general categories Identify general categories (( 確定一般性分類確定一般性分類 ))

Page 5: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 5

Guidelines for Data GatheringGuidelines for Data Gathering (( 資料收集應注意事項資料收集應注意事項 ))

Focus on key impact factors Focus on key impact factors (( 處理主要影響因處理主要影響因子子 )) Avoid little impact information (e.g. off-hour Avoid little impact information (e.g. off-hour

performance, extremely rare downtime, negligible performance, extremely rare downtime, negligible move time..)move time..)

(( 避免影響性小的因素,如加班特例、罕見的機器故障、避免影響性小的因素,如加班特例、罕見的機器故障、可忽略的移動時間可忽略的移動時間 .).)

Separate input variables from response variables Separate input variables from response variables (( 區分輸入變數與回應變數區分輸入變數與回應變數 )) Input variables define how the system worksInput variables define how the system works (( 輸入變輸入變數決定系統運作數決定系統運作 ))

Response variables do not “drive” model behaviorResponse variables do not “drive” model behavior

Page 6: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 6

Guidelines for Data Gathering Guidelines for Data Gathering (( 資料收集應注意事項資料收集應注意事項 ))

Focus on essence rather than substanceFocus on essence rather than substance Capture cause-effect relationships and ignore Capture cause-effect relationships and ignore

meaningless details meaningless details (( 著重因果關係,忽略細節著重因果關係,忽略細節 ))

Focus on the activity of using resources or the Focus on the activity of using resources or the delay of entity flow (system abstraction) delay of entity flow (system abstraction)

(( 重視系統抽象層面重視系統抽象層面 ))

Isolate actual activity times.Isolate actual activity times. (( 確定真正活動確定真正活動時間時間 )) Exclude any extra time waiting Exclude any extra time waiting (( 排除額外等候時間排除額外等候時間 ))

Page 7: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 7

Steps to Gathering Data Steps to Gathering Data (( 資料收集的步驟資料收集的步驟 ))

Determine data requirements Determine data requirements (( 決定資決定資料需求料需求 ))

Identify data resources Identify data resources (( 確定資料來確定資料來源源 ))

Collect the data Collect the data (( 收集資料收集資料 )) Make assumptions Make assumptions (( 建立假設建立假設 )) Analyze the data Analyze the data (( 收集資料收集資料 )) Document and approve the data (Document and approve the data ( 資資

料文件化與驗證料文件化與驗證 ))

Page 8: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 8

Determining Data RequirementsDetermining Data Requirements(( 決定資料需求決定資料需求 ))

Structural data Structural data (( 結構型資料結構型資料 )) All the objects in the system to be modeledAll the objects in the system to be modeled

(( 系統中被建模的所有物件系統中被建模的所有物件 )) Describe the layout of the system Describe the layout of the system

(( 結構型資料描述系統的佈置情形結構型資料描述系統的佈置情形 )) Identify the items to be processed (e.g. entities, Identify the items to be processed (e.g. entities,

resources, locationsresources, locations….)….)

(( 結構型資料確定被處理的項目,如實體、資源、結構型資料確定被處理的項目,如實體、資源、工作站等工作站等 ))

Page 9: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 9

Determining Data RequirementsDetermining Data Requirements(( 決定資料需求決定資料需求 ))

Operational Data Operational Data (( 作業型資料作業型資料 )) Explain how the system operatesExplain how the system operates

(( 解釋系統如何運作解釋系統如何運作 )) When, where and how events & activities take When, where and how events & activities take

placeplace

(( 解釋事件與活動發生的方式、地點與時間解釋事件與活動發生的方式、地點與時間 )) Consist of the logic information about the system, Consist of the logic information about the system,

e.g. routing, schedules, downtime behavior and e.g. routing, schedules, downtime behavior and resource allocation. resource allocation. (( 說明系統中的運作邏輯、說明系統中的運作邏輯、如路線、排程方式、故障方式、資源分派方式如路線、排程方式、故障方式、資源分派方式 ))

Page 10: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 10

Determining Data RequirementsDetermining Data Requirements(( 決定資料需求決定資料需求 ))

Numerical Data Numerical Data (( 數值型資料數值型資料 )) Provide quantitative information of the systemProvide quantitative information of the system

(( 提供系統的數量資料提供系統的數量資料 )) Some are easy to get but some are notSome are easy to get but some are not

(( 有些容易獲得,但有些並不容易獲得有些容易獲得,但有些並不容易獲得 )) e.g. capacities, arrival rates, activity time…e.g. capacities, arrival rates, activity time…

(( 如工作站容量、到達率、活動時間等 如工作站容量、到達率、活動時間等 ))

Page 11: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 11

Determining Data RequirementsDetermining Data Requirements(( 決定資料需求決定資料需求 ))

Use of a Questionnaire (sample see p.103)Use of a Questionnaire (sample see p.103) (( 使用問卷,樣本請見第使用問卷,樣本請見第 103103 頁頁 ))

Questionnaire help gathering right informationQuestionnaire help gathering right information (( 問卷可以幫忙獲得正確資料問卷可以幫忙獲得正確資料 )) If sample data are not available, it is useful to get at If sample data are not available, it is useful to get at

least estimate of the least estimate of the minimumminimum, , most likelymost likely, and , and maximum valuemaximum value until more precise data obtained until more precise data obtained..

(( 若樣本資料無法獲得,至少要得到最差、最可若樣本資料無法獲得,至少要得到最差、最可能、最佳等三類估計值直到較佳資料獲得為止能、最佳等三類估計值直到較佳資料獲得為止 ))

Page 12: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 12

Identifying Data Sources Identifying Data Sources (( 確定資料來源確定資料來源 ))

Good sources of data Good sources of data (( 好的資料來源好的資料來源 )) Historical Records Historical Records (( 歷史資料,如生產量,銷售量歷史資料,如生產量,銷售量…… )) System DocumentationSystem Documentation (( 系統文件,如生產計劃,設施規劃系統文件,如生產計劃,設施規劃 ..)..) Personal Observation Personal Observation (( 個人觀察,如工作取樣個人觀察,如工作取樣 ,, 時間動作研究時間動作研究 ..)..) Personal Interviews Personal Interviews (( 訪談,如作業方法訪談,如作業方法 ,, 修理程序修理程序 ,, 排排

程程 ..)..) Comparison with similar systems Comparison with similar systems (( 與相似系統比較與相似系統比較 )) Vendor claim Vendor claim (( 零售商意見,如處理時間,新機器可靠度零售商意見,如處理時間,新機器可靠度 ..)..) Design estimation Design estimation (( 設計過程之估計值,處理時間設計過程之估計值,處理時間 ,, 搬運時搬運時

間間 ..)..) Research literature Research literature (( 文獻探討文獻探討 ..)..)

Page 13: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 13

Collecting the DataCollecting the Data (( 資料收集資料收集 ))

Defining Entity Flow Defining Entity Flow (( 定義實體流定義實體流 )) Entity flow establishes a skeletal framework for additional Entity flow establishes a skeletal framework for additional

data be attacheddata be attached

(( 實體流可以建立大綱式的架構實體流可以建立大綱式的架構 )) Follow the entity movement Follow the entity movement

(( 實體流按實體移動路線定義實體流按實體移動路線定義 )) Use Entity flow diagram (EFD) Use Entity flow diagram (EFD)

(( 使用實體流程圖使用實體流程圖 )) Difference between Entity flow diagram & Process Difference between Entity flow diagram & Process

Flowchart (Flowchart ( 程序流程圖程序流程圖 ))

Page 14: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 14

Collecting the DataCollecting the Data (( 資料收集資料收集 ))

Difference between Entity flow diagram & Process Difference between Entity flow diagram & Process Flowchart Flowchart (( 實體流程圖與程序流程圖之區別實體流程圖與程序流程圖之區別 ))

Process Process Flow chartFlow chart

(( 程序流程程序流程圖圖 ))

•Show logical sequence of Show logical sequence of activitiesactivities

(( 顯示活動的邏輯順序顯示活動的邏輯順序 ))•Define what it happensDefine what it happens

(( 定義發生什麼事定義發生什麼事 ))

Entity flow Entity flow diagram diagram

(( 實體流程實體流程圖圖

•Show physical movement of Show physical movement of entitiesentities

(( 定義實體實際移動方式定義實體實際移動方式 ))•Define where it happensDefine where it happens

(( 定義事情發生的地點定義事情發生的地點 ))

Page 15: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 15

Developing A Description of Developing A Description of Operation Operation (( 作業流程描述作業流程描述 ))

Description of Operation (Description of Operation ( 作業流程描述作業流程描述 )) Explain how entities are processed & provides the details of the EFDExplain how entities are processed & provides the details of the EFD

(( 解釋實體如何處理並提供解釋實體如何處理並提供 EDFEDF 細節細節 )) Requirements (Requirements ( 需求項目需求項目 ))

Time & resource requirements of the activity or operationTime & resource requirements of the activity or operation

(( 活動或作業的時間或資源需求活動或作業的時間或資源需求 )) Where, when & in what quantities entities get routed nextWhere, when & in what quantities entities get routed next

(( 實體於何地,何時並以何種數量前進至下站實體於何地,何時並以何種數量前進至下站 )) Time & resource requirements for moving to the next location Time & resource requirements for moving to the next location

(( 移動至下一站的活動或作業的時間或資源需求移動至下一站的活動或作業的時間或資源需求 ))

Page 16: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 16

Entity Flow Diagram for Patient Entity Flow Diagram for Patient ProcessingProcessing

(( 病患處理過程之實體流程圖病患處理過程之實體流程圖 ))

Page 17: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 17

Process Description for Patient Process Description for Patient Processing Processing

(( 病患處理過程之過程敘述表病患處理過程之過程敘述表 ))

Page 18: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 18

Defining Incidental DetailsDefining Incidental Details (( 定義附帶的細節定義附帶的細節 ))

Incidental data (downtimes, setups & work Incidental data (downtimes, setups & work priority) are not essential but necessary in priority) are not essential but necessary in order to have a complete & accurate order to have a complete & accurate model model

(( 附帶的細節非必要,如故障時間、裝置時間、附帶的細節非必要,如故障時間、裝置時間、工作優先順序等,但昰若要完成一個正確模式是工作優先順序等,但昰若要完成一個正確模式是有必要的有必要的 ))

Once a basic model constructed, any Once a basic model constructed, any numerical values (e.g. activity time, arrival numerical values (e.g. activity time, arrival rates ..) should be firmed up rates ..) should be firmed up

(( 一旦基本模型已建立,任何數值資料如活動時一旦基本模型已建立,任何數值資料如活動時間、到達率等應被強化 間、到達率等應被強化 ))

Page 19: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 19

Making Assumptions Making Assumptions (( 建立假建立假設設 ))

Simulation can’t run with incomplete Simulation can’t run with incomplete data, sodata, so assumptions are required for assumptions are required for any unknown future conditions (any unknown future conditions ( 模擬無法模擬無法執行不完全資料,對於未確定狀況應建立假設執行不完全資料,對於未確定狀況應建立假設 ))

Assumption must make sense in the Assumption must make sense in the overall operation of the model. Seeing overall operation of the model. Seeing absurd behavior may tell us that certain absurd behavior may tell us that certain assumptions don’t make senseassumptions don’t make sense ( ( 建立假設建立假設應合理,異常行為的發生可能是假設不合理應合理,異常行為的發生可能是假設不合理 ))

Page 20: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 20

Making Assumptions Making Assumptions (( 建立假建立假設設 ))

Simulation can’t run with incomplete data, Simulation can’t run with incomplete data, soso assumptions are required for any assumptions are required for any unknown future conditions unknown future conditions

(( 模擬無法執行不完全資料,對於未確定狀況應建立假設模擬無法執行不完全資料,對於未確定狀況應建立假設 )) Assumption must make sense in the overall Assumption must make sense in the overall

operation of the model. Seeing absurd operation of the model. Seeing absurd behavior may tell us that certain behavior may tell us that certain assumptions don’t make sense assumptions don’t make sense

(( 建立假設應合理,異常行為發生可能是假設不合理建立假設應合理,異常行為發生可能是假設不合理 ))

Page 21: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 21

Making Assumptions Making Assumptions (( 建立假建立假設設 ))

Sensitivity analysis assess the Sensitivity analysis assess the influence of an assumption on the influence of an assumption on the validity of a model. validity of a model. (( 敏感性分析可以用來敏感性分析可以用來評估假設對模型的影響評估假設對模型的影響 )) Best or most optimistic case Best or most optimistic case (( 最樂觀情形最樂觀情形 )) Worst or most pessimistic case (Worst or most pessimistic case ( 最悲觀情形最悲觀情形 )) Most likely or best guess case (Most likely or best guess case ( 最可能情形最可能情形 ))

Page 22: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 22

Statistical Analysis of Numerical Statistical Analysis of Numerical Data Data (( 數值資料統計分析數值資料統計分析 ))

Data should be analyzed to ascertain Data should be analyzed to ascertain their suitability for use. their suitability for use. (( 資料應分析使用資料應分析使用適合度適合度 ))

Data characteristics: Data characteristics: (( 資料特徵資料特徵 )) Independence (randomness) Independence (randomness)

((獨立性或隨機性獨立性或隨機性 )) Homogeneity (data from the same distribution) Homogeneity (data from the same distribution)

((齊一性:是否來自相同分配齊一性:是否來自相同分配 ) ) Stationary (distribution of data no change over time) Stationary (distribution of data no change over time)

((穩定性 :資料分配是否隨時間改變穩定性 :資料分配是否隨時間改變 ))

Page 23: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 23

Statistical Analysis of Numerical Statistical Analysis of Numerical Data Data (( 數值資料統計分析數值資料統計分析 ))

Stat::Fit in Promodel can automatically Stat::Fit in Promodel can automatically analyze & test data in a simulation analyze & test data in a simulation

(Stat:Fit(Stat:Fit 可以自動分析與測試模擬中的資料可以自動分析與測試模擬中的資料 )) Parameters Parameters (( 常見統計參數常見統計參數 ))

Mean Mean (( 平均數平均數 )) ─ ─ the average of the datathe average of the data Median Median (( 中位數中位數 ) ─) ─ the value of middle the value of middle

observationobservation Mode Mode (( 眾數眾數 )) ─ ─ the value with greatest frequencythe value with greatest frequency

Page 24: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 24

Descriptive StatisticsDescriptive Statistics (( 敘述統計敘述統計 ))

Parameters Parameters (( 常見統計參數常見統計參數 )) Standard Deviation Standard Deviation (( 標準差標準差 )) ─ ─ measure of measure of

average deviationaverage deviation Variance Variance (( 變異數變異數 )) ─ ─ the square of standard the square of standard

deviationdeviation Coefficient of variation Coefficient of variation (( 變異係數變異係數 )) ─ ─

standard deviation divided by mean standard deviation divided by mean Skewness Skewness (( 偏態偏態 ) ─) ─ measure of symmetrymeasure of symmetry Kurtosis Kurtosis (( 峰態峰態 ) ─) ─ measure of flatness or measure of flatness or

peakednesspeakedness

Page 25: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 25

Suitability of Data for Use Suitability of Data for Use (( 資料適合度資料適合度 ))

Test for Independency Test for Independency (( 獨立性檢定獨立性檢定 ):): Data are independent if the value of one observation is Data are independent if the value of one observation is

not influenced by the value of another observationnot influenced by the value of another observation (( 資料為獨立若其一觀察值不受其他觀察值的影響資料為獨立若其一觀察值不受其他觀察值的影響 ))

Test for Homogeneity Test for Homogeneity (( 齊一性檢定齊一性檢定 )) ::data from the same distributiondata from the same distribution (( 資料來自相同分配資料來自相同分配 ))

Test for Stationary Data Test for Stationary Data (( 穩定性檢定穩定性檢定 )) ::distribution of data does not change over timedistribution of data does not change over time

(( 資料分配不隨時間改變而改變資料分配不隨時間改變而改變 ))

Page 26: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 26

Test for IndependencyTest for Independency(( 檢定資料獨立性的方法檢定資料獨立性的方法 ))

Scatter Plot ( 分散點圖 ) A plot of adjacent points in the sequence of A plot of adjacent points in the sequence of

observed values plotted against each otherobserved values plotted against each other A pair of consecutive observations (Xi, Xi+1), A pair of consecutive observations (Xi, Xi+1),

i=1,..,n-1 i=1,..,n-1 (( 一連串連續觀察值一連串連續觀察值 )) Xi’s Positively correlated (Xi’s Positively correlated ( 正相關正相關 ))

→ → positively sloped trend line positively sloped trend line (( 正斜率直線正斜率直線 )) Xi’s Negatively correlated (Xi’s Negatively correlated ( 負相關負相關 ))

→ → Negatively sloped trend line Negatively sloped trend line (( 附斜率直線附斜率直線 ))

Page 27: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 27

Test for Independency Test for Independency (( 檢定資料獨立性的方法檢定資料獨立性的方法 ))

Autocorrelation Plot ( 自相關性 ) If observations in a sample are independent,

they are uncorrelated. ( 若觀察值獨立則不相關 ) Assume that data are taken from stationary

process The measure of autocorrelation is called rho (ρ)

(see, p. 104) ( 自相關測量值稱為 ρ) Autocorrelation ρ is between [-1,1]. (-1<= ρ<=1) If ρ is near either extreme 1 or -1, the data is

auto-correlated. (ρ 越靠近 1 or -1 ,則自相關越強 ) If ρ is near 0, the data is little or unrelated (ρ 越

靠近 0 ,則相關性越弱 )

Page 28: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 28

Test for IndependencyTest for Independency (( 檢定資料獨立性的方法檢定資料獨立性的方法 ))

Runs Test ( 執行測試 ) A run in a series of observations is the

occurrence of an uninterrupted sequence of numbers showing the same trend e.g run “up” or “down” ; ( 顯現相同趨勢之序列,如向上或向下走勢 )

Page 29: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 29

Test for IndependencyTest for Independency (( 檢定資料獨立性的方法檢定資料獨立性的方法 ))

Types of runs tests: if there are too many or too few, the randomness of the series is rejected. ( 趨勢出現次數過多,則應棄卻隨機性假設 )

Median Test ( 中位數檢定法 ): measure the number of runs (sequences of numbers) above and below the median

Turning Point Test( 轉折點檢定法 ): measure the number of times the series changes directions

Page 30: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 30

Test for HomogeneityTest for Homogeneity(( 齊一性檢定齊一性檢定 ))

Test for Identically Distributed Data): Test for Identically Distributed Data): Test if Test if data set come from the same distribution. data set come from the same distribution. (( 同一同一分配檢定分配檢定 ))

Examples of non-homogenous data setExamples of non-homogenous data set

(( 非齊一分配資料項實例非齊一分配資料項實例 )) Activity times that take longer or shorter depending on the Activity times that take longer or shorter depending on the

type of entity being processed type of entity being processed (( 活動時間隨實體改變活動時間隨實體改變 )) Inter-arrival times vary in length depending on the time of Inter-arrival times vary in length depending on the time of

the day or week the day or week (( 到達間隔時間隨時間改變而改變到達間隔時間隨時間改變而改變 ))

Page 31: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 31

Test for HomogeneityTest for Homogeneity(( 齊一性檢定齊一性檢定 ))

Visually inspect the distribution to see if it Visually inspect the distribution to see if it has more than one mode (has more than one mode ( 眾數眾數 ) (p.118 ) (p.118 Fig. 5.9)Fig. 5.9)

(( 可以使用視覺觀察是否有一個以上的眾數可以使用視覺觀察是否有一個以上的眾數 )) Analysis of variance (ANOVA) for normally Analysis of variance (ANOVA) for normally

distributed data distributed data (( 以變異數分析決定是否為常以變異數分析決定是否為常態分配資料態分配資料 ))

Two-Sample test, Chi-square multi-sample Two-Sample test, Chi-square multi-sample test, Kruskal-Wallis non-parametric test….test, Kruskal-Wallis non-parametric test….

(( 無母數分析法無母數分析法 ))

Page 32: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 32

Test for HomogeneityTest for Homogeneity(( 齊一性檢定齊一性檢定 ))

One type of nonhomogenous data One type of nonhomogenous data occurs when the distribution changes occurs when the distribution changes over timeover time

Example of time-changing distribution Example of time-changing distribution (( 隨時間而變之分配隨時間而變之分配 )) Learning Curve (Learning Curve ( 學習曲線學習曲線 )) Non-stationary or time variant (Non-stationary or time variant ( 據時間變異性據時間變異性 Arrival rate of customers to a service facility (Arrival rate of customers to a service facility ( 顧客顧客

到率到率 ))

Page 33: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 33

Approaches for Stationary Data Approaches for Stationary Data (( 穩定性測穩定性測試試 ))

Non-stationary data can be detected by Non-stationary data can be detected by plotting subgroups of data that occur plotting subgroups of data that occur within successive time intervals within successive time intervals (Fig 5.10)(Fig 5.10)

(( 非穩定性可以依連續時間區間描點資料子群組來觀察非穩定性可以依連續時間區間描點資料子群組來觀察 ))

Run Stat::Fit and see what distribution Run Stat::Fit and see what distribution best fits each data set. If the same best fits each data set. If the same distribution fits both, the same population distribution fits both, the same population is assumed is assumed (( 執行執行 Stat::Fit Stat::Fit 檢查何種分配適合資料檢查何種分配適合資料集集 ))

Page 34: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 34

Distribution FittingDistribution Fitting (( 分配配對分配配對 ))

Three ways of Data Representation (Three ways of Data Representation ( 資料表資料表示法示法 )) Original data record (Original data record ( 原始資料原始資料 ))

The data set is usually not large enoughThe data set is usually not large enough Empirical distribution (characterize data)Empirical distribution (characterize data)

(( 次數分配次數分配 )) Continuous frequency distribution (Continuous frequency distribution ( 次數分配次數分配 ): ):

the percentage of values that fall within given the percentage of values that fall within given intervals (intervals ( 數值落在特定區間之比例數值落在特定區間之比例 ))

Page 35: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 35

Distribution FittingDistribution Fitting (( 分配配對分配配對 ))

Empirical distribution (characterize data)Empirical distribution (characterize data) Discrete frequency distribution: the percentage Discrete frequency distribution: the percentage

of times a particular value occurs. of times a particular value occurs.

(( 特定值出現之次數的比例特定值出現之次數的比例 )) Drawbacks (Drawbacks ( 缺點缺點 ))

• Insufficient sample size may create artificial Insufficient sample size may create artificial bias bias (( 樣本數太少樣本數太少 ) )

• Fail to capture rare extreme values that may Fail to capture rare extreme values that may exist in the population from which they were exist in the population from which they were sampled sampled (( 無法解釋少數臨界值無法解釋少數臨界值 ))

Page 36: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 36

Distribution FittingDistribution Fitting (( 分配配對分配配對 ))

Theoretical distribution Theoretical distribution (( 理論分配理論分配 )) Fitting theoretical distribution to the Fitting theoretical distribution to the

datadata

(( 找出一個適當的理論分配找出一個適當的理論分配 )) Random variates (generated from the Random variates (generated from the

probability distribution provide the probability distribution provide the simulated random values. simulated random values.

(( 由亂數產生程式產生隨機變量由亂數產生程式產生隨機變量 ))

Page 37: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 37

Distribution Fitting Distribution Fitting (( 分配配對分配配對 ))

Theoretical distributionTheoretical distribution (( 理論分配理論分配 )) Fitting a theoretical distribution to sample data Fitting a theoretical distribution to sample data

smoothes artificial irregularitiessmoothes artificial irregularities

(( 理論分配緩和資料不規則行為理論分配緩和資料不規則行為 ) ) Ensure extreme values are includes Ensure extreme values are includes (( 不排除臨界點不排除臨界點 ) ) Most simulation software provide utilities for Most simulation software provide utilities for

fitting distributions to numerical data fitting distributions to numerical data (( 大多數模大多數模擬軟體提供公用程式來配對數值資料擬軟體提供公用程式來配對數值資料 ) )

Page 38: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 38

Theoretical Distribution Theoretical Distribution (( 理論分配理論分配 ))

Uniform Distribution Uniform Distribution (( 均一分配均一分配 )) (see p. 124) (see p. 124)

X~U(a,b) with EX=(a+b)/2, VarX=(b-a)^2/12X~U(a,b) with EX=(a+b)/2, VarX=(b-a)^2/12 Used as a “first” model that is felt to be randomly Used as a “first” model that is felt to be randomly

varying between a & b which little else is knownvarying between a & b which little else is known

(( 當已知資料訊息極少時,可以當做第一個模當已知資料訊息極少時,可以當做第一個模式式 ))

Page 39: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 39

Theoretical Distribution Theoretical Distribution (( 理論分配理論分配 ))

Triangular DistributionTriangular Distribution

(( 三角分配三角分配 )) (see p. 124)(see p. 124) X~Triang(a,m,b) with EX=(a+m+b)/3, X~Triang(a,m,b) with EX=(a+m+b)/3,

VarX=(a^2+m^2+b^2-am-ab-bm)/18VarX=(a^2+m^2+b^2-am-ab-bm)/18 Used as a rough model and good Used as a rough model and good

approximation to use in the absence of dataapproximation to use in the absence of data (( 當已知資料不足時,可以當做一個粗糙模式當已知資料不足時,可以當做一個粗糙模式 ))

Page 40: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 40

Theoretical DistributionTheoretical Distribution (( 理論分配理論分配 ))

Normal DistributionNormal Distribution (( 常態分配常態分配 )) (see p. 125)(see p. 125)

X~N(X~N(μμ,,σσ22) with EX=) with EX=μμ, VarX= , VarX= σσ22

Symmetry (Bell-shaped curve)Symmetry (Bell-shaped curve)

(( 對稱鐘型曲線對稱鐘型曲線 )) Physical measurements – height, length…Physical measurements – height, length…

(( 實際測量值實際測量值 )) Certain activity timeCertain activity time

(( 特定活動時間特定活動時間 ))

Page 41: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 41

Theoretical DistributionTheoretical Distribution (( 理論分配理論分配 ))

Poisson Distribution Poisson Distribution

(( 普瓦松分配普瓦松分配 )) (p. 126) (p. 126) X~Po(X~Po(λλ) with EX= ) with EX= λλ, VarX= , VarX= λλ Used as numbers of events that occur in an interval of time Used as numbers of events that occur in an interval of time

when the events are occurring at a constant when the events are occurring at a constant

(( 特定時間區間事件發生數特定時間區間事件發生數 ) ) e.g. # of items in a batch of random sizee.g. # of items in a batch of random size

(( 一批隨機大小產品的個數一批隨機大小產品的個數 )) e.g. # of items demanded from an inventorye.g. # of items demanded from an inventory

(( 訂單需求量訂單需求量 ))

Page 42: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 42

Theoretical DistributionTheoretical Distribution (( 理論分配理論分配 ))

Exponential Distribution Exponential Distribution (( 指數分配指數分配 )) (p. 126)(p. 126)

X~Exp(X~Exp(μμ) with EX= ) with EX= μμ, VarX= , VarX= μμ22

Used frequently in initerarrival times of “customers” to a Used frequently in initerarrival times of “customers” to a system that occur at a constant rate or time to failure of a system that occur at a constant rate or time to failure of a piece of equipment piece of equipment

(( 顧客到達系統的間隔時間或機器故障時間顧客到達系統的間隔時間或機器故障時間 ) ) If an occurrence happens at a rate of Po(If an occurrence happens at a rate of Po(λλ), the time between ), the time between

occurrences is Exp (1/ occurrences is Exp (1/ λλ)) (( 若一事件以若一事件以 Po(Po(λλ)) 到達率到達,則到達間隔時間為到達率到達,則到達間隔時間為Exp (1/ Exp (1/ λλ))))

Exp(Exp(μμ) is memory-less (help for events occurred ) is memory-less (help for events occurred independently of one anotherindependently of one another) ) (( 指數分配具有無記憶特指數分配具有無記憶特性性 ))

Page 43: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 43

Theoretical DistributionTheoretical Distribution (( 理論分配理論分配 ))

Gamma Distribution Gamma Distribution (( 伽瑪分配伽瑪分配 )) X~Gamma(X~Gamma(αα,,ββ) with EX = ) with EX = αβ αβ , with VarX=, with VarX=αβαβ22

Used as time to complete some tasks, e.g. Used as time to complete some tasks, e.g. customer service or machine repair. customer service or machine repair. (( 完成工作之完成工作之時間,如顧客處理時間時間,如顧客處理時間 ,, 機器修理時間機器修理時間 ))

Distribution of a random proportion, e.g. the Distribution of a random proportion, e.g. the proportion of defective items in a shipment; time proportion of defective items in a shipment; time to complete a task in a PERT to complete a task in a PERT (( 可用於設備不良可用於設備不良率,率, PERTPERT 中完成某工作之時間中完成某工作之時間 ))

Page 44: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 44

Theoretical DistributionTheoretical Distribution (( 理論分配理論分配 ))

Beta Distribution (Beta Distribution ( 貝他分配貝他分配 )) X~Beta(X~Beta(αα1,1,αα22)) Used as a rough model in the absence of data Used as a rough model in the absence of data

(( 可用於資料缺乏時的粗操模式上可用於資料缺乏時的粗操模式上 )) Distribution of a random proportion, e.g. the Distribution of a random proportion, e.g. the

proportion of defective items in a shipment; time proportion of defective items in a shipment; time to complete a task in a PERT to complete a task in a PERT (( 可用於設備不良可用於設備不良率,率, PERTPERT 中完成某工作之時間中完成某工作之時間 ))

Page 45: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 45

Theoretical DistributionTheoretical Distribution (( 理論分配理論分配 ))

Weibull Distribution (Weibull Distribution ( 韋伯分配韋伯分配 )) X~Weibull(X~Weibull(αα,,ββ)) Exp(Exp(ββ)=Weibull()=Weibull(1,1,ββ)) Used as time to complete some task or time to Used as time to complete some task or time to

failure of a piece of equipements failure of a piece of equipements (( 完成工作時間、完成工作時間、設備持續至故障的時間設備持續至故障的時間 ))

Distribution of a random proportion, e.g. the Distribution of a random proportion, e.g. the proportion of defective items in a shipment; time proportion of defective items in a shipment; time to complete a task in a PERT to complete a task in a PERT (( 可用於設備不良可用於設備不良率,率, PERTPERT 中完成某工作之時間中完成某工作之時間 ))

Page 46: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 46

Fitting Theoretical Distribution Fitting Theoretical Distribution (( 理論分配配對理論分配配對 ))

Stat::Fit does a reasonable job of data Stat::Fit does a reasonable job of data fitting which ranks distribution. fitting which ranks distribution.

(Stat::Fit (Stat::Fit 可以提供一些合理的資料配對的排名分可以提供一些合理的資料配對的排名分配 配 (p.127))(p.127)) Trial and Error Process Trial and Error Process ((試誤法試誤法 )) Goodness of fit test evaluates each fitted Goodness of fit test evaluates each fitted

distribution to ascertain the relative goodness of distribution to ascertain the relative goodness of fit.fit.

((適合度檢定評估每個配對的分配來確定相關的適合度適合度檢定評估每個配對的分配來確定相關的適合度

Page 47: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 47

Fitting Theoretical Distribution Fitting Theoretical Distribution (( 理論分配配對理論分配配對 ))

Two common goodness of fit tests: Two common goodness of fit tests: χχ2 and 2 and Kolmogorov-Smirnov testsKolmogorov-Smirnov tests

((兩種常見的適合度檢定: 兩種常見的適合度檢定: χχ2 2 檢定與檢定與 Kolmogorov-Kolmogorov-Smirnov Smirnov 檢定檢定 ))

If little data are available, goodness of fit test is If little data are available, goodness of fit test is unlikely to reject any candidate distribution unlikely to reject any candidate distribution

(( 資料不足時,適合度檢定無法拒絕任何分配資料不足時,適合度檢定無法拒絕任何分配 )) Good idea to look at graphical display in a Good idea to look at graphical display in a

histogram (histogram ( 直方圖直方圖 ) before making decisions) before making decisions (( 作決定前,先看看直方圖視個好辦法作決定前,先看看直方圖視個好辦法 ) )

Page 48: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 48

Data Absence Data Absence (( 缺乏資料缺乏資料 ))

Most likely or Mean Value Most likely or Mean Value (( 最可能資料或平均值最可能資料或平均值 )) About 10 customers arrivals per hourAbout 10 customers arrivals per hour Approximately 20 mins to assemble partsApproximately 20 mins to assemble parts Around five machine failure per dayAround five machine failure per day

Minimum and Maximum Values Minimum and Maximum Values (( 最大值與最小最大值與最小值值 )) 1.5 to 3 mins to inspect items1.5 to 3 mins to inspect items 5 to 10 customer arrivals per hour5 to 10 customer arrivals per hour 4 to 6 minutes to set up a machine4 to 6 minutes to set up a machine

Minimum, Most likely, Maximum Values can be Minimum, Most likely, Maximum Values can be easily set up as a triangular distribution easily set up as a triangular distribution

(( 缺乏資料時,可以使用最小值最可能值與最大值建缺乏資料時,可以使用最小值最可能值與最大值建立一個三角分配立一個三角分配 ))

Page 49: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 49

Summary Summary (( 結論結論 ))

Data should be collected systematicallyData should be collected systematically

(( 資料應有系統的收集資料應有系統的收集 )) Three types of data: structural, Three types of data: structural,

operational and numericaloperational and numerical

(( 資料分為;結構型、作業型與數值型三種資料分為;結構型、作業型與數值型三種 )) Questionnaire is a good way to request Questionnaire is a good way to request

informationinformation

(( 問卷昰獲得資訊的有效方法問卷昰獲得資訊的有效方法 ))

Page 50: Data Collection and Analysis ( 資料收集與分析 ) By C. L. Hsieh Department of Industrial Management Aletheia University

Data Collection and Analysis 50

Summary Summary (( 結論結論 ))

Numerical data for random variables should Numerical data for random variables should be analyzed to test for independency and be analyzed to test for independency and homogeneityhomogeneity

(( 數值型資料應分析獨立性與齊一性數值型資料應分析獨立性與齊一性 ) ) A theoretical distribution should be fit to the A theoretical distribution should be fit to the

data whenever possible data whenever possible

(( 理論分配應可用於配對資料理論分配應可用於配對資料 )) Data should be documented, reviewed and Data should be documented, reviewed and

approvedapproved

(( 資料應被文件化、複習與證明資料應被文件化、複習與證明 ))