Upload
noreen-brooks
View
222
Download
0
Embed Size (px)
Citation preview
1
Jim StuartManager, Applied Statistics
Eastman Chemical [email protected]
Kevin WhiteSenior Statistician
Voridian, a Division of Eastman Chemical Company
E A ST M A N
2
Statistical Thinking...
A habitual way of looking at work that:
recognizes all activities as PROCESSES,
recognizes that all processes have VARIABILITY,
uses DATA to understand variation, and to drive effective DECISION MAKING.
3
Process Thinking Principles
8 Lessons for Visualizing Variability
Databased Decision Making– Control Charts– Special Cause Rules– Change Point Analysis
Outline
4
Managerial Data Should:
• Summarize performance on what is key to business success.
• Provide history of how the business has performed.
• Help predict the future.
• Provide the foundation for improvement. (Gap Identification)
• Provide a signal for reinforcement of accomplishments.
• Serve as a means for holding the gains.
Process Thinking
5
Principles for Selecting Measures
• Sufficient number to adequately cover all the important facets of the business. If it isn’t important to the business, don’t track it.
• Each measure should impact at least one stakeholder including suppliers, publics, investors, customers, or employees (SPICE).
• Needs to be an appropriate mix of leading and lagging measures.
• Lend themselves to charts that are easy to read and interpret.
• Measures should be analyzed for appropriateness if the situation or strategy changes.
Well-charted measures can provide a mechanism for concise communication with stakeholders.
6
Indicators of Performance
Outputs
CustomersProcess
Gauge Gauge
Products or Services
Gauge3
Customer Satisfaction
Customer Dissatisfaction
Product and Service Quality
Process Quality/ Reliability
(Leading Indicators of Product and Service Quality)
Quality of Inputs
Supplier Quality
Inputs
Gauge4 2 1
Gauge5
Financial/Cost
Gauge7
Gauge6
People Health, Safety & Environmental
7
His project is 10% over budget…
Good News?
Bad News?
No Earthly Idea?
8
Statistical Thinking Lesson #1:“It Depends”
• Was the budget set at the best current estimate or was it a “guaranteed not to exceed number”?
• What are the implications of financial planning if everyone uses guaranteed not to exceed numbers?
• What would you suspect if a particular project manager finished every project exactly on schedule?
9
Statistical Thinking Lesson #2:“Variation Happens…At Least It Should”
Distribution of Project Cost Variances
0
20
40
60
80
100
120
MANAGER A MANAGER B MANAGER C-20 -10 0 10 20-30 -20 -10 0 10 20 -20 -10 0 10 20
NU
MB
ER
OF
PR
OJE
CT
S
ESTIMATED COST
overrun overrun
10
185
190
195
200
205
210
215
1 31 61 91 121 151 181 211 241 271 301
185
190
195
200
205
210
215
1 31 61 91 121 151 181 211 241 271 301
185
190
195
200
205
210
215
1 31 61 91 121 151 181 211 241 271 301
185
190
195
200
205
210
215
1 31 61 91 121 151 181 211 241 271 301
Statistical Thinking Lesson #3:“Show Data in Time Order”
11
Statistical Thinking Lesson #4:“Beware Your Axes”
The selection of the scale of your vertical axis can have a profound effect on the interpretation by the audience…particularly if it is not their
data.
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
1 31 61 91 121 151
Daily Sales in Thousands
0
20
40
60
80
100
120
140
160
180
200
1 31 61 91 121 151
Daily Sales in Thousands
12
0
50
100
150
200
250
1 2 3 4 5 6 7 8 9 10
185
190
195
200
205
210
215
1 31 61 91 121 151 181 211 241 271 301
Opportunity SeekingImprovement Motivating
(Let’s fix the dips!)
Management Review And Presentation
(I’m OK, you’re OK)
Statistical Thinking Lesson #5:“Don’t Over-Summarize”
Collect and display data at sufficient frequency to understand the variation, and beware the trappings of bar-charts!
13
0.0
50.0
100.0
150.0
200.0
250.0
2001
I’m OK, You’re OK Slide
Summary presentations utilizing averages, ranges or histograms should not mislead the user into taking action that would not have been taken if presented as a time series.
14
Chart TitleBrief Description (Measure & Scope)
0
50
100
150
200
250
300
350
400
450
500
J1998
M M J S N J1999
M M J S N J2000
M M J S N J2001
M M J S N
Month/YearSource of Data, How Measure is CalculatedPopulation (i.e., Entire Company, All U.S., etc.)
Date PreparedKey Result Area: Stakeholder:
GOOD
List of Supporting Information(Tables, Bar Charts, Pie Charts, etc.)
Goal
Comparison 1Comparison 2M
ea
su
re (
Cle
ar
De
sc
rip
tio
n &
Un
its
)
Plot sufficient history to visualize
trends relative to the
variation
Statistical Thinking Lesson #6:“Display History to Provide Context”
15
Chart TitleBrief Description (Measure & Scope)
0
50
100
150
200
250
300
350
400
450
500
J1998
M M J S N J1999
M M J S N J2000
M M J S N J2001
M M J S N
Month/YearSource of Data, How Measure is CalculatedPopulation (i.e., Entire Company, All U.S., etc.)
Date PreparedKey Result Area: Stakeholder:
GOOD
List of Supporting Information(Tables, Bar Charts, Pie Charts, etc.)
Goal
Comparison 1Comparison 2M
ea
su
re (
Cle
ar
De
sc
rip
tio
n &
Un
its
)
Relevant comparisons
should be placed in the appropriate locations on
the graph
Statistical Thinking Lesson #7:“Provide Comparisons to Enable Gap Analysis”
16
• Helps visualize trends through the noise.
• Length should cover expected cycles. Annual is most common.
• Tend to be sluggish.• Can generate the appearance of cycles
or shifts which are not truly present.– Cannot use run rules to signal
special causes• Control limits for moving averages
can be calculated, but prefer not to place them on the graph itself.
Chart Title
0
50
100
150
200
250
300
350
400
450
500
J1996
M M J S N J1997
M M J S N J1998
M M J S N
Month/Year
KR
M (C
lear
Des
crip
tion
and
Uni
ts)
Goal
Comparison 1
Comparison 2
Statistical Thinking Lesson #8:“Use Moving Averages with Caution”
17
The Headlines Scream - Great News!
18
What We All Imagine…‘cause this is what newspaper graphs look like
1995 1996 1997 1998
19
Projected 1998
Reality
20
A habitual way of looking at work that:
recognizes all activities as PROCESSES,
recognizes that all processes have VARIABILITY,
uses DATA to understand variation, and to drive effective DECISION MAKING.
Statistical Thinking...
21
Good graphical depiction goes a long way Seasoned managers can see signals through the noise Statistics can take the subjectivity out of such
decisions One size does not fit all
Databased Decision Making
Managers are routinely faced with interpreting their metrics and making a real-time decision as to whether the latest data
point tells them to do something.
22
Control Charts - The Two Mistakes
The False Alarm - Interpreting noise as a signal
The Missed Alarm - Failure to detect a signal
23
Control Charts in Data Rich Environments
Control limits set at 3 standard errors
Approximate 0.3% risk of a false alarm
The risk of the missed alarm is often overlooked
In parts manufacturing, greater sensitivity can be obtained by giving consideration to the selection of the rational subgroup
24
The “Run of 8” Rule
Sometimes 7 and sometimes 9
Also provides low risk of false alarms
Used with 3 standard error limits, sensitivity is improved
Takes 8 points to initiate signal
Many other rules are also often used in the data rich environment for greater sensitivity but the tradeoff is a higher false alarm rate.
25
Average Run Lengths for Typical Data Rich Environments
0
50
100
150
200
250
300
350
400
0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80 2.00 2.20 2.40 2.60 2.80 3.00
Mean Shift (Standard Errors)
Ave
rag
e #
Po
ints
to D
etec
t Sh
ift 3 Std. Error Limits Only
3 Std. Error Limits + Run of 8
26
Average Run Lengths for Typical Data Rich Environments
(Reduced Scale)
0.00
5.00
10.00
15.00
20.00
1.40 1.60 1.80 2.00 2.20 2.40 2.60 2.80 3.00
Mean Shift (Standard Errors)
Ave
rag
e #
Po
ints
to D
etec
t Sh
ift 3 Std. Error Limits Only
3 Std. Error Limits + Run of 8
27
Why These Rules Work for Data Rich Environments
High false alarm rates would lead to wasted time doing investigation and possibly excessive process adjustments.
Poor sensitivity is often an acceptable trade-off because for a lower false alarm rate
And the next point is never far behind
28
Why Managerial Data Is Different
The Obvious - less frequent data
Detection of large process shifts is not as important
Actions taken are different
Improve mindset, not maintain
29
Traditional Rules Applied to Low Frequency Managerial Data
A shift of 1.5 standard errors takes eight points on average to detect
This is little comfort if dealing with monthly managerial data
30
The Individuals Chart
An excellent all-purpose tool
Very robust - low false alarms for virtually any data distribution (typically < 1%)
A single option for managers will get more use
But, don’t forget the poor sensitivity
31
Sensitivity for Managerial Data
Data is usually individual observations (cannot subgroup)
With traditional special cause rules, there is no control over risk of the missed alarm
User can control the width of the control limits
User can employ some modified run rules
These modifications do come with a higher false alarm rate!
32
Alternative Special Cause Rule Sets
A - Control limits set at 2.5 std. errors from the centerline.
B - Control limits set at 2.5 std. errors from the centerline plus two points past 1.5 standard errors.
C - Control limits set at 2.0 std. errors from the centerline.
D - Control limits set at 2.0 std. errors from the centerline plus a run of 6.
E - Control limits set at 2.0 std. errors from the centerline plus three points past 1.0 standard errors.
F - Runs of 6 consecutive points on one side of the centerline.
33
Why Alternative Rules?
Greater sensitivity is desired with an acceptable number of false alarms
What’s acceptable? It depends (See Lesson #1)!
– The data frequency
– Time to do investigation
– Importance of detecting quickly
– Magnitude of change deemed important
34
Average Run Lengths for Alternative Rules - Chart #1
0
10
20
30
40
50
60
70
80
90
100
0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80 2.00 2.20 2.40 2.60 2.80 3.00
Mean Shift (Standard Errors)
Ave
rag
e #
Po
ints
to D
etec
t Sh
ift (A) 2.5 Std Error Limits Only
(B) 2.5 Std Error Limits + 2 Past 1.5
35
Average Run Lengths for Alternative Rules - Chart #2
05
10152025303540455055606570758085
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00Mean Shift (Standard Errors)
Ave
rag
e #
Po
ints
to D
etec
t Sh
ift
(C) 2 Std. Error Limits Only(D) 2 Std. Error Limits + Run of 6(E) 2 Std. Error Limits + 3 Past 1 (F) Run of 6 Only
36
Average False Alarms Per Year
Data Frequency
Obs. Per Year
3 Std. Error
Limits
3 Std. Error
Limits + Run of 8
2.5 Std. Error Limits
(A)
2.5 Std. Error Limits
plus two past 1.5
Std. Errors
(B)
2 Std. Error
Limits (C)
2 Std. Error
Limits plus
Run of 6 (D)
2 Std. Error Limits plus
three past 1 Std.
Error (E)
Run of 6 (F)
Hourly 8760 23.64 57.48 108.59 171.50 401.28 509.01 435.60 139.454 Hours 2190 5.91 14.37 27.15 42.87 100.32 127.25 108.90 34.868 Hours 1095 2.96 7.19 13.57 21.44 50.16 63.63 54.45 17.43
Daily 365 0.99 2.40 4.52 7.15 16.72 21.21 18.15 5.81Weekly 52 0.14 0.34 0.64 1.02 2.38 3.02 2.59 0.83Monthly 12 0.03 0.08 0.15 0.23 0.55 0.70 0.60 0.19
Quarterly 4 0.01 0.03 0.05 0.08 0.18 0.23 0.20 0.06
370.50 152.40 80.67 51.08 21.83 17.21 20.11 62.82ARL for No Shift
SPECIAL CAUSE RULE SET
37
Situational Recommendations
Situation Recommendation
Goal is to Maintain Use 3 Std Error Limits and consider Run of 6
Goal is to Improve Alternative Rules
Strong Slope in the Metric Predictable Slope - Place limits around sloped center line.
Highly Unstable Process - Where’s the Average?
Control Charts will not apply.
38
Change Point Analysis
The general principle is Monte Carlo simulation
Advantages include:
– Very easy to use
– Detects mean and variation changes
– Excellent graphics
39
Change Point Analysis
Confidence Levels for the probability a change is real
Confidence Levels for when the change occurred.
Handles any type of data
More sensitive than control charts
Not confused by outliers
40
Change Point AnalysisExample Graph
Change-Point Analysis of % of AR$ Past Due22
14
6
% o
f A
R$
Pas
t D
ue
Jan-1990 Dec-1990 Nov-1991 Oct-1992 Sep-1993 Aug-1994 Jul-1995 Jun-1996
Month/Yr
41
Change Point AnalysisExample Table
Table of Significant Changes for % of AR$ Past DueConfidence Level = 90%, Confidence Interval = 95%, Bootstraps = 1000, Sampling Without Replacement
Month/Yr Confidence Interval Conf. Level From To Level
Dec-1990 (Nov-1990, Feb-1991) 98% 13.527 16.8 2
Jul-1991 (Jun-1991, Oct-1991) 98% 16.8 13.083 1
Jun-1993 (Feb-1992, Jan-1994) 94% 13.083 11.306 3
Dec-1994 (Sep-1994, May-1995) 100% 11.306 14.392 2
42
Change Point AnalysisSerial Dependency
Change Point Analysis of $ Accounts Receivable440000
330000
220000
$ A
cco
un
ts R
ecei
vab
le
Jan-1990 Dec-1990 Nov-1991 Oct-1992 Sep-1993 Aug-1994 Jul-1995 Jun-1996
Month/Yr
43
Conclusions
Process thinking and careful selection of measures can help keep managers focused
Appropriate plotting is 90% of the battle
Traditional control charts may not be optimal
Alternative special cause rule sets should be considered
Change point analysis may be the closest thing to an all-purpose tool for managers.
44
REFERENCES 1) Stuart J, White K, Methods for Handling Low Frequency Managerial Data, 2002
ASQ Annual Quality Congress Proceedings
2) Balestracci D, Data “Sanity”: Statistical Thinking Applied to Everyday Data, ASQ Statistics Division Special Publication, Summer, 1998 (Available through ASQ, www.asqstatdiv.org)
3) Britz G, Emerling D, Hare L, Hoerl R, Shade J, Statistical Thinking, ASQ Statistics
Division Special Publication, Summer, 1996 (Available through ASQ, www.asqstatdiv.org)
4) Leitnaker, MG, Using the Power of Statistical Thinking, ASQ Statistics Division
Special Publication, Summer 2000 (Available through ASQ, www.asqstatdiv.org) 5) Wheeler DJ, Understanding Variation: The Key to Managing Chaos. Knoxville, TN:
SPC Press, Inc. 1986. (www.spcpress.com) 6) Taylor WA, "Change-Point Analysis: A Powerful New Tool For Detecting Changes,"
WEB: www.variation.com/cpa/tech/changepoint.html, 2000 7) Wheeler DJ, Building Continual Improvement, Knoxville, TN: SPC Press, Inc., 1998.
(www.spcpress.com)