1 Jim Stuart Manager, Applied Statistics Eastman Chemical Company [email protected] Kevin White Senior Statistician Voridian, a Division of Eastman

1

Jim StuartManager, Applied Statistics

Eastman Chemical [email protected]

Kevin WhiteSenior Statistician

Voridian, a Division of Eastman Chemical Company

[email protected]

E A ST M A N

2

Statistical Thinking...

A habitual way of looking at work that:

recognizes all activities as PROCESSES,

recognizes that all processes have VARIABILITY,

uses DATA to understand variation, and to drive effective DECISION MAKING.

3

Process Thinking Principles

8 Lessons for Visualizing Variability

Databased Decision Making– Control Charts– Special Cause Rules– Change Point Analysis

Outline

4

Managerial Data Should:

• Summarize performance on what is key to business success.

• Provide history of how the business has performed.

• Help predict the future.

• Provide the foundation for improvement. (Gap Identification)

• Provide a signal for reinforcement of accomplishments.

• Serve as a means for holding the gains.

Process Thinking

5

Principles for Selecting Measures

• Sufficient number to adequately cover all the important facets of the business. If it isn’t important to the business, don’t track it.

• Each measure should impact at least one stakeholder including suppliers, publics, investors, customers, or employees (SPICE).

• Needs to be an appropriate mix of leading and lagging measures.

• Lend themselves to charts that are easy to read and interpret.

• Measures should be analyzed for appropriateness if the situation or strategy changes.

Well-charted measures can provide a mechanism for concise communication with stakeholders.

6

Indicators of Performance

Outputs

CustomersProcess

Gauge Gauge

Products or Services

Gauge3

Customer Satisfaction

Customer Dissatisfaction

Product and Service Quality

Process Quality/ Reliability

(Leading Indicators of Product and Service Quality)

Quality of Inputs

Supplier Quality

Inputs

Gauge4 2 1

Gauge5

Financial/Cost

Gauge7

Gauge6

People Health, Safety & Environmental

7

His project is 10% over budget…

Good News?

Bad News?

No Earthly Idea?

8

Statistical Thinking Lesson #1:“It Depends”

• Was the budget set at the best current estimate or was it a “guaranteed not to exceed number”?

• What are the implications of financial planning if everyone uses guaranteed not to exceed numbers?

• What would you suspect if a particular project manager finished every project exactly on schedule?

9

Statistical Thinking Lesson #2:“Variation Happens…At Least It Should”

Distribution of Project Cost Variances

0

20

40

60

80

100

120

MANAGER A MANAGER B MANAGER C-20 -10 0 10 20-30 -20 -10 0 10 20 -20 -10 0 10 20

NU

MB

ER

OF

PR

OJE

CT

S

ESTIMATED COST

overrun overrun

10

185

190

195

200

205

210

215

1 31 61 91 121 151 181 211 241 271 301

185

190

195

200

205

210

215

1 31 61 91 121 151 181 211 241 271 301

185

190

195

200

205

210

215

1 31 61 91 121 151 181 211 241 271 301

185

190

195

200

205

210

215

1 31 61 91 121 151 181 211 241 271 301

Statistical Thinking Lesson #3:“Show Data in Time Order”

11

Statistical Thinking Lesson #4:“Beware Your Axes”

The selection of the scale of your vertical axis can have a profound effect on the interpretation by the audience…particularly if it is not their

data.

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

1 31 61 91 121 151

Daily Sales in Thousands

0

20

40

60

80

100

120

140

160

180

200

1 31 61 91 121 151

Daily Sales in Thousands

12

0

50

100

150

200

250

1 2 3 4 5 6 7 8 9 10

185

190

195

200

205

210

215

1 31 61 91 121 151 181 211 241 271 301

Opportunity SeekingImprovement Motivating

(Let’s fix the dips!)

Management Review And Presentation

(I’m OK, you’re OK)

Statistical Thinking Lesson #5:“Don’t Over-Summarize”

Collect and display data at sufficient frequency to understand the variation, and beware the trappings of bar-charts!

13

0.0

50.0

100.0

150.0

200.0

250.0

2001

I’m OK, You’re OK Slide

Summary presentations utilizing averages, ranges or histograms should not mislead the user into taking action that would not have been taken if presented as a time series.

14

Chart TitleBrief Description (Measure & Scope)

0

50

100

150

200

250

300

350

400

450

500

J1998

M M J S N J1999

M M J S N J2000

M M J S N J2001

M M J S N

Month/YearSource of Data, How Measure is CalculatedPopulation (i.e., Entire Company, All U.S., etc.)

Date PreparedKey Result Area: Stakeholder:

GOOD

List of Supporting Information(Tables, Bar Charts, Pie Charts, etc.)

Goal

Comparison 1Comparison 2M

ea

su

re (

Cle

ar

De

sc

rip

tio

n &

Un

its

)

Plot sufficient history to visualize

trends relative to the

variation

Statistical Thinking Lesson #6:“Display History to Provide Context”

15

Chart TitleBrief Description (Measure & Scope)

0

50

100

150

200

250

300

350

400

450

500

J1998

M M J S N J1999

M M J S N J2000

M M J S N J2001

M M J S N

Month/YearSource of Data, How Measure is CalculatedPopulation (i.e., Entire Company, All U.S., etc.)

Date PreparedKey Result Area: Stakeholder:

GOOD

List of Supporting Information(Tables, Bar Charts, Pie Charts, etc.)

Goal

Comparison 1Comparison 2M

ea

su

re (

Cle

ar

De

sc

rip

tio

n &

Un

its

)

Relevant comparisons

should be placed in the appropriate locations on

the graph

Statistical Thinking Lesson #7:“Provide Comparisons to Enable Gap Analysis”

16

• Helps visualize trends through the noise.

• Length should cover expected cycles. Annual is most common.

• Tend to be sluggish.• Can generate the appearance of cycles

or shifts which are not truly present.– Cannot use run rules to signal

special causes• Control limits for moving averages

can be calculated, but prefer not to place them on the graph itself.

Chart Title

0

50

100

150

200

250

300

350

400

450

500

J1996

M M J S N J1997

M M J S N J1998

M M J S N

Month/Year

KR

M (C

lear

Des

crip

tion

and

Uni

ts)

Goal

Comparison 1

Comparison 2

Statistical Thinking Lesson #8:“Use Moving Averages with Caution”

17

The Headlines Scream - Great News!

18

What We All Imagine…‘cause this is what newspaper graphs look like

1995 1996 1997 1998

19

Projected 1998

Reality

20

A habitual way of looking at work that:

recognizes all activities as PROCESSES,

recognizes that all processes have VARIABILITY,

uses DATA to understand variation, and to drive effective DECISION MAKING.

Statistical Thinking...

21

Good graphical depiction goes a long way Seasoned managers can see signals through the noise Statistics can take the subjectivity out of such

decisions One size does not fit all

Databased Decision Making

Managers are routinely faced with interpreting their metrics and making a real-time decision as to whether the latest data

point tells them to do something.

22

Control Charts - The Two Mistakes

The False Alarm - Interpreting noise as a signal

The Missed Alarm - Failure to detect a signal

23

Control Charts in Data Rich Environments

Control limits set at 3 standard errors

Approximate 0.3% risk of a false alarm

The risk of the missed alarm is often overlooked

In parts manufacturing, greater sensitivity can be obtained by giving consideration to the selection of the rational subgroup

24

The “Run of 8” Rule

Sometimes 7 and sometimes 9

Also provides low risk of false alarms

Used with 3 standard error limits, sensitivity is improved

Takes 8 points to initiate signal

Many other rules are also often used in the data rich environment for greater sensitivity but the tradeoff is a higher false alarm rate.

25

Average Run Lengths for Typical Data Rich Environments

0

50

100

150

200

250

300

350

400

0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80 2.00 2.20 2.40 2.60 2.80 3.00

Mean Shift (Standard Errors)

Ave

rag

e #

Po

ints

to D

etec

t Sh

ift 3 Std. Error Limits Only

3 Std. Error Limits + Run of 8

26

Average Run Lengths for Typical Data Rich Environments

(Reduced Scale)

0.00

5.00

10.00

15.00

20.00

1.40 1.60 1.80 2.00 2.20 2.40 2.60 2.80 3.00


Ave

rag

e #

Po

ints

to D

etec

t Sh

ift 3 Std. Error Limits Only

3 Std. Error Limits + Run of 8

27

Why These Rules Work for Data Rich Environments

High false alarm rates would lead to wasted time doing investigation and possibly excessive process adjustments.

Poor sensitivity is often an acceptable trade-off because for a lower false alarm rate

And the next point is never far behind

28

Why Managerial Data Is Different

The Obvious - less frequent data

Detection of large process shifts is not as important

Actions taken are different

Improve mindset, not maintain

29

Traditional Rules Applied to Low Frequency Managerial Data

A shift of 1.5 standard errors takes eight points on average to detect

This is little comfort if dealing with monthly managerial data

30

The Individuals Chart

An excellent all-purpose tool

Very robust - low false alarms for virtually any data distribution (typically < 1%)

A single option for managers will get more use

But, don’t forget the poor sensitivity

31

Sensitivity for Managerial Data

Data is usually individual observations (cannot subgroup)

With traditional special cause rules, there is no control over risk of the missed alarm

User can control the width of the control limits

User can employ some modified run rules

These modifications do come with a higher false alarm rate!

32

Alternative Special Cause Rule Sets

A - Control limits set at 2.5 std. errors from the centerline.

B - Control limits set at 2.5 std. errors from the centerline plus two points past 1.5 standard errors.

C - Control limits set at 2.0 std. errors from the centerline.

D - Control limits set at 2.0 std. errors from the centerline plus a run of 6.

E - Control limits set at 2.0 std. errors from the centerline plus three points past 1.0 standard errors.

F - Runs of 6 consecutive points on one side of the centerline.

33

Why Alternative Rules?

Greater sensitivity is desired with an acceptable number of false alarms

What’s acceptable? It depends (See Lesson #1)!

– The data frequency

– Time to do investigation

– Importance of detecting quickly

– Magnitude of change deemed important

34

Average Run Lengths for Alternative Rules - Chart #1

0

10

20

30

40

50

60

70

80

90

100

0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80 2.00 2.20 2.40 2.60 2.80 3.00


Ave

rag

e #

Po

ints

to D

etec

t Sh

ift (A) 2.5 Std Error Limits Only

(B) 2.5 Std Error Limits + 2 Past 1.5

35

Average Run Lengths for Alternative Rules - Chart #2

05

10152025303540455055606570758085

0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00Mean Shift (Standard Errors)

Ave

rag

e #

Po

ints

to D

etec

t Sh

ift

(C) 2 Std. Error Limits Only(D) 2 Std. Error Limits + Run of 6(E) 2 Std. Error Limits + 3 Past 1 (F) Run of 6 Only

36

Average False Alarms Per Year

Data Frequency

Obs. Per Year

3 Std. Error

Limits

3 Std. Error

Limits + Run of 8

2.5 Std. Error Limits

(A)

2.5 Std. Error Limits

plus two past 1.5

Std. Errors

(B)

2 Std. Error

Limits (C)

2 Std. Error

Limits plus

Run of 6 (D)

2 Std. Error Limits plus

three past 1 Std.

Error (E)

Run of 6 (F)

Hourly 8760 23.64 57.48 108.59 171.50 401.28 509.01 435.60 139.454 Hours 2190 5.91 14.37 27.15 42.87 100.32 127.25 108.90 34.868 Hours 1095 2.96 7.19 13.57 21.44 50.16 63.63 54.45 17.43

Daily 365 0.99 2.40 4.52 7.15 16.72 21.21 18.15 5.81Weekly 52 0.14 0.34 0.64 1.02 2.38 3.02 2.59 0.83Monthly 12 0.03 0.08 0.15 0.23 0.55 0.70 0.60 0.19

Quarterly 4 0.01 0.03 0.05 0.08 0.18 0.23 0.20 0.06

370.50 152.40 80.67 51.08 21.83 17.21 20.11 62.82ARL for No Shift

SPECIAL CAUSE RULE SET

37

Situational Recommendations

Situation Recommendation

Goal is to Maintain Use 3 Std Error Limits and consider Run of 6

Goal is to Improve Alternative Rules

Strong Slope in the Metric Predictable Slope - Place limits around sloped center line.

Highly Unstable Process - Where’s the Average?

Control Charts will not apply.

38

Change Point Analysis

The general principle is Monte Carlo simulation

Advantages include:

– Very easy to use

– Detects mean and variation changes

– Excellent graphics

39

Change Point Analysis

Confidence Levels for the probability a change is real

Confidence Levels for when the change occurred.

Handles any type of data

More sensitive than control charts

Not confused by outliers

40

Change Point AnalysisExample Graph

Change-Point Analysis of % of AR$ Past Due22

14

6

% o

f A

R$

Pas

t D

ue

Jan-1990 Dec-1990 Nov-1991 Oct-1992 Sep-1993 Aug-1994 Jul-1995 Jun-1996

Month/Yr

41

Change Point AnalysisExample Table

Table of Significant Changes for % of AR$ Past DueConfidence Level = 90%, Confidence Interval = 95%, Bootstraps = 1000, Sampling Without Replacement

Month/Yr Confidence Interval Conf. Level From To Level

Dec-1990 (Nov-1990, Feb-1991) 98% 13.527 16.8 2

Jul-1991 (Jun-1991, Oct-1991) 98% 16.8 13.083 1

Jun-1993 (Feb-1992, Jan-1994) 94% 13.083 11.306 3

Dec-1994 (Sep-1994, May-1995) 100% 11.306 14.392 2

42

Change Point AnalysisSerial Dependency

Change Point Analysis of $ Accounts Receivable440000

330000

220000

$ A

cco

un

ts R

ecei

vab

le

Jan-1990 Dec-1990 Nov-1991 Oct-1992 Sep-1993 Aug-1994 Jul-1995 Jun-1996

Month/Yr

43

Conclusions

Process thinking and careful selection of measures can help keep managers focused

Appropriate plotting is 90% of the battle

Traditional control charts may not be optimal

Alternative special cause rule sets should be considered

Change point analysis may be the closest thing to an all-purpose tool for managers.

44

REFERENCES 1) Stuart J, White K, Methods for Handling Low Frequency Managerial Data, 2002

ASQ Annual Quality Congress Proceedings

2) Balestracci D, Data “Sanity”: Statistical Thinking Applied to Everyday Data, ASQ Statistics Division Special Publication, Summer, 1998 (Available through ASQ, www.asqstatdiv.org)

3) Britz G, Emerling D, Hare L, Hoerl R, Shade J, Statistical Thinking, ASQ Statistics

Division Special Publication, Summer, 1996 (Available through ASQ, www.asqstatdiv.org)

4) Leitnaker, MG, Using the Power of Statistical Thinking, ASQ Statistics Division

Special Publication, Summer 2000 (Available through ASQ, www.asqstatdiv.org) 5) Wheeler DJ, Understanding Variation: The Key to Managing Chaos. Knoxville, TN:

SPC Press, Inc. 1986. (www.spcpress.com) 6) Taylor WA, "Change-Point Analysis: A Powerful New Tool For Detecting Changes,"

WEB: www.variation.com/cpa/tech/changepoint.html, 2000 7) Wheeler DJ, Building Continual Improvement, Knoxville, TN: SPC Press, Inc., 1998.

(www.spcpress.com)

Documents

1 Jim Stuart Manager, Applied Statistics Eastman Chemical Company [email protected] Kevin White Senior Statistician Voridian, a Division of Eastman