VYSOKÁ ŠKOLA BÁŇSKÁ – TECHNICKÁ UNIVERZITA OSTRAVA FAKULTA METALURGIE …katedry.fmmi.vsb.cz/Opory_FMMI_ENG/QM/Computer … · · 2015-11-15vysokÁ Škola bÁŇskÁ –

VYSOKÁ ŠKOLA BÁŇSKÁ – TECHNICKÁ UNIVERZITA OSTRAVA

FAKULTA METALURGIE A MATERIÁLOVÉHO INŽENÝRSTVÍ

Computer Aided Quality Management II

Study Support

prof. Ing. Jiří Plura, CSc.

Ing. Pavel Klaput, Ph.D.

Ostrava 2015

Jiří Plura, Pavel Klaput Computer-Aided Quality management II

1

Title: Computer Aided Quality Management II

Code:

Authors: prof. Ing. Jiří Plura, CSc., Ing. Pavel Klaput, Ph.D.

Edition: first, 2015

Number of pages: 50

Academic materials for the Management of industrial systems study programme at the

Faculty of Metallurgy and Materials Engineering.

Proofreading has not been performed.

Execution: VŠB - Technical University of Ostrava


2

TABLE OF CONTENTS

TABLE OF CONTENTS ...................................................................... 2

1. BASICS OF WORKING WITH MINITAB .......................................... 5

1.1 CHARACTERISTICS OF THE MINITAB ............................................................................................ 5

1.2 OPERATORS AND MINITAB FUNCTIONS ..................................................................................... 6

1.3 CREATING NEW VARIABLES ....................................................................................................... 11

1.4 DATA SORTING .......................................................................................................................... 14

1.5 SORTING DATA (FILTRATION) ................................................................................................... 16

2 EXPLORATORY DATA ANALYSIS ............................................... 20

2.1 BOXPLOT .................................................................................................................................... 21

2.2 DOTPLOT .................................................................................................................................... 22

2.3 HISTOGRAM ............................................................................................................................... 23

2.4 STEM AND LEAF DISPLAY ........................................................................................................... 24

2.5 PROBABILITY PLOT ..................................................................................................................... 25

2.6 SCATTERPLOT ............................................................................................................................ 26

2.7 BASIC DESCRIPTIVE STATISTICS ................................................................................................. 28

2.8 TESTING NORMALITY ................................................................................................................. 29

2.9 FINDING A SUITABLE THEORETICAL MODEL OF PROBABILITY DISTRIBUTION .......................... 30

3. APPLICATION OF SELECTED QUALITY MANAGEMENT TOOLS .... 34

3.1 CAUSE AND EFFECT DIAGRAM (ISHIKAWA DIAGRAM, FISHBONE DIAGRAM) .......................... 34

3.2 PARETO ANALYSIS ..................................................................................................................... 35

3.3 MEASUREMENT SYSTEM ANALYSIS ........................................................................................... 37

3.3.1 Analysis of the linearity and bias of a measurement system...................................................... 38

3.3.2 Analysis of the repeatability and reproducibility of a measurement system. ............................ 40

3.4 PROCESS CAPABILITY ANALYSIS ................................................................................................. 43

3.5 STATISTICAL PROCESS CONTROL ................................................................................................ 46


3

STUDY INSTRUCTIONS

The study support is determined for the subject Computer-Aided Quality Management II,

which is taught in the first semester of the follow-up masters study in the branch Quality

Management

PREREQUISITES Mathematical Statistics, Basic Statistical Methods of Quality Management, Planning Quality I

SUBJECT OBJECTIVE AND LEARNING RESULTS

The objective of the subject is to adopt the advanced applications of computer-aided quality

management. Students are acquainted with selected programmes and their possibilities and

learn how to solve quality management tasks using the Minitab programme.

AFTER STUDYING THE SUBJECT STUDENTS SHOULD HAVE THE ABILITY:

Knowledge results: Students should be able to: • to categorize the types of software used in quality mangament • to identify the possibility of using the Minitab programme in quality management.

Skill results:

• to apply the Minitab programme for processing data • to apply the Minitab programme in completing tasks in quality management areas • to interpret the achieved results.

WE RECOMMEND THE FOLLOWING PROCEDURES IN STUDYING EACH CHAPTER:

When working with study support it is suitable to proceed in logical sequence and at the

same time to practice individual topics using the Minitab programme.

COMMUNICATION METHOD WITH TEACHERS:

A teacher assigns three projects to be done at home from the selected tasks of quality

management using the Minitab programme. The projects will be evaluated up to 14 days

after submission and the results will be sent to students by email by means of IS.

CONSULTATION WILL BE CARRIED OUT WITH THE GUARANTOR OF THE SUBJECT

OR LECTURER:

• during consulting hours,

• after making an apppointment by email or by telephone

Guarantor of the subject: prof. Ing. Jiří Plura, CSc.

Lecturer: prof. Ing. Jiří Plura, CSc.

Contacts: [email protected]; [email protected]

mailto:[email protected]

mailto:[email protected]


4

INTRODUCTION

Similarily as requirements constantly increase for the quality of products and services,

there also increase demands on the graduates of the branch of study Quality Management.

Today´s quality managers and specialists in this area cannot get along without the use of

computer aids, because their important workloads are the analysis and processing of data

about the quality of products and processes, and the application of suitable methods for

planning and improving quality and for solving problems.

The subject Computer Aided Quality Management II is one of the important subjects

following up the Masters Study in the branch Quality Management. Through it students get

acquainted with the possibilities of computer aids in working out assignments from the field of

quality management and learn how to work with selected statistical software for analysing

and processing data and for the application of selected quality management methods.

In the course of the last decade national and foreign industrial companies and other

organizations have in practice significantly expanded their use of the programme Minitab,

which is modern statistical program with a very good choice of the most up-to-date

procedures in working out assignments for quality management areas. This fact has led to

the introduction of this programme into the teaching of the subject Computer Aided Quality

Management II. For supporting these lessons, this teaching text was worked out using the

Minitab programme.


5

1. BASICS OF WORKING WITH MINITAB

Study time

3 hours

Objective

After studying this chapter you will know how:

to work with the Minitab programme

to use operators and Minitab functions

to organize and filter data.

Lecture

1.1 CHARACTERISTICS OF THE MINITAB

The Minitab programme is a world-wide statistical programme from the American

company Minitab, Inc. The programme offers a wide range of possibilities in the area of

statistical data processing with special regard to the area of quality management. This

programme is presently available in seven world language versions (English, French,

Spanish, German, Japanese, Korean and Chinese). The appearance of the Minitab

programme screen is shown in Figure 1.1.

Fig. 1.1 The appearance of the Minitab screen.


6

The upper part of the Minitab programme screen, similiarly as in other applications in

the Windows environment, is created by roll-up menu, under which there is a toolbar, in

which it is possible to activate a selected application using a button. Under the toolbar there

is located the results window (Session), where the text results of individual tasks are

recorded.

A fundamental part of the Minitab programme screen is usually made up of a special

Worksheet, which represents a just opened data file. Several worksheets can be opened at

the same time. Columns in the worksheets represent individual variables, which can have a

numerical, textual or data format. In the individual cells there are the individual values of

appropriate variables.

After the Minitab programme is released, a new, empty worksheet automatically

opens. It is also possible to open a new worksheet at any time using the File - New

selection. Using this selection it is possible to open a new worksheet (Minitab Worksheet) or

new project (Minitab Project). The Minitab worksheet constitutes a table of 4000 columns

with a number of rows, which is limited only by a computer´s memory. The Minitab project

includes an arbitrary number of worksheets, including the results of all up-to-date analyses

carried out.

1.2 OPERATORS AND MINITAB FUNCTIONS

The source data, which we gain for analysis, still often don´t provide sufficient

information for effective quality management. It is often necessary to recalculate them to

other variables or on their basis to set various indicators. It is possible in Minitab to carry out

various mathematical or logical operations for these calculations with the use of a series of

operators and functions. An overview of the most important operators and functions is given

in the following text.

Mathematical operators

An overview of mathematical operators is given in Table 1.1.

Tab. 1.1 Minitab mathematical operators.

Operation Symbol

Addition +

Subtraction -

Multiplication *

Division /

To the power of **


7

Relation operators

An overview of relation operators is given in Table 1.2.

Tab. 1.2 Mintab relation operators.

Operation Symbol

Equal to =

Not equal to <>

Greater than >

Less than <

Greater than or equal to >=

Less than or equal to <=

In the case when an appropriate relation expression is true the result has the value 1,

in the case when it is not true the result has the value 0.

Logic operators

An overview of logic operators is given in Table 1.3.

Tab. 1.3 Mintab Logic Operators.

Operation Symbol

Conjunction And &

Disjunction Or |

Negation Not ~

It is possible to enter logic operations in Minitab in a text form and so by using graphic

symbols. In more complicated expressions logical operators are evaluated in the order And,

Or, Not. The conjunction of two statements is true only when, if both statements are true, in

other cases it is not true. The disjunction of two statements is true, when at least one

statement (if need be both) is true. The negation of a statement is true, when the statement

is not true.

Selected functions of Minitab

a) arithmetic functions

Function Description Example

ABS(x) Calculates the absolute value of a number.

ABS (-23,5) = 23,5

FACTORIAL(x) Calculates factorial. FACTORIAL (6) = 720

ROUND (x;n) Rounds off the number x to n decimal places.

ROUND (2,136;2) = 2,14


8

b) functions of the variables (columns)


RANK (prom) It states the order of values in a given column.

If column c1 contains 2, 5, 7, 9, 4, then RANK (c1) = 1, 3, 4, 5, 2

SORT (prom) It aligns numerical values in a given column in increasing orde.r

If column c1 contains 1, 11, 7, 9, 4, then SORT (c1) = 1, 4, 7, 9, 11

c) data and time functions


TODAY() It states up-to-date data TODAY() for example is 8.10.2015

NOW () It states up-to-date data and the time. The format of results will depend on whether a worksheet (the time format) is stored in a column or a constant (the numerical format).

NOW () for example is 8.10.2015 12:35:54 PM

DATE(„text“) DATE(number)

It transforms information about the date and time in a text or numerical format into information about the date into a data format. Similarily as in Excel, the days are calculated from 1.1.1900.

DATE("8.10.11 12:55:35 PM") = 8.10.11 DATE (40824,5386) = 8.10.2011

TIME („text“) TIME (number)

It transforms information about the date and time in a text or numerical format into information about time in a time format.

TIME("8.10.11 12:55:35 PM") = 12:55:35 PM TIME(0.5386) = 12:55:35 PM

WHEN(the number of days)

It transforms the date and time from a numerical format into date information or text.

WHEN(40824,5386) = 8.10.11 12:55:35 PM

NETWORKDAYS (starting date; ending date; holidays)

It states the number of working days between two dates. As a standard, Saturday and Sunday are considered non-working days. Holidays are done by a non-compulsory third argument.

If c1 contains 8.10.2011, c2 contains 6.12.2011 and c3 contains 28.10.2011 and 17.11.2011, then NETWORKDAYS (c1;c2;c3) = 40 days

WDAY(starting date; number of working days; holidays)

It states the ending date, if we assign a starting date and the number of working days.

If c1 contains 8.10.2011, and c3 contains 28.10.2011 and 17.11.2011, then WDAY (c1;40;c3) = 6.12.2011

d) logarithmic functions


ANTILOG (x) It calculates 10x. ANTILOG (3) = 1000

EXP (x) It calculates ex. EXP (1) = 2.718281

LOGTEN (x) It calculates log(x). LOGTEN (1000) =3

LN (x) It calculates ln(x). LN (2.718281) = 1


9

e) logical functions

Function Popis Description Příklad Example

ANY(prom;value1; value 2; value 3)

It evaluates whether individual variable values are equal to some of the set of given values. If yes it returns to the value 1, if no it returns to the value 0.

ANY(c2,12,20) returns value 1 for values c2 equal to 12 or 20; for other values it returns to value 0.

IF(condition;yes;no)

A third augment is not compulsory. It evaluates whether the individual variable values meet an assigned condition and according to the values leads the assigned operations (yes; no).

IF(c1<=10;“conformity“;“nonconformity“) For variable value c1 meeting the condition conformity is given, for values not meeting the condition nonconformity is given.

IF(condition1;yes1; condition 2;yes 2;no)

The expanded variant of the IF function, whether individual variable values meet the assigned conditions and according to the results lead the assigned operations (yes1;yes2; no).

IF(c1<= 2, "small", c1 <=4, "medium", "high") it returns "small" for value c1<= 2, "medium" for value 2<c1<= 4, and „high“ for other values of c1.

f) statistical functions


MEAN(prom) It calculates the average of a given variable.

If variable c1 contains values 6, 3, 15 then MEAN(c1) = 8

MEDIAN(prom) It calculates the median of given variable.

If variable c1 contains values 6, 3, 15 then MEDIAN(c1) = 6

MIN(prom) It calculates the minimal value of given variable.

If variable c1 contains values 6, 3, 15 then MIN(c1) = 3

MAX(prom) It calculates the maximum value of given variable.

If variable c1 contains values 6, 3, 15 then MAX(c1) = 15

COUNT(prom) It states the number of all observations of given variable (including missing values).

If variable c1 contains values 6, *, 15 then COUNT(c1) = 3

NMISS(prom) It states the number of missing values of given variable.

If variable c1 contains values 6, *, 15 then NMISS (c1) = 1

N(prom) It states the number of values of given variable (without the missing values).

If variable c1 contains values 6, *, 15 then N(c1) =2

PERCENTILE (prom;p)

It calculates 100p% percentile of the values of given variable.

If variable c1 contains values 2, 3, 5, 7 , then PERCENTILE (c1;0,25) = 2,25


10

RANGE (prom) It calculates the range of values of given variable.

If variable c1 contains values 6, 3, 15 then RANGE (c1) = 12

STDEV (prom) It calculates standard deviation of given variable.

If variable c1 contains values 6, 3, 15 then STDEV (c1) = 6.245

SUM (prom) It calculates the sum of all the values of given variable.

If variable c1 contains values 6, 3, 15 then SUM (c1) = 24

Note: It is also possible to calculate appropriate sample characteristics in rows. The names of individual functions are distinguished by prearrangement of the letter R (for example, RMEAN, RMEDIAN) and variables are shown as arguments (columns, which have to be included into the sample). g) text functions

Funkce Function Description Example

CONCATENATE (prom1;prom2)

It connects the values of two or more variables (columns) and stores them into new variable. Numerical values are tranformed to a text format.

If variable c1 contains John and variable c2 Smith, then CONCATENATE (c1,c2) returns JohnSmith

FIND("text";prom) It states the order of symbols, where in a value text chain begins an assigned text chain (letter, syllable, word, etc.) If a chain is not found the missing value (*) appears. Small and big letters are distinguished.

If variable c1 contains the text Product is without defect, then FIND("without defect",c1) returns the value 12.

ITEM(prom;n) It selects the n (in ordinal numbers) word from the text chain in the values of the given variable.

John Smith, Opava ITEM(c1,3) returns Opava

LEN (prom) It states the number of symbols of text chains in the values of the given variables.

If variable c1 contains the text "neshoda", then LEN (c1) returns the value 7

MID(prom;n;m) It states the m of symbols from the n position of the text chain in the values of the given variable.

If variable c1 contains the text " "neshoda", then MID(c1;4;3) returns the value "hod"

REPLACE (prom;n ;m;"text")

It substitues the m of symbols from the n position of text chains in the values of the given variable using an assigned text.

If variable c1 contains the text Jan Novak, then REPLACE (c1,1,3,"Josef ") returns the value Josef Novak

RIGHT (prom;n) It states the last of the n symbols of text chains in the values of given variable.

If variable c1 contains the text "neshoda", RIGHT (c1, 3) returns the value " oda"

TEXT (prom) It transforms the values of the numerical variables into a text format.

If variable c1 contains the text 1024, then TEXT (c1) returns the text 1024

VALUE (prom) It transforms numerical or date or time information from a text format to a numerical format.

If variable c1 contains the text 1024, then VALUE (c1) returns the number 1024


11

1.3 CREATING NEW VARIABLES

When creating new variables, which have to be calculated on the basis of existing

variables, proceed using the following method:

1. On the main menu we select the option Calc – Calculator.

2. We set either the name of the variable or designation of column into the window Store

results in variable, into which the new variable should be stored.

3. In the entry window into the field Expression we record the relation, according to which

the new variable should be calculated. After recording the expression it is possible to use

a virtual keyboard, which has the most important arithmetic, relation or logical operators

or a selection of Minitab functions.

4. If we want to ensure that the values of new variables were recalculated in changing the

variables contained in the statement, we tick the field Assign as a formula.

Standardization of data

By means of selecting Calc - Standardize in the main menu it is possible to carry out

the standardization of a selected variable or variables. The entry window also enables some

specific methods of calculation. The standard is however set using the basic calculation,

when the difference of individual values from the arithmetic average is divided by standard

deviation (Subtract mean and divide by standard deviation).

It is possible to create new variables not only using mathematical or other relations,

into which the values of existing variables are entered. It is also possible to use other

possible Minitab selections of Minitab for creating them.

Creating a arithemic sequence of values

It is possible to create arithmetic sequence of values using the selection Calc - Make

patterned data - Simple Set of Numbers. It is possible to enter the initial value of the

sequence (From first value), the final value of sequence (To last value) and the difference

between the two following values - a step (In step of). The selection of the entry window

enables the repetition of each value of the sequence several times (Number of times to list

each value) and at the same time to repeat the whole sequence several times (Number of

times to list the sequence). An example of an entry window for creating a sequence of odd

numbers from 1 to 33 is given in Figure 1.2.


12

Fig. 1.2 Entry window for creating arithmetic sequence.

Creating variables containing arbitrary numbers

It is possible to create a set of arbitrary numbers using the selection Calc - Make

patterned data - Arbitrary Set of Numbers. It is necessary to enter a sample plan and

defining numbers, which have to be in given variables and in their order (Fig. 1.3). Individual

values are distinguished by a gap. If a given variable also has to contain an arithmetic

sequence it leads to its entry so that it shows the initial and final value separated by a colon

and behind it a step is shown after a slash (if it is other than 1). An example of such an entry

is 50:52/0,5 which creates the values 50; 50,5; 51; 51,5; 52. Similarly as for the arithmetic

sequence, it is possible to repeat every value of the numerical order several times (Number

of times to list each value) and also to repeat the whole numerical order several times

(Number of times to list the sequence).

Fig. 1.3 Entry window for creating variables containing arbitrary numbers.


13

Creating text variables

It is possible to create text variables using the selection Calc - Make patterned data -

Text Values. It is necessary to enter a sample chart, defining the text chains, which should

be in a variable or their order (Figure 1.4). Similarly as for arithmetic sequences, it is possible

to repeat each text chain several times (Number of times to list each value) and at the

same time it is possible to repeat the whole sequence of text chains (Number of times to

list the sequence).

Fig. 1.4 Entry window for creating text variables.

Creating sequences of date or time information

It is possible to create variables containing date or time information using the

selection Calc - Make patterned data - Simple Set of Date/Time Values. The initial date

(if need be date and time or only time), the final date, the step and its unit are put into the

entry window. The unit step of the entered date or time sequence can be days, working days,

weeks, months, quarters of a year, hours, minutes, seconds or tenths of seconds. In Figure

1.5 there is shown an example for creating a sequence of data for working days in the period

1.1.2011 to 16.1.2011.

Creating variables containing arbitrary dates or time data

It is possible to create variables containing arbitrary time or date information using the

selection Calc - Make patterned data - Arbitrary Set of Date/Time Values. It is necessary

to enter a sample chart, defining date or time information, which should be in given variables

and their order (Figure 1.6). Individual values are separated by a gap. Similarly as for the

previous applications it is possible to repeat each date or time information several times

(Number of times to list each value) and at the same time to repeat whole date or time

rows (Number of times to list the sequence).


14

Fig. 1.5 The entry window for creating variables containing the data of working days.

Fig. 1.6 Entry window for creating variables containing arbitrary data.

1.4 DATA SORTING

In compliance with one of the principles of current quality management "Decision-

making on the basis of facts" it is required that decision-making processes in the field of

quality management to a greater extent be propped up by the results of measured data

analysis. The collected data are often unarranged and non-homogeneous. That is why it is

required to pay attention to their arrangement and categorization.

Providing an overview data sorting enables the arrangement of gathered data so that

their value of information is increased and their analysis is made easier. A criterion for sorting


15

data can be, for example, their order identified by the time of acquiring data or the size of

chosen variables or an alphabethical order. While sorting data it is very important to be

aware that the values of individual variables in one row correspond to one specific case, so

that it is not possible to sort values only in one column independently from the values of other

variables. It should always deal with the arrangement of entire rows according to the values

of chosen variables.

The possibilities of sorting data in Minitab will be illustrated in an example of data

gathered at a company operating an Internet bookshop, which has three field shipping

centers. This data are available in the sample file programme of Minitab indicated by

ShippingData.MTW. The individual variables are:

Name of shipping centre (Center)

Date and time of the order (Order)

Date and arrival time (Arrival)

Period of settling the order (Days)

State of delivery (Status) - „On time“ means that the delivery arrived on time, „Back order“

means that the book is not in stock and „Late“ means that the shipment was delivered later

than six days from the time of ordering

The distance of the shipping center from the delivery destination (Distance).

If you want, for example during the analysis these data, to sort them according to

individual centres and according to the time of settling orders, we can proceed using the

following method:

1. We select the option Data – Sort in the main menu

2. In the entry window in the item Sort column(s) we present all columns, which the sorting

should be concerned with.

3. The entry window offers up to four sorting levels. In the item By column individual

columns are gradually shown, according to which selected columns should be sorted.

4. In the case of sorting requirements according to the the decreasing values of selected

columns it is necessary to check off Descending, because the standard is set in an

increasing order (according to the size of values or in an alphabetical setting.)

5. In the item Store sorted data in one of three possibilities of storing sorted data can be

selected:.

New Worksheet

Original column(s)

Column(s) of current worksheet.


16

The entry window for sorting data according to individual centres (sorted

alphabetically) and on the second level sorting according to the period of settling orders is

shown in Figure 1.7.

Fig. 1.7 Entry window for two-level data sorting.

In case when we want to perserve the original arrangement of data, but we want to

know its order according to size, it is possible to proceed using the following method:

1. We select the option Data – Rank in the main menu.

2. In the entry window in the item Rank data in we present a column for which the order of

values should be set.

3. In the item Store ranks in we place the column symbol, in which the discovered order of

values should be stored.

The specific approach Minitab uses in the situations, when some values of variable

are same. Rank of these values is calculated as the average value of their order.

1.5 SORTING DATA (FILTRATION)

During data analysis it is very often necessary to select homogeous data, that are

data gained under comparable conditions. It is possible to acquire homogeneous data by

filtering original data. The filtration of data into partial files corresponding to specific

conditions significantly increases the possibilities of analyzing the gathered data.


17

In the Minitab it is possible to filter data using various methods. It is possible to divide

the original worksheet into partial worksheets. (Split Worksheet) or it is possible to use it as

a base for creating the partial files of the met assigned conditions (Subset Worksheet).

Split Worksheet

In this case it is necessary to divide the worksheet into partial worksheets proceed in

the following method:

1. Select the option Data – Split Worksheet.

2. In the entry window in the item By variables we place the variable, according to which

the worksheet should be divided.

3. The division of the original worksheet into partial worksheets (the original worksheet

stays) is run. Their number will correspond to the number of various variable values,

according to which the worksheet is divided.

Creating subset worksheets

In this case it is necessary to create a subset worksheets by the following method:

1. Select the option Data – Subset Worksheet in the main menu.

2. In the entry window we select the option Include or exclude on whether rows will be

entered, which should stay or which have to be excluded.

3. In the entry window in the option Specify Which Rows to Include/Exlude we select

one of three possibilities:

Rows that match condition

Brushed rows

Rows number.

4. We carry out the entry according to the appropriate selection. A new worksheet is

created, which will contain only selected data.

We can suppose that from the entire file of data we want to analyze the delivery of

books in detail from the Western centre to destinations less than 150 km away. In the entry

window, there will be then chosen a selection of rows of the fulfilled condition and the

condition will be recorded corresponding to the entry (Figure 1.8 and 1.9).


18

Fig. 1.8 Entry window for creating subset worksheet.

Fig. 1.9 Entry window for recording conditions.

Concept summary

Minitab Worksheet – worksheet, containing in columns the individual variables.

Minitab Project – the file of worksheets including the results of the analysis carried out up to

now.

Session Window – application window, into which the text results of solved tasks are stored

Data Sorting – arranging data in selected columns according to selected variables.


19

Questions

1. Using which function is it possible to find an arbitrary quantile of a certain variable?

2. What types of conditions are possible to use for creating partial worksheets?

3. What operations represent the standardization of data?

4. How is it possible to create a variable, which connects the text data of two original

variables?


20

2 EXPLORATORY DATA ANALYSIS

Study time

5 hours

Objective

After studying these chapters you will be able:

to use tools for the exploratory data analysis

to identify the sample characteristics of the analyźed variables

to test normality and to use goodness of fit tests

to interpret the analysis results.

Lecture

It is possible to describe each data file on the basis of two methods: using graphic

methods and using numerical characteristics. Both the graphic and numerical analysis of

source data can provide a whole series of valuable information about monitored products,

processes or services. Some graphic tools belong to the seven basic quality management

tools, which have their own irreplaceable place even in terms of planning, control and

process improvement.

The basic objective of the graphic analysis of data is the clear provision of basic

information about data, their sample characteristics, their distribution, the occurence of

outliers, data dependencies, and similar areas. The conclusions of the acquired analysis are

then verified using suitable numerical methods. Basic tools of the exploratory data analysis

are:

• Box Plot

• Dot Plot

• Histogram

• Stem and Leaf Display

• Probability Plot

• Scatterplot.


21

2.1 BOXPLOT

A boxplot is used for evaluating the occurence of outliers and the symmetry of a given

distribution. Its structure is simple and consists of the following criteria:

a) A calculation of the quantiles of given variable x25, x50 and x75.

b) A calculation of the length of the rectangle (box) : Rk = x75-x25 (quartile range).

c) Determining the borders for identifying outliers

A = x25 – 1,5Rk

B = x75 + 1,5Rk

d) Boxplot display.

e) Evaluating symmetry and identifying outliers. Values lying in front of value A and

values lying behind value B are considered as outliers and are illustrated as separate

points.

In the case of designing a box plot, we proceed in Minitab using the following method:

1. In the main menu we select the option Graph - Boxplot.

2. In the entry window we should select from four box plots, which are a simple (Simple)

and grouped (With Groups) boxplot for one variable (One Y) and the same two

possibilities in the case of multiple variables (Multiple Y’s).

3. In the window Graph variables we enter the following variable and into the window

Categorical Variables we can enter a variable according to which a given variable will

be divided.

4. In the option Data View there can be found the possibility of display as for example

marking outliers (Outlier symbols) and marking out the average (Mean symbol).

A model example of a boxplot for a period of shipment delivery for an individual

shipping centre is illustrated in Figure 2.1. From the illustrated boxplot it comes that in the

case of the Eastern shipment centre there occurs one outlier, which represents a shipment

delivery of almost 8 days. This centre also delivers shipments during longest time on

average.


22

Fig. 2.1 A comparison of delivery times for individual centres using boxplots.

2.2 DOTPLOT

Dotplot (diagram of individual values) represents a simple graphic summary of data,

for which each observed value is represented by a dot located on axis x. The dotplot shows

how often partial values occur and points out unusual or outlier values. The dotplot in the

Minitab programme is processed in the following way:

1. In the main menu we select the option Graph - Dotplot.

2. In the entry window there is a selection of three kinds of dotplots for a simple variable

(One Y) and four kinds of multiple dotplots (Multiple Y’s).

3. In the entry task we select in the window for selecting variables (Graph variables) a

variable, for which we want to process a dotplot.

4. If we want to use dotplot for comparison of variables divided using some categorical

variable, we enter the variable from the selection Multiple Graphs into the item By

variables with groups in separate panels, according to which the analyzed variable

will be divided. If we want to create several separate dotplots, we enter this variable into

the window By variables with groups on separate graphs.

An example of a multiple dotplot, which compares the period of shipment delivery

through the individual shipping centres is shown in Figure 2.2. From an analysis of the

dotplot it comes that the Western shipping centre achieved a shorter delivery time period

than the other two centres.


23

Fig. 2.2 Dotplots.

2.3 HISTOGRAM

A histogram represents a graphic illustration of the interval frequency of data in

selected intervals. In the area of quality management it is one of the basic graphic tools for

data analysis. For example, it enables the evaluation of the distribution of monitored quality

characteristic, the identification of unusual causes influencing process or evaluating the

process capability.

A histogram is a columned graph with columns mostly of the same width, where the

width of individual columns corresponds to the width of interval h and the height of the

columns expresses the frequency of the values in a given interval. Each interval is defined in

a lower and upper boundary. In the case of processing a histogram we proceed with Minitab

using the following method:

1. In the main menu we choose the option Graph - Histogram.

2. For an easier evaluation of the monitored quality characteristic distribution we select in

the entry window the histogram with the displayed curve of probability density (With Fit).

3. In the entry task, we choose a variable in the window for the variable (Graph variables),

which we want to design a histogram for.

An example of a histogram for the period of shipment delivery is given in Figure 2.3.

From the displayed diagram it comes that the distribution of the variable Days could

approximately correspond to the normal distribution.


24

Fig. 2.3 Histogram.

2.4 STEM AND LEAF DISPLAY

The steam and leaf display numerically illustrates the frequency distribution of the

monitored quality characteristic presenting all values. Similarly as in a histogram, they

correspond to the length of rows in the numerical histogram of the number of values falling

into a given interval. While for the histogram all these units were indicated the same (through

a hatched area), for a numerical histogram each statistical unit is represented by a symbol

corresponding to its observed value. This is achieved by distributing the observed values into

two components - the leading number is called a "stem" and the following number is called a

"leaf". For example number 75 has stem 7 and leaf 5. If the values of statistical variables are

three digits, stem represents hundreds, leaves dozens and the last numeral is not

considered. Similarly, we proceed if the variable is of a still higher value.

The procedure of designing a stem and leaf display is the following:

1. In the main menu we select the option Graph - Stem and Leaf.

2. In the entry task we select the variable in the window for variable selection (Graph

variables) which we want to design the stem and leaf for.

3. After confirmation in the field of the results Session there appears a resultant stem and

leaf display. The first column in this diagram represents cumulative frequency, or if need

be the absolute frequency. This at first deals with the cumulative frequency growing up

to an interval, in which the median occurs. For this interval the frequency is given in

brackets and it corresponds to the absolute frequency at a given interval. For other


25

intervals, it then again deals with cumulative frequency, but this time a decreasing one

(from maximal values up to the interval which contains the median).

An example of a stem and leaf display, illustrating information about distances to

which the shipment was delivered, is shown in Figure 2.4.

Fig. 2.4 Stem and Leaf Display.

2.5 PROBABILITY PLOT

The probability plot is used to compare the distribution of analyzed variable with some

theoretical probability distribution resp. to compare if it is possible to understand analyzed

data as a sample from a certain probability distribution. This plot expresses the relation of the

quantiles of analyzed variables with the quantiles of a considered probability distribution.

Most often a normal distribution is used for this comparison. The interpretation of the

probability plot is the following: the closer the points are to the plotted straight line, both

distributions become more similar. The conclusion gained by analysing with this graphic tool

it is necessary to be confirmed by a suitable goodness of fit test.

We design the probability plot according to the following procedure:

1. In the main menu we select the option Graph - Probability Plot.

2. In the entry window we should select from two types of probability plots, and they are the

simple (Simple) and multiple (Multiple) probability plot. In the entry task we select the

variable in the window for variable selection (Graph variables), which the probability

plot is designed for.


26

3. The probability distribution, with which we want a selected variable using a comparison

plot, is possible to select in the option Distribution. It is standardly set comparing the

distribution of given variable with normal distribution.

An example of a probability plot for normal distribution for the distances the shipment

was delivered to, is shown in Figure 2.5.

Fig. 2.5 Probability plot for normal distribution.

From the illustrated plot it is evident that the variable Distance copies a straight line

corresponding to the normal distribution and it is then possible to evaluate that the variable

can correspond to normal distribution. In the right upper part of the plot, besides mean value

and standard deviation, there is also found the results of the Anderson-Darling Normality

Test. We can compare p-value with the level of significance α = 0.05. In our case p–value =

0.734, which means that the variable Distance is corresponding to normal distribution.

2.6 SCATTERPLOT

The scatterplot represents a graphic tool for analysis of the dependence of two

random variables. This diagram provides the first information about the existence of

stochastic dependence, its shape and about rate of correlation.

In making a scatterplot we first select independent variable X and dependent variable

Y. We further carry out a measurement of a sufficient number of pair values of dependent

and independent variables (Xi, Yi). From the measured values we design the scatterplot so

that we indicate the pair value (Xi, Yi) in the square coordinate system (X, Y).


27

By analyzing this diagram we gain first information about the relationship between the

given variables. For a better illustration of the dependence we can apply regression analysis

for finding suitable regression function.

In the case of designing a scatterplot diagram, we proceed in the following way:

1. In the menu we select the option Graph - Scatterplot.

2. In the entry window we select from the available possibilities a simple scatterplot

(Simple). It is further possible to select for example a scatterplot with a regression curve

(With Regression) or a scatterplot, whose points are connected (With Connect Line).

3. In the entry window for a single scatterplot we select dependent variable (Y variables)

and independent variable (X variables).

An example of scatterplot analyzing delivery time in dependence on distance is

shown in Fig. 2.6.

From the placement of individual points it is clear that the delivery time is not

dependent on the distance to which the shipment is delivered.

Fig. 2.6 Example of a scatterplot.


28

2.7 BASIC DESCRIPTIVE STATISTICS

Basic sample statistics provide valuable information about the monitored quality

characteristics. Most of these statistics are further used in various graphic tools and in

hypotheses testing.

In this task it is possible to evaluate in the Minitab sample statistics given in Table 2.1.

Tab 2.1 The evaluated sample statistics.

Statististics Description

Mean Average of given variable.

SE of mean Standard deviation of the mean.

Standard deviation Sample standard deviation of given variable.

Variance Sample variance of given variable.

Coefficient of variation Percentual rate of standard deviation from the average.

Trimmed mean Average of 90% medium values.

Sum Sum of all values of given variable.

Minimum Minimal value of given variable.

Maximum Maximal value of given variable.

Range Range of given variable.

N nonmissing Number of all observations of given variable (without missing values).

N missing Number of missing values of given variable

N total Total number of values of given variable.

Cumulative N Cumulative frequency.

Percent Relative frequency in %.

Cumulative percent Relative cumulative frequency in %.

First quartile Lower quartile of given variable.

Median Median of given variable.

Third quartile Upper quartile of given variable.

Interquartile range Quartile range of given variable.

Mode Modus of given variable.

Sum of squares Sum of squares of differences between individual values and average.

Skewness Skewness of given variable.

Kurtosis Kurtosis of given variable.

MSSD Mean of the squared successive differences.

In the Minitab programme it is possible to present the basic sample statistics of given

variables in three ways. In the case when we don´t want to use these values for further

calculations we can select the option Stat - Basic Statistics - Display Descriptive

Statistics. The values of sample statistics are then illustrated in the Session window. If we

will further work with the values of sample characteristics, we choose the option Stat - Basic

Statistics - Store Descriptive Statistics, which helps store the values of statistics directly

into the worksheet as new variables. The last possible option Stat - Basic Statistics -


29

Graphical Summary enables the illustration of some sample characteristics together with a

histogram, with a boxplot and confidence intervals for the mean and median. In this option it

is not possible to select arbitrary sample characteristics, but several of them are solidly

given.

2.8 TESTING NORMALITY

It is possible to analyze the distribution of monitored quality chaateristic using various

graphic tools and by the help of tests.

Among the graphic tools there are specially the histogram, boxplot and probability plot.

These graphs, presented above, however only enable an approximate evaluation and in this

way it is necessary to fill in the acquired results with the results of some numerical normality

tests. The Minitab programme offers three tests of normality:

a) Anderson-Darling test

b) Ryan-Joiner test

c) Kolmogorov-Smirnov test.

For the three above mentioned tests these hypotheses are formulated:

Null hypothesis H0: x1, x2,…,xn is a random sample of distribution N(μ ,σ 2).

Alternative hypothesis H1: x1, x2,…,xn is not a random sample of distribution N(μ ,σ 2 ) .

During standard procedure of hypothesis testing a test statistic is calculated. In the

case of null hypothesis validity the distribution of this test statistic corresponds to a certain

theoretical model of distribution. The value of test statistic is then compared to the critical

value of a given distribution for a chosen level of significance. The result of this comparison

then leads to the conclusion if on the chosen level of significance the null hypothesis should

be accepted, or conversely rejected and the alternative hypothesis accepted.

Similarly to other statistical programmes for evaluation, if the null hypothesis is

accepted (and the alternative rejected) or rejected (and the alternative accepted) Minitab

presents a p-value. The p-value represents the minimal level of significance, for which it

would still be possible for the given measured values of a random sample to reject the nul

hypothesis H0. If the p-value is greater than or equal to the chosen level of significance, we

accept the null hypothesis and reject the alternative hypothesis. If the p-value is less than the

selected level of significance, we reject the null hypothesis and accept the alternative

hypothesis. For usual calculations, the level of significance is chosen at a level α=0.05.

We proceed in the following way in the case of normality testing:

1. In the main menu we select the option Stat - Basic Statistics - Normality Test.


30

2. We enter a variable, whose normality we want to test, into the Variable window. The

Anderson-Darling test is standardly chosen for evaluating normality.

3. Using the illustrated probabilty plot we can evaluate if the selected variable corresponds

to the straight line of normal distribution and then it is possible to evaluate that the

variable has normal distribution. Besides the mean value and standard deviation, in the

right corner of the plot there is also found the results of the Anderson Darling test: the

value of test statistic (AD) and the p-value (see Fig 2.5).

2.9 FINDING A SUITABLE THEORETICAL MODEL OF PROBABILITY DISTRIBUTION

In the situation where it is not possible to approximate data distribution by normal

distribution we find another suitable probablity distribution. We proceed in Minitab in the

following way:

1. In the main menu we select the option Stat - Quality Tools - Individual Distribution

Identification.

2. In the entry window in the item Data are arranged as we can indicate the possibility

Single column in the case when all data of a given variable are in one column. In order

for us to test all 14 possible distributions, which the programme offers, we leave the

option Use all distribution ticked off.

3. The numerical results illustrated in the Session window are divided into three parts. In

the first part Descriptive Statistics the basic sample characteristics of evaluated

variable and the value of the lambda parameter for the Box-Cox transformation can be

found.

The second part contains the results of Anderson-Darling goodness of fit tests for

the 14 probability distributions (see Fig. 2.8). In column AD the value of the Anderson-

Darling test statistic is given. In column P the p-values are presented. We compare this

value with the level of significance α. We can test the hypothesis:

H0: data are corresponding to given distribution

H1: data are not corresponding to given distribution.

In the case when the p-value is greater than or equal to the selected level of

significance, we accept hypothesis H0. In the last LRT P column we find the results of the

Likelihood ration test, which helps us to evaluate if a distribution with more parameters

describes the distribution of data better than a classical distribution. In the case when value

LRT P ≤ 0.05 the distribution with more parameter is more suitable than the classic one.


31

Let´s suppose that our task is to find a suitable model of probability distribution for a

variable stored in the column Days. The entry window and analysis results are found in

Figures 2.7 to 2.9.

Fig. 2.7 Entry window for determining a suitable theoretical model of probability

distribution.

Fig. 2.8 The results of goodness for fit tests for various theoretical probability

distributions.


32

Fig. 2.9 Probability plots for a selected distribution.

In our case the p-value is highest for Weibull distribution. It does not exceed the value

0.05, so that for the level of significance of 0.05 it is possible to state that the data does not

come from any of the given distributions. This table is filled in by probability plots for all

tested distributions.

Concept Summary

Boxplot – a diagram enabling the evaluation of the symmetry of distribution of observed

characteristic and to identify outliers.

Histogram – a diagram of frequency distribution of given characteristic in suitably chosen

intervals.

Dotplot – graphic illustration of the distribution of the individual values of observed

characteristic.

Stem and Leaf – numerical distribution diagram of the individual values of obseved

characteristic.

Probability Plot – a diagram enabling the graphic assessment of the agreement of observed

characteristic distribution with a selected theoretical probability distribution.


33

Questions

1. Through which method are outliers identified in a boxplot?

2. What values are given in the first column of stem and leaf display?

3. What is the base for decision about the acceptance or rejection of the null hypothesis in

the case of normality testing?

4. What is the variation coefficient and when is it used?


34

3. APPLICATION OF SELECTED QUALITY MANAGEMENT

TOOLS

Study time

6 hours

Objective

After studying this chapter you will know how to:

use selected quality management tools

interpret the achieved results.

Lecture

3.1 CAUSE AND EFFECT DIAGRAM (ISHIKAWA DIAGRAM, FISHBONE

DIAGRAM)

The cause and effect diagram is an important graphic tool for analyzing all possible

causes of a given effect (a problem with quality). It is also called the Ishikawa Diagram, after

Japanese specialist Kaoru Ishikawa, who first used it or a Fish Bone Diagram due to its

shape. Its use represents a systemic approach to problem solving.

For working out a cause and effect diagram using Minitab there is selected the path

Stat – Quality Tools – Cause-and-Effect. In the entry window (see Fig 3.1) a solved

problem is recorded (Effect) and there are stated the categories of its possible causes

(Label). The possible causes of the problem are analyzed, which are then recorded to a

corresponding category of causes into the column Causes. Individual causes are separated

by a gap: if it concerns a multi-word expression, it is necessary to put the data into quotation

marks. For individual causes it is possible to further analyze their subcauses. It is possible to

fill them in a separate window after clicking on the button Sub… An example of a resultant

cause and effect diagram is shown in Figure 3.2.


35

Fig. 3.1 Entry window for a cause and effect diagram.

Fig. 3.2 An example of a cause and effect diagram.

3.2 PARETO ANALYSIS

The Pareto analysis is an important tool of manager decision-making, for it enables

us to state the priority in solving a problem with quality so that during the purposeful use of

sources the maximal effect is attained. Its basic tool is the Pareto diagram, which is also very

suitable for a graphic presentation of the main causes of the problem.


36

Pareto analysis is based on the Pareto principle, which was formulated by J.M. Juran

in this form: "Most problems with quality (80 to 95%) are caused only by a small ratio (5 až

20 %) of causes, which are shared with them".

It is necessary to understand here individual causes with a wider significance. They

present partial "insufficiency bearers" as the individual causes of nonconformities are, but

also individual nonconformities, individual products, individual production equipments,

individual workers etc. For example, by applying the Pareto principle it is then to state that for

arising problems there are shared by a decisive measure only a certain group of products

from the whole production programme, only some nonconformities from all occuring

nonconformities, only some causes from all influencing causes, only some production

equipment from all used, only some workers from all, who influence the quality of a product

etc. This identification of causes is very important for stating priorities in solving a problem.

These small groups of causes are indicated as a "vital few" and for its remaining part

is gradually identified with the label "useful many". By using a Pareto diagram it is possible to

identify this "vital few ", which enables us to focus on sources for eliminating the causes

which contribute the most to the anlayzed problem.

To apply the Pareto analysis in the Minitab programme proceed in the following way:

1. In the main menu select the option Stat – Quality Tools – Pareto Chart.

2. In the entry window in the field Defects or attribute data in we enter a variable, which

contains source data (the occurence of individual causes) or a variable containing

individal types of causes.

3. If in the preceeding step we entered a variable containing an individual type of causes,

then in the field Frequencies in we enter a variable containing the frequency occurence

of individual types of causes (or if necessary expenses connected with the occurence of

individual types of causes e.g. nonconformities).

4. In the case when we want to divide input data into groups, for example according to

shifts, we enter a variable in the field By variable in, which indicates the appropriateness

to a certain group (it is possible only to apply in combination with source data).

5. In standard entry window options Minitab has the causes (defects) set, which were not

found in 95% of the groups mostly sharing in solving the problem. They can be

connected in one group (Combine remaining defects into one category after this

percent – 95). It is possible to change this percentual ratio or to refuse the possibility of

connecting the smallest contributing item (Do not combine).

6. We design a Pareto diagram and carry out the analysis.


37

We will illustrate the given approach with an example of evaluating the occurence of

nonconformities in the production of motorcycle speedometers. Data about the occurence of

nonconformities and the frequency of their occurence are possible to find in the variable

Defects a Counts in the worksheet Exh_qc.MTW. The entry window is given in Figure 3.3.

From the created Pareto diagram in Figure 3.4 it is evident that the first two most occuring

nonconformities on the total occurance of defects are shared by almost 80 percent (78.7%),

so that it would be possible to rank them into the “vital few” of the causes of problem with

nonconformities. From this picture it is also evident that the least occuring defects were

connected to one item Other.

Through a certain insufficiency of the Pareto diagram, which the Minitab offers, is the

incorrect position of Lorenz curve, whose points should not lie on the centre of individual

columns, but on the level of their right border (it should begin in the right upper corner of the

first column).

Fig. 3.3 Entry window for the Pareto analysis.

3.3 MEASUREMENT SYSTEM ANALYSIS

The main objective of analysing the measurement system is the quantification of the

variability of this system. If that variability is not known, the variability of a production process

can be mistakenly reevaluated. On the basis of this mistaken conclusion well-meaning

interventions into the production process could also mean bad decision making, which could

have considerable financial effects. Separating the variability of the measuring system from

the variability of the production process itself is a basic condition for correct decision making

in terms of process control.


38

Before describing the basic methods for evaluating a measuring system it is firstly

necessary to present important statistical properties of a measurement system. Among the

most important properties are:

Bias

Stability

Linearity

Repeatability

Reproducibility.

Fig. 3.4 Pareto diagram.

3.3.1 Analysis of the linearity and bias of a measurement system

Bias is the difference between an average of repeated measurements and a reference value.

Bias is the measure of the systematic error of a measurement system and it contributes to all

errors created by the combined effects of all variability sources, known or unknown. If the

bias is not zero it is necessary to add it to the measured results.

The bias difference in the expected working range of the measurement system is

called linearity. Linerarity is determined by measuring samples, which by the values of

observed characteristic cover the supposed working range of the measurement system. It is

possible to consider linearity as a change in bias in regards to the size of the measured

values.


39

We evaluate the linearity and bias of the measuring system in the Minitab in the

following way:

1. In the main menu we select the option Stat - Quality Tools - Gage Study - Gage

Linearity and Bias Study.

2. In the field Part numbers we enter variable containing numbers of measured

samples. In the field Reference values we enter a variable containing the reference

values of the measured samples. Into the field Measurement we insert a variable

containing the measured values of samples.

An example

We will evaluate the bias and linearity of a measurement system. We have used data

measured from the worksheet Gagelin.MTW (see Figures 3.5 and 3.6).

Fig. 3.5 Entry window analying the linearity and bias of a measurement system.

From the illustrated plot it comes that the given measurement system obviously has a

problem with linearity. This conclusion is also confirmed by the numerical results. In the first

table Gage Linearity there are found the results of the t-test of the absolute term and

regression coefficient of the linear regression function. In the case when one p-value is at

least less than 0.05, we can state that the linearity of the measurement system is statistically

significant on a level of significance of 0.05.

In the second Gage Bias (Bias) table the bias statistical significance is evaluated for

individual samples. From the results it is evident that in the case of samples with a reference

value 2, 8 and 10 the value of bias is statistically significant.


40

Fig. 3.6 The results of analyzing the linearity and bias of a measurement system.

3.3.2 Analysis of the repeatability and reproducibility of a measurement

system

Very important properties of a measurement system are repeatability and

reproducibility, which represent two basic components of the variability of a measurement

system. Repeatability is defined as the variability of the repeated measurements of the same

quality characteristic in constant conditions (Equipment Variation) while reproducibility

represents the variability of the mean value of sets of repeated measurements carried out

under various conditions (Appraiser Variation). In evaluating the repeatability and

reproducibility of measurement system in practice, the method of Average and Range is

most often used. The stated procedure, which includes both numerical and graphic

evaluations states the repeatability values (EV) and reproducibility (AV). From their values it

is then possible to determine the combined repeatability and reproducibility according to the

relation:

22AVEVGRR

(3.1)

The acceptability criterion of the measurement system is the procentual share of GRR

from the total variability and the value of ndc (number of distinct categories), which are

calculated according to the relationships:


41

100

TV

GRRGRR%

(3.2)

GRR

PV41,1ndc

(3.3)

where:

PV – part variation

TV – total variability:

22PVGRRTV (3.4)

We evaluate the repeatability and reproducibility of a measurement system in Minitab

in the following way:

1. In the main menu we select the option Stat - Quality Tools - Gage Study - Gage

R&R Study.

2. In the window Part numbers we enter a variable containing the numbers of

measured samples. In the window Operators we enter a variable containing the

indication of operators. In the window Measurement data we insert a variable

containing the measured value of samples.

3. In the option Method of Analysis we designate the method of evaluating the

repeatability and reproducibility analysis. The method of Average and Range (Xbar

and R) is used more often in practice than the method of analysis of variance

(ANOVA).

Fig. 3.7 Entry window of Gage Repeatability and Reproducibility Analysis.


42

An example

Our task is to evaluate the acceptability of a measurement system from the point of

view of its repeatability and reproducibility. For analysis sample data has been used from the

work sheet Gageiag.MTW (see Figure 3.7 to 3.9).

Fig. 3.8 Numerical results of Gage Repeatability and Reproducibility Analysis.

Fig. 3.9 Graphic results of the Gage Repeatability and Reproducibility Analysis.


43

From the numerical results stored in the window Session (see Fig. 3.8) it comes that

the measurement system is conditionally acceptable. The percentual value of combined

repeatability and reproducibility in regards to total variability (Total Gage R&R) is 26.7% and

the value of the number of distinguished categories of ndc = 5

3.4 PROCESS CAPABILITY ANALYSIS

One of the important areas of quality management is the capability analysis of the

designed or already used processes of product production. It is possible to characterise the

process capability as the ability of a process to permanently provide products meeting

required quality criteria.

For the correct evaluation of the process capability it is necessary to use the correct

procedure, which includes verifying some limiting conditions. Evaluating the process

capability on the basis of measurable quality characteristics should be carried out in these

steps:

1. Choose of quality characteristic

2. Measurement system analysis

3. Gathering data from the runing process

4. Exploratory analysis of gathered data

5. Evaluating the statistical stability of the process

6. Verifying the normality of the monitored quality characteristic

7. Calculating the capability indices and comparing them to the required values

8. An appropriate solution and the implementation of actions to improve the process.

For process capability assessment there are used process capability indices. The

most often used indices are Cp and Cpk, which evaluate the potential and real ability of a

process to provide products meeting tolerance limits. To a smaller extent indices Cpm, Cpm*,

and Cpmk are applied, which evaluate the ability of a process to attain the target value of

monitored quality characteristic.

We proceed in the analysis of process capability in the Minitab in the following way:

1. In the main menu we select the option Stat – Control Charts - Variable Charts and

using a suitable control chart we verify the statistical stability of the process.

2. In the main menu we select the option Stat – Basic Statistics – Normality tests and we

verify the normality of the monitored quality characteristic.


44

3. In the main menu we select the option Stat – Quality Tools – Capability Analysis –

Normal.

4. In the entry window we select the method of entering data. If data are gathered in

individual subgroups arranged after each other, we select Single column and we enter

an appropriate variable. In the field Subgroup size we then enter the subgroup size. If

the data are in various columns so that given row always create a subgroup, we select

Subgroup across rows of and we enter corresponding columns.

5. In the field Lower spec we enter the value of the lower tolerance limit and in the field

Upper spec the value of the upper tolerance limit of the monitored characteristic. The

field Boundary is checked off in cases when a given limit cannot be exceeded.

6. Optionally the field Historical mean or Historical standard deviation can be filled in.

7. We can select the option Transform in the case of a problem with the normality of data

and a method of data transformation. Minitab offers Box-Cox power transformation or

Johnson transformation.

8. In the option Estimate it is possible to select various methods of estimating the standard

deviation.

9. In the option Options it is possible among other things to fill in the target value of the

quality characteristic or to fill in the requirement for calculating the confidence intervals of

process capability indices.

10. In the option Storage it is possible to enter which results of the process capability

analysis should be stored.

11. We carry out the calculation and interrpret the attained results.

We can illustrate application of the above-mentioned procedure in an example of

analysing the capability of the cable wire production process from the point of view of wire

diameter. In the course of the process, at regular intervals the diameter of always five wires

was measured. The measured values were stored in the variable Diameter in the file

Cable.MTW (see Minitab sample data).

The entry window for analysing process capability is given in Figure 3.10. The results

of the carried out analysis are then in Figure 3.11. On the basis on the determined value of

Cpk index, which is much smaller than the usual minimal required value (1.33), it is possible

to state that the given process is not capable. Comparing Cpk index with Cp index leads to the

conclusion that the better setting up of the process towards the centre of tolerance does not

ensure process capability, (through this intervention the Cpk only reaches level of Cp.) and it

will be necesarry to make corrective actions leading to a reduction of the variability of

monitored quality characteristic.


45

Fig. 3.10 Entry window for the process capability analysis.

Fig. 3.11 The results of process capability analysis.

Minitab also offers the procedure of analyzing the process capacility, in which the

statistical stability of the process and the normality of data are analyzed together. In the main

menu we select the option Stat – Quality Tools – Capability Sixpack.. For verifying the

statistical stability of the process a selected pair of control charts is used. For evaluating

normality the probability plot for normal distribution is applied (see Fig 3.12). The main output

of the process capability analysis provides rather less information than the above mentioned

procedure. Its advantage however is to integrate information including and verifying essential

assumptions.


46

Fig. 3.12 The results of analyzing the process capability supplemented with

assumptions verification.

3.5 STATISTICAL PROCESS CONTROL

Statistical Process Control (SPC) represents a preventive tool of quality management,

because on the basis of the timely revelations of significant deviations in the process from a

set level in advance it enables to implement intervention into the process with the objective to

keep it in the long term at an acceptable and stable level, or if need be to improve it. The

principle of the statistical process control is to aid in attaining and keeping a production

process on an acceptable and stable level so that an agreement is ensured between the

product and customer through specific requirements.

The basic SPC tool is the control chart. It is a graphic means of illustrating the

development of the process variability in time using the principle of testing a hypothetical

hypothesis. One of the functions of the effective use of control charts is to provide a

statistical signal when an assignable cause begins to work, and to avoid a useless signal,

when it doesn´ t lead to a significant change in the process. The choice of a suitable type of

control charts is dependent on the type of measured variable and the subgroup size.

Decision-making about the statistical stability of a process is enabled by control limits:

LCL and UCL:

UCL - Upper Control Limit,

LCL - Lower Control Limit.


47

In interpreting a control chart generally basic rules are applied:

a) If all values of the sample characteristics lie inside the control limit, the process is

considered as statistically stable and no intervention into the process is required.

b) If some value of any sample characteristics lie outside the control limit, the process

is considered as statistically unstable. In this case identification of assignable causes is

required and suitable corrective actions should be proposed with the objective of fully or at

least partially eliminating the assignable causes. Besides, signals of assignable causes are

considered to be some non-randon patterns of points.

Processing a control chart in Minitab is done in the following way:

1. In the main menu we select the option Stat - Control Charts.

2. In the case when the output of process is measurable variable and subgroup size n ≥

2, we select in the menu Variables Charts for Subgroups and consequently we

choose a pair of control charts. In the case when the output of the process is

measurable variable with the subgroup size n=1, we select Variables Charts for

Individuals. In the case of the attributive quality characteristics we select Attributes

Charts and we choose the control chart according to the character of the variable

(Fig 3.16).

3. In the entry window we then define if measured values are in one column (All

observations for a chart are in one column) or if the individual subgroups are in

individual rows (Observation for a subgroup are in one row of columns). In the

case of storing all values in one column, it is necessary besides entering the given

column, to put the subgroup size into the field (Subgroup sizes).

An example

By designing a control chart for measurable quality characteristic we can show the

analysis of data in the worksheet Quality.MTW (variable Days). In verifying the statistical

stability of a process we can use a pair of control charts for sample averages and standard

deviations (see Fig 3.13 and 3.14).

If in the processed control charts there are not any values outside the control limit and

there do not occur any non-random patterns of points, we can then state that the process is

statistically stable.


48

Fig 3.13 Entry window for control charts for variables.

Fig. 3.14 A pair of control charts for sample averages and standard deviations.

An Example

The procedure for processing a control chart for attributive quality characteristic will

be illustrated using the work sheet Docs.MTW, where there are number of nonconforming

units (Defect) in 25 subgroups by a constant size of 100 units. By using np - control chart we

verify the statistical stability of the process (see Fig 3.15 and 3.16).


49

Fig. 3.15 Entry window of np-control chart.

Fig. 3.16 Np-control chart.

In the processed control chart one point has been found above the upper control limit,

and that is why the process is not statistically stable. For ensuring the statistical stability of

the process it will be necessary to identify an appropriate assignable causes and to remove

them.


50

Concept Summary

Cause and Effect diagram (Ishikawa diagram, Fish Bone Diagram) - a graphic tool for

analyzing all possible causes of a certain problem with quality.

Pareto diagram - a tool enabling to identify “vital few” of quality problem causes and set

such a priority for solving this problem.

Process capability – the ability of a process to provide permanently products meeting

required quality criteria.

Statistical Proces Control – a feedback system, which on the basis on the timely revelation

of significant deviations in a process from a level stated in advance enables the

implementation of intervention into a process with the objective of its long-term maintenance

in acceptable and stable level.

Control Chart – a basic graphic tool enabling to distinguish the influence of assignable

causes on process variability from the influence of random causes.

Questions

1. What principle is used for joining of causes contributing to the analyzed problem in the

Pareto diagram into one group?

2. What has to be fulfilled for a measurement system to be evaluated as appropriate from a

bias point of view?

3. What has to be fulfilled for a measurement system to be evaluated as appropriate from a

linearity point of view?

4. What is indicated as GRR?

5. What assumptions have to be verified in analyzing process capability?

6. What method is used for the control limits determination in a control chart?

7. How it is possible to identify the influence of assignable causes in control chart?

References

[1] Montgomery, D. C.: Introduction to Statistical Quality Control, Sixth edition. New York: J.

Wiley & Sons, 2009, 734 s.

[2] Minitab 16 Help.

Documents

VYSOKÁ ŠKOLA BÁŇSKÁ – TECHNICKÁ UNIVERZITA OSTRAVA FAKULTA METALURGIE …katedry.fmmi.vsb.cz/Opory_FMMI_ENG/QM/Computer … · · 2015-11-15vysokÁ Škola bÁŇskÁ –