59
Manfred te Grotenhui Theo van der Weeg n Statistical To I An Overview of Common Applications in Social Sciences

Statistical tools Grotenhuis Weegen

  • Upload
    mal40

  • View
    56

  • Download
    6

Embed Size (px)

DESCRIPTION

Book on statistics in the social sciences

Citation preview

Page 1: Statistical tools Grotenhuis Weegen

Manfred te Grotenhui Theo van der Weeg n

Statistical To I

An Overview of Common Applications in Social Sciences

Page 2: Statistical tools Grotenhuis Weegen

t\lltt)',ltl'o 1,··.t·tvrd N11 p.ttl ,,J !111..., puhh�.tlhllt ttl.ty hr ll'PHJdtHnl. ·.ttllt'd 111,1 ' ' '''' v.d •V ·'' ' ''

111 ll.IJI',IIIIIln\l!t ;tlly ltlllll tll hy ;\!ty llll';tll...,, L·k·L(It)l\ll', llll'LII.tllll";tl plttlltlttlJlYIII)',, ICt tlldlll)',. Ill

1 1\IH'I wt...,t·, wnlantt thr pnur prrmt�StLHl ul I he Puhh::-.ilt'l.

NUR 916

ISBN 978 90 232 4532 2

Fro111 {P\.!II llltt \111\ttll t l•td!,,lt• 'tltdt tdllll ,'J ����yod Van Gorcwn

Tr:111,1,11, ,111,1111!1, \lilt• ltl• t\,,.,J ''" 1111 d lt�dp1nidclel, Assen: Koninklijke Van Gorcum (2008) Tra11,1,1111111 I'• 11, IIIo" 1�tlllll t l111l11 \lo 1111• 11 I \1'0, and Man[reclte Grotenhuis

PritH �����·o�l 111 •, •t tHH 11 11,, lit ll11 .I tlld. (}l)l)t))

l9 . . ' Mlor tl '"'''' •· l't I l1 ,,,,, '

FS< ' I I

CllVl'l .IIIIIJoilllllll I 1 •I 11111 1 ·' 11 1l1r Netherlands

Profac 7

Statistical Tool 9

1 1.1 1.2 1.3 1.4 1.5 1.6

2 2.1

Statistical Data 11 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .. .. . . .. . . . . . . . .. . . . . . . . .. 1 1 Four Levels of Measurement . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . .. . . . . . . . . . . . 12 Selecting Uni ts of Analysis: Random Sampling ............................ 15 Collecting Statistical Data . . . .. . . . .. .. . . . . . . . . . . .. . . .. . . . . .. . . . . .. . . . . . . . . . . . . . . . .. . . . . 1 7 Data Quali ty . .. .. .. . . . . . . . . . . .. . . . . . .. . . . . . .... . . . . . .. . . . . . . . . . . . ... .. .. .. . ... ... . .. . ... . . . . .. . 1 9 From Collecting Data to Answering Research Questions . ..... . . . .. . 22

Descriptive Statistics 23

Introduction . . . . . .. . . .. . . .. . . .. .. . . . . .. .. . . . . . .. . . . . . . . . . . .. . . . . . .. . .. . . .. . ... . .. .. . . . . .. . .. . . . . 23 2.2 Graphical Description of a Single Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Bar Chart . . . . . . . . . . . . .. . . . ... . .. . . . . . . . . . . .. . .. .. . .. .. . . . . . .. .. .. . . .. . . .. .. .. . . . .. . . . . . . . .. 23 Pie Chart .. . . .. .. . . . . .. . . . . . . . . . . . .. .. .. .. . . .. . . . .. . . . . . . . .. . . . . . . . . . . .. .. .. .. . . . .. . . . . . . . . 24 Histogram . . . . . . . . . . . . . . . . . . . .. .. . . . . . . . . . .. . . . . . . . .. . . . .. .. .. .. . . .. .. . : ................... 25 Stem-and-leaf Plot . .. . . . . . . . .. . . .. . . . . .. . .. . . . .. . . . . . .. . . .. .. .. .. .. . . .. . . . . . . . .. . . . . 26

2.3 Numerical Descrip tion of a Single Variable . .. .. .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . 27 Frequency Table .. .. . . . . . . . . . . . . . .. . . .. . . . . . . . . . . . . .. . . . . . . .. .. . . . . . . . . . . . . . . . . . . .. .. . 28 2.3.1 Measures of Cen tral Tendency . . . . . . . . . . . . . . . . . . . . . . .. . . . . .... .. . . .. . . . ...... . 29

2.3.2

2.3.3

2.4 2.4.1

Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . /9 Med ian . .. ...... ........ ... .. .. . . .... ....... .. .. .. . ............. .. ................ ..... . . . /0 Mean . . . . . . . . . .. .. . . . . . . .. .. . . . . .. . . . . . . . . . .. .. .. ... . . . . . . . . . .. .. .. . ... ................... ... :11 Measures of Variabi l i ty . .. . . .. . . . .... .. ....... .. . .. . ..... .. ........... ..... ...... . :1:1 Range . .. . . . . . . . . . . . . . . . . . . . . . .. .. .. . . . . . . . . . . . . . .. . . . . . .. . . . . .. . . . ............... .. .. . . :1:1 l n terquartile Range (IQR)...... . . .. .. . . . . . . . . . .. . . ..................... 1·1

Detecting Outliers with Box plots . . . . . . . .. . . . . . . . ................. 111

Standard Deviation and Variance . . . . . . . . . . . . .. .................. Ill Measures of Relative Standing . . . . . . . . . . . . . . . . . . . ................. ·I I Percenti les .. . . . . . . . . .. . . ... ................... ....... ................. ..... ·I.' Z-scores .. . ... .. .. . . . ..... . .. ... . . ... .. . . .. . . . . . . . . . . . . . . .. . . . . . . . . . ........ . ·I I Chebyshev's Rule and Empirical Rule . . . . . . . . . . . . . . ........... .... ·1•1

Statisti cal Relations between Two Variables . . . . . . . .. . . .. ....... . Graph ical Description of a Bivariate Relation ................ .. Box Plot ........................... ........ .... .. . . . . . . . . . . . . . . . . . . . . . . ......... . Scatter Plot. . .. . ..... .. ..... . . ...... ........ ..... .. . . .. ... . . ........ ........... .... .. Line Graph . . . . . . . .. .. . . . ..... . ...... . ..... . ..... . . . .... . . . . . . . . . . . . .................. ..

2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . .... ... . . . .. . . .... .... !1(l

Page 3: Statistical tools Grotenhuis Weegen

( i

51

3 Inferential Statistics 51 3_1 Introduction �0 �tatistical lnference - - - - - - -_-_-_-_-_-_-_-_:·_·_·_-_-_:·_-_-_-_-_-_-_-_-_-_-_-_-_-_::

·_-_:·_·_-_:

·.:·. 52

Central Limtt Theorem - - --· · · · · · · · · · · · · · · · 55

confidence Intervals ····················· ·· ·· ·· .. ·· .. · ....... ·· ······· ·· ·· ········ .. 58

Testing Hypotheses -···················································

·············

rf . 62 3.2 One-Sample tests for Mean and Propo ���----_-_ -_-_ -_-_ -_ -_-_ -_-_

·.·.·.·.·_-_-_ -_-_:

·.:·.:·.·_·_ -_ -_ -__ 62 32_1 Testfor a mean .. :······························ .... ..... ..... .. .... ................. 65

3 2 2 Test for a proportton .......................... . . . � 3.3 Tests for Compa1��

gT �:�t Yt!�

n��p��d��t-g��-�-p�)

·::::: : :::::

·.: ::: :

·.: : :: 6� 3.3.1 Pair d Samp · de endent groups) .... .. .............. ..... 7 3.3.2 Two- . mple T-test (tw

(t�r�e o� more independent groups)······ 73 3 3 3 AnalySI f Vanance _ 7 . . . t. for Nominal and Ordinal Vanables ... ....... 7 3 4 Mc<:l un - of A oc1a 1on _________________________ . 77 .

•••• •·• L" n in Conttngency Tables·············· 78 3.4.1 /\,,_,( )(.1.1 I .... .. ····························· ...... . �� \1:::,\'1L;'� r i\- -�-�-�i�ti��-f��-0-��i-�al Variables - - - - - - -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_

·. �g .11.?

, , . 1 l and Cramer s V ... - .... ··· .. ·· .............. 84 LIH )qll.1r . f for Ordinal Variables ....................... . Mu:l�illl c. f A socta ton T c ············· 85 I < tHI:lll'. Rank Correlation: Tau b and au ..

....... _._-_-_-....... .... ... 88

, • >c • 1r1n an's Rank Correlatton -·······························

. I � , :1 !l

•ll ( . . 91 Mo: ,sur s of Association for Interval and Ratio Vanable�

---·_:·_·_·_-_-_-_-_-_-_-_ 91

on's Correlation Coefftclent .... ... ... . .. .. . ......... .. . 93 :1 !> 2 '1.!>.3

� . ··············· Linear Regression Analysis ............................... -_ -_-_ -_-_ -_-_ .. ............ . 98 Odd� Ra�

n�;· -�i

-�- ---_-_·_·_·_·_-_-_-_-_-_-_-_-_-_-_-_-_-_

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_- -_-_-_-_-______________ , ..... . 1g:

3.6 Mul�������eren/causal Multivariate Models················.·_·_:·.:

·_·_-_-_-_-_-_-_ �05 3.6.1

�e�;��o:����- -.:·_·_·_-_-_:·.:

·.:·_-_-_-_-_-_-_-_-_-_:

·_·_-_-_-_:·.:·.:·_·_·_-_-_:

·.:·_·_-_-_-_-_:

·.:·_·_-_-_:·.:

·.: ............. . �g�

3.6.2

P�rtial Mediation I Partial Spuriousness ..... . .................... .. .... 107 Suppression - --······

····:·······································_:·.:·.:·.:·.:·_·_·_

·_·_-_:·_-_: ::·. 108 Moderation I Interaction_- -··············:········ - -······· ···················· 109 Multiple Linear Regresslo� Analy�ls ···v····:··b

·l�� ··················· 11 o

I. Interval and Ratio Predictor ana . . 111 Mode mg _ d N . al Predictor vanables ..... -- - -······

Modeling Ordmal an omm 116 Linear Regression Analysis: Assumptions ............................ . 117 3.7 Summary - -- --····

······························· ·············································

Concluding Remarks on Statistical Tools ............................................ . . 118 1 1 9

Index 1 22

Notes

Preface

T hl:rl: ;11· · llllllll'IPWl illHlks 011 statistics. A lthough many of them are in trodu ·tory, till· olkn cover a lot or stat istical ground resulting in

mass ive vo lumcs. This tc.xthuuk has only 1 28 pages and does not have statist ical theory as its main lhcmc . Instead, Statistical Tools intends for students in the social sci�.:nces to become famil iar with commonly used statistical applications.

Quanti tat ive data analysis is common practice in the social sciences and knowledge about statistics is therefore essential . This, however, does not necessar i ly mean that students need advanced knowledge of mathe­matics. We think it is more important to have a thorough understanding of the practical appl ications and the interpretation of the statistical results. Consequently, no mathematical knowledge is required to understand the content of this textbook.

A l l statistical appl ications are exempl ified using data from current re­search in the social sciences. Due to i ts popularity, we use the computer program SPSS to produce all stati stical outcomes.

We would l ike to express our gratitude to Rob E isinga, Bert Fe l l in ' ·

Nan Dirk de Graaf, Ariana Need, and Peer Scheepers for providing a l l relevant stati stical data collected i n the Netherlands during 1979 - 2005. Rense N i euwenhuis greatly helped in trans lating the original Dutch t�.:xt­book Statistiek als hulpmiddel and helped build the supporting w�.:bsitl:: www.ru.nl/mt/statistics/home. Special thanks also to Matthew Bennet t l(>r correcting our initial manuscripts and for providing indispensable advice about (Oxford) Engl ish usage.

We extend special thanks to our students from Radboud University Nijmegen, who contributed to the improvement of our lecture materials over the last ten years that now find themselves bundled here.

F i nal ly, we would l ike to thank Hans Schmeets and Peer Scheepers for their efforts i n making Statistical Tools come alive.

Manfred te Grotenhuis Theo van der Weegen

Radboud University Nijmegen, The Netherlands

tool (too I) I. ;\ <il:vict:, such as :1 s:11v. lls,·d 111 l :wtl , l :lll' III:HHI:il or mt:clianical work

::!. ;\ llliH'IiccH·. •.cH'il ,1•: n lnlh,·. Hs,·d lo c:ul :md shape \. ScHHl'lliiHg rc:;�:udc:d ns 11\ ' C'l'�:m c v ' " llw c'/111\ 1111• "" ' "' Cllll''s proll:ssion

Suut• , . hilt• 111111 1/11 ,,,.,.,/,, /lol/1/111'1'1111///un/

Page 4: Statistical tools Grotenhuis Weegen

Statistical Tools

INTRODUCTION

Statistical science comes i n a l l shapes and forn1s. N onetheless, it is often associated with the more complex aspects, like probability theory. As a consequence, people often think of statistics as something quite difficult. For students in the field of the social sciences, (e.g., anthropology and sociology), statistical knowledge is typically not an end in itself, but a practical means to help answer research questions. Therefore, it does not make much sense to teach these students how to derive various complex formulas or to teach the fundamentals of statistics at great length. There have been (and sti l l are) courses in statistics that focus on the fundamen­tals. However, as a result, students may attain a deeper understanding of statistical theory, but lack the abi lity to apply this knowledge in a pract i­cal research ·setting; analogous to receiving a driver' s l icense for demon­strating competence in repairing a gear box.

Thus, Statistical Tools does not focus on complex statistica l theory (but interested readers can find additional information in our endnotcs), but on the practical applicability of statistics. Using data sets from recent research, we i l lustrate how statistics can be an i ndispensable tool in social science research. Thus, our main goal is not to provide students with ex­haustive statistical knowledge, but we do hope that this book contributes to the proper use of a variety of statistical tool s that help in answering questions arising from the research process.

STRUCTURE

Chapter one discusses quantitative data that are often collected using ran­dom samples. S ince many data col lections are avai lable through the Internet, a short overview i s given to where these data can be found.

Chapter two covers important top ics on descriptive statistics, focusing on how large quan t i t ies or dal < t can be summarized in a concise manner. These summaries can IK· dot t L' • rapl t i ca l ly u s i ng charts, such as a bar chart or h i stogrant. : 1 1 H I/ot ll l t l l l l ' l i l ·a l l y 1 1 s i ng 1 1 1easures l i ke the mean and the standard lkv t : t t l l l l l ll• . t i i J ' n:t l l t pks l t o l l l S l l l' i a l sc ience research, var ious ways o l' p t ov t d l i i J' ,, l l'" l't ' l •.t111111t:11y o/ d : t l : t w i l l he i l l u st ra ted.

Page 5: Statistical tools Grotenhuis Weegen

10 ::tllhlli 111 tooL·

Chapter three deals w ith inkn:rlli;d st;tt ist ics, prov iding answers on how to draw conc l usions about a populat ion when only information on a sma l l part o f that popu l a tion (a sample) is ava i lab le . Relative ly simple tests on proportions and means arc d iscussed a longside more compl �x tests based on regress ion analysis . Even so, the proper use of the statistical te.sts, and the correct interpretation of the outcomes remains the focus of thts chap­ter. The various sets of data used throughout this book, are avai lable as SPSS fi les on a special web page: http://www.ru .nl/mt/statistics/home. To facilitate the use of these files, we italicize al l variable names in the text. A l l exercises that relate to the statistical topics discussed throughout this book can also be found on our web page.

For readers interested in more detailed - and often more technical -background information, we provide various endn?tes. More ad.vanced statistical appl ications which are relevant but not discussed m th ts book and l inks to relevant l iterature for fmther reading can also be found on our web page.

Tables of probability distributions, often found in statistical textbooks, are not included here for reasons of space. Instead, procedures to calcu­late these probabi l ities using statistical software (SPSS) are also provided on our web page.

SOFTWARE

Since the focus i s on practical statistical applications, we cannot go with­out a proper toolbox, that is, a statistical computer program. We decided to use SPSS ( origina l ly : Statistical Package for the Social Sciences, see w w w .spss.com) . Th is program is often used for teaching statistics due to i t s uscr-fi· i end ly interface. Since our goal is to write an affordable and ;1cccss ible book, no explanation of the use of SPSS itself is given. For this, various books are already avai lable (For Dutch readers we refer to two books published by Van Gorcum: Basiscursus SPSS and SPSS met Syntax) .

STATISTICAL DATA

1.1 INTRODUCTION

To apply statistics, one needs data that fulfil! certain requirements. One important requirement is that the data must be .!lumerical, which means that a l l information is expressed in numbers. Of course, informat ion is often expressed in words (often referred to as 'alphanumerica l'), but i t has to be transformed into numerical information before stat is t ica l proc ·­dures can be applied. Numerical data are often stored in a .lpremll'/w('/ (see Figure 1 . 1 ) . Generally, the rows of a spreadsheet represen t th · units of analysis. Conclusions drawn from statistical ana lyses rci'cr tP t l lvs1 • units. ln social science, the units of analysis are often people or 'rvsp1111 dents ' . The columns of the spreadsheet contain the variables, whi ·it r:ttrv information about the units of analysis. Should these u n its rcpr ·svttt 1rr·n pie, characteristics such as sex, year of birth, ed11cotiou, iut·n"'''· :11111 marital status are typically represented in the data and ar · l'OII IIIIurrly used variables for statistical analyses. The numerica l va lues ol' lhv v 111 ables are recorded in the cel l s of the spreadsheet. F igu re 1 . 1 siHIIV.' :111 SPSS-spreadsheet with information on six variables fi·om tltr 'L' 1\'spnrr dents.

W1#i1·ii§!.$1ffl!f\jl@$i§llli{iWii!!UM!� d.QJ21 Eile !;d� :Lie'-N Qata Iranslorm t;_nalyze Q_raphs !Jtil�ies Add-Qns �ndow !:!elp 1:

1 --

2 3

Data View Vari ble Vlow

2 3

2 2 1935 3 " .� 1967

I 'I\' .vv' .I nil lh l'i<JCl ssor i:neady --------�--------��--�� -------------�

Page 6: Statistical tools Grotenhuis Weegen

1/ 1 li''il'"' I

1.2 FOUR LEV -L' M ASUREMENT

A var iable mc�tsur ·s :1 spn· t l t r c l t : t r :trtcr is t i c of the u n i ts and holds var i­

ous values. For cxa lt tpk, a l l ll'SjHIIId ' l t l s have a specific age and a spe­

cific level of educa t ion . ( lc1 tcr:tll y, t l tcrc is a lot of variation among these

characteristics; for i ns tance, n..:spondcn ts ' ages may fal l b etween 1 8 and

70 years old. Most var iab les l�;tvc a l i m i ted set of categories to classify

the units of analysis. These categories arc idcn t i l ied through unique nu­

merical codes in the spreadshcct . l;or example, the variab le marital state

in F igure 1 . 1 has four categori es t ha t a rc coded l , 2, 3 , and 4. Typically,

information regarding the mean ing of codes can be found in the dataset' s

accompanying codebook, bu t is a l so often found in the data fi le itself. To

i l lustrate the latter, Figure 1 .2 shows the codes for the variab le marital

state, which represent 'Not married' (code 1 ), ' Married' (code 2), 'Di­

vorced ' (code 3) , and ' Widow/Widower' (code 4) .

MJl1WfJi.lift·Mtmifiil.ftihtjiiJ.tWi@M®·tbAW �� E:ile Qala !ransform i',nalyze Q.raphs �ilities !l.dd-Q.ns ·�ndow tjelp

Value Labels Spelling ...

Figure 1 .2 SPSS Variable View (upper panel) and Value Labels (lower panel)

In. s l : 1 l i s t ics, V:ll'l:thkn 1'/111 lw I .tll')'.llllll'd i n to one of l i te l ( > l low i ng lcv ·Is

11! IIIC;tSIIrCiliCIII: • Nomina l • Ord i na l • Interva l • Rat i o

Nn111i1�a/ variables represent the lowest measurement leve l . The ca tego­ncs ol t hese variab les are only distinguished by their names. The numeri­cal codes r�p�esentin_g_!he categori_es can therefore b e chosen arbitrari ly as long as they are only associated with one category. An example o f a nom i na l variab le is marital state: it has different categories without any log 1ca I order. (unmarried people are not ' less ' in any respect than married pcop lc ) . The l ack of ordering means that the arbitrary coding i n F igure 1 . 2 can be changed to 6 (Unmarried), 1 (Married), 4 (Divorced ), and -6

( W idow/Widower) without changing the meaning of the categor ies . Ordinal variables cannot be coded arbitrarily, because the ca lcgoriL·s

arc ran k ordered. For example, the variab le educational/eve/ i s ord i na l : t s i t

.i s assumed that the various levels can b e ordered according to t he kvd

ol kno�ledge that has been attained by the respondent. Gencr:t ll , tl11• lowest knowledge level has been attained by people with on ly c lc ntl' t l ln t schoo l education, a higher level is obtained by col lege s tudents, : 1 1 1 d t l 1v h ighest l evels of knowledge is attained at university. To ex press l l t L'-1 ran k i ng, the codes of an ordinal variab le must be in ascend ing 01 d1· sccndmg order. Educational level may be coded ' Elementary' 1, ' ( 'o l lcgc=3, and 'University '=4. However, 'E lementary'=O, 'Co l l ege ' X, a11d 'University'= 1 9 is equivalent because the rank order rema i ns thc sa n 1c. This clearly shows that the sp.acirJ.g between subsequent val ues of un or­d i na l variab le is arbi trary . This is not a prob lem for ordinal var iab les such ; ts educational level as the �xact extent of the knowledge increase w i t h each increase o f educational level i s unknown.

In contrast, il}Jqva! varic:_bles have e?<ac;t known differences (or in ter­va l s ) between sub sequent categories. An example of an interval variab le is Fear of birth. The categories of this variable are rank ordered : t he more recent one's year of birth, the younger that person is . But crucial ly, the In terva l s between subsequent age categories a l l have the same d istance ( scc F igure 1 .3) . l n th is example, t he difference between two adjacent h 1rl h cohorts a lways reprcscnts cx<�cl ly o11e yea r.

1967 19Gfl I 'I /I l 1071 1972

Page 7: Statistical tools Grotenhuis Weegen

111 I luq>I>JI I

L:u· •er i l l k l v: i ls l'!lll "'Il l lw 11\I IIJl:llnl: peop l e rrom 11)()() were born 30 years earlier tl1:111 Jll'uplt• hollll 111 I') 10, which a l so ho lds ror peop le born in 1970 and _000, 111 1 11(,() n11d I 11110 . Int erva l variab les do not have an abso l ute zero va l u ·. Tl1v v :11 1 11hk l'c'rll· nfhirt!J is a good example be­cause the Western ca le nda r IISL'S till· h i rlh or Chr i s t as its starting point. S ince this zero po in t is arbitr:uy, i t 1 1 1eans t ha t people born a hundred years BC have a year of b i rth o i ' - I 00 . As a consequence, the calculation of ratios is not meaningfu l . J."or i ns tance, we cannot state that a person born in the year 1 000 was born tw ice as ear ly in history compared to a person born in the year 2000. This a l so holds for the variable temperature measured in degrees Celsius. An objec t with a temperature of 30 degrees Celsius is not twice as hot as an object that has a temperature of 15 de­grees Celsius because 0 degree Celsius is not the absolute zero value ( in fact -273 . 1 5 degrees Celsius or 0 Kelvin i s the [absolute] zero value).

Ratio variables have rank ordered categories, equal d istances betwee.n categories, and an absolute non-arbitrary zero value. As a consequence,

I

ratio 's can b e calculated meaningfully. The temperature measured in de-grees Kelvin i s an example of a ratio variable. A temperature of 400 de­grees Kelvin is exactly twice as high (hot) as a temperature of 200 de­grees Kelvin. The same holds for the variab le age: forty year olds are exactly four times as old as ten year olds. However, in many social sci­ence applications the difference between interval and ratio variab les is irrelevant as few calcu lations rel y on an absolute zero-value.

Dichotomous variables are a special category of variables. These al­ways have e.x_f!£tly two ategories, l ike the variab le sex, for example. These variab les allow the researcher to rank observations in terms of presence/ab sence (or yes/no). For the variab le sex, respondents are female or they are not ( i . e . , male). In addition to this , it becomes irrelevant whether or not there are equal intervals between a l l categories because there is only one interval . Therefore, mathematically, a dichotomous variab le has the same characteristics as an interval variable. In later sec­t ions we wil l i l lustrate the consequences of the interval character of di­chotomous (or 'dummy' ) variables.

The levels of measurement are relevant in two ways. F irstly, each measurement level is associated with specific statistical techniques. So, once the measurement level is known, we know also which techniques are feasib le and which are not. In the case of nominal variab les the fre­quency of occurrence ( i .e . , the total number of units of observation) in each category can be determined and statistics are then restricted to ana­lyzing data as counts and percentages. The categories of ordinal variab les can be ranked, for example , from 'few ' to 'many ' . In la ter sect ions we wi l l show how some stCJt i s t i ca l techn iques makL· l iSL' nr ll1i.-; r: 1 n k i ng, s uch

:)it >li:;llc:rtl I )t>l> 1

as the ca l ·1d: l l ion PI 111 · llll'd1:111. Scores on i n terva l var iab les can be added a nd s ul,tractnl, 1 1 1ak in • it poss ib le to ca lcu la te the mean score. As d i scussed be rore, I l l · s ·ores nn a ra t i o variab le can be d i v ided to ca lcu la te rat ios.

Secondly, the hierarchy in levels of measurement is decisive in choos­ing the appropriate statistical technique from the multitude of techn iques. If a variab le of i nterest does not possess the required level of measure­ment needed for a particular statistical technique, then this technique can­not be applied. Likewise, if a technique is appl icable to a particular vari­able, i t general ly also applies to variab les measured at higher l eve l s . The hierarchical ranking of the level of measurement from low to high is: nominal, ordinal, interval, and ratio. Thus, techniques that are sui ted l(x nominal variab les can also be used for ordinal , interval, and ra t i o var i­ables.

1.3 SELECTING UNITS OF ANALYSIS: RANDOM SAMPLING

The stm� ing point in scientific research i s always the rese: 1 rc l l qu ·stHIII 1 1 exp l icates the subject of research and defines the un i t s or ana l s i s. < kl'il sionally, the research question i s highly specific and/or the taq�l'l p11p1d 1 tion is very small . In these (rather rare) cases all u n i t s with i n lhl' 1H1111d 1 tion can b e sampled. Consider, for example, a resea rch projn· l till 1 1ty councils in a specific city, or on a l l firemen in some eou 1 1 ly. 1:u1 t i lL' k11 1d of research question, no practical prob lems arise regarding I i lL' sl'i ·riHlll of the units and conclusions b ased on these observat ions do nnl IlL' ·d 1 1 1 be generalized to a l arger population . Rather, descr ipt ive statis t i · :d :11 t : l ly ses are sufficient i n these scenarios. Typical ly, however, nol a l l u n i t s ·a11 be included i n the research, and a selection is required. I n t h i s case, til�.: important question is to what extent the selected units are va l id or repre­sentative of the entire population.

A r_andom sample. is required in order to g�!lE�I ize fi nd ings to a population based on a l imited number of selected units. This sample comprises a relatively smal l part of the entire population. In the Weslcrn world, several organizations (e.g. , The World Value Survey Network ( www.wor ldvaluessurvey . org) ) regu lar ly interview a l a rge sample o r peop le o n var ious l op ics, such as po l i t i c a l vot ing behav ior. These samples n lien comprise o r severa l t lious:l l l ds or i ndiv idua ls ( o ften re ferred to as ' responden ts ' ) rrotll Wll i dl : 1 wilk SJll'\' 1 1111 1 1 or da t ; l i s co l l ected. A s wil l be shown in ·h: lpll'l I, :t t:u1dn111 s:unpk a l lows researchers to make stat �.:ments ; 1h011l 1 1 1 1 ' l'lllltl' l"llliiLIIIIIII l'11 Ill· val i d, 111 ·s�.: gener;d s ta tc­lliU 1 ts req u i rl' till': lllllph· 111 l•t 11 11111 li'JIII"•I'Iilnllilll td.llial ptlpul:lti1llt. l t

Page 8: Statistical tools Grotenhuis Weegen

I[] nnn10r 1

is ol'ten sa id t ha t a sample should he (su rlic icn tly ) refJI'C.I'C'IIIotil'e, w h ich

means that the sample should possess the same key character is t i cs of t he target population. A random sample comprising a female-to-male ratio of 1 : 5 is not representative a s in most populations the ratio is c loser to 1 : I.

Simple....r..andom s(JJJJ.[2}jng is a commonly used strategy to obtain a rep­resentative sample of the population. I n this sampling procedure, respon­dents are chosen randomly from the target population and all respondents have the same probability of being selected. Simple random sampling is l ike selecting name tags from a basket by a bl indfolded person. To avoid non-random distortion ( 'bias ' ), the tags are mixed-up thoroughly before each selection.

Stratijie_d random sampling_is a strategy typically appl ied where units are not directfy selected' at random but are first grouped into categories

'

(called ' strata' ) from which independent-random samples are drawn in a

second stage. I n a simple random sample there is sti l l the possibil ity that the age distribution or the ratio between foreigners and natives i n the sample differ substantially from that in the population. This can be prob­lematic if the research question is about nationalism, as a biased sample can potentially endanger outcomes. To prevent this from happening, the population is typically grouped into different categories based on age and country of birth, called strata. Following this stratification, simple inde­pendent random samples are drawn from each stratum or a combination of strata ( i .e . , young foreigners). Usually, these samples are drawn pro­portionally to that of the total population. So, if the population has a fe­male-to-male ratio of 55-to-45, then approximately 55% of sampled re­spondents shou ld be females. The stratified sample approximates perfect r ·presentat ion of the population and its characteristics, such as age, coun­try ol ' b irth, sex, and marital status.

Alllltistoge sampling is a procedure that uti l izes one or more random p1 L' select ions from which a simple random or a stratified sample is drawn later, at a second stage. This sampling technique is considerably cost erlcctivc compared to a simple random sample. This is because sim­p le random sampl ing draws from the entire population, requiring the re­cru i tment of interviewers from al l over the country or the travell ing of long d istances to conduct interviews, both of which can prove expensive. Furthermore, simple random sampl ing requires a l ist of all people in the population (the ' sampl ing frame ') , which is difficult to acquire in many countries due to privacy legis lation. F irstly, in multistage sampl ing it i s more efficient to sample among communities (which might be stratified according to degree of urbanization). Secondly, the selected communities are then requested to provide a sample of inhab ita n ts from their commu­nity data base (again, possibly stratified, for i nsta nce by age a nd sex).

:il 11 lll:ll l llill i l If

lksides l'()S( ll'!hl\'llttll, Ill\' 1111111vcs undcrlying mu l t istage sa mpl i ng

can be rcsc;1rcl1 (h IVL'II hll L':\;nnplc, il' the research is about soc i a l net­works, S0111C 01' lilt: rl'SJ1tlt)(klliS in t he sample have to be rel ated tO o ther respondents. Likc w isc, suppose t ha t a researcher wants to investigate the extent people choose their partners on the basis of social characteristics, such as educational attainment. This requires a random pre-selection at the household level, where both partners within a sampled household are subsequently interviewed. A disadvantage of multistage sampl ing is that respondents are not selected independently from each other. With respect to households, this means that interviewing the head of the household automatically requires that his/her partner is also interviewed as wel l . This, of course, is exactly what (among other things) i s needed to deter­mine whether people choose partners because they share the same educa­tional background (known as educational homogamy in this research area). However, this also means that interdependency within households has to be taken into account (see section 3 . 3 . 1 ) . Statistical programs that account for interdependence between units are known as 'Mixed Models' and can be performed using SPSS (www.spss.com), the popular M LwiN program (www.cmm.bristol .ac .uk), and the freeware package R (www.r-pr�ject.org).

1.4 COLLECTING STATISTICAL DATA

There are four commonly used methods to collect stat ist i cal data:

• Survey • Experiment • Observation • Secondary data

In a §_urvey, data are collected from a large number of (preferably) ran­domly selected respondents. For (PhD) students and researchers in gen­eral, it is close to impossible to carry out a survey independently, espe­c ial ly if a l arge sample is required. Therefore, general ly only special ized research institutes, universities, and government agencies c9llect statisti­cal data using large-scal e surveys, i n which several researchers contribute to the questionnaire. An example is the Dutch SOCON project (www.ru .nl/sociology/research/socon), in which researchers from the dis­ciplines of psychology, sociology, and communication science at the Radboud University Nijmegen (Netherlands) interview 1 ,500 Dutch re­spondents every 5 years about a w ide a rray of subjec t s including rel igion, media usage, a l l iludcs towards (et h n i c ) m inori t ies, and professions.

Page 9: Statistical tools Grotenhuis Weegen

I l l I . l l i i pl i l l I

/c'XJWrilllell/.1' a n.: I l l · St.:l'OIId Wi lY o r co l lec t i ng da ta i ll wl 1 i ·ll respon­d e n t s a rc random ly ass i • 1 1 ·d to • roups, or preex i st i ng groups arc used. In c lassic exper iments, t wo • roups ex i s t : the treatment group who receive a ' st imulus' and a comparison group w ho do not (referred to as the control group). In a recent example, employees commuting by car were randomly assigned to a treatment or a cont ro l group. The employees wi thin the treatment group were asked to commute by bicycle instead of by car ( in this example the stimulus is the bicycle, resulting i n more physical exer­cise). The employees i n the control group continued commuting by car. After six months the physical condition of the employees was compared to their physical condition at the beginning of the experiment. The ex­perimental results suggest that the physical condition of the bike com­muters improved significantly and they also reported fewer bouts of il l­ness compared to participants in the control group.

Observation is a rel atively labor intensive method for collecting data. This data collection method requires researchers to become part of the group under investigation (participant observation). A lternatively, re­searchers can refrain from ful l participation, thus minimalizing their level of influence on those under investigation (unobtrusive observation). Both observational strategies utilize the natural environment of the participants being studied. For example, participant observation is used in cultural anthropology, where researchers study (sub-)cultures by means of actual partic ipation, and unobtrusive observation is used i n psychological stud­i es that explore the interactions between school chi ldren.

Surveys, experiments and observations can be (partly) executed by the researcher, but come with considerable time and financial constraints. Alternatively, researchers can make use of the enormous amount of digi­tal ly stored statistical data that has already been collected. F urthermore, t hese secondary data are widely available on the Internet. These data are routinely col lected, often using high quality random samples, and can also capture entire populations ( i .e . , censuses). Here is just a short l ist of important websites that provide, or l i nk to, secondary data:

• www .cbs.nl/statline (data from the Netherlands) • www.dans.knaw.nl ( idem) • http ://ess.nsd.uib.no (European Social Surveys from 2002) • http ://epp.eurostat.ec.europa.eu (other data from Europe) • http ://factfinder.census.gov (census data i n USA) • https ://intemational . ipums.orglinternational (census data) • www .measuredhs.com (demograph ic and hea l t h surveys) • http://ropercenter. ucon n .edu/ ( surveys i n U S A ) • http ://soc ios i t c .net /da tabascs . php ( w o rl d w ide l i hr: 1 ry ) • h t tp ://Jst agcs .org/ '/ ·s2 . · g i ( w or l d w idv l i h r : r r y ) • h t tp : / /Jst " • c s . org/ id :i l a/ ( sr · : 1 1 r · l r 1 ' 1 1 ) ', 1 1 1 1 ' )

: il 1 1 1 1 1 1 1 1 : d I > ! 1 . 1 I ! )

1 . 5 DATA QUAI I I Y

A l t hough t i ll: pri 1 1 1 : 1 r l r ll'liS o r t h i s book is on descript i ve and inferential stat t s t rcs , so n 1 c a t l clll io J I w i l l be pa id to the quality of statistical data. On the one hand, i t is o f ten a rgued tha t results from statistical research are to be v i ewed w i t h skcpt ic ism because the data were col lected in an inappro­pnate manner. On the other hand, people often claim that statistical out­comes should not be challenged as they are based on 'representative' re­search samples. The truth, however, probably l ies somewhere in-between �hese, two .extremes. Statistical research cannot prove someth ing to be tru.e , but It c�n demonstrate that one option is more l ikely than another

optwn, provtdmg some fundamental condi tions have been met. These conditions pertain to:

• Validity • Reliabi l ity • Representativity • M issing Data

Measurement vqlidity refers to whether a measurement actually measures what it i ntends to measure. For example, teachers are taught not to ask students whether they understand a lecture ' s content. Of course any pro­fessional teacher wants to know whether he or she succeeded but answers t� that particular question probably are the product of peer �res­sure or sttgma: who i s wi l l ing to confess not having understood some­thing? Since few students w i l l do so, the teacher appears to have suc­ceede.d. The question ' Have I been unclear about certain aspects?' is far �upenor because th is time the teacher 's performance is being evaluated mst�ad .of the �tudents ' abil i ty to withstand peer pressure and forgo stig­mattzatwn. This example demonstrates that questions can measure some­thing quite different to what was intended. Therefore, i n research termi­nology a distinction is made between valid and invalid measurements. The validity of m�asurements i s often discussed and defined with the he I p of experts and pnor research. For example, d idactical experts understand that peer pressure in a classroom should be taken i nto account and wil l r�cognize th is faulty (inval id) measure of teacher' s performance. I n addi ­tion to expert evaluation, the measurement should relate to other meas­urements assoc iated w i t h the subject . For examp le, it m ight be expected that students who J n d J cil l c l h : r l a course is l oo heavy a lso indicate that they do no t

. l l l ld ' I'Si ; J �Id j )iii'IS of l h ' l' I H II'Sl: COil f c n t . l r that relationship

does not CX ISI , O lll' l l l q •,lr l l l iiVl' ) '. IHHI Il' : ISOII ill l ji iCSi ion the va l i d i ty of th� ques t ron 1 1 1 l llldl' r S I : tlll l l ll) ' l n' l l l l l ' , . . . ll il'lil

Page 10: Statistical tools Grotenhuis Weegen

l<l 'liul!ili ( l ' rc l : � l l:s lo I I l L ' ( l i l l k t t l ) l o i i H i t l l t t 'I t or o r t l l · t l l l': t S t l t l' t t t l' l t l ; 1 1 1 td T

s i n 1 i la r c i rc t l l l lS I : I I tl 'l 'S n l i ' I H ' t l t •d I I I L ' : t s l l l l' \ 1 1 ' I l l shou ld ( rou • l i l y ) rL:su l l i n a s i m i l a r outco t l l L: . h l t l 'X l l l i i pk , i l lv pt •ssi h i l i l y i l l a l a mcasun;mcnt i s _u n­re l i abk i ncreases w h . , , q t l l 'si u l l tS : t t l' 1 1s ·d i l l a l can be i n terpreted i n mu l ­t i p le ways. Suppose i l l : t l r 'S I H l f Hk l l l s WL'r · asked t o answer this question:

"Po l i t ics dea l s w i t h t i le red t � l' t i u t l u l ' l ra l 'l ic jams, with minimizing crime levels, and w i t h lh · s i r · t t • l l te n i ng or women ' s labor market participation. Doing a l l t ha l , t he government can make good and bad decisions. P lease i nd i cate be low wh ich answer best corresponds to your personal opinion :"

o About I 00% of these decisions arc good o About 75% of these decisions are good, 25% are bad o About 50% of these decisions are good, 50% are bad o About 25% of these decisions are good, 75% are bad o About 1 00% of these decisions are bad

This multi-bgrre£ed que!j_t.!f.!n is a biased measurement of the perceived quality of government decisions. The indicated topics are very diverse, ranging from opinions addressing governmental decisions on traffic jams to governmental decisions on female labor market partic ipation. There is a slim chance that this measurement accurately captures respondents' opinions about the same government decisions, but separating the differ­ent types of government decisions into different questions wi l l increase t he rel iabi lity of this measurement. I f a researcher is interested in gov­ernmental decision making, answers cou ld simply be summed to create a Ukat scale. This scale 's rel iabi l ity wi l l be higher than that of each sepa­ra te question. Unreliabi lity undermines validity as well ; if one does not ( rough ly ) measure the same concept each time for every respondent; it does not logical ly measure the concept itself. This of course does not 1 1 1e < tn t hat rel iable measurements are also val id; rel iabi lity is just a neces­s: 1 ry condit ion for validity and is not a sufficient condition .

! l and- in-hand with rel iable and valid data, representati vity is a key characteri stic in statistical sampl ing. Unfortunately, researchers often as­sume t ha t the sample they use accurately represents the population - an assumpt ion that often goes unchecked. I f the principles of random sam­p l i ng are strictly fol lowed, a large sample wi l l generally be sufficiently representative. For instance, it can be shown that the ratio between men and women in a random sample of hundred individuals wi l l be close to t ha t in the entire population. However, by sheer chance, ( i .e . , bad luck) dev i at ions from the population can occur in t he sample . Genera l ly , this is not very probl ematic to t he genera I i l'.a l ion or s 1 : 1 1 i s I i c: i l li nd i ngs because a

/. I

n l a rg i t t o f L I I I C L' I I : t t l l i Y l i'> i l l\ ; ty : : l : l k l ' l l l t t l o : tCCO L I I I I ( SL:L: chapter J , ( 'onji­

dence lntcn•nls. p: t ) ' L' " " ) l l tl · s t l t l : t l ion i s more cr i t i ca l , t hough, when a sample i s hL:a v i ly b 1 : 1 s ·d hy 1!1 1 1 1 / '( ',\'f'OIISC', wh ich means t ha t part i cu la r sets or responde1 1 l s an; 1 10 1 or umlcrreprescntcd in the sample. This could oc­cur i f i n terv iewers predom i nan t l y v i s i t selected respondents during the a li:crnoon as people work i ng fu ll-time wi l l not be reached. The resulting sample wi l l not be representative of the labor market and the male-to­female ratio may also be d istorted since in many societies more men are in ful l-time employment than women.

Another source of nonresponse is when respondents refuse to answer ,. parts of the questionnaire. Classic nonresponse generators are questions about pol itical issues. Research suggests that people who are alienated or averse to politics are less l ikely to participate in political research. Con­sequently, the level of political interest measured in the sample wi l l be overestimated. Because nonresponse can turn even a well designed ran­dom sample into a non-representative collection of respondents, it is im­portant to deal with this problem at an early stage. Possible strategies to prevent serious nonresponse include special instructions for the i n ter­viewers to deal with sensitive subjects and eventually rewarding respon­dents initial ly refusing to participate. Furthermore, a slightly biased sam­ple induced ·by modest amounts of nonresponse can be made more representative by weighting the sample. However, a weight i ng s t ra tegy i s always based on variables w ith wel l-known population distri but ions. U n­fortunately these are often not the variables causing the stat i stica I prob­lems. For example, the ratio between men and women in the popu lat ion is often known exactly, but the distribution of educational level is not - le t alone the distribution of political alienation!

ln addition to weighting, it is possible to take into account underrepre­sentation of a population (e .g . , highly educated people) using statistical controls (see chapter 3, Multivariate Analysis, page I 0 1 ) . However, sta­tistical controls and · weighting procedures are only effective when the highly educated respondents sampled are representative of al l highly edu­cated in the population.

Finally, missing data can negatively influence the qual ity of the col­lected data. 'fffu?exmnple, respondents in Europe are asked about their income, they may be reluctant to answer because earnings are considered a private matter. Consequently, it is not surprising that a lot of infonna­tion remains missing when respondents are asked to report their exact income. If th is re l uc t 8nce to share in l 'ormat ion occurs randomly among responden ts , nol much s l : t l i s l i ca l harm is done . The s i tua t ion becomes more t roub l i ng whL' t l t l·spt l i Hk t t l s l 'ro t t t t i le t t pper c lasses systemat ica l ly rc rusc to answn i l t l · q t ll'.' l t O I I < 'o t t S(' ( ( I I t ' t t l l y . i i l L· a verage est i ma ted in-

Page 11: Statistical tools Grotenhuis Weegen

( , I I I I J I I ! l l I

come i n the s: 1 1 npk IV < I I d d l w < I I H k l ·. · t l l l l : l l cd . 1\ pot ' l l t i : d su l 1 1 t H l l l t o t h i s problem i s 1 10 t lo : 1 sk l t l l l i l L ' l' \ : I L ' I I II L' O I I H.: , hut to have respo 1 1de1 1 t s i nd i ­c_ate t he i r i ncome hy : 1 l l l l l l i l ll ' l P I l 1 .x · d broad ly defined i ncome catcgo­

nes. Generally, attempts s i H 1 1 i l d h · 1 1 1ad · t o l imit the amount of missing

data to the lowest poss ibk k:v · I s . ( : c 1 1er: i l s t ra tegies include proper intro­ductions to interviewers when s ·ns i t ive quest ions occur in questionnaires, or to have interviewers t ra i ned to reac t appropria te ly when respondents give evasive answers or simply re fuse to answer t he question. However, even when taking these precau t ions . samples may st i l l suffer from miss­ing data. Fortunately, statistical t echn iques l i ke multiple data imputation can be used to replace missing data, prov ided that some spec ific condi­tions have been met.

1.6 FROM COLLECTING DATA TO ANSWERING RESEARCH QUESTIONS

The previous sections provided a brief introduction to the conditionsJb.at . ?ata must meet before they can be fruitfully used in statistical analyses. A l l research fields require high quality data, but this is especially true of scientific research. The method of data collection should closely corre­spond to the goal of the research project, and the researchers should pro­vide a c lear overview of the validity and rel iabi lity of the data, the sam­p le ' s representativity, and the ways (serious) missing data problems have been dealt with. Furthermore, i t is customary to check and correct the data for errors - a process referred to as data cleaning. This should be done wel l before presenting descriptive statistics (see chapter 2) or infer­en t i a l stat istics (see chapter 3) . The next chapter outl ines various descrip­t i ve s ta t i stical tools, including those used in the process of data c leaning.

DESCRIPTIVE STATISTIC �

2.1 INTRODUCTION

When describing statistical data, It IS not very useful to describe eve u n i t separately - a strategy more c losely fitting with qual itative tee: t_y n iques such as in-depth interviews. Because the number of observatio h� . d . 1

. . � l 1 1 1 ata sets IS re at1vely large, adequate summaries of the data arc l llC\ � m lor

_mative

_. �hese summaries can ?e repre�ented by diagr�ms ( graph i c< \•· '

01 with statistical measures (numencal). This chapter w i l l l 1 rs t 1 1 1 t roduc \ I ) number of graphica l and numerical summaries of a s ing le var iab l ·. S ' : 1

nnd, descriptions of the associations between two var iab les a n.: i 1 1 t r \·

duced. F inal ly, th i s chapter ends with a schemat i c overv iew or l l t l· d\ 1

scriptive s�atistical tools that were introduced.

2.2 GRAPHICAL DESCRIPTION OF A SINGLE VARIABLE

Bar chart

!Jar charts are often used summarizing the scores on nominal and 01·din. variables (see section 1 .2) . In bar charts, the variable's categories a�1

placed on the horizontal axis (x-axis) of the chart. On the vertical y-ax ,e the absolute or relative proportion ( in percentages) of each category � s shown. Every category is represented in the chart by a bar. The height �s t hese bars is proport ional to the frequency of occurrence. The bars hav f equa l w i dth, while there is some spaci ng i n-between bars. To ensu�:-e readabi l ity of the chart, the number of categories should not be too Jar!!; e ( many software packages al low for the exclusion of one or more categi::J e nes from a bar chart), al lowing the bar chart to provide a c lear picture 1::\' : i l l counts. For example, F igure 2 . 1 shows that many respondents hav f Lower Vocational School ( 1 5%), Secondary Vocational School (24%), ()� Col lege (20%) as t he i r h i ghest educational level, whereas 0 levels and t leve l s a re c learl y a t t a i ned less ( both approx imately 5%). Thi s i s not su� prisi ng g i ven l ha l t i i L'Sl' ·d i iC : I t ion: i l levds i nc l ude l itt le vocational training,

'

Page 12: Statistical tools Grotenhuis Weegen

25% -

20% -

(j) , -g> 1 5%

c (j) 2 cf. 1 0% . -

5% , -

0"1< 0

.--

.--

.--.--

n I I L. Secondary I I I Elementary 0 Levels

Lower Vocational Secondary Vocational

A Levels

-

.--

ll I . I Other College

Un1vers1ty

Figure 2 . 1 Bar Chart for Highest Completed Educational Level

Pie chart

Pie charts provide a usefu l alternative to bar charts. The diagram contains a c i rc le, and each segment of the circle represents a category. Each seg­ment covers an area that is proportional to the frequency of occurrence. Pie charts are frequently used to show results in the media (e.g. , during poli t ica l elections). I n science, bar charts are genera l ly preferred instead because they are c learer and people are less l ikely to misj udge the propor­tions of each area to the extent that they do when evaluating pie charts. I f a pie chart i s chosen, corresponding percentages should be incl ude� i n each section to avoid misconception (see F igure 2.2). Pie charts are dtffi­cult to interpret when many categories are represented, especiall y when there are no categories with a h igh frequency of occurrence. In practice, the use of pie charts is l imited to nominal (and to a lesser extent ordinal ) variables with a smal l number o f categories, whil e (preferably) only a few categories represent large portions of al l units, as in Figure 2.2.

I '"'" i lpi iV< i � l l i i i i • , I IL•

Married ,

54 . 2 %

W i d ow/

W K l ow r , '- .8%

Not Married,

29 .4%

Figu re 2.2 . Pie Chart for Marital State (percentages included)

l l istogram

Since interval �nd ratio variables generally have a larger number or cate­goncs, a descnption of these variables using a bar chart is pre ferab le to a p1c chart. A bar chart, however, has spacing between adjacent categories ( sec Ftgure 2 . 1 ) and symbol izes the fact that the exact distance between : i l l c_ategories is unknown. As stated before, this is the case for both 1 1omma! and ordinal variables. However, the subsequent intervals be­l wc

_en categories in interval and ratio variables are fixed. This characteris­l l c IS accounted for in histograms as the spacing between bars is absent. 0 h1stogra� for the variable age (ratio scale) is shown in Figure 2 .3 . This l 1gure provides good insight into the distribution of the variable which is somewhat h i l l -shaped. '

'" ... . .

Page 13: Statistical tools Grotenhuis Weegen

/(i Cl l; q JI r 2

60

50

40 ._ Q.l ..0 E 30 ::::l

z

20

1 0

0

20 30 40 50 60 70

Age Figure 2.3 Histogramfor Age (range: 18- 69 year, one-year interval)

Stem-and-leaf plot

;\ s/(:'m-and-leafplot is an alternative way of graphically presenting vari­ables measured at i nterval and ratio levels . Like a histogram, stem-and­lea !' p lots give information about the shape of a variable's distribution. [n t hese p lots, a distinction is made between the stem and leaf F igure 2.4 shows the distribution of the weekly working hours. The stem of the chart contains the first digit ( ' stem-width=l O ' ) and the leaves denote the sec­ond digit (where every leaf represents a single observation ( 'each leaf: 1 case ' )) . The first row contains five respondents who work at least 1 0 hours per week (as i ndicated by the stem of 1 ) . The leaves indicate how many hours each individual works. To i l lustrate: two respondents work 1 0 hours ( 1 0 + 0), the other three work 1 2 ( 1 0 + 2), 13 ( 1 0 + 3), and 1 4 ( 1 0 + 4) hours, respectively. The stem-and- leaf plo t clearly shows that working forty hours a week is most rrequen t : 42 respondents have a ' nine-to-five ' job. The in terva l /ra t i o ch ; r ract ·r o r S l c l l t-and- lca r plots i s mirrored by the l inea r i ncrease i 1 1 d i • i t s o l " sk r l l s . l'Vl' l l i l " there are no ob­servat ions a t t ached to t i l l' s l v r 1 1 . T l t v s l l ' ' " n 1 1d k: r r p l u l is espec ia l ly su i ted to rcpresc l l l 0 1 1 1 i n h' l l' : t l l l l l l l l l l l I d l l . i l r k 1\ r t l t : 1 l l l l l l tn l l l l l l l l ber o l "

/ {

observat ions . 1 1 1 l : t r gc d : l l ; t sets , L h c rows very q u ick ly become too long. To coun ter t h i s , s ta t i s t ica l sol "tware such as SPSS makes i t possible for each lea f' to represen t more than a single observation. This, however, may result in a s l igh t ly less accurate p lot, where the distribution is less read­able. A more suited graphical description of interval and ratio variables with many observations is the histogram.

Working hours a Week Stem Width: 1 0 Each Leaf: 1 Case) Counts Stem Leaf

5 1 . 00234 1 0 1 . 5555668889 1 3 2. 00000001 23344 9 2. 566778889

1 9 3. 0000000001 222222222 27 3. 566666666666777888888888888. 42 4. 000000000000000000000000000000000000000000

2 4. 55 7 5. 0000000 2 5. 55

Figure 2.4 Stem-and-Leaf Plotfor Working f-lours a Wed

2.3 NUMERICAL DESCRIPTION OF A SINGLE VARIABLE

The previous section showed how a multitude of data can be appropr i­ately summarized using graphical tools. Nevertheless, presentmg (the shape of) a distribution is often not the only object�ve. In statisti

_cs . there

are also various ways to numerically express spectfic charactensttcs of that distribution. These numerical descriptions generally relate to the cen­ter and the variability of a variable (see F igure 2 .5) . For example, it i s instruct ive to present both center and variation of the age distribution not only graphically (see F igure 2 .3) but also numerical ly.

Center

Va riabi l i ty �

Fig u n· ( 01 ' / / /1 '/ l !l l l l l l l l l l l f , , {, t r • 111 u / )istrihutiun

Page 14: Statistical tools Grotenhuis Weegen

Ft·c<J ucncy l a blc

A frequency lahle i s a ust; l \ d : 1 1 1d JWpuhr way or numerica l ly prl:sen t i ng a variable, irrespec t i ve o r l ht; kvd o r measurement . 1 t con ta i n s a l ist o f all the variable's categories along w i l h a bso lu te counts, percentages, and if necessary, valid percentages and cumu l a t i ve percentages. The number of categories should be l imited as a li·cqucncy tab le with ten or more catego­ries is often difficu lt to read. Table 2.6 is a frequency table of the highest completed educational level (the same var iable that was used and graphi­cally presented earlier in Figure 2. 1 ).

Table 2.6 Frequency Table for Highest Completed Educational Level

Highest Completed Counts Percentage Valid Cumulative Educational Level Percentage Percentage

Elementary school 90 6.5 6 .7 6.7 2 Lower Vocational school 2 1 5 1 5.6 1 5.9 22.6 3 Lower Secondary school 1 78 1 2.9 1 3.2 35.8 4 Secondary Vocational 334 24.3 24.7 60.5 5 0 levels 62 4.5 4.6 65.1 6 A levels 79 5.7 5.8 70.9 7 College 281 20.4 20.8 9 1 .7 8 University 1 1 2 8 . 1 8 .3 1 00 .0 9 Other educational levels 24 1 .7

Total 1 ,375 1 00.0 1 00.0

Table 2.6 shows '0 levels ' to be least frequent: of a l l 1 ,375 respondents on ly 62 have completed this level of education, amounting to 4.5 percent ( (= 62 I I ,375) * 1 00). N ote that the denominator includes respondents li·om all categories including ' Other Educational Leve ls ' . To calculate percentages based on all respondents with a c lassified educational level only, th i s ninth category must be excluded ( i .e . , defined as a 'missing value ' ) . Because the denominator now is 1 ,35 1 (24 less) the valid per­centages are s lightly higher. Based on cumulative percentages, 60.5% of al l respondents (= ((90 + 2 1 5 + 1 78 + 334) I 1 ,3 5 1 ) * 1 00) have secon­dary vocational school or less. Again, the 24 respondents in the 'other' category are excluded. A frequency table provides a lot of information and may be confusing to the reader, especial ly i f large and/or many tables are presented. If this is the case, graphical representations are often more suitable, while it is also possib le to present re levant characteristics of a distribution with a s i ng le v a l ue. These arc i n l roduced i n I he next sec t ion .

l lt r .cr lpl lvo :> t . r t ln i iL 1 / ! )

2 . 3 . 1 MEASUR N I RAL TENDENCY

Mode

The l east comp l icated way of describing the center of a d istribu t ion w i t h a single value i s to report the category that has the highest freq uency o r occurrence. This is called the mode. In Figure 2 . 1 and Table 2 . 6 t h e mode equals 4 which is ' Secondary Vocational Schoo l ' , while in Figu re 2.2 the mode is ' Married' (code = 2).

The mode is often used when income distributions are descr ibed . 1 1 is highly instructive to know what income category most work i ng peopk fal l into (also known as the modal income class). By dc l i n i t ion , the mode does not require any rank order of the categories nor does il rcq u i r · L'l J t i : d distances between categories. Hence, t he mode can be app l i · d 1 1 1 ; l i l Y variable, although i t i s typically app l ied to nom ina l v: 1 1· i : • h l ·s . !\ d 1 s:u l vantage of the mode i s that its value i s somel i mcs d i nlnd t t l l < k lv t l l l l l l t ' and it can be rather ambiguously . For example , t i l l: • nodv i 1 1 I ll\ ' : • � ·.t · d t •i tribution (see F igure 2 .3) can be 32 and 34 as l lo l h l' : l l l' ) ', t t l t t' 1 11 1 ' " equal ly frequently (62 observations eac h ), w h i l · t > l l l l ' ' · a lq • t l l l \ "1 1 1 • d 1 1 1 1 '

almost as frequent (38 and 42 both occur 6 1 t i m · s ) : i t · 1 1nl i i ' P I I ' ' • I ' l l l l 'd 1 1

a l l in the mode.

Median

The median describes another aspect of a d is t r ibu 1 io1 1 ' s l 'l' I I I V t , l l l l l l w l y the point at which half of the total number o f observ a t i o 1 1 s i s r\· : 1 ( ' 1 1 · d l \ 1 determine the median, the data must be rank ordered l i rs t . 1 :or ·:-<: l li l pl · , the range of numbers:

1 0, 70, 20, 50, 20, 30, 40, 40, 1 0, 60, 70, 80, 90, 90, 90

is first ranked to: 1 0, I 0, 20, 20, 30, 40, 40,@60, 70, 70, 80, 90, 90, 90.

The median in this ranked row of numbers is situated at observa t i on no. 8, because this is the most central observation (seven observations have lower numbers and seven observations have higher numbers). This means that the median equals 50. When the number of observations is even, the median l ies exactly between the two most central observations. For in­stance, if the number I 00 i s added to t he range of numbers in the example shown above, t he med ian t hen becomes 1331 ( I he n umber exact ly i n be-t ween numbers :'\0 : 1 1 1 d (10 ) . _ �

I 0, I 0. _0, . 0, \0 , · 1 0, · 1 0, 0, ( 10, /0, 70, KO, <)( ), <)( ) , 1)0 , I 00 .

o r course, I l l . l l l l l l l i ll ' l t t l oh : ol ' l I l l l i t l l l :l I l l d : I I : I Sl' I S i s l y p i l:: d l y l i l r greater l han in t h . ' X : I I I I j t l l " • ' ' ' " " 1 1 I l l 1 1 I l l ' o l l l 1 1 I : l ' o l '•:. I l l \ ' llll'l l i ;l ll C:lll he c : i i C I I

Page 15: Statistical tools Grotenhuis Weegen

\ ,q t r q I l l_., I r

l a t �d frt l l l l <l fr�q l iL' I l L 'Y l : 1 h k h H l' .\ : l l l l p l ' , i n ' J 'a h k '... . ( l I l l ' l l lL'd i a n is th� fou rth �a tegory ( S�.:cl l l H I : l l y V m ·: l l l t l l l : d ) h�.:ca use of a l l I , .1 5 1 va l i d obser­va t ions the most c�.: n t r: t l ohSl'I V : 1 t i o 1 1 is l H l . 676 ( ca lc u l a t ion : ( 1 ,35 1 + I ) I 2) , and th i s observa t ion 1 : 1 1 1 s i n to t h�.: I ( Ju r t h ca tegory . The third category (Lower Vocationa l ) can 1 10 1 he t l l�.: 1 1 1ed ian because th is level contains only observations up to no. 4X3 ( ()0 1 2 1 5 I 1 78) . Likewise, the fifth category (0 Levels) is not t he mu. l i an because i t starts with observation no. 8 1 7 (483 + 334). This can a l so be eas i l y in ferred from the cumulative percentages in Table 2.6 : for the th ird ca tegory this is 3 5 .8 , and for the fourth category this amounts to 60.5 . The point at which 50% of al l (ranked) observations are counted thus resides w ithin the fourth category. Generally, however, the median need not be calculated manually in this way because i ts algorithm is inc luded in a l l statistical software packages.

Table 2.7 Median of Highest Completed Educational Level

Median Number of Valid Observations

4 1 ,351

A wel l -known example in which the median plays an important role is in determination of the poverty threshold. First, the median of a l l household incomes is detennined, i .e . , the income of the households after 50% of a l l the ranked households are counted. This is shown in Figure 2 .8 , where the median of the income d istribution equals 1 ,300 euros.

€780 €1 .300 = Median

= Below Poverty Threshold

� = 50% of all Households

__&:_ I ncome Distribution

Figure 2.8 Definition n/1/1( ' 1 1n l 'l 'l 'f l ' 'f'!u ·, ·s!Jn!r l tlll 'l l ll,i ! l' tlw Median

,) )

l ; rom t h�.: 1 1 1L:d i : 1 1 1 o l I , 100, : 1 j )L' I l' ' l l l ag�.: is taken to determ ine the poverty t h reshold . 1 1 1 I l l · I •: I I H I J ll ' : l l l l i 1 1 i u l l , t h i s percen tage is genera l l y set to 60%.

So, the thresho l d a l l H l U l l l s l o 7 XO euros ( I ,300 * (601 I 00)), which means that househo lds w i t h a net household income below 780 euros are consid­ered to be below the poverty l ine. The median is used to determine the threshold because it is not sensitive to extremely high income values that are part of the overal l income distribution in many parts of the world. Consider, for instance, a sample of 1 ,00 1 households in which the most central househol d after ranking is no. 50 1 . Suppose that after ranking, households no. 45 1 through 55 1 turn out to have an income of I ,300 eu­ros. I f 1 0 households are added to the sample w ith an income of two mil­lion euros, the total number of households rises to 1 0 1 1 . As a result the median shifts from household no. 50 1 to household no. 506 . However, the total income of household no. 506 is 1 ,300 euros, so the median re­mains the same. More extremely, we could add up to I 00 households with extremely high incomes (the exact income is i rrelevant) to the origi­nal sample of 1 ,00 1 households without any change in the med ian ( t he

median wi l l stil l remain to be 1 ,300 euros, because the med i a 1 1 i s at

household number 55 1 i n case 1 00 high incomes arc added ). ( I ' l l ·r:d I .

the median is said to be a robust measure, w h ich means t hat it is r a l hl· l

insensitive to extreme scores (also cal led outliers ) . A n�ccssa ry ·o l H i i t i o t l

to using the median i s that variables need to be a t least ord i 1 1 a l as t i l l· o h

servations have to be ranked meaningfu l ly first.

Mean

The mean (or more accurately, the arithmetic mean, symbol : x ) i s the most commonly used measure to indicate the center of a distribution. The principle of the mean is that there is a point in a variable 's distribution at which equi l ibrium is found (see F igure 2.9) . To calculate this point, the scores of al l observations are summed and divided by the total number of observations. 1 For example, in the fol lowing range of numbers, the mean equals 44.

5, 8, 1 0, 25 , 25 , 50, 50, 70, 70, 8 1 , 90 -7 5 + 8 + I 0 + 25 + 25 + 50 + 50+ 70 + 70 + 8 1 + 90 = 484 -7

484 I 1 1 = 44.

All numbers can now be simultaneously replaced by 44 without changing the sum of a l l scores ( I I * 44 = 484) . So, on average, every observation has a score of 44. As sa id , the mean is the point on the distribution at which the scor�.:s pl 'r l \:l ' t ly h: l i : l l lC · c: 1ch other. To i l l us t rate th is, we first subtract l i·on l L' : 1r l 1 v : d 1 1 · t l 1v l l l l ' : l n : 'i · 1 ' 1 . X 44 . . . . . 90 44, which re-

Page 16: Statistical tools Grotenhuis Weegen

32 Chapter 2

suits i n the fol lowing numbers: -39, -36, -34, - 1 9, - 1 9, 6, 6, 26, 26, 37 , 46. The mean consequently is 0. The sum of all negative numbers equals - 147 (-39 + -36 + -34 + - 1 9 + - 1 9) and the sum of all positive numbers equals 1 47 (6 + 6 + 26 + 26 + 37 + 46) . I n absolute terms, both sums are equal, and thus balance each other out. Additionally to this arithmetic exerci se we can also graphically show that the mean is the point at which the balance is in equi l ibrium:

44

I 1 1 I I I I I 5 8 1 0 25 50 70 8 1 90

25 50 70

Figure 2.9 The Mean as the Center of a Balance in Equilibrium

An obvious disadvantage of the mean can be derived from thi s figure. I f outliers (very high or very low values) are added to the balance, the point at which the balance is in equi l ibrium shifts profoundly. For example, suppose that a value of 1 88 is added to the balance. The mean then be­comes (484 + 1 88) I 1 2 = 56 ! Note that adding 1 88 to the scores does not alter the median (= 50) ! Generally, the mean i s a adequate measure for a d istribution's center as long as thi s distribution is not overly skewed to the left or the right due to extreme scores ( out l iers ). H ighly skewed dis­tributions can easi ly be recognized because of their d istinct shape (see Figure 2. 1 0) . By definition, a distribution is skewed to the right if the mean is higher than the medi an and vice versa for d istributions skewed to the left (see F igure 2. 1 0) . Generally, in strongly skewed d istributions (such as income distributions) the median is more appropriate than the mean.

mean

skew cl to i lq l 1 l

Descriptive Statistics 33

[ndividual characterist ics, such as body height and body weight, tend to have a more or less S VJIIJ/Ielricol distribution, which means that an approximately equal number or observations can be found to the left and to the right of the mean ( sec J ." igurcs 2.3 and 2 .23) . The mean therefore i s a very useful too l t o i ndicate t he ccnter o f these two distributions. Table 2. 1 1 shows the means for the ra tio variables body height and body weight.

F inally, we would l ike to note that the use of the mean is l imited to in­terval and ratio variables as calculations of the mean require summation of a l l values, which is only meaningful when the i ntervals between adja­cent categories are known (or assumed to be known).

Table 2.1 1 Means of Body Height and Body Weight

Mean

2.3.2 M EASURES OF VARIABI LITY

Height 1 73.83

Weight 76.24

When describing a d istribution numerically, it is often not enough to re­port the central tendency using the mode, median, and/or mean, because a distribution also has a certain degree of variabi lity around its center. As shown in F igure 2. 1 2, the variab i lity of distributions can be quite d iffer­ent, even when mode, median, and mean are equal.

mode/median/mean ----+

Figu re 2. 1 2 Sa111e Mode/Median/Mean but Different Variability

R a nge

The most bas ic w : 1 y tu :l o � y ·.u l l l \ ' 1 1 1 1 1 1 ) ', n l lt 1 1 i l ; 1 d i s l r i h u l i o n ' s variab i l ity is In c a l u d < � k t hv d i l l t · l l ' l l l t ' ht l l\ 1 '< 1 1 1 1 1 1 ' 1 1 1 . 1 \ 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 i n i n n 1 1 l l score. ' 1 '1 , ; , , , 1 ; 1 ·1 : . ... . . . . . . . , , , . . . . 1 1 • . 1 , 1 , . . . . . • . . .. I . . . • 1 • . . . 1 . 1 : . . I I • I

Page 17: Statistical tools Grotenhuis Weegen

4 C l r. rplor ?

cases ) , i n terva l and ra l io v: 1 1 r : d lks . l r r l h e sequence 1 0, 30, 50, (JO, 90, the

range equa ls SO ( 90 I 0 ) . l l owcv ·r, I he downs ide of us i ng th i s measure

is its high sens i t iv i ty lo · x l r ·m · scores; when just one score of 1 70 1 s

added to the sequence above, I he range i s doubled. Another disadvantage

is that the range is not i n lorma t i ve about t he exact shape of the distribu­

tion. To i l lustrate th is, Figure 2 . 1 3 shows two quite differently shaped

distributions that have the same range ( SO ) .

1 0 9 0 1 0 90

Figure 2 . 13 Same Range but Different Shaped Distributions

Interquartile Range (IQR)

A more appropriate alternative is the interquartile range ( IQR). This measure indicates the range of the middle 50% of all observations. To determine this, quarti les are used. Quartiles sp l i t the distribution in four equal ly sized parts, where each part contains 25% of all observations. Previously, the median was said to be the point at which half the number of observations has been counted (after ranking). I n tenns of quartiles, the median is the second quarti le (indicated as Q2). The difference between the first and the third quartil e then is the interquarti le range (Q3 - Q 1 = lQR), as shown in F igure 2. 1 4.

D = First 25% of Observations

D = Central 50% � = Last 25%

01

IQR

02 03 Figu re 2. 1 4 Meonin,!!, oj' (hwrtill 's l ll l r l lllli ' l "r f l l r l l 'l ill ' Hnllge (IQ!?)

Do:a;rrplrvo ' l i i l r : , l rc: .

/\s prev ious ly s l a l l:d, l i re 1 1 1 ed i an ( Q2 ) i s robust, wh ich means that l h i s measure i s re l a t i ve l y i nsens i t i ve to extreme scores . This means t h a t Q I , Q3, and con sequently , t he IQR share this robustness as wel l . The advan­t age of the IQR over the range is that the d ifferences in the degree of variabi l ity are better represented. F igure 2 . 1 5 shows the distri butions ll·om Figure 2. 1 3 , but now with the addit ion of the interquarti le range. The IQR of the first distribution is 40, whi le the IQR is only half of that ( 20) for the second. This is because these distributions h� e quite di ffer­ent shapes. As with the range, the IQR can be calculated with a l l types of variables except nominal ones. Table 2 . 1 6 shows the median (Q2), range, m ini mum and maximum, and the three quartiles for the variables boc6; height and body weight. N otice that in SPSS the TQR is not presented and has to be calculated from Q l and Q3 afterwards ( TQR body height = Q3 Q l = 1 80 - 1 67 = 13 and IQR body weight = 85 - 66 = 1 9 ) .

Figure 2. 1 5 Different JQR due to D[//er( 'ff/ ,)'/'ntwr l ! J,,, ,, ,hlltl l • l l l

Table 2.16 Numeric.al Measures ofthe Voriahilil \ ' u/

Body Height and Body Weight

Height Weight Number of Observations 1 ,209 1 ,209 Median 1 73 75 Range 52 8 1 Minimum 1 52 44 Maximum 204 1 25 Quartiles 1 st (Q1 ) 1 67 66

2nd (02) 1 73 75 r( l (QJ) 1 80 85

Page 18: Statistical tools Grotenhuis Weegen

6

Detecting Ou Uicrs w i C h Bux p lu C s

Box plots were no t i l l t ts tra l ·d wh i l · tkscr i b i ng charts in sect ion 2 .2 for the reason that they con 1 < 1 i 1 1 st a t i s t i ·a l measures that bad not yet been in­troduced at that po i nt � the 1 1 1 ·d i < l l t , quart i les, and the i nterquarti le range. Box plots are we l l su i ted to dclecl excep t iona l ly low and high scores, to describe the overa l l d i str i but ion, and to compare distributions (the latter is described in section 2.4 . 1 ) .

As mentioned, some measures l ike the mean are sensitive to excep­tionally low and high scores ( k nown as outliers). Outliers can originate from errors during data entry ( for instance, someone erroneously enters a score of 1 00 into the data base instead of the intended I 0). Also, it i s common practice to designate relative ly high scores (99 or 999) to special categories such as the answer ' don ' t know' in questions about attitudes. When analyzing data, these scores need to be set to 'missing' during the data cleaning process but occasionally mistakes occur. F inally, extreme scores can result from valid observations - the income earned by top sen­ior managers, for example. ln box p lots created by SPSS the extreme low/high scores are indicated with 0 and *. Observations indicated with 0 are located between Q l - 1 .5 IQR and Q 1 - 3 I Q R (low scores), and Q3 + 1 .5 IQR and Q3 + 3 IQR (high scores). Observations i ndicated with * are located outside Q l - 3 IQR (extremely low scores), and Q3 + 3 IQR (extremely high scores). Very extreme low/high scores are potential ly unwanted outliers that influence the results in an undesirable way.

To i l lustrate, F igure 2 . 1 7 shows a box plot for the variable weekly working hours. In this figure, Q l equals 24 working hours per week and Q3 equals 40 hours per week (IQR thus equals 1 6) . The extreme scores are located at the top of the distribution. Observations indicated with 0 are between Q3 + 1 .5 IQR and Q3 + 3 I Q R; that is, 64 and 88 hours ( 40 + 1 .5 * 1 6 = 64 and 40 + 3 * 1 6 =88). The observed scores 65, 66, 67, 70, 72, 75, 80 fal l into that interval . Some observations (indicated with *) are located beyond the point Q3 + 3 * IQR = 88. Their exact scores are 90 and 99 hours. Note that the box plot indicates potential outliers but i t does not show exactly how many observations have extreme scores. A fre­quency table is suited to provide information about the frequency of oc­currence (see Table 2 . 1 8) . Table 2 . 1 8 shows that 24 respondents work 65 hours or more. As mentioned earlier, the mean is sensitive to such scores. The scores 90 and 99 wi 1 1 exert the s t rongest in l l ucncc , and the researcher may rightfu l ly wonder whether t hcs · : 1 r · v : t l id observa t ions at a l l . On closer i nspec t ion , t he cod ·hook shows t l 1: i l t ilL'SL' :1n; codes for the an­swers ' don ' t k now ' ( 1 ) 0 ) : 1 1 1d ' d i d t iP I i l t l 'l W I ' I ' ( 1 11 1 ) l t t SPSS t hese codes shou ld he dcs i l.!, t l a tn l :1s l t l l :l .�I I I J ', v t l t w • , , w l 1 11 1 1 t ' \ t " i t � < ks t hem from any s ta t i s t il'al a l l : t l ys t . ·

I lo: H : I ip l lvo : · 1 n l i : l l lc:1

1 00

80

60

40

20

0

Highest score (=60) within 03 (=40) en 03 + 1 .5 * IOR (=64)

Lowest score (=0) within 01 (=24) en 01 - 1 .5 * IOR (=0)

j 03 + 3 IQR (=88)

*

*

0 0

1

Weekly Working Hours

03 + 1 .5 IOR (=64)

03 (=40)

IOR (=1 6)

01 (=24)

Figure 2 .1 7 Box Plot for Weekly Working Hours

Table 2 . 1 8 Respondents Working More Than 64 Hours a Week

Frequency Cumulative Counts Percentages Percentages

65 4.2 4.2 66 4.2 8.3 67 4 .2 1 2.5 70 1 1 45.8 58.3 72 4.2 62.5 75 2 8.3 70.8 80 5 20.8 91 .7 90 4.2 95.8 99 tl .? 1 00

Total ;•,t I I lO

' J /

Page 19: Statistical tools Grotenhuis Weegen

\ ,1 1 j 1 1 t I 'I

To tklcnn i ' ' · t l 1�· l l l i l l w l l t ' \ ' \ \ ' 1 1 \ ' d hy t h ·se cx t rel l lc va.-vs, t l 1 �· 1 1 1 ea l l and the quart i ks : I l l' l ' aku l : l t l 'd l u 1 t l l l l' � · se · 1 1ar ios : i nc l us io 1 1 1 1 1' : d l cases t h a t scored 90 and 91) , l' .\ l' i i iS I I l l l 1 1 1 t i i \'S\ ' v : 1ses, and exc lus ion o l ' a l l cases with 65 of more work i n • I IO I I I s . T hv l l'S I I I t s : 1 re shown in Tab le 2. 1 9 .

Table 2. 1 9 DesaitJiil '< ' .\'totistil ·s l l 'iiiJ oud without Out/iers

Weekly Working Hours I ull Sample 90 and 99 >64 Excluded Excluded

Valid Observations 1 ,3 1 3 1 ,3 1 1 1 ,289 Mean 33.74 33.65 32.99 F i rst Quartile 24 24 24 Median (Second Quartile) 38 38 37 Third Quartile 40 40 40

When the cases with scores 90 and 99 are excluded from the sample, the mean changes sl ightly from 33 .74 to 33 .65 . There is only one case that scored 90 and only one case with 99, which explains the rather minor change. If there were a substantial proportion with these scores the mean would have been seriously affected. Note that al l quarti les remained ex­actly the same.

Exclusion of all extreme scores (more than 64 hours) has more serious consequences for the mean as it decreases by three quarters of an hour, whereas the median decreases by one hour. Because it is plausible that people work 80 hours per week - bearing in mind long working hours, for instance, in bars, restaurants, and finance - the second column in Ta­ble 2 . 1 9 seems to be best for describing working hours. That is, exclusion of 90 and 99 scores (not representing observed hours) and inclusion of the respondents working between 64 and 80 hours per week.

Standard Deviation and Variance

The standard deviation i s the most commonly used measure for variabil­ity. This measure is related to the distance between the observations and the mean. For example, suppose we have the fol lowing range of numbers: 1 0, 20, 30, 40, 50, 60, 70, 80, 90, and 1 00. The mean is 55 ( ( 1 0 + 20 + 30 . . . . + 1 00) I 1 0) . How can the variability around the mean be best de­fined? Taking a l l d i s ta nces from t he mean toget her is inappropriate as this would result i n t he range: -45 ( I 0 5 5 ), -J5 , -2) , - 1 5 , -5, 5, 1 5 , 25, 35 and 45 . The S l l l l l o r t h i s r: l l lgc i s ol\ l 't / \ '.1' 0, wh i ch o r l'OUrse is not infor-

,I\ I l Il l \ • I I I I I IVI I \ l n l ln l l l .

mat i ve l l l ' t i ll· V I l l i t l 1 1 l i l v 1 1 t :• l l l l l l l' : 1 pp 1 opr ia l e lo t u rn a l l d i s t : 1 nces i n t o

ohsolutc d i s l : l l l l'l ':i ( t l 1 1 l 1 ' · l l l l i l l i p l y l l l).!. t he nega t ive numbers by - I ) . The

sum t hen a n H I I I I I I S I l l . 1 '1 ( ) ( · l 'l I I . I _5 I 1 5 I 5 1- 5 1- 1 5 -1- 25 -1- 35 +

45 ) . Th i s SU I \ \ , d i V I (kd hy t he l l l l l l lber or observations, yie lds the mean

d is tance : 250 1 1 0 2 . . l l owever, this absolute measure is not often used

because it does not re la te we l l to in ferential statistics (see chapter 3) .

A nother st rategy i s to sum the squared distances (a negative score

turns positive when squared ) . This results in a sum gf 8250 (= -452 + -352

+ -252 + - 1 52 + -52 + 52 + 1 52 + 252 + 352 + 452 ='"2025 + 1 225 + 625 +

225 + 25 + 25 + 225 + 625 + 1 225 + 2025) . By div iding this sum by the

number of observations ( 1 0), the average squared d istance to the mean

equals 825 . In statistics, this number is known as the variance. The vari­

ance can be compared to the area of a square (see F igure 2.20)

Sides = 28.72 Area = 28.72 * 28.72 = 825

F igure 2.20 Variance Compared to the A rea of a Square

In statistics, the measure of variabil ity is preferably indicated as a dis­

tance instead of a squared d istance ( i .e . , a square). The square root of 825

(= 28 .72) i s taken (this value equals the length of the sides in F igure 2.20)

and the resulting measure is called the standard deviation? Roughly, the

standard deviation can be interpreted as the average distance from the

mean, although mathematical ly this is not correct.3 However, for practical

reasons this interpretation suffices. The standard deviation refers to a dis­

t inct and frequently used distribution - the normal distribution. The nor­

mal distribution is crucial for many stat istical tests (see chapter 3) . Unlike

the variance, the standard dev ia t ion is expressed in the same units of

measurements as the var iab le i t s · l r . l ;or ex a m p le , w hen the variable body

weight is measu red i n tenn o r k i lo � �r: 1 1 ns , t he s ta ndard deviation indicates

the ' average ' d i sta l l l ' · i 1 1 k i lu) •, r : I I I IS r: 1 I I I L' I ' 1 1 1 : 1 1 1 squared k i lograms, as is

the case w i t h v : 1 1· i anrr ( 1 1 1.' h : 1 1 d lo 1 1 1 1 : q •, i l lr w hat .\'ifi iOred kilograms

wou l d mean 1 1 ' 1\' ) l ; \ 1 1 i l i l l 'l i i : I I I V I ' I I I I I I H '''n: . Tahk .) . 2 1 shows the mean,

the standa rd drV I : t l l l l l l , 1 1 1 d l l w ' . I I L I I I t t' Dl I l l \ ' v : 1 1 1 : 1 hks hud1 • lwight and

hodv l l 'ei.e ./tl .

Page 20: Statistical tools Grotenhuis Weegen

40 Ll i i 1p lo1 '/

Table 2.2 1 !\1/em t, Stntu lnt ·, / 1 J, . , . ,r tftl l / 1 n11d Variance of 1 /c •t,! :/tl n11d W< 'ight

Mean Standard Deviation Variance

Height

1 73.83 9.48

89.90

Weight

76.24 1 3.41

1 79 .70

Roughly speaking, respondents d i verge on average 9.48 centimeters from the mean body height ( 1 73 .�3 ) and d iverge on average approximately 1 3 .4 1 k i lograms from the mean body weight (76.24). The word ' diverge' is used because the disti nction between l ong/short and l ight/heavy is no longer relevant. This is because al l d ifferences between observations and the mean were squared to calculate the standard deviation. Consequently, respondents weighing 3 k i lograms below average and respondents weigh­ing 3 ki lograms above average both score 9 ' square kilograms' so the in­formation whether they are below or above average is lost.

Like the mean, the standard devi ation can be compared to a balance in equi l ibrium. We can redistribute all respondents in such a way that half of them measure 1 64.35 centimeters (the mean minus the standard deviation -7 1 73 .83 - 9.48) and weigh 62 .83 k i lograms (76.24 - 1 3 .4 1 ) whi le the others measure 1 83 .3 1 centimeters ( 1 73 .83 + 9.48) and weigh 89.65 kilo­grams (76.24 + 1 3 .4 1 ). These new ly constructed distributions of the vari­ables height and weight have the same mean and same standard deviation as the original variab les. The only difference is that now all respondents are at a distance of 9.48 and 1 3 .4 1 units from the mean respectively (see F igure 2.22 which shows this for the variable body weight).

Because the mean is part of the calculation, the standard deviation is only suitabl e for interval and ratio variables. Also l ike the mean, the stan­dard deviation is sensitive to outliers, and in the case of extremel y right and left skewed distributions, the IQR is actually a better-suited measure than the standard deviation.

= 50% of all observations weigh less than average = 50% weigh more than average

62.83 76.24

1 3 .41

1 89.65

Figure 2.22 Standard Deviation os l >istr lll r ' t ' In tltc ' ll fc ·r ll t

(/Jod1' Weigltt tukc11 os ' ' ' ' ""I 'll ' )

l lo::C I I I i l lvn ! ) l a i i : . I IL : I

2.3.3 M EA S U R E S 0 H ATIVE STANDING

A common probkm in res ·arch i s incomparabi l i ty . O llcn, data on var ious v<:�riables are ava i lab le , but t he uni ts of measurement arc not identica l . Th is i s problematic when these variables need to be compared. Previous ca lculat ions demonstrated that the standard deviation for body heigltt i s 9 . 4 8 centimeters and 1 3 .4 1 ki lograms for�ody weight (see Table 2.2 1 ) . From this we cannot infer that body height shows less variabi l i ty t han body weight - this would be l ike comparing apples and oranges. Th i s is not to say that apples and oranges cannot be compared at al l , one on ly has to take into account their similarities. For example, the amount or v i ta­mins and/or calories in apples and oranges is perfectly comparable and something similar is possible with body height and body we igh t . Con­sider, for instance, a person who measures 1 80 centimeters and weighs < ) ( )

k i l ograms. Based on the means (shown in Tabl e 2.2 1 ) , i t can he L'O I I eluded that thi s person is both taller and heavier than the avcragl' . 1 1 1 0 1

dcr to make a proper comparison though, the posi t ion i n the d i s t 1 l i H J t t l l l l or height and weight must be compared. With regards to h · i gh l , I I I L'i pt ' l son i s located to the right of the mean. Thi s i s a l so tru · l ( lr 1 l 1 i s I H' I '<� I I I ' •, weight, but this observation l ies far more to the right or thl' I I IL ' i l l l 1 1 1 1 11

ever, the question remains: how much more to t he r ight ex a ·t I y·

Height

1 80 90

X = 1 73.83 X = 76.24

I Wt lql l l

Figu re 2.23 /,ocotiu ll n/ 1 1 1\, ·sr " '"r /' ' 1 11 l l ' t l!t /(od1 • 1 /eigltt 180 c11t { f/1( /

/Joi / 1 ' I J 't ' t,! :ftf I)(} {, I !

Page 21: Statistical tools Grotenhuis Weegen

11 2

Percenti les

One answer to t h i s qut:st ion l i ·s in pcrct:n tages. ln this case, we have to calculate the percen tage o r rcspomkn t s wi th a height of 1 80 centimeters or less, and the percentage or n;spondcnts weighing 90 kilograms or less. These percentages are cal led percentiles. In fact, we a lready explained what percenti les mean, because the quart i les in Table 2 . 1 6 are equal to the 25th, 50th, and 75th percenti le . In other words, a percentile indicates the percentage of (ranked) observations that is counted from observation no. 1 onwards. To determine the exact percenti le, cumulative percentages are most useful . Table 2 .24 shows (parts of) the frequency tables for body height and body weight. The cumulative percentages indicate that the (truncated) percenti le scores are 76 (for 1 80 cm) and 86 (90 kg), respec­tively. Based on these scores, a fair comparison i s possible. Compared to the person who measures 1 80 centimeters and weighs 90 ki lograms, 24% of al l respondents are tal l er, but 'only' 1 4% are heavier. So the person in this example is relatively more heavy than he is tal l . Percentiles are commonly used including in education for a l l kinds of school perform­ance tests. The percentile indicates the cumulative percentage of people performing equally wel l or worse compared to a pup i l ' s test performance.

Table 2 .24 Frequency Table for Body Height and Body Weight

Height Weight

Frequency Cumulative Frequency Cumulative Counts Percentage Counts Percentage

1 78 72 70.9 88 2 1 8 1 . 5 1 79 1 2 7 1 . 9 89 1 5 82.7 1 80 58 76.7 90 46 86.5 1 81 1 8 78.2 91 1 2 87.5 1 82 35 8 1 . 1 92 1 5 88.8

F inal ly, it should be noted that percentiles can be computed for all vari­ables except for nominal variables. I n the case of i nterval and ratio vari­ables, z-scores can be used in addition to percentiles to indicate the rela­tive standing. Z-scores are d i scussed in the nex t sec t ion .

To determ i ne rc l a l i v · s l : 1 1 1 d i ngs , the abso l ute d i flcrcnccs between obser­va t ions and the mean arc requ i red. I n our example w here a person mea­sures 1 80 cent imcters and weighs 90 ki lograms, these abso l ute di lkr­cnces amount to 6. 1 7 centimeters ( 1 80 - 1 73 .83 ) and 1 3 . 76 k i lograms ( 90

76.24). It is i ncorrect to infer that this person differs approximately twice as much from the mean body weight compared to the mean body height. As stated before, the two variables ar�i ncomparable because di f ..

rerent units of measurements are being used ( i .e . , centimeters vs. ki lo­grams).

To obtain a common measure other than percenti les, the standard de­viation is very useful as it indicates the average deviation from the mean . For example, i n F igure 2 .23 there are respondents who are exactly I s tan­dard deviation to the right of the mean. These responden ts measure 1 83 .3 1 centimeters ( 1 73 . 83 + 9.48) and weigh 89.65 ki l ograms ( 7Cl . ..ll 1 1 3 .4 1 ) . In absolute terms, they are located at 9.48 ccnt i mc l c rs nnd I 1 . · 1 1

ki lograms from their respective means. Relativel)l, however, l l t L-sL· J ll'PJ lk are equally tal l and heavy, for their relative pos i t ion lo I l l · 1 1 1 · : 1 1 1 1 s l l w same - exactly I standard deviat ion! Two variab les w i t h d i f 'l l.· t l' l l l l l l l l l h 1 o l

measurement can be correctly compared when abso l u l l: d i r i ' · t v t l i 'i' ' • 11 1 1 replaced with relative differences. To ach ieve th i s wt: l l : I Vl' l i l i 'I I I I I J I I I f t the relative standing in terms of standard dev iat ion.

We return to our example to i llustrate th i s . The ahso l l t i L' d i l i l ' t i ' l l t i"l between observations and mean amount to 6 . 1 7 ccnl i mt:l · r s 1 1 1 1d r I / I • ki lograms (see previous calculations). The standard dev ia t ions : 1 1 1' I J · 1 >1

and 1 3 .4 1 , respectively. It is c lear that the height d i rf'crs less l l l : t t t I s l n 1 1 dard deviation from the mean height. The weight d i ners approx im : 1 1 L' I y 1 standard deviation from the mean weight. The exact d i llcrcnccs , l : t hc l ·d

:::-scores, are: .65 (calculation: 6 . 1 7 I 9.48) and l .03 ( 1 3 .76 I 1 3 .4 1 ) . · 1 Ti t · relative weight is thus about one and a half times as large as the rcla l ivc height ( 1 .03 I .65 = 1 . 58) . Z-scores can be calculated in s ta t i s t i ca l sort­ware packages. In Table 2 .25, the z-scores for respondents 1 80 ccn t ime­tcrs tal l (58 observations) and 46 respondents weighing 90 ki lograms arc shown (calculations: SPSS).

Table 2 .25 Z-scores for Bodv Height= 1 80 cm and Body Weight= 90 kg

Height = 1 80 cm Weight = 90 ki lo

Z-score ( i ! l l 1 .026

Page 22: Statistical tools Grotenhuis Weegen

C hebyshev's R u le a n c l J <: m p i r i(: a l R u le

Besides compari ng ind iv idua l observa t ions from d i fferent variables and measures, z-scores a rc used to corn pa re t he relative standing of multiple observations in one s i ng le dis tribu t ion. When from any distribution the observations are taken tha t l ie w i t h i n z-scores -2 and 2, then thi s selection always comprises at least 314 (7YX, ) or a l l observations. Between the z­scores -3 and 3 at l east % (88 .9'Yo ) or all observations are always found (see Figure 2.26). Generally, for any number of total observations, a pro­portion of at least 1 - 1 I z2 is located between -z and z (where z is the number of standard deviations). This fommla is known as Chebyshev's Rule, named after a nineteenth century Russian mathematician. Note that when the fonnula is appl ied to z = 1 , at least zero observations ( 1 - 1 I 1 2

= 0) are located within z-scores - I and 1 . Clearly, th is is not informative. Chebyshev's rule therefore is useful for any z > 1 , but it is especially known for z = 2 (75%) and z = 3 (88.9%). Chebyshev' s rule may be used for any distribution, regardless of its shape. The distribution shown in Figure 2.26 has multiple peaks and has a number of sudden rises and fal l s . A lso, this distribution is skewed to the right. Nevertheless, Cheby­shev' s rule is valid !

z = -3 -2 mean 2 3

Fig u re 2.26 Chebyshev 's Rule (applicable to any distribution)

When a distribution is approximately symmetrical and hi l l -shaped (see Figure 2. 1 4) , the empirical rule is much more informative compared to Chebyshev' s rule. It states that for every roughly symmetrical hi l l -shaped distribution, approximately 68% of a l l observations fal l within the z-score range - 1 and 1 . Between z-scores -2 and 2, approximately 95% of al l ob­servations are located, and approximately a l l observations (99. 7%) I ie within -3 and 3 (See Figure 2.27) . Note that the word 'approx i mate ly ' i s used in the empirical rule, because i t i s a der iv : t t i v · o r t he exact ru l e, which states t ha t 68.27% of a l l observa t ions : t t"L· hl" I W \T t t 1 scores - I and I , tha t 95 .45% is locat ed between -2 ; 1 1 1 < l . ) , : t t H I I I L t l ! )< J / 1 " " is loe< t t cd between -3 and \ . The e x < t c t ru l e i s o 1 1 l v : t l t d ' " ' l l w 1 / ( 1 1 1111 11 r l1stnhutiou,

Do�cr rplrvo l ; 1 1 1: : 1 1 r : :

w h ich is symrn · t r i va l a 1 1 d l 1 1 1 i sh < tped and can be descri bed w i t l t ; t re l < t t i vc ly s i mp le l(m n u la ( on our webs i te a SPSS (syn tax ) f i l e i s a va i la b le to ca lculate percentages for any z-score in a normal d istribut ion ) . W c w i 1 1 return to this distribution i n chapter 3 because i t plays a cri t i ca l ro le i n inferential statistics.

Approximately 99.7%

-3 -2 -1 mean 2

Figure 2.27 Empirical Rule (suitable for synunl'lric ( //!( / 11 1·11 s/11 1/ 11 ' 1 I r l1 ,1

tributions)

A summary of both rules is shown below:

Within z-scores

-1 and +1 -2 and +2 -3 and +3

Any Distribution (Chebyshev's Rule)

at least 75% of all observations at least 88.9%

Symmol r ic t i i H I I I I I I •d t o t t u " l ( mpt l l < : t l I { I l l" )

approxit t l l t l ! ly 1 >1 1 '1., approxi t nnloly ! l ! >%

approximat ly ! l ! l /%

2.4 STATISTICAL RELATIONS BETWEEN TWO VARIABLES

Up to this point, all our descriptive statistics rel ate to one sing le var iab le, and are thereby called univariate descriptions. With bivariate sta t is t ics, the statistical relationship between two variables is described. I ns tcad o r "relationship", other words l ike "association", " interdependence" , o r "correlation" are used t o denote t h a t two variables are statist ica l l y re lated . Two variab l es a rc pos i t i ve ly re la ted when low scores on a li rs t var iab le coinc ide w i t h low srmcs r l l l ; t sccol ld v ; 1 r i ah le and h igh va l ues on t he l i rs l go together w i t l t l t i ) •, l t Sl · r u r:s 0 1 1 t l t r : SLT< l l l d v ; t r i ab lc . When low va l ues on one var iable L'O i l l \' t r k \\ t i l t l 1 1 1 ' l t \ . t i l l \' o 1 1 l i l r: o the r va r i ab le < l l l d v i ce

versa , t he rc ia l t < l t t � . l l l l t , ., •d n i P , I I r , d l y l t l ' l ' : t l t V I ' 1\ h i v ; t r i : t l · s ta t is t i · a l r · l ; 1

Page 23: Statistical tools Grotenhuis Weegen

il ( i ( l idpl i l l ')

t ionsh i p ca l l b · show 1 1 l ' t i i H· t l '. t . t p l l l \' : t l i y ( us i ng p lo l s ) o1 l l l l l l l l' I I L' : I i l y .

N umerical sta t i s l icd rv l : i l 1 1 1 1 1:� 1 i t ps : 1 1 · 1 10 t j t 1s t used i n tkSl' l l pl t v l· s ta t i s­tics, but arc a lso uscd l \ 1r l l i k l l · t i l l : l i s t a t is t ics ( i . e . , gcnera l iz i n • lo a l � 1 rger population ). l n lcrcn t i a l s l a l is t tr :d t l ll ' : t s l l n.:s a rc described i n chapter 3 .

2.4.1 GRAPHICAL DESCRI PTION OF A BIVARIATE RELATION

Box plot

Box plots have already been descr ibed i n section 2 .3 .2 to detect ex­tremely low and extremely high scores . Box p lots, however, can also be used to describe the distribution of a dependent variable ( indicated as y variable and found on the y-axis of the plot) for each category of an in­dependent variable (indicated as x variable and found on the x-axis). Figure 2.28 shows an example in which the distribution of educational attainment (see Table 2 .6 for detail s) is compared between three cohorts; respondents born between 1 935- 1 950, 1 95 1 - 1 97 1 , and 1 97 1 - 1 980.

8 Q) > Q) 7 -' cu c 6 0 � () ::J 5 -o w

-o 4 Q) w Q. E 3 0 () u; 2 Q)

..c .Ql I

ge

� '

\ d" me 1

1 935-1 950

03

IOR

0 1 an (02)

1 95 1 - 1 970 cohort

1 97 1 -1 980

Figure 2.28 Box Plotfor Highest Educational Level and Cohort

Figure 2.28 i l lustrates t hat t he med ian ror p ·op l l' rro l l t t he o ldest cohort equals 3, the med ian ror l l t i dd l . l' ( )hnrl l'q l l : t l s . J . : l l l d t i t . median ror t he

youngest cohort eq 1 1 : 1 i s 'i . Tl t i s l l l l' : l t l s t l t . t l l t . d l 1 1 1 l l t t · I H 'opk be long i ng to

11 /

the o ldest co i tor l ( i . · . . pcnp lc born be tween 1 93 5 and 1 95 0 ) COl l l p l ·kd cduea t ion at t he Lower Secondary School or lower. I n t he m idd le coi H l l t , ha l t ' o f " the people completed Secondary Vocat iona l School leve l o r lower. w h i l e in the youngest cohort, half of the respondents have 0 lcvc ls or

lower as their h ighest level of educational attainment. Furthermore. 1 : i •

ure 2.28 shows that the first quarti le (Q l ) becomes i ncreas ing ly h igh ·r. and that the third quarti le for the oldest cohort is c learly lower t h a 1 1 i t is for both the other cohorts. The interquart i l e ranges ( IQ R ) lor t he f i rst l wo cohorts are equal but larger than 4fhe IQR i n t he youn •est cohort . T i l ·

shift ing median and decreasing IQR charac t c r i i'.e t he process o r ·d 1 1 · : t t iona l expansion that took p l ace in the modern Western world . ' ' ' t i l t s process more and more (young) peop le Lend t o < t l l : t i n h i gl t L ·r n l l l t ' : i l to l l l l i l evels. As a consequence, the share o r l ower ·dul' : t l rd pvopk I l l t i l t ' j ll l l l l l lation decreases whi)e the Shares 0 1 " m idd J ' ! t l l t f l l l ) ', i l ' I t 'd i H ' I I i l 'd j \ l 't l j d t • increase over time. The box plot i n � . - X i 1 1 d 1 · : t i n� t l 1 n 1 1 1 1 1 · I ' d " ' 1 1 1 1 1 1 1 1 1 1 expansion somewhat loses momel l l l l l l l , p: 1 1 t l y d t l t ' t u l l ' l i l i l l ' 1 ' 1 l 1 1 1 1 1 1 1 1 the proportion of responden ts t ha t a t\' l l l ) '. l w 1 t 'd l i i l l i l d 1 1 1 1 1 1 1t 1 ) , strongly as indicated by t he s t � tb i l i l y or ( ) \ N t t l l l l ' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 I l l 1 1 I mum - minimum) rema i ns equ; t l : I ( H· l " : t l ' l l t ' 1 1 1 t 1 u 1 1 1 l 1 1 dd l i l t ! 1 1 1 1 t • 1 1 sti l l respondents with elementary sc l tot l i l l l ) ', 1 1 t d v , " l 1 1 1 t 1 1 1 1 1 l l ' """ ' ' there are respondents who grad uatcd l 'r o n 1 l l l l i VI ' I ' t l l \

Scatter plot

Scatter plots can be used for the bivar ia le d ·sr1 i p1 i n 1 1 1 1 1 t l w 1 1 l n l i l l l l 1 1 1 1 • between two variables that either arc i n terva l or r: t l l t , , l i t \' p l 1 1 1 1 t i l l ! 1 1 1 1 1 1 horizontal (x-)axis, and a vertical ( y - )ax i s . t r a · : l l l s: t l l l ' i : i l l t l l l ,' : l l l p I'• I ' sumed between the two variables, then t he dcpcnd · n t v: 1 1 1 : t hk 1s t H t,'H tioned on the y-axis. For example, if one wants to dcscr ih · l i t · rl' i : t l t u t l ship between the variables body weight and hodv lll!ig!Jt, t i t · L': l l tsa l relationship is undisputable; the tal ler the responden t , the morc i t ·/she weighs. In this example it is hard to assume reverse causa l i l y ; a body weight increase does not increase body height. Th i s means t ha t ho1 �t ' weight is the dependent variable (y variable) and body height i s t l t c i nd · pendent variable (x variable). By convention, the dependent var i : th lc i s placed on the y-ax i s and the independent x -vari able i s p laced on l ht.: x

axis of the sca t ter p lo t . as shown i n F igure 2 .29.

Page 24: Statistical tools Grotenhuis Weegen

( ·' ' ' '" ' ' )! ?

1 25 0 0 1 20 ( )()

() 0 � 1 1 5 oH 0 0 (/) 1 1 0 0 0 0 0 0 E 0 0 Cl) 1 05 0 ..... Ol 1 00 .Q 95 � 0 c 90

"0 85 Q) 80 ..... ::::J 75 (/) Cl) 70 Q) 0

� 65 -.E 60 Ol 55 ·a; :5: 50

45 0

40

1 50 1 55 1 60 1 65 1 70 1 75 1 80 1 85 1 90 1 95 200 205 Height (Measured in Centimeters)

Figure 2.29 Scatter Plot for the Relationship between Height and Weight

Line graph

The relationship between the variables body height and body weight is c lear to see in the scatter p lot: tal ler people are indeed heavier. However, th is example is relatively c lear-cut and the relationship is quite strong. General ly, such strong relationships are rare in the social sciences, ren­der i ng scatter p lots difficult to read and interpret. For example, the rela­t ionship between the number of hours someone watches television and age is shown in Figure 2.30. In communication studies, it is hypothesized that o lder people tend to watch television more than younger people, but thi s is not obvious in the scatter p lot. This is because the relationship be­tween age and tel evision watching is relatively weak and also because each observation is depicted separately in a scatter plot. When the aver­age hours of watching television are shown for each age category, a much clearer p icture arises. Such a plot is called a line graph, which is typical ly better suited at gauging statistical re lat ionsh irs than arc scatter p lots. For example, Figure 2.3 1 shows t hat o lder pt:ople i 1 1dt:ed tend to watch more television than younger people do. l ·:spl·c i :d l y a lter age 55 the average time spent watch i ng TV i 1 1 · rc: I Sl'S sh ; 1 1 p l y l l c lwL·vn, it i s imposs ible to conclude t h i s rrom 1 : igu 1 l' . \0 l 'V\' 1 1 t l l t � l l ) '. l l l lni l i plots use t he exact same sample or ohscrv: t t ions l

l )o:;cl ipllvo Slnll : ; l lw

(lJ � ::::J 0

..c

7

5

� 4 (lJ ..... Q) > � 3 > � 2 c

0

0

2 0 3 0

0 0

0 0 0 0

0 0

0 0

4 0 ! l( ) l i ( ) Age ( M n�u 1 c < l l 1 1 Y1 1 H )

il ! l

0 0 0

I l l

Figure 2.30 Scatter Plot.for the Relutionsllif i k/1 1 '1 '1 '11 f , . , 1 111• / 1 1 "'' 1 1 1 1 1 1 / I

2:' 0 Ol Q) co 0 Q) Ol <( ..... Q) 0 0..

> 1-Ol 5 c

..c u co :5: 0

-0 (/) ..... ::::J 5 0

..c Q) Ol � 0 Q) > <( 1 8 23 7fl 33 3fl 113 48 53 58 63 68

/\q< (Me ; 1:ur d in Years)

Figu re 2.3 1 /, Ill• ' ( ,', , ,, ,;, ;, ,, t/11 · !,'t '/, t(/t l l / l ll lf l hc 'f l l 'l '< 'l l Age ( // 1( / Wotcliill. ' 'f 'V

Page 25: Statistical tools Grotenhuis Weegen

( l l l l j i l l l l ')

2.5 Summary

We su mmarize t h i s c i l ; 1 p tc r ' s t'ullkl l t schcma t i ca l l y i 1 1 ' 1 ': 1 h k s l . . L and 2.33 . For any g iven n lcasurL' I I I ' I l l I ·ve l , one or more su i t ab le graph ica l and numerical descr ip t i ve too ls : 1 r · presen ted . For bivariatc re la t ionsh ips, Table 2.33 reports graph ica l descr i pt ions only . Because numerical de­scriptions of b ivariate rc la t io1 1sh i ps arc o llen generalized to a population, we wi l l discuss these in the nex t chapter on inferential statistics.

Table 2.32 Descriptive Statistics /or a Single Variable (univariate)

Numerical description Measure- Center Variability

ment level Graphical description

Bar Chart Frequency Table Nominal Pie Chart Mode (when number of

categories is small) Frequency Table

Bar Chart Mode* (when number of Ordinal Box Plot Median categories small)

Range lnterquartile range Frequency Table*

Box Plot (when number of H istogram Mode* categories is small)

I nterval/Ratio Stem-and-Leaf Plot Median* Range* (when observations are Mean IQR*

l imited) Variance Standard Deviation

* This description is generally only used 1f other measures fall short, for Instance due to extreme skewness or to (extreme) outliers.

Table 2.33 Descriptive Statistics (graphical) for Two Variables (bivariate)

Dependent Independent variable (x) variable (y) Nominal Ordinal I ntervai/Ratio

Nominal None I None None Box Plot

Ordinal (When categories are Box plot l imited)

Interval/Ratio Scatter Plot Line Graph

INFERENTIAL STATISTICS

3.1 INTRODUCTION TO STATISJICAL INFERENCE

The previous chapter addressed descriptive stat i st ics , tha t i s , t he graph i L·; r l and numerical description o f quantitative data. l n lcrcnt ia l s t ; t t i s t ics gol's one important step further: based on data from a random s: t n tp l · , gV I I l ' l

al izations are made about the popu la t ion from w i l i c i l l hL' s : 1 1 1 1p k i . · d r : r w 1 1

(see Figure 3 . 1 ) . For instance, from ca lu i l a t i 1 1 • 1 1 1 • l l i L' : i l l : q •. · u l 1 1 1 d l \' h l 1 !

als in a random sample, genera l izat ions c 111 h · 1 1 1 : H k : d lP I I I t i H ' 1 1 1 1 ' 1 1 1 1 I J ' I i n the population.

POPULA I N

Figure 3 . 1 Generalizing Outcomesji-om a ,)'wllfdl ' to 1 1 / 'uf ill{l l/ / l i/1

A statement l ike ' More than half of all responden t s a rc over 4) y · : r rs old · i s a descriptive statistic; a particular characteristic or the da ta set i s s i l l ! ply described without further generalization. On the con t rary , a statement l ike ' Based on a random sample from 2007, 25 to 30 percent of the Dut c h people smoke' , results from inferential statistics. To correctly l'SC these generalizing statements, some theoretical knowledge of sta t i s t ica l i n l'cr­ence i s required. This theory w i l l be exempl i fied below using da ta from a census ( i .e . , a procedure to col lect data from the entire popu la t ion ) .

Until 1 97 1 in the Netherlands, it was customary to conduc t ccns1 1scs where each and every inhabitant had several persona l var iab les about t hem co l lected, such : r s Sn, ;f .� < '. : rnd !VIuritul Status . I n t he I X99 census. the mean age l i l l· : r l l . I 1 1 1 i l l io11 I ) l l td t w: rs 2 7 . 1 years and the s ta ndard dev ia t ion or t i l l ' : q •,l ' d l l: l l i l l l i l l l l l l l l ll l ll'd ( ) I l l lo hl.' 20.(l yea rs. Such charac­ter is t ics ( ) i ' t i ll' j l l l j l l d , i l l l ll l I , d l \ - d f lt l/ t l / 1 / t 'ft •/ '.\' i i i"L' gt: l lCI '< I I I y i n d i L · i l tl' l l

Page 26: Statistical tools Grotenhuis Weegen

u s i ng ( ! reek !e l l ' I 'S. T l w l l l L ': I I I 1 1 1 : 1 popu l : 1 t ion is i 1 1d 1 \ n l n l ' ' ' " ' ' ! '. l ' ( pro­

nunc ia t ion : m u (as i n ll / /ls i l " ) ) : 1 1u l I l l · s t a 1 1dard d e v ial l l ll l 1 .' l l l l l tc: t tnl us­ing o ( sigma ) . Fi • t m.: .\ . _ s l luws t h · age d is t r ibu t ion 1 '1 u t l l I X'N, and

population para mcl ers p : 1 1 t d .

>-u c Q) ::::l 0" Q) .... u..

1 40000 1 20000 1 00000

80000 60000 40000 20000

0 0

Mean Age (IJ) = 27 . 1 years Standard Deviation (o) = 20.6 years

1 0 20 30 40 50 60 70 80 90 1 00 Age

Figure 3.2 Age Distribution (Age 0 - I 01) in the Netherlands in 1899 (source: CBS, http://statline. cbs. ni/Stat Web/dome/? LA= EN, theme: population)

Central Limit Theorem

Nowadays, due to high costs and strict privacy legisl ation, it is almost imposs ible to hold a classical census i n the Netherlands. However, it is sti l l possible to gain knowledge about the entire population. In statistics it i s not required to know, for example, the age of each and every individual in a population to tel l the mean age for that population. Instead, a rela­tively small sample w i l l provide a very good approximation of this popu­lation parameter.

To i l lustrate that a smal l random sample can indeed achieve this, a thought experiment is described below. Suppose that in 1 899 a simple random sample of 1 ,000 respondents was drawn from the population of 5 . 1 mi l l ion Dutch people . The question of i nterest is what is the mean age for all people in that sample? Given the mean age in the population from 1 899 ( i .e . , 27. 1 years), i t is h ighly improbable that thi s would have been below 1 0 years. Such an improbable sample would have cons isted of pre­dominantly young kids. Thi s i n turn would imply that in thi s sample of I ,000 respondents random l y d ra w n l'rom t he popu l a t ion of 5 . 1 mi l l ion people, hardly any adu l t s were sckckd. T h i s is q u i t e un l i ke ly because

I l l · re was ahot l l : 1 l t l t y l t l t y L'l t: I I ICc tltal : 1 D u t c h pcrso 1 1 you 1 1gcr t ha 1 1 2 1 w: ts randomly sl· k v k d IHll l l t i le I X9<) popu l a t ion ( in 1 X99 2 .35 m i l l ion O l l l o r 5 . 1 m i l l i o l l I )ut c l l pcopk were younger than 2 1 years ) . The prob­: t h i l i ty t ha t no person o l ' a t least 2 1 years o f' age was selected after five ·onsecu t i ve draws equa ls 2 .35/5 .0 1 * 2 .35/5 .0 1 * 2.35/5 .0 1 * 2.35/5 .0 1 *

2.35/5 .0 I = .02, which is a chance of only two percent! Given the age distribution in the population, it is most l ikely that quite a number of adu lts w i l l be represented in the saSlple. A lthough the mean age i n the sample wi l l genera l ly not be exactly equal to the mean age in the popula­t ion, it is highly unlikely that the mean age w i l l be much l ower or higher. To determine which sample means (notation: x ) may result from a ran­dom sample of 1 ,000 Dutch people, the thought experiment is extended l'urther. This time, assuming time and money is infinite, we draw 1 00,000 random samples from the 1 899 population, each consisting of 1 ,000 respondents. Next, the mean age in each of these samples is calculated, hence resulting in 1 00,000 means. Using statistical software such as SPSS, this thought experiment is easy to perform given F igure 3 .2 and

the results of thi s are presented in F igure 3 . 3 .5

600()-

400()-

200()-

o-

E( x ) "' 27. 1

O X "' .65

--����llli��lili��Uil�--- T -25 26 27 28 29 30

Mean age in sample

Figure 3.3. S'wnf !ling / )istrilutfion j(Jr MC'an Age (1 00, 000 Samples, / , 000 lndi t ·iduo/s f !a ,)'nlltfl/e)

Page 27: Statistical tools Grotenhuis Weegen

F igure 3 . 3 shows I l l · d i S t l l hu l l l l l l t ) l " t h · 1 00,000 l l l ' i i i i S to r : q • t · , l l'S t i l t i 1 1 g

from I 00,000 random s: t 1 n p k s . T h i s d i s t r i bu t ion i s c;t l kd : r ,\'t llllf'l"'.!.:. dis­

tribution. Interest ing ly , t h . l ) V L' ra l l l l l t: : tn or a l l I 00,000 S: l l npk means i s almost identica l to t ht: rea l 1 1 1 · a 1 1 < 1),!-C i n the popu l a t ion i n I W ) 9 27 . 1 years ! This is no coincidem:t: ; 1na t ht:mat ica l ly, the overa l l mean of all possible sample means ( i nd ica ted w i t h E( x ) ) equals the mean in the population ((.l) exactly. Thcrc l( )re, i n s ta t i s t ics it is said that the sample mean i s an unbiased estimator o r the popu l ation mean. Furthermore, an interesting re lationship exists between the original standard deviation (a = 20.6, see Figure 3 .2) and the standard deviation of the sampl ing distri­bution of means (a x ). This standard deviat ion a x appears to equal a I -In . Thus, in our example the standard deviation of a l l sample means i s .65 (calculation: 20.6 I -1 l ,000).6

We would l ike to stress that the standard deviation (a) of 20.6 years is roughly the average distance of an individual 's age to the overal l mean (27. 1 ) . The standard deviation (a x ) of .65 infonns us about the average distance of a random sample mean to the overa l l mean (again, 27. 1 ). To avoid confusion, the standard deviation related to the sampl ing distribu­tion (a x ) is not called a standard deviation but a standard error.

Note that the standard error (a x ) is much smaller compared to the standard deviation (a). This is due to the replacement of a l l 5 . 1 mi l l ion observations (see F igure 3 .2) by the mean age of 1 ,000 peopl e from a random sample (see F igure 3 .3) . An individual ' s age in 1 899 varied be­tween 0 and 1 0 1 years, and this variab i l ity resulted in a relatively large standard deviation of 20.6 years. Due to the design of a simple random sample of I ,000 i ndividuals from a population of 5 . 1 m i ll ion, the prob­abil ity of extreme sample means l ike 0 and 1 0 1 is very small . As such, the possible sample means are located more c losely to the population mean compared with the i ndividual scores, which results in a relatively small standard error (a x ) compared to the standard deviation (a).

The most striking feature is the shape of the sampl ing distribution (see F igure 3 .3) . This c losely resembles the symmetrical and h i ll-shaped dis­tribution shown in Figure 2.27. More precisely, the sampl ing distribution resembles the normal distribution ! 7 This is quite remarkable because the shape of the age distribution was not normally distributed at al l (see F ig­ure 3 .2). Generally, a sampl ing distribution tends to resemble the normal distribution irrespective of the shape of the original distribution from which the random samples are drawn. This is known as the central limit theorem, which genera l ly app l ies to random samples cons is t ing of 30 or more observa t ions . The larger tht: n u m ber o l " ohserv: t t ions in a sample, the more t he samp l i ng d is t r i hut io1 1 s rL'SL' I n hks t i l l ' l lorm a l d is t r i bu t i on . W i t h a sampk col l l a i l t i n g l ."i to .) 1 ) t )h.� L ' I v : r l l l l l l '> I l l \ ' s: l l l l p l i r l ).!, d i s t r i b u t ion

l 1 1 l t r onl l : i l � ; l n l l : : l lc: .

is on ly a pprox i n 1 : i l d y 1 10 1 1 1 1 : t l l y d i s t r ibu ted i l ' t hc d i st r ibu t ion o l " t hc or ig i­na l var iab le i s sy l l l l l ll: l r ica l ( an a lmost equa l n umber or observa t ions to the le ft and t o t he r ight or the mean) . With even smal ler sample sizes (2 �

1 4 ) , the original variable should resemble a normal d istribution to gener­ate a (c lose to) normal sampl ing distribution. 8

Confidence Intervals

When the sampling distribution of the mean is approximately normally distributed (see F igure 3 .3), the position of extreme high and low means can be eas i ly calcu l ated. For example, 95% of a l l possible sample means are located at a maximum distance of 2 (more precisely: 1 .96) standard errors to the left and to the right of the population mean ((.!). The value 2 is a z-score (see section 2.3 .3) , although in the case of sampl ing d i str ibu­tions, the tenn z-value is more frequently used. In the sampl ing d is t r ibu­tion of the mean age in 1 899, 95% of al l sample means are l ocated be­tween 27 . 1 ± 2 * .65 = 25 .8 and 28.4. Because a norma l d i s t r ibu t i o 1 1 i s

symmetrical, 2 . 5% of al l sample means l ie below 25 .g whereas 2 . ."i'Y., a rL·

located above 28 .4 (see the grey areas in Figure 3 .4 ) . I n other wo rds , o i '

each 1 ,000 samples, approximately 2 5 samp les w i l l havt: a 1 1 1e : 1 1 1 i i ) ',L'

lower than 25 .8 and approximately 25 samp les w i l l ha ve a l l l t:a l l agl'

higher than 28.4.

z-value = -2

2 * a x

25.8 27.1

j..J = 27.1

a x = .65

z-value = +2

2.5%

28.4

Figure 3.4 The Percentage of Sample Means outside -2 and outside + 2 Standard Errors from J1 in a Normal Distribution

Sadly, our t hought exper iment i s not rea l i st ic , as it would cost a fortune to draw I 00,000 sa 1 n p l ·s i 1 1 ordt: r to l ind the exact popu l at ion parameter for t h e mean a •c . h l l l t l l l n l d y , O l l l ' s i 1 1 t p k ra ndom sample su niccs because sc ien t. i s ts g ' l l l ' l ! d l y l l l l ' I I P I l l i l l ' I L'S i l'd i l l l l ic e x : r c l va lueS of popu la t ion

Page 28: Statistical tools Grotenhuis Weegen

I l i i i J i l t I :1

paramdL:rs, bu l sel t l · l ( l r ( V L' I Y ) good approx i n 1 : 1 l inns 1 1 1 �d c : u l S i i rpr is­i ng ly , on ly one re la l i vL: Iy sn t ; � ll r: t 1 u l o 1 n s :nnp lc s u l 'l i ccs t p : t l' I I I L' V L' t h i s '

Imagine that from the I 00,000 samples shown i n l ; i gu r · \ . \ , j us t one simple random samp le is d rawn from t he popu l a t ion wha t is the ex­pected mean age in this sample'! Means be low 25.8 and above n.4 are hardly to be expected; F igure 3 .4 demonstrated that the chance of this is only 5%. This means that the chances or finding a mean ( x ) between 25 .8 and 28.4 (27. 1 ± 2 * .65 ) is very l a rge: 95% ( 1 00 - 5) .

Clearly, the distance of the popu la t ion mean (1-l) to a certain sample mean ( x ) is equal to the distance o r that specific sample mean to the population mean. Therefore, it is a lso correct to state that there is a 95% chance that a sample wil l be drawn in which the population mean (1-l) is located in the interval x ± 2 * a x . In F igure 3 .5 , we calculated such in­tervals - called confidence intervals (or Cl) - from three samples. In the first sample, the sample mean age is 25 .8 years. The confidence interval then equals 25 .8 ± 2 * 0.65 = (24 .5 ; 27 . 1 ) . The mean age in the popula­tion (27. 1 ) l ies just within this interval. The same can be said for the in­terval associated with the second sample - the mean age of which is 28 .4 years: 28 .4 ± 2 * 0.65 = (27. 1 ; 29.7) . This means that every sample where the sample mean age is between 25 .8 and 28.4 (the grey area in F igure 3 . 5 ) has a confidence interval including the population mean of 27 . I ! To­gether, these samples constitute 95% of all possible samples. The remain­ing 5% will have a 95% confidence interval excluding the population mean of 27. 1 . For example, the third sample ( x = 29. 1 ) belongs to these 5% as the confidence interval is 29. 1 ± 2 * 0.65 = (27 . 8 ; 30.4) .

The crucial conclusion from Figure 3 . 5 is that with almost 1 00% cer­tainty (95% to be precise), we wil l draw a sample in which the population mean (1-l) is located somewhere in the interval x ± 2 * a x . This means that of every 1 00 samples, an average of 95 samples holds a 95% confi­dence interval that includes the population mean. I n other words, there is a rather slim chance ( i .e . , 5%) that we wi l l draw a sample that does not i nclude the population mean in its 95% Cl. Of course, one could choose a very large confidence interval . For example, we could easily state that we are 1 00% confident that the mean age in the population - which is normally unknown of course - in 1 899 was somewhere between 0 and 1 0 1 years. However, although we are 1 00% confident that thi s is true, it does not provide usefu l information. We could have said exactly the same thing without drawing a sample, and we could have done so comfortably without any knowledge of statistics.

l n l t l l l l l i l l n l : ; t . l l l : . t l t : :

25.8 28.4 29. 1

0' X = 0.65 Sample size = 1 ,000

= Al l samples (95% of total) where 1-1 is within 95% Cl

I = A sample (from 5% of total ) where 1-1 i s not within 95% CI

= sample means

Figure 3.5 The 95% Confidence interval.�· (Cl) and tit£ ' / 1uf 'lllo/tl l l l

Parameter (Mean Age (fJ) = 2 7. 1)

However, it is also not desirable to have relatively low con fidence l eve l s instead. I magine, for example, a random sample in which respondents on average are 27.7 years old. According to F igure 3 .4, this sample mean is quite plausible. Statistical theory dictates that the borderlines (called con­fidence limits) of the 40% confidence interval are located at about . 5 stan­dard errors from the sample mean. This means that we are 40% confident, that the population mean is located somewhere between 27.4 and 28.0 (calculation: 27 .7 ± 0.5 * .65) . This statistical statement is quite interest­ing for it narrows the interval, but this time it is rather questionable whether this narrow interval ' captures' the (unknown) population mean. Recall that in a 40% confidence interval, the population mean wi l l be within this interval approximately 40 out of I 00 samples. Note that the sample with a mean age of 27 .7 does not belong to these 40 samples ( 40%-Cl = (27 .4; 28.0 ) , while �L = 27 . I ) . In general, one wants to arrive at rather narrow con l idence i n terva ls without losing too much certainty. This resu l t s i n l 'rL·quc l l t ly nscd k:ve ls o l ' con lidence o f 90 to 99%. Finally, i t i s i n teresl i n ' t o t H t l v l l t : 1 l l k l l lL'd i : t typ ica l l y do no! report con fidence i n t e rva l s whc 1 1 1 1 1 1 · l t ' • . t t l t •, o l ·:l i i i i J lks : l rL' shown . ;\ newspaper headl i ne

Page 29: Statistical tools Grotenhuis Weegen

dcc l : 1 r i n • l k t l I l l · ' I k t l l i l\' 1 : t l :: 11 1 1 1 l w u sc: l l s ' i s : � c l t t : l l l y d l ' , l l l i l . d d ! · w i t · 1 1

no con lldc1 1c<.: i n l c rv : i l i s t q H H i n l as w · 1 1 ( o f cou rse, W l' nSS I I I I H' l l t : t l e lec­t i on po l ls arc based 0 1 1 r a 1 1do1 1 1 s: 1 1 n pks, otherw i se l hc l i L ' : td l t l ll.' i s l �u· worse than ' d i sputab le ' ) .

Testing Hypotheses

The previous section demonstra ted that with just one simple random sample, highly confident sta t i st ica l probab i l i ty statements can be made about an unknown population parameter. Simi larly, it is also possible to test assumed values of a certain popu l at i on parameter. In social sciences, as a point of reference, it is frequently first assumed that parameters equal 0. This is called the null hypothesis (notation: H0), which always contra­d icts or opposes the researcher 's theoretical expectation. For example, when the l i fe expectancy in Europe is hypothesized to have risen over the years, Ho states that this demographical process did not take place ( i .e . , equals 0 or has not risen). The null hypothesis does not necessarily have to be taken l iterally; one could hypothesize j ust as well that the l i fe ex­pectancy rose by more than 1 year. I n this case, Ho states that the l ife ex­pectancy did not rise by more than 1 year. The counterpart of the nul l hy­pothesis is called the research hypothesis or the alternative hypothesis (notation : Ha) and is frequently derived from new scientific insights or from new (or old) theories. Quite often Ha is directional , which means that a population parameter is said to be either larger or smaller than the value expressed in H0. I n non-directional research hypotheses, the popu­l ation parameter is said to deviate from some value. I n science i t is stan­dard to rigidly test the alternative hypothesis, requiring very convincing evidence before Ha is accepted and Ho rejected. This means that the statis­t ical results have to render the nul l hypothesis h ighly implausible before rejecting it. To determine how implausible H0 is , the confidence interval (Cl) can be used. When the confidence interval does not i nc lude the population value stated in the nul l hypothesis, one can safely say that Ho is probably wrong. To build a strong case against H0, a large confidence interval should be taken (typically between 90% and 99%).

Although using a Cl to test hypotheses is entirely appropriate, it is not often used this way. The most popular test strategy uses the level of sig­nificance (notation: a). We would l i ke to stress, however, that both methods of testing are identical and lead to the same conclusion. Com­mon levels of significance used a rc I on;;, , S'Yo, and 1 %, and constitute rejection areas - t he n u l l hypothes is is re j ected when the sample result fal l s into the reject ion a r<.:a . The ra t iona le be i ng that it i s h igh l y improb­abl e tha t t he popu l : i l i o 1 1 p: t r: l l l ll.' l -r is equa l to the v a l ue hypot hes ized in

••••n •' ' ' " ' n ' ' """' ' n t 1 n • n , ! 1\ l

I he n u l l hypt l l hL·s i s l l 1 1 ' n : 1 v 1 p ro b: t b i l i l y ( a l so k nown as ' p-va luc ' or j us l ' p ' ) can he ca l · u l : l l l'd W i l l ! : t n y s la l ist ica l sonware package. This p-va l ue can be �.: i t her onc-l a i lcd or two-ta i l ed . The one-tai led p-va l ue is the prob­ab i l i ty tha t t he sample resu l t, or an even more extreme sample result, i s l( )und whi le Ho i s assumed to be true. The one-tailed p-value is always used to test directional alternative hypotheses. The two-ta iled p-value is t w i ce the size of the one-tai led level and i s used when Ha is non­directional . Once the p-value is calcul ated it can be compared to the level of significance:

• When the one-tai l ed p-value is less than or equal to the level of sig­n i ficance, the nul l hypothesis is rejected and the directional alternative hypothesis is accepted.

• When the two-tailed p-value is less than or equal to the level of s i g­

nificance, the nul l hypothesi s i s rejected and the non-direct i ona l a l ter­native hypothesis is accepted.

This can also be summarized in symbo ls :

. When P one-tailed :::; a -7 Ho 8, H" (diroction8 1 ) ;..;

When P two-tailed :::; a -7 Ho 8, H" (non-directional ) ©

Most stat istical software packages, l ike SPSS, present p-va1ues expressed as proportions (range 0- 1 ) instead of percentages (0- 1 00% ) . Therefore, the levels of s ignificance w i l l be presented proportionally for the remain­der of this book; for example . 1 0 instead of 1 0%.

To i l lustrate a hypothesis test using p-va1ues, we return to the mean age in the year 1 899. Between 1 899 and 1 930, health care and work con­ditions improved greatly. Thus, our directional alternative hypothesis ( Ha) is that due to these improvements, l i fe expectancy rose, as did the mean age in the Netherlands during this period. Conversely, our nul l hypothesi s J.Ho) is that the mean age i n the Netherlands did not rise between 1 899 and 1 930. In other words, according to the nul l hypothesis, the mean age in the Netherlands was sti l l 27. 1 years i n 1 930.

Suppose that in 1 930 a random sample consisting of I ,000 i ndividuals was drawn from t he population. Because th is sample is l arge enough, the central l i mit l h<.:orcm appl ies and the resulting samp l i ng distribution wil l thus be ( a pprox i t l l : i ldy) normally distributed.

Page 30: Statistical tools Grotenhuis Weegen

60

The s i tua t ion descri bed i n H0 is assumed to be t rue u n l i l co rH.: I u s i vc ly fal sified. Therefore, the mean of the sampling d istr i but ion is assumed to equal 27. 1 years (the population mean in 1 899). Further, i t is assumed that the standard error of this sampling distribution is .65, which is based on the standard deviation in the 1 899 census and the size of the 1 930 sample (calculation: 20.6/v'1 ,000). ln section 3.2 . 1 , we wi l l show how to test a hypothesis when the standard error of the population mean is also unknown. The nul l hypothesis assumes that any random sample ( inc lud­ing ours) of 1 ,000 individuals is part of the sampling distribution with 11 = 27. 1 and a x = .65 . Now assume that in the sample the mean age is 28.4 years. This c learly exceeds 27. 1 , which favors our directional hypothesis that the mean age rose between 1 989 and 1 930. The question remains, however, whether the difference between 27. 1 and 28.4 is significantly large enough to reject H0, because it is possible that the sample mean of 28.4 is contained in the sampling distribution with 11 = 27. 1 and a x = .65 . I n deciding whether or not to reject H0 , we must know this probability (p) .

F irst we have to calculate the number of standard errors that l ie be­tween 2

,8.4 and 27. 1 . Assuming a normal sampl ing distribution, the rela­

tive share of a l l sample means that are equal to or exceed 28.4 can be cal­culated using a z-value. The z-value appears to be 2 (calculation: (28.4 -27. 1 ) I .65 (see endnote 4)). According to the empirical rule (see page 44), the one-tailed p-value is about 2.5% or .025. Using the formula for the normal distribution (see endnote 7), the exact p equals . 0228 . This resu l t i s compared to the level of significance (a) selected by the re­searcher be fore the hypothesis test. With a = .05, H0 is rejected and the d irec t i onal a l ternative hypothesis (Ha) stating that the mean age in 1 930 is h i gher t han in 1 899 is accepted because p is smaller than a ( .0228 < .05), sec F igure 3 .6. Conventional ly, the mean age found in the sample (28.4) is sa id to be significantly larger than 27 . 1 .

""- = Rejection area � (a=.05)

� = One-tailed probabil ity � (p=.0228)

/7. 1

Sampling distribution / (normally distributed)

2 * a x = 1 .3

(l 28.11

Figu t·c 3.6 Testill,!!, n / / r •t Jotfll '.l' /,1' \ \ 'l tll f ' r •r t lllr ' \ r tl lr f t l

I I I IOI O I I I i; l l l < t l isl iC:l G l

The level o r s ign i l ic ; r l lce ( a ) a l so indicates the probab i l i ty tha t l l o i s re­jec ted given that l l o is true . This incorrect decision is cal led a type I error. I n sc ience it is generally agreed that this type of error should be kept t o a m r n r mum. Hence, the level of significance rarely exceeds . 1 0. l n our ex­ample we used a = .05 which is the conventional standard. This means t hat before the test i s conducted, on average 5 out of 1 00 times we wil l incorrectly conclude that the mean age did increase whi le the mean age in the population actually did not increase. Al ternatively, when the alterna­tive hypothesis is true ( i .e . , the population ' s mean age increased) while we do NOT reject the null hypothes is, then this is cal led a Type 11 error.

I n our example, there is a unique opportunity to check whether a type error was made. From the 1 899 and 1 930 censuses we know that the

mean age increased from 27. 1 to 28.6 . So, in the target popu la t ion t he al ternative hypothesis is true, while the null hypothesis was rejec ted us ing t he outcome from a random sample. Therefore, no type I error was m: rd . _

that is , Ho was correctly rejected. However, researchers shou ld prnL'L' ·d with caution, as checks for these errors are typ ica l ly not poss ih l · hcc 1 1 rsl.' t he true population parameters are general ly unknown ( we d i d i 1 1 ! I r i s i 1 1

stance becallse the census contains the ent i re popu la 1 io 1 1 rro 1 1 r w l r id r 1 v drew the sample). We elaborate upon both types o r error i n se · 1 i o 1 1 U . I .

Final ly, four important points regarding hypo thes i s l esl i 1 1 1 nn·d l u h · considered. First, before testing directional hypotheses i t mus l he d r T k l'd whether the sample result (also call ed a sample estimate) i ndeed d i ffers i 1 1 I he correct direction from the hypothesized population parameter i n 1 1 0 . 1 r

th �s i� not the case, the sample result never l ies within the (one-ta i led ) reJectiOn area, and automatical ly results in not rejecting Ho!

Second, in statistical software packages l ike SPSS, two-tailed p-values are typically presented. When testing non-directional hypotheses, this p­value can be compared directly to a. However, directional hypotheses are lyp ically tested, so the two-tailed p-value must be divided by two then .

Third, when the nu l l hypothesis is not rejected (because p > a) this does not mean that Ho is accepted to be true for i t i s very difficul t to suf­f ic iently prove that a population parameter is exactly 0 (or any other val ue). Among other things, this is related to the level of significance. Suppose a researcher decides to use a very small a (e .g. , .000 1 ) ; conse­q uently , the test of Ho is so strict that H0 wi l l not be rejected in most in­s l � l nces. This of course does not imply that a low a causes H0 to be true!

Fi na l ly , accep t i ng I f" does not mean that Ha i s true. Among other I ! r i ngs, t h i s is due lo t h e f : r c l that a sample is used instead of the entire popu la t ion . f l ow ' V L'r, i l v ; 1 1 r he conv i nc i ng ly s ta t ed t ha t 1 1 " is much more l i ke ly l lwn l l o, hv; l l i l l ) ', i l l l l l l l id l l r ; r l l l rc rc is st i l l a ( sma l l ) r isk or cornmit-

1 1 1 1 1 :1 l ypc I L'I I < l l

Page 31: Statistical tools Grotenhuis Weegen

3.2 ONE-SAMPL S S FOR MEAN AND PROPOH liON

In soc i a l sc ienccs, thi.'I L' : I l l " two co rnmon ly used st a t i s t rc : i l tcs ts ! "or pa­rameters o f a s i ng le v : 1 r i : 1 h l ·: :t tcs t ! "or a population mean and a test tor a population proport ion . Tl lc l i >rmcr was used in the previous sec t ion to test whether there is conc l us i vc cv idcnce to reject the nul l hypothesis regard­ing the assumed mean age i n t he ropu la t ion . The latter is used to test whether a sample proport ion ( or i "raction) differs from a hypothesized population proportion, as s ta ted in the null hypothesis.

3.2.1 Test for a mean

Section 3 . 1 discussed several important principles of inferential statistics. To avoid unnecessary complexity, it was assumed that the population standard deviation (a) was known. Because a census was used in the ex­ample, this assumption was not problematic at all. However, in most situations a wi l l be unknown. Fortunately, it can be demonstrated that the sample standard deviation (s) c losely resembles a. Recall that dividing the standard deviation (a) by the square root of the number of observa­tions in the sample yields the standard eiTor of the mean (formula: a .X = a I fn ) . S ince a is generally unknown, s offers a good approxima­tion and the formula becomes SE .X = s I Fn , where SE x denotes the standard error of the mean (SE is short for standard error). The interpreta­tion of S E x however is equal to a .X . Because a is replaced by s, which is a sample estimate, additional statistical uncertainty is introduced. Statisti­cian Wi l l iam Gosset ( 1 876- 1 93 7), using the pseudonym ' student' , showed that this uncertainty results in a somewhat broader sampling dis­tribution, which is known as the student 's distribution, or !-distribution (see Figure 3.7) .9

-4---- Normal (z-)distribution

J.1

Figure 3.7 A t-distrilmtion und the Normal (z-)Dislrihution

The i n lcrpn.: t : i l io 1 1 P I I l' r r l l l r ' '• l r r l l r l l l 1c I d i st r i but ion i s equa l to z-va l ues bot h i nd ica tc how r r 1 : r r 1 y :. l . i i H I : r r d I.'ITors l i e bet ween the sample est imate

and t he hyrot l l es i zn l J H I J H i l : I I H l l l 1 1 1can ( �L ) . The s tandard error 1 s " kl.'y c lement in statistical tests as it measures

the re la t i ve d is tancc o r t hc sample estimate to the population parameter stated in the nu l l hyrot hes is . I magine, for instance, that one wants to test whether the average n umber of children of Dutch couples is lower than 2 - should this directional alternative hypothesis be confirmed then it is probable that the Dutch population w i l l decrease in the long run. The null hypothesis states that the average number of children equals 2 (= 1-Lo). In the year 2000, a sample of 734 Dutch couples showed that the average number of chi ldren was 1 .79 (= .X ) with a standard deviation (s) of 1 .24. The question is whether this sample mean is a random deviation from the assumed population mean in the nu l l hypothesis, which is 2. Statistical ly, we have to calculate the probabil ity that a sample mean of 1 . 79 (or less) from a sampling distribution with a mean of 2 and a standard error of 1 .24 is drawn. In absolute terms, the sample mean is located -.2 1 (children) to the left of the assumed population mean ( 1 . 79 - 2). Given that s = 1 .24 and n = 734, SE .X equals .0458 (= 1 .24 I "./734). Therefore, in relative terms, the difference between the sample estimate and the as­sumed population mean is 4.59 (= - .2 1 I .0458) standard errors. So, the associated t-value is -4.59 (see F igure 3 . 8) . 1 0

The final question is whether the relative distance of 4.59 is large enough to reject the null hypothesis and consequently accept the alterna­tive hypothesis. Because the t-distribution is symmetrical and hi l l -shaped in cases of large samples (see Figure 3. 7), the empirical rule app l ies. Re­ca l l that according to this rule approximately 99.7% of al l sample means ( .X ) l ie within -3 and + 3 standard errors of Jlo. This means that approxi­mate ly .3% of al l sample means are located outsides these l imits. Because the distribution is symmetrical, approximately . 1 5% of these extreme means are to be found to the left of Jlo. The sample estimate of 1 . 79 is lo­cated in th is area, for the associated t-value exceeds -3 . It is quite easy to calculate the exact cumulative probab i l i ty associated with a t-value of -4. 59. G i ven our first approximation us ing the empirical rule, it is not sur­prising to find this probabi l ity to be very sma l l : .000003 (on our web page we offer easy-to-use SPSS programs to ca l c u l ate probabi lities for any t-value).

How is �t that a sample mean of 1 . 79 i s f ( H i nd w h i le the chances of finding th i s ou tcome i s extremely low accord i ng to 1 1 1 1? There are two poss ib le answ Ts to t h i s quest ion . F i rs t ly , t h i s is pure ' had l u ck ' - by sheer chance : 1 1 1 n l l r ' l l l l ' sa 1 1 1 p l e was drawn . l \;rh: 1ps, duc to chance, many l �l m i l i L·s w i l l i i H I I r l i l i < i l l · l l wnc sampled, t hc rl.'hy rcd 1 1 c i n • t he mean

I I I

Page 32: Statistical tools Grotenhuis Weegen

numbl: r o r c l t i ldr ' l l p 'I l l l l l l i l y , ' \TOl ld ly , t hl: n u l l hypo t i iL'S I S I S i ncorrect;

the t rue popu l a t ion 1 1 1 · ; 1 1 1 1 . · I · . ·s t l t : 1 1 1 - · or course, t he secnnd a nswer i s

much more l i ke l y t ha t t l l�.· l t t s l t l t l · . Therefore, the n u l l hypothes is i s re­

jected ( at the .0 1 level ( ) l ' s ign l l tr : l l tce ( u ) ) and the a l ternat ive hypothesis

is accepted ( �L < 2 ) . For i l lustrative purposes we w i l l t a ke th is test one step further. Due to

the extreme p-value ( .000003 ), i t is a lmost without doubt that the popul a­

tion mean indeed is less than 2 . So, one could also test whether it is less

than 1 .9, or even less than I X Consequently, a point wi l l be reached at

which the nu l l hypothesis wi l l no longer be rejected. At the .05 signifi­

cance level, this point is approximately located at the mean of 1 .87. The

t-value at this point is - 1 .7 (calcu lation : ( 1 .79 - 1 .87) I ( 1 . 24 I .V734). This

t-value is associated w ith a one-tai led p-value of .045 and means that the

null hypothesis can be rejected. A nu l l hypothesis with an assumed popu­

lation mean of 1 .86 (or less), a sampl e estimate of 1 .79, and a level of

significance of .05, w i l l no longer result in rejecting the nu l l hypothesis.

One-tailed probability (p) = .000003 (= black area)

x = 1 .79 !-lO = 2

Sampling d istribution � (t-distributed)

Fi�!;urc 3.8 A t-Testfor a Mean (�0=2, .X = 1 .79, s= l .24, and n=734)

l ' ina l ly, we would l ike to emphasize that a mean test using the t­

d istribution is statistical ly correct only when the sampling distribution

approximates a t-distribution. W ith relatively large random samples this

is genera l ly true. When using samples sizes between 1 5 and 30 observa­

tions, this is only the case when the test variable ( i .e . , the variable for

which the mean is calcu lated) is approximately symmetrical (= as many

observations are located to the le ft as to the r igh t or the mean ) . With

smal ler numbers of observat ions , th · t ·st var iah l · shou l d he approxi­

mately norma l ly d is t r i bu ted i 1 1 t i l e pop1 1 i : l l 1 t H I I 1 1S j lLT i i 1 1 g t l l · histogram of

the test var iab le i n t i le san t p k i 1 1d i 1 t ' t ' l l y 1 1 t l t l l t l l 'i , , ., : l i �t l l i l t h i s . The Lest

var iab le Nu111hcr nj' ( '/u lt !r. ·u ; •• ·l , dl l/ ''' ' 1 , 1 1 1 1 \ 'I \ 1 1 1 1 1 1 1 ' 1 1 1 1 : t l ( : 1 1 k:1s t in t he

ln lo1 n l inl ,' l : tll: -> l lc: ; 65

Net herl ands ) : t h ·r · a t L· t c l :l t i ve ly many (young) couples without chi ldren and re lat ive ly rew ·o1 1 p l es w i t h more than three chi ldren. l n this case, the appropriate stat is t ica l test l 'or the mean number of chi ldren in the Nether­l ands must be carried out with random samples consisting of at least 30 couples. In general, such analyses use far l arger samples. The advantage being that the standard error is relatively small (calculated by dividing the standard deviation (s) by .Vn , where n is the sample size). A smal l stan­dard error is desirable because it lowers the chance that we i ncorrectly do not reject the nul l hypothesis (type Il error) . In medical research this type of error (as wel l as a type I error) can be very important. For example, suppose a new type of medicine is indeed more efficient than o lder types. Researchers of course do not know this population parameter so as a pre­caution they want the chances of i ncorrectly not rejecting H0 as smal l as possible in a sample study. Imagine the consequences of not introducing a more effective drug to the market (a decision resulting from a type 1 1 error: the new drug was found not significantly more effective). O f course a t�pe I error should be kept a t minimum also. Imagine taking a n effective drug from the market while replacing it with a less effective one (that was erroneously found significantly more effective in a sample ) .

3.2.2 Test for a proportion

A proportion is the number of units (e .g. , respondents) that have some particular characteristic of i nterest, divided by the total number of units. This characteristic is typical ly measured using a dichotomous variab le coded 0 for those not having the characteristic and 1 for a l l that have the characteristic . For example, the dichotomous variable Overweight may be coded 0, indicating that a respondent is not overweight, and 1 , indicating that a respondent is overweight. Thus, using this coding technique, the proportion overweight equals the sum of a l l people being overweight, div ided by the total sample size (this equals calculating the mean for the variable overweight). By definition, a proportion l ies between 0 (not a single observation possesses the characteristic) and l (every observation carries the characteristic) . I n a 2005 research project, 558 out of 1 209 respondents happened to be overweight. The proportion (notation: p l ) is thus 558 I 1 ,209 = .46. This means that, on average, 46 of every 1 00 re­spondents we overweight . The proport ion of respondents who were not overwe ight ( no ta l i l H t : pO ), cqu:ds .54 ( p I 1 pO - I ) .

Genera l ly , w hL' t l 1 \ 's l i l l ) '. l i H· proport ions, the n u l l hypo t hes is does not s ta te tha t p i is ' l ( l l : l l l l • 0 I h 1 .., 1s hl'v: I I I Sl ' l l nd i n • al least one s i ngle res­pondent i n :1 s: l l l l l l l l · 1 1 1 1 1 1 h . t · . 1 1 1 1 ' ' l t : n : I \' I L' I I S i i . or i n t erest ( i .e . , code I ) WO idd rl'S \ 1 1 1 I l l ! I l l ' 1 1 I • I 1 1 • 1 1 1 1 1 1 1 1 1 1 ' 1 l t i l l l l i y pt � l h l'S I S .

I I .,

Page 33: Statistical tools Grotenhuis Weegen

( i ( l

l 1 1 1 1 1 : 1 1 1 Y L' : I Sl'S , ! l l ld . i l l l l l • 'i l ' l ' l l ! l l l i l y 1 1 1 n .:S\.: < 1 1 " · 1 1 \ I l l l l l ll ' 'd i Y . . 1 pop i i l : l ­

t ion prororl io 1 1 ( np l n l l < l l l p l 1 d h q •,l' l l h : t l l 0 w i l l he s l : l l l'd 1 1 1 i h l· n u l l hy­pothesis . l ."or i nsL I I I l'V, N1 1 ppw:,· 1 1 1 : 1 1 i l l -re arc reasons l o h · l 1 ' V l' l l ia l l hc proport ion o f obese I ) l l l l ' l l pvopl · equa l s .40 i n 2005 . I low ·v · r, o l hcr re­searchers argue tha l t h i s j l l ( l j lO I I I O I I i s l l lore l i ke ly to m irror the propor­tions in the U n i ted S ta le s 1 1 1 : � 1 : l l'l' w · 1 1 over .40. Thus, accord ing to Ho the proportion of ovcrwei • h i p ·opk is . 4 0 . , whi le in Ha > .40 is hypothe­sized, while .46 (no ta t ion : 0 1 ) was I ( H I I ld in the 2005 sample.

To test whether fJ I is conc l u s i ve ly h igher than p 1 0 (allowing us to re­ject H0), a normal samp l ing d is t r ibu t ion is used (see F igure 3 .9) . How­ever, in cases of small sample s i zes and/or very smal l or l arge values for p l 0, the samp ling distribution is prob� 1b ly not normally distributed. Con­ventiona l ly, to determine whether the data al low for a test of proportions using a nonnal sampl ing distribution, a genera l ly applied rule of thumb states that the 99,7% interval should not include the values 0 and/or 1 . Recal l from section 2 .3 .3 that 99.7% of all sample estimates l ie with in the range of -3 and +3 standard errors in a normal distribution. So, the 99.7% interval equals p l 0 ± 3 a p 1 0, where a p l 0 is the standard error of the sam­pling distribution. This parameter is calculated by d iv iding the standard deviation of the test variable (a) by the square root of the sample size (n) . Note that we use a here instead of s because the population standard error is a direct function of the assumed population parameter. According to the nul l hypothesis, the population proportion of overweight people is .40 and the associated population standard deviation (a) of the variable over­weight equals .49 ( .40* . 60 ) . 1 1 Consequently, the standard error equals . 0 1 4 ( .49 I v'l 209). According to the rule of thumb, 0 and/or 1 should not be part of the 99.7% interval . This is indeed the case (p l 0 ± 3 a p l 0 -7 .4 ± 3 * .0 1 4 = .4 ± .042 = ( .36; .44).

Following H0, with p 1 0 = .40, the sample proportion that was actual ly found ( .46) is just one of the sample proportions that belong to the sam­pl ing distribution with p 1 0 = .40 and a p l 0 = . 0 1 4 (see F igure 3 .9). Recal l that inferential statistics lets us calculate the probabi lity of finding par­ticular sample results given the assumed population parameter and the standard error. The absolute difference between the sample proportion and the assumed population propmtion equals .06 ( .46 - .40). Relatively, this difference amounts to 4.3 standard errors ( .06 I . 0 1 4) . Because a i s used, 4.3 is not a t-value but a z-value. The one-tailed p-value for z = 4.3 i s very smal l ( .000009), suggesting that there is good reason to reject the nu l l hypothesis and to accept the alternative hypothesis. In other words, the proportion of overweight people in the 2005 Dutch popu l a t ion is very l ikely to exceed .40 ( sec Figure 3 .9 ) .

l l l i t l l t l l l l i i l : l ld l l : , l lu

Sampl ing distribul i01 1 ( Normal ly d istributed )

p 1 o = .40

I l l

One-tailed probability (p)= 0.000009 (= black area)

fJl = 0.46

Figure 3.9 A Test of a Proportion using the Normal Distribution

(p1o = . 40, pi = . 46, and n = 1 ,209)

Generally, a test for proportions using a normal distribution wil l be cor­rect when the proportions 0 and 1 are not included in the 99.7%- i n tcrv :d around p l o. I f this requirement is not met, the sampl ing d is t r i but ion i s probably not (approximately) normal . In these cases, the b inom i : i l d is l r i bution should be used instead. This distribut ion i s most su i ted 1 ( ) 1· l l's l i 1 1g proportions but closely resembles the norma I d i st r ibu t ion i r 0 and/or I 1 : 1 1 I outside the 99.7% interval . 1 2

To conclude this section, we present the resu l t s or the overwe i ght hy pothesis test, where the nul) hypothesis is that the proport ion o f O V • r

weight Dutch people does NOT exceed .40 and the a l terna t i ve hypo thes is is that this proportion is higher than .40. This test was f irst performed us­ing a sample of I ,209 respondents and the results were compared to a smaller randomly selected sample of 9 respondents, using both the nor­mal and the binomial distributions (see Table 3 . 1 0) .

Table 3. 1 0 Test of a Proportion Using the Normal Distribution and Using the Binomial Distribution with n= 1, 209 and n= 9

Overweight Counts Observed Proportion One-tailed p One-tailed p proportion( pl ) in Ho (normal (binomial)

distribution) Yes 558 .46 .40 .000009 .000008 No 651 (= 558 I 651 ) .000008 * Tota l 1 ,209

Yes 1 . 1 1 40 .037 .071 No 8 (= 1 I 9) .076 *

Total D

* with corrocl lo ! J 1 1 1 1 ' " ' 1 1 1 1 1 1 1 i l y ( • : t t l l i H I I JOIO l : l )

Page 34: Statistical tools Grotenhuis Weegen

011

I n t h�.: sma l l sa 1 np lc ( 1 1 < ) ) , ( ) l : d l s i 1 1 l o l h�.: <)9 . 7"/o i l l l l'i V I I I ( 1 1 I \ * ( . 1 1 * . �9 I ( 9 ) ( - .20 ; . ' 1 .� ) ) . /\ 1 a s i • n i l icanc�.: kvd ( < L ) ol . 0'\ , 1 1 1 1 can

clearly be rejected wh�.: l l l h · 1 1 onn :d d i sl r i bu t ion i s eiTon�.:ously us�.:d , but

this is not the case wi lh I l l�: h i 1 1om i n l d i st r ibut ion . Sc ien t i l i c honesty i s

required to report t ha t l h i s d i rll:r�.:nc�.: i n test results disappears when a

correction for continuity i s p�.:r l " ormcd . 1 1

3.3 TESTS FOR COMPARING TWO MEANS

In section 3 .2 . 1 , we tested whether a single sample mean differs signifi ­cantly from a hypothesized population mean. l t i s also possible to test whether two or more means statistically differ from each other. In these tests, a general distinction is made between comparing means within two dependent groups and within two (or more) independent groups.

3.3.1 PAIRED SAMPLES T-TEST (TWO DEPENDENT GROUPS)

Two groups are said to be statistical ly dependent when each unit of

analysis (often respondents) w ithin the first group is somehow related to a

unit in the second group. For obvious reasons these groups are often re­

ferred to as paired groups. Consider a random sample of adul t women

(group 1 ) and a second group consisting of their mothers. The goal of

such a design could be to determine differences in occupational careers.

Another example is a random sample of respondents interviewed at two

moments i n time; for example, during e lections held in 2003 and in 2006.

A third example is the comparison of two variables, such as the results on

a language test (group 1 ) and a math test in (group 2), whil e both groups

contain the same respondents. A typical characteristic of these three ex­

amples is that there is interdependency between the (paired) observations.

Obviously mothers and their daughters are rel ated through fami ly ties but

they are also statistically related as it i s l i kely that their occupational ca­

reers are more simil ar than any randomly chosen pair from the sample of

mothers and the sample of daughters . Respondents at time 0 are rel ated to

themselves at time 1 ; respondents taking a l anguage test and a math test

are the same respondents during both tests, which makes it highly prob­

able that the outcomes of both tests are related. Thus, the unit of analysis

is not a single unit but a pair of u n i ts with two scores that are to be com­

pared (see Table 3 . 1 1 ) . Var iab le 3 i n Tahk 3 . 1 1 i nd ica t�.:s the di tferences

between var iab le I and v a r i a h k l "or �.:ae l l pa i r . < > I " u lt lrs�.:, l h i s new vari­

able has a mean ( l h�.: n t ·an d i ll·r�· t l l' L' ) : 1 1 t d n s 1 : 1 1 1d : 1 1 d dcv i a l ion ( s ) .

1 1 1 1 c 1 1 l l l l l ln l � ; 1 1 1 1 1 : 1 , :n O! J

Tahlc 3. 1 1 / )ofo 1 - '!11 · l l 't f!t 'l 'l t 'O l h' fJCtu lcnf Groups

Pairs ( i ) variable 1 variable 2 variable 3 (d ifference between 1 and 2)

1 2 6 -4 2 5 1 4 3 1 1 0 n X y X y

Again, the standar� error of the differences can be calculated by dividing the standard dev�atwn of the test variable by the square root of the sample stz�. Because differences between the two variables result in a single vanable (see Table 3 . 1 1 ) , the test is equal to the mean test in section 3 .2 . 1 . The nul l hypothesis in this test wi l l often state that there is no dif­fere?ce (mea� difference = 0) . The alternative hypothesis is typically d i­rectt

_onal, ':h1ch means that the researcher expects the mean d i ffcrcncc lo

be either higher or lower (positive of negative). In c�ncluding this section, we will present t h ree examp les rrom a c l t l : i l

research. �he first example deals with inequa l i t y bc l we�.:n n tc 1 1 : 1 1 H I �omen. I t I & expected that women, o n average, obta i ned a low�.:r ·d 1 1 · : 1 twnal level than their spouses (we measured educat iona l leve l w i l l l I n i a l ye

_a:s of education to obtain an interval variable) . The s�.:cond �.:xa 1 1 1 pk

utilizes a panel study from 1 985 and 1 990. In both years, the swne groups of respondents (the panel) were asked about their church a ttendance (m�asured as the n umber of days they attended church a year). The a l ter­native hypothesis is that in those five years the mean level of church at­tendance had decreased on average. The third example comes from re­searc

_h on occupational mob i lity. It is general ly expected that the social

prestige of one ' s first job (measured at interval level) i s lower than that of the respondent's current job.

Table 3.1 2 Three Paired Sample t-Tests (Dependent Groups)

Mean p example Pairs ( i ) Difference in : Difference (one-tai led)

1 Female-Male Education (years) - .44 <.001

2 Individual in 1 985 and Church Attendance - .83 .03 in 1 990 (days a year)

3 First job t t rronl job cupation (prestige) -3.80 <.001

Page 35: Statistical tools Grotenhuis Weegen

{( ) I l u • l ' l i l l :1

Tabk 3 . 1 - sugges t s t l 1 : i l t i l l ' 1 1 1 d l l t y pot h c s i s ( 1 1 1 1 ) · � 1 1 1 lw t t' J l " t 1 n l lor : d l t h ree cxa mpks a t t i t · .O 'l S l ) ', l t i l l i " : t l t · c kvc l . N ote t l t : 1 t : i l l p v : t l ucs a rc one-ta i led as a l l a l t c l " l la t i v · l t y po t l t cscs arc d i rect iona l . W o n tcl l on aver­age have lower educa t iona l l e v e l s compared to the i r par tners ( samp le mean di fference = .44 years ) . Ti l e frequency of attending chu rch did de­crease as expected ( on average w i t h . l n L i mes a year), and the prestige of the respondent ' s first job indeed is lower t han the prestige of their current job (3 .8 points on the prest ige sca le ) . N ote that if these hypotheses were tested more rigorously us i ng a .0 I leve l of significance (a), the decline in church attendance would not have been s ign i ficant.

The test for a difference in means w i t h dependent groups is statisti­cally correct if the random sample is sufficiently large (n 2: 30). With smaller numbers (30 > n > 4), it is assumed that the test variable is ap­proximately normally distributed in the population. A h istogram of the test variable 's distribution in the sample may provide the researcher with information about this. With samples smaller than 5 , i t is assumed that the variable is distributed normally.

3.3.2 TWO-SAMPLE T-TEST (TWO INDEPENDENT GROUPS)

The previous section discussed tests for mean differences between women and their spouses, between individuals at two time points, and between i ndividual s ' performance on two comparable tests. In all three cases the groups were related or dependent. Groups can also be inde­pendent, such as a random sample from a population of women and a random sample from a population of men. Technical ly this means that the random selection of women from the population does not determine w h ich men are selected from the population in any way. I n case the first group contains randomly selected women from a population whi le the second group consist of their spouses, both groups are not i ndependent, as i l lustrated in the previous section.

To compare means in two independent groups, individual scores can­not be subtracted l ike they are in Table 3 . 1 1 . The mean difference is now calculated by subtracting the mean in group 1 from the mean in group 2. The standard error associated with th is difference is more difficult to cal­culate compared to paired groups. Besides group sizes, this calculation depends upon the difference in var iances w i t h i n the two groups. In the population, these two var ianccs can be equal to each other (which is cal led ' homosccdas t i c ' ) , or d i iTeren t ( wh ich i s c : dkd ' hctcroscedast ic ' ) .

Thi s i s shown i n 1 -" i • u rc 3 . 1 1 . 1 ' 1 1\ / ,� . , ., .,c . ·_,. tc ·st 1 1 1 : 1 y he used to test w hether t here is l t onH >sn·d : t : t i v i t y t l l l l l " l \ ' l o:a vd: t ·: l l l " I I Y ' J ' I t i s tes t assu mes

l t t l l l l l l l i l l . d ! i l l t l lnl i 1 .11 / I

equa l v: 1 r ia nccs 1.1 1 t l w J !Pp l l l t i H I I t ( 1 1 1 1 ) a n d is t es ted ag� 1 i 1 1 s t the oppos i te ( I . e . , unequa l V : t l " l : t l l \ ' ·s ) < > l t v 1 1 , : 1 S l l la l l n ( e .g . , . 05 ) i s used in t hese tests, w h 1ch can resu l t i 1 1 1 101 I I..' J lT i i n • the n u l l hypo thes i s ( popu la t ion var ianccs arc eq�1a l ) even t i H H I • h t here arc re levant d i fferences found in the vari­anccs f rom the samp le . When there are vast differences, and when groups have unequa l s 1zcs, 11 1s adv 1 sed to assume unequal variances irrespective of the outcomes of Levene's test. The t-distribution is well suited for testing mean differences. Recall t ha t a

_t-value indicates how many standard errors l ie between the differ­ence 111 means and the assumed mean difference in Ho (which often equals 0). Subsequently, when using a t-value the associated p-value can be calculated and

_compared to the level of significance (a). Again, for d1rect10nal altemat1ve hypotheses, the one-tailed p-value should be calcu­lated and compared with a.

group 1 group 2

Figure 3.13 Heteroscedasticity (large differences between variances)

When usi_ng SPSS, both variants of the test ( i .e . , assuming equal and un­equal vanances) are performed simultaneously, al lowing the researcher to see whether relevant differences occur in the p-values. If one wants to test the null hypothesis as rigorously as possible, one should select the test w1th the largest p-value.

. . Testin� for a difference in me�ns requires that the independent groups d i e sufficiently large ( n 2: 30) . W 1 t h smal ler groups (30 > n > 4), it is as­sumed that the tes t var iab l e in t he popu lat ion for both groups is approxi­���ately symmct n c:d . l � cs�a rc l l shows , however, t ha t t he test is also appli­cdble when bo1 l 1 d 1 s l 1 1 h l l l t o n s : 1 rc a-symmet rica l , but bear stron

resemblance. l.l 1 s i o) ', I : I I I IS 1 1 1 t l 1 1 · v : 1 1 i : 1 h l ·s i n the sample prov ide insight a� to the sha

_pc o l t i l l' d l �, l l i i H i i l l l l l ' • 1 1 1 t l t L " pt �pu l : i t ion . I f the groups have even smal le r s 1 zcs ( 1 1 '1 ) , t l w ' l l l i l h k 1 1 1 l lo l l t popu l : 1 t ions have to be ap­prox i mate ly I IO I I I I I t l l \' l i t • 1 1 d 1 1 t l t 1 1

Page 36: Statistical tools Grotenhuis Weegen

We presen t t wo cx : t 1 1 1pks In i l h ls t ra tc t h i s kst . 1 1 1 l l l l ' l 1 1 • . t n . 1 1 1 1 p l · , a COmpari son O f ' WC · k l y W01 k 1 1 1 g l i t i i i i 'S ( pa i d C l l i p loy l l l l' l l l ) I � l l l : l tk be­t ween men and WOi l l ' 1 1 . W · l' X P ·rt t l i a l women on avcr< � t�l· l i < � vc lower levels of fu l l t i me cmploy l l l l' l l l t l l < � l l 1 1 1cn . Th is seems to be t he case in a sample of D u tch respondc1 1 t s : wo1 1 1 . , , work, on average, 1 5 . 1 4 hours less than men (27.02 - 42. 1 6 ) . Ti le va r ia nccs are 1 43 .6 and 1 1 4. 5 , respec­tively. These d ifferences probably acco u n t for the fact that many Dutch women work part-time, t hereby l oweri ng the mean, albeit with a lot of variabi l ity. Men are more l i k e ly to work ru l l- t ime, so the mean is about 40 hours while the variance is re la t i ve ly low. Although the variances are unequal, the sample sizes are approx imate ly equal (3 1 8 women and 3 7 1 men). Therefore i t seems reasonable t o assume equal variances to test whether the difference of - 1 5 . 1 4 deviates significantly from 0. The stan­dard error associated with this difference proves to be 0.86. As a result, the t-value is - 1 7 .6 (- 1 5 . 1 4 I . 86). Because th is value is located to the far left in the t-distribution, the (one-tailed) p-value is very small (smal l er than .00 1 ) . So, we can confidently conclude that on average Dutch women work fewer hours than Dutch men. When we assume unequal variances the outcomes are virtually identical (standard error = . 87, t­value - 1 7 .4).

In the second example, two unequally sized heteroscedastic groups are compared. Body Mass Index (BMI) serves as the dependent variable. 1 5 In the first group, 29 respondents are aged between 20 and 2 1 and i n the second group, 820 respondents are aged 30 or older. The mean BMI in the first group is 23.9 with a variance of 30.6, whereas the mean BMI i n the second group is 25 . 3 with a variance of 1 5 .6 . Apparently, BMI in­creases with age (as might be expected), but the variance decreases. The d i fference between the means of BMI amounts to - 1 .4 (23.9 - 25 .3 ), w h ich seems quite small . Assuming equal variances, the standard error is . 76 which results in a t-value of - 1 .84 (- 1 .4 I . 76) . Assuming unequal variances, which reflects the data far better, the standard error is 1 .04 and the t-value is - 1 .35 ( - 1 .4 I 1 .04) . The associated p-values are .03 (t-value = - 1 . 84) and .09 (t-value = - 1 . 35) respectively. When testing at the .05 significance level, and assuming unequal variances, H0 is not rejected. 1 6

Table 3 . 14 summarizes our two examples.

Table 3 . 14 Three Two-Sample t-Tests (independent groups)

Example I ndependent Difference in Observed p groups (n) difference (one-tailed)

Female (318) - Male (37 1 ) working hours - 1 5. 1 4 <.001 2a Age 20-21 (29) t 30 (8/0) MI 1 110 .03 2b idem (uncqi J < l l v; u i< I I IC :< : : ) h h I l l ldi I l l .09

l n lut t H I I II • I : ; 1 1 1 1 1 : 1 1 1t :n

3 . 3 . 3 ANALYSIS Ot-: VAHIANCE (THREE O R MORE I N D E P ENDENT

GROUPS

r : \

W hen there arc more t ha n two independent groups, the t-test from the previous section is no t sui table and an F-test must used instead. The null hypothesis in this test states that all population group means are equal and the alternative hypothesis states that not al l population group means are equal. This is a non-directional hypothesis, for it is only hypothesized that the group means differ from each other. There are two important fac­tors that detern1ine whether the nul l hypothesis is rejected or not. F irstly, the spread or variance of the group means is considered - the more they differ, the more l ikely it is that the means are not equal in the population (rejecting H0). As a measure for this group mean variability, the group mean's variance around the overall mean ( 'grand mean' ) is used. This variance is often indicated as the between variance (or MSG wh ich i s short for Mean Squares of Groups). The height of the between var iance i s calculated based on a particular sum of the differences between group means and grand mean. 1 7 A graphical representation of th is between var i ance is given in the left panel of Figure 3 . 1 5 .

t

Group means Grand mean

2 Groups

0

8 0 9

3

Group mean Individuals

� 0

8 0

I t+ 0

2 3 Groups

Figure 3 . 15 Between Variance (MSG) and Within Variance (MSE)

The higher the between var iance, the further the group means are located away from t he grand mean . Or, i n other words, t he h igher the between variance, t he f 'urt h · r t he • roup means d i ffer from each o t her.

The seco 1 1d 1 : H ' I < 1 1 1 1 1 : 1 1 i l l l l l u.: l l ces the test res u l t i s t he amoun t o r var i­ab i l i t y wit!Jill l': l \ ' 1 1 l ', l ! l l l p 1 1' l l i 1 s v: 1 1· i : 1 h i l i t y i s q u i t e l a rge, i t i s less l i ke ly

Page 37: Statistical tools Grotenhuis Weegen

I l l l i j i l o l l I

l ha l l hc gro u p l l l l' : I I I S : i l l' 1 1 1 1 t ' l l l l l l ) ' l l : 1 p: 1 r l I n rcj�.:cl l hl' l l t d l I I Y J H i l i l l·s l s .

Th is var iab i l i t y i s 1 · 1 \ ' l l l 'd 1 1 1 : 1 :. 1 1 1 · l l 'itl1i11 l 'orianc< ' ( 01 M S I ·: M c < l l l

Squares or l � rro r ) . Ti l l S v : l l l : l l l l \ ' I S : 1 S l l l l l o r t he un i t s ' V < l l i : l h i l t l y ; � round

the respect i ve group l l ll: : I I I N ( Sl'l' I I l l ' I i g h l p; �ne l or F igure 3 . 1 5 ) . I X Th�.: l ar­

ger t he w i th i n-var ia 1 1 · ·, t i t · l t 1 1 t l t L ' I t he observat ions a re apart rrom t heir respective group mean , < I I H I t l l · I I H l l · d i fl i cu l t i t is to demonstrate that these group means d i fkr rro 1 1 1 L': l l ' l t o t l ln in the population.

To summarize, when t he hl: l w · ' 1 1 v ; � r i a nce is small (small variabi l i ty of the group means) and t he w i t h i 1 1 v ; � r i a nce i s large (a lot of variabi l ity around the group means) , t hcr�.: w i l l be l i t t k indication for group mean differences in the popu lat ion . Till: oppos i t e is also true; large between variance, combined with sma l l w i t h i n var ian<.;e, dearly points to different group means in the population. To express t h i s re l at ionsh ip between both variances statistically, the between variance is div ided by the within vari­ance. The outcome is cal led an F-va l ue . Re la t ive ly smal l between vari­ance and l arge with in variance results in smal l F-values (see the left panel of F igure 3 . 1 6) . Conversely, a large F-value is associated with relatively large between variance compared to the within variance (Figure 3 . 1 6, right panel) . B ecause variances are the key objects, this type of analysis is la be led ANOV A, which is short for ANalysis Of VAriance.

= Group means, - Grand mean , 0 = I ndividual score

F is small F is large 0 0 8 0 0 1 5 9 0 8 0 0 0 0 0 � --

� 8 0 0 8 5 0 § -0 8 0 0 -

2 3 2 3 Groups Groups

Figure 3. 1 6 Small F-value (lefi panel) and Large F- l 'alne (right panel)

ln l t l l t 1 1 l l o � l D l , dh i l li: l

To tktcrm i ne whc l l i L' I : 1 1 1 1 : va l ue i s l a rge enough to reject t h<.; n u l l hy­po t hl:s t s , a co t l lpu tcr progra l l l ( sud1 as SPSS) can be used to ca l cu late the p-va l uc i n t he 1 :-d i s t r i b u t ion . This sampl ing distribution i s rightwardly skewcd ( see F tgure 3. 1 7 ) . Only p-values to the right are of interest be­cause extreme F-values are always found to the far right of 0 . 1 9

F-distribution

0 Observed F-value

Figure 3.1 7 An F-distribution, Observed F-value, and p-valw

To i l l ustrate the analys i s of var iance, an exa m pk l ( ) l lows l 'n 1 1 1 1 a l l 'Sl': l l t ' l t project regarding the relationsh ip bet we<.;n c h i ld r: 1 i s i n g : t l l t t l l < ks 1 1 1 u l levels o f edu9ation. Cognitive t heor ies su •g<.;st t l l n l 1 1 1 · l'd l l t ·a t 1 ( ) 1 1 : d h · w l attained influences these att itudes. To measu r<.; : t t l i l l l ( ks i 1 1 d 1 d d l : l l h l l l ) ' , respondents were asked to respond to the I ( J I Iow i n • s l : t l l' l l l l' l l t : " l \oy :1 , . 1 1 1 �e raised more lenient than girls" by choos i ng ( ) l l(.; o r ' " l' l ' : i l q •,( l i l ' 'i completely agree' (code 1 ), ' agree ' ( 2 ), ' neu t ra l ' (3 ). ' d i s: tgt'l'l' ' ( · I ) ,

'completely disagree' (5) . Strictly speaking, this is a ord i na l var iab lt; hut it is c�mm

_on pr�ctice to assume equal distances bet ween each ca tegory,

rendermg It an mterval variable and al lowing us to ca lcu l ate t he mean score. The results are shown in Figure 3 . 1 8 .

Ot- 5

4��������7 3

2

Educational low level :

average high

= Grand mean = Group means = IQR

Figure 3. 1 X The Relationship hetween Educational Level and Raising lfu \ '.1' 1 '.1' . ( ,'iris. Ml 'ons. and !QR (= middle 50% )

Page 38: Statistical tools Grotenhuis Weegen

" ' . ,...., , . . ..... . . .

Figun.: J . l X suggl'S I � 1 1 1 11 J H'I I J l i l · . 1 1 1 1 avl· 1 ag · . I · nd 1 1 1 d 1 , I I J ', I I ' I ' ''" I l l l i te s ta temenl o l ' ra i s i n ) ' huy , 1 1 1d ) ', I l l s d i l l l: ren l l y . I n ; •dd l l l \ 1 1 1 , J W< � pk w i l l t average or h igh ·r I ' V \' h l u l \ ' l i lw : l i l l l l l 1\.:nd l o d i sagrt:L' l l iO i l' s i J oug ly

compared t o l hos · W J I I 1 lP\ \ ' 1 I · v l ' i s ul ' eduea l ion . Tah l · l . l 9 demon­strates tha t the bcl w T l l v:u l l l l l \ ' l ' ( · I X . X ) i s much larger than l hc w i thin variance ( .76 1 ) . The ra l io i ll ' l wn· , , l i l l·sc l wo var iances ( F-va l ue ) i s 64. 1 (48 .8 I .76 1 ) and i s s l a l i s l i l': l i l y S I J ' I I i l ic :u l l ( a l t he . 0 1 significance level) . Therefore, the nul l hypo l h L·s i s l ha l a l l l l t rce means are equal can be re­jected.

Table 3 . 19 Results from If NO VII.· '''dumlioua/ Level -) Raising o \ �

Between variance 48.8

With in variance .761

F-value 64. 1

p < .001

A more informative theory driven hypothesis suggests that h igher levels of education result in less tradit ional chi ld-raising attitudes. In th is case, the alternative hypothesis states that the higher someone' s educational attainment, the stronger he or she wi l l disagree with the statement that boys should be raised more leniently than girls . Indeed, this association is observable in F igure 3 . 1 8. The Bonferroni test is an appropriate tool to statistically test which group means d iffer from each other. This test re­quires three separate t-tests for differences in means for two independent groups (see section 3 . 3 .2 ) because three separate groups are present in the data. However, the level of significance is adapted in such a way that the total type I error does not exceed a. Without this adaptation, the type I error would be equal to 3 * a, which is generally considered too large. According to our Bonferroni test, a l l group means differ significantly from each other in the expected direction, confirming the alternative hy­pothesis (see Table 3 .20).

Table 3.20 Means and D!fferences in Means in Child Raising Attitudes (the larger mean, the less traditional)

Educational level

Low Average High Low 3.734 .

Average . 3 1 6 * 4.050 . High .571 * .255 * 4.305 .

• group mean , * difference between means (p < . 0 1 )

I f

Tlte 1 ;- 1 ' S I r · s ls 0 1 1 l wu . I : .: > I I I I I J l i i i J I IS . F i rs l l y , l he i n d i v idua l scores in each group a rc ass t l l l l l'd lo lw l l l l l l l l : t l l y d i s l r i bu ted in the popu l a t ion . l l owever, resea rch has shown l ha l L' VC I I w i t h non-norma l distributions, the F-test is appropr ia te l ix prac l i · : 1 1 use, w hen n per group > 5. Secondly, it i s as­sumed t h a t t he popu l a t i on variances in a l l groups are equal ( i .e . , homo­scedasti c i ty ) . Violat ion of this assumption may render the test less useful when the ratio between the largest and smallest sample group variance differ by more than 2 and the ratio between largest and smal lest group size is more than 4. In this situation, the Bonferron i test is not appropriate and a test that accounts for unequal group variances should be used.

3.4 MEASURES OF ASSOCIATION FOR NOMINAL AND ORDINAL VARIABLES

I n this section, measures of association are presented that describe rela­tionships between nominal and/or ordinal variables. D iscussed examples inc lude the relationship between religious affiliation and political parl11 preference (both nominal variables), and the relat ionsh ip between educa­tional /eve/ and income (both ordinal variables).

3.4. 1 ASSOCIATIONS IN CONTINGENCY TABLES

Contingency tables are commonly used to describe the assoc i a t ion or re­lationship between variables with low numbers of categories (preferably < I 0) . Due to l imitations imposed by the number of categories, contin­gency tables are generally used for nominal and ordinal variables only. The table consists of two or more columns and rows, depending on the number of categories. The inner cells contain the observations for each combination of columns and rows. The outer cells are called the mar­ginals, in which the total number of observations for each column and each row are presented. The total sum of a l l marginals is the total number of observations and is shown in the l ower right of the contingency table (see Table 3 .2 1 ) .

Table 3.2 1 Basic Structure Of Contingency Tables

Column 1 Column 2 Marginals Row 1 I nner Cell 1 I nner Cell 2 Row total (Cell 1 + 2 )

Row 2 I nner Cell 3 Inner Cell 4 Row total (Cell 3 + 4) Marginals Column lolc I Column total Grand total

( ,oi l 1 1 ,( 11 : l ) - ( ,oil / 1 Cell '1 ) (Cell 1 + 2 + 3 + 4)

Page 39: Statistical tools Grotenhuis Weegen

Il l 1 1 1 1 1 1n1 , ,

When t he re i s •ood i l t i 'O i l ' l i t l l l l l ' : t ::o 1 1 lo : tSSU I I H.: a 1 '1 1 1 1.\' 1 11 1 \ ' 1 d l l l l l ', l t t p, i l i s common prac l i c · l o I' L' p r L ' :<i' l l l l l w l l ldt ·t wnr lenl var iah l · ( n l so l l ' i l ' l l l'd l o as the x var iab k ) a s l hl' l 'O h l l l l l l v n r r : d 1 l · a 1 1 d t he deJ)(!IIdenl ( y ) v : 1 1 r ahk as the row var iab le . h>r · x ; � r n p k . s t t pposc l lw l a researcher st ud ies t he re l a­tionship between edumlionul lt ' l 't 'l ( m d i na l ) and income class ( ord ina l ) . S ince most peop l e comp l · 1 · t i t · 1 1· cdu ·a t ion before starting their first regular job, the causa l order sc ' I l l S i nd i spu tab le . This means that the edu­

cational level variable serves as l hc ·o l urnn variable, whereas income is the row variable. When these var i ab les each have two categories, the re­sulting contingency table w i l l have two rows and two columns (see Table 3 .22). Table 3 .22 demonstrates tha t of 567 respondents w ith Secondary Vocational School or less, 263 or them earn a month l y net income of over 2,000 euros. Of the 408 respondents w i t h 0 levels or higher, 287 earn over 2,000 euros. Based on the abso lu te d i rference (263 - 287), it seems that educational level does not influence income much. However, this comparison is i ncorrect because of unequal column totals (567 versus 408). Genera lly, absolute differences in contingency tables are inappro­priate for determining the relationship between variables.

Table 3.22 Contingency Table with Educational Level and Income (in cells: absolute counts)

Educational Level

Secondary 0 levels or higher Total Vocational or less

Income €2,000 at maximum 304 1 2 1 425 more than €2,000 263 287 550

Total 567 408 975

Percentages

To answer the research question as to whether higher educational levels induce higher incomes, a comparison has to be made between the two educational levels shown in Table 3 .22. However, a comparison of the absolute counts in the inner cells is problematic because the column totals vary (567 and 408). In a fair comparison, this variation has to be ruled out. Therefore, the column totals are set to be equal first. A common way for this is to set the column totals from Table 3 .22 to 1 00 percent.20 This is done by divid ing the co l u mn tota l s by 5 .67 ( 567 I 5 .67 = 1 00) and 4.08, respectively. As the column to ta l s represen t t he sum or the inner cell counts, these coun t s have lo be d i v i tkd hy '. . (17 and . J .OX as wel l . Conse­quent ly , t he i n ner c · l i s 1 10 l o 1 q ',l ' l l l ' J 1 1 l ' T i l l l l w : t h.'oh l l · l ' < H I I l l s hut the

l l l l l l l f l l l l l r l l ;-c 1 1 I l l I IC l P

coh 1 1 nn p · r · ' l l l : t ) ', ·s ( .'• \ '\ ' I . r l lk 1 . . 1 \ ) . hl l· cxarnpk, l he percentage of re­spondcn l s w i l l 1 Si· ·ondn t y Voc: 1 1 iona l School or less and earning an i n­come of over , 000 l' I I I OS 1 s · 1 ( 1 .4 ( 263 I 5 .67 or alternatively (263 I 567) * I 00) . The percent a • · of respondents with 0 l evel education or higher with the same i ncon1c is substantial ly higher - 70.3%. In other words, rough ly 46 or every I 00 people with Secondary Vocational School as t he i r highest l eve l of educational attainment have a monthly net income or more than 2,000 euros. Likewise, approximately 70 of every 1 00 re­spondents with 0 l evels or higher, earn more than 2000 euro 's . Thus, the di fference in percentages (notation: d%) equals 23.9 (70.3 - 46.4). In this case, both variables have only two categories, meaning d% can be com­puted by comparing the top i nner cells as wel l (29 .7 - 53 .6) . Descrip­tively, the answer to the research question is that h igher levels of educa­tion do appear to increase the chances of earning a higher level of income later in l ife.

Table 3.23 Contingency Table with Educational Level and Income (in cells: absolute counts and percentages)

Educational Level

Secondary 0 levels or h igher Total Vocational or less

Income €2 ,000 at maximum 304 1 2 1 425

53.6% 29.7%

more than €2,000 263 287 550

46.4% 70.3% Total 567 408 975

1 00% 1 00%

Presenting percentages in contingency tables is common practice, and in the cases with low numbers of rows and col umns, it provides a c lear overview of the relationship between two variables. However, the differ­ence in percentages ( d%) is not commonly used as a measure of associa­tion due to two important disadvantages. Firstly, cl% depends on whether the row totals or the column tota ls arc set to I 00%, w h ich is particularly problematic when a causal order cannot be estab l i shed on theoretical grounds. Unfortunate ly , c lear causal order ing is more d i rficul t to establish in the social sc iences t h an i l i s in t he na tura l s · i encl!s. Second ly, in tables with two rows : 1 1 1d lwo co l un l l lS , 0 1 1 l y 0 1 1 · d" o l· x i s l s . In cases with more co lumns and/or rmv: , I I H l l l' d i l f· rl·m·c i 1 1 p ' l l'l' l l l : 1 •cs can be ca lcu lated, which can he l l n t l h l t · :;n l l w n .·; 11 1. u l ln l I I H I I \ ' u tsl 1 1 1 l' l i ve to present the re la t ionsh i p I I S i l l ) ', 1 1 1 1 1 ' 'i i i i i , J , . I H I I I i i H • t

Page 40: Statistical tools Grotenhuis Weegen

! I ( ) I l 1 1 1 p l 1 1 1 : 1

3 .4.2 MEAS U R ES OF AS 0 IATION FOR NOM INA VA H I/\ l l S

Chi-sq uare Test and ( 'ramh'� V

Cramer's V can be u ::;ed to descr ih · t i le relat ionsh ip between two vari­ables, where at least one is a nom i n; d var iab le . This measure is derived from the chi-square ( nota t ion : x\ pronounced as ' Ki-square ' ) . The height of this chi-square indicates t he d i l Terenee between the observed and the expected counts in the inner ce l ls of a cont ingency table. The expected numbers are calculated from t he hy po t he t ica l situation of NO stat istical relationship between the variables. In our previous example, we found a relationship between educational /eve! and income (see Table 3 .23). Now suppose that no relationship exists between these variables, while the counts in the marginals ( i .e . , the outer cel ls) are exactly equal to those in Table 3 .23 . Since there is no relationship, the percentages in both col­umns are identical and d% equals zero. S ince the percentages in the mar­ginals are taken from Table 3 .23, the percentages per column also equal the percentages in the row totals (see the grey shaded cells in Table 3 .24). In this table i t i s completely irrelevant as to which of the two educational levels is considered, as the chances of earning a higher income are ex­actly equal ! In other words, about 56 of every 1 00 respondents, earn an income of over 2,000 euros - a number that holds for both educational levels . Tn statistical terms, this means that no statistical relationship exists between educational level and i ncome. From the inner cell percentages in Table 3 .24, we can easily calculate the expected counts for each of the inner cells . For inner ce l l 1 , the expected count i s 247 ( .436 * 567) and for inner cel l 2 this is 1 78 ( .436 * 408). The expected counts for inner cell 3 and 4 are 320 (567 - 247) and 230 (408 - 1 78), respectively (see Table 3 .24)

Table 3.24 Contingency Table with Educational Level and Income (ex­pected counts and percentages, condition: no relationship)

Educational Level

Secondary 0 levels or h igher Total Vocational or less

Income €2,000 at maximum 247 1 78 425 43.6% 43.6% 43.6%

more than €2,000 320 230 550

56.4% 56.4% 56.4%

total 567 408 975 1 00% 1 00%

tn loi i J I I t ln l : i ln t ln l l i : l Il l

The exact ch i sq u: t l l ' IN ' i 1 1 t d nkd hy tak i ng the d i fference between the observed and t he · x pl'c t n l l' \ l t t t t t s i n each cel l . These d i iTercnccs a rc squared, t hen d i v i lkd hy t i t · assoc ia ted expected count, and l lnal ly summcd.2 1 I n Tab le 3 . 2 . \ , t i l e ch i-square equals 55 . 5 (calculation : (304 -247)2 I 247 + ( 1 2 1 I n )2 I 1 78 + (263 - 320)2 I 320 + (287 - 230) 21 230

1 3 . 1 + 1 8 .2 + 1 0. 1 -1 1 4 . 1 = 55 .5) . Thus the chi-square indicates the l eve l of discrepancy between the observed table (see Table 3 .23) and the table without any statistical relationship (see Table 3 .24). H igh chi-square values indicate a high level of relationship and vice versa.

Typically, the next ( inferential) research question is: does the observed re lationship also exists in the population? This question can be answered using the chi-square-test. I f a sufficiently large number of observations are present in the inner cel ls, the sampling distribution associated with this test closely resembles the /-distribution (shown in F igure 3 .25) .22

� l-distribution Probabil ity (p ) , grey area

0 2 Observed X -value

Figure 3.25 A /-distribution, Observed /-value and p

Table 3 .26 shows the calculated chi-square value and the associated probabi l i ty . Note that the chi-square value in Table 3 .26 s lightly di ffers from our own calculations because statistical software uses exact ex­pected counts whi le we used rounded numbers. The probabi lity is smal ler than .00 1 , suggesting that we should reject the nul l hypothesis stating no relationship between the two variables. As the difference in percentages is as expected (see Table 3 .23), the alternative hypothesis - that people with 0 levels or higher general ly earn more income than people with Secondary Vocational School or less - is supported. We l ike to note that i n tables wi th 2 rows and 2 co lumns , directional hypotheses can be tested. In such a table, the reported probab i l i ty should be divided by two?'

Table 3.26 Till ' !lssociolion hc/1 1 '< '1 '1 1 !·.' ducolionol Level and lncnnw: ( 'f1i ,\'tf llill "l ' 'f 't ·st

Ch i-squ n v1 ill i l l '•' , " I ' I p ( twn l i l i l l d ) · .00 1

Page 41: Statistical tools Grotenhuis Weegen

I 1 \l l j i lnl . I

Pcrl'tmn i n . < I c l l i sq l l : l l l' l l' . ' l I I :: I I I J '. : 1 x ' d ls l r i bu l ion i s s l : i l l . l l l l l l l y . i ( l j l l l l pr ia tc whcn t l l c l l l l l l l h · 1 p i P i lSr l v : i l " l i i S i s su l 'l ic icn l l y l : 1 1 gi· < : i ' l l i ' l a l l y ,

this is determ i ned us in • ( 'ul 'll l 'l l l l ·.,. ml< ' , which states l l l : � l t l l · < ' l f l< 'i 'led

number o f observa t ions in l': ! l' l l i l l l l l ·r l 'c l l of t he hypothct i ca l l < � h l c i nd i ­cating no re lat ionsh i p ( e .g. , ' l ' i i h l · ' · 2'1 ) shou ld be at least I , w h i l c i n 80% of all inner cel l s the ex pee led 1 1 1 1 1 1 1hcr shou ld be at l east 5. I n Table 3 .24, this rule is satisfied, so us ing l i lc x! d is t r i but ion is not problematic.

When samples are sma l l and/or w l l c n the table has many rows and/or columns, the l i kelihood that Cochran ' s ru le is no t satisfied increases. One possible way to solve this problem is to comb ine rows and/or columns to increase the observations in the resu l t i ng i n ner ce l l s . However, i f this i s not feasible, an exact test i s more appropriate . This test rests on the num­bers in the marginal cells of the cont i ngency tab le . Based on these, the correct sampl ing distribution is derived, again u nder the assumption that no relationship is present. This is a relatively laborious procedure, similar to repeatedly drawing new samples from a population (see section 3 . 1 ) . However, this procedure is not as time consuming anymore thanks to modern computers. The exact test is part of every establ ished statistical software package, including SPSS. Using the correct sampling distribu­tion and the observed counts, the correct p-value can be calculated and compared with the level of significance (a) set by the researcher.

The chi-square cannot be directly used to indicate the strength of the relationship. This is due to the fact that there is no natural l im it to its height. Larger counts in the inner cel ls and/or larger number of i nner cel ls automatically lead to larger values for chi-square. For example, this means that a chi-square value of 30 in small samples may indicate a strong relationship, whereas in large samples it would indicate a weak relationship. This incomparabi l ity problem was solved by Swedish statis­tician H arald Cramer ( 1 893- 1 985) . He calculated the maximum possible value for chi-square, given a certain sample size and given a certain num­ber of rows/columns. He then divided the observed value for chi-square by this maximum value and took the square root. Without this square root, a difference between the observed and expected numbers that was twice as large would actually indicate a relationship that was four times stronger. This is due to the squaring of the differences between the ob­served and expected numbers when chi-square is calculated. In this case, Cramer's V (as the measure is nowadays call ed) equals ...J(55 .389 I 975) =

0.238?4 The merit of Cramer' s V is that its values are always between 0 and 1 . A value of 0 indicates no relationship (the observed numbers are then identical to the expected numbers, so chi-square = 0). The value 1 , on the other hand, indicates a perfect relationship (see Table 3 .27 for an example) .

Tahk J.27 l 'l 'rj, ·t ·f f,' , •ft J I I • ' I I ' II It ' ht 'l l l 't ' < 'l l f�'r lucnlioun/ l,c l 'd 1 1111 1 /ucnlll < ' .

( 'l 'l ll l l t '' ·,, I I Educational Level

Secondary 0 levels or more Total Vocational or less

Income €2,000 at maximum more than €2,000

Total

567 0

567

0 408 408

567 408 975

Although Cramer' s V is always l i mited between 0 and 1 , it is not easy to indicate when a relationship is 'weak' or ' strong' . Contrary to research 1 n the natural sciences, i t i s virtually impossible to find values o f Cramcr's V that exceed .8 in most social science research. Moreover, in common research applications a value of .6 is considered exceptionally high . For example, the relationship between education and income will never he

perfect (Cramer' s V = 1 ) because other factors also play a :ole, such < �s work experience, weekly number of hours of work, type of JOb, and se x .

We propose the fol lowing indicators for the strength of a relationsh ip : • > 0 - . 1 0 very weak • . 1 0 - .25 weak • . 25 - .35 moderate • . 3 5 - .45 strong • > .45 very strong

As a measure of association, Cramer' s V is commonly used when at least one of the variables is nominal and both variables do not have too many categories. Therefore, in many instances both variables wi l l be nom ina l , or one may possibly be ordinal. Note that the variables education levf'l

and income in our example are dichotomous and thereby have i n terval characteristics (see section 1 .2) . This means that other measures or asso­ciation that presume a higher leve l or measurement appl y to this exa 1�1p lc as wel l and wi l l lead to t he exac t same absolute value as Cramer's v. -'

To ;est whether a ca l c u la led ( 'ram�r's V-va l ue d i ffers ti-om 0, the ch i ­square test can be used w l lc 1 1 su l 'l i c i l' l l l observa t ions a rc presen t . I I ' Coch­ran' s rule is not sat i s l i ·d , : 1 1 1 l' X : I l ' l I L's l shou ld hc uscd i nstead. I n o ther words if the V<l i l l l' l ( l l i ' l l i Sq l l : l l l ' i s S i J •,n i l i ' : l l l i l y d i l 'f'crent i 'rom 0, CramtSr's V is < 'lfllt ill l ' . · q •, l l i l l l ' : 1 1 1 1 l y d i l l v l v l l l as we l l , l i l r l he lat ter i s d i­rectly deri ved 1 '1 1 1 1 1 1 I I l l ' I P I I I I I ' i

We end t h i s -.:1 ·1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 ' " " l ( lh · 11 l 1 1 · 1 l' < ' r : 1 1 1 H.:r 's V i s l he o n l y

correct me: IS I I I I ' 1 1 1 1 1 ' • �"� 1 1 1 1 1 1 1 1 1 I l l • 1 1 • , 1 . 1 1 1 l 1 l ( l l l ' ', i i i ) J I i s w h · thcr : 1 re l < l­t ionsh ip e x i s t s I l l ' I \\ 1 ' ' " , , l 1 1 ' ' ' ' " ' ' '"''' ' ' l iP/ / t I I I I I I I I I L I I ) : 1 1 1d t Jn/itiC 'ol t }( " 't \ '

Page 42: Statistical tools Grotenhuis Weegen

JJreji!re11ces ( no 1 1 1 i n : i l ) 1 1 1 l l l l ' N v l l t �· 1 l : 1 1 tds ( sc:c Ta b le 1 . .' :-1 ) l l n · l '. t c:y shaded ce l l s show l l t ; � l I H l l l l l ll' t i l hL· t s l t : t vc: a s t rong prc: ll.:1 ' I I �T l ot k l i ­wing part ies . Dutch ( 'a l l tn l trs l t nd t l i on : d ly have d i fl icu l t ic:s i 1 1 choos ing between left w ing ( an c: ·o t H i t l l i e ; i l i n l · rc:sl ) and Chr is t ian part i es ( a cul­tural interest), w h i le Prolcsl ; l l t l s pr ·don t i na n l ly prefer Christian parties. The relatively large d i llcren · c:s r ·ga rd i 1 1 g po l i tica l party preferences are reflected i n a strong re la t ionsh i p h ·1 w ·c: t t re: I ig ious affi l iation and politi­cal party preferences (Cra mcr ' s V . 3 9 , p-va l ue < . 00 1 , and differs sig­nificantly from 0 with a l l common va l ues o r a) . As a side comment, we would l ike to add that these da t a arc from 2005 and they suggest that po­l itical party preferences sti l l are re l a ted to re l ig ious affi liation, despite the well documented processes of secularizat ion.

Table 3.28 Relationship between Religious Affiliation and Political Party Preferences (counts, percentages, Cramer 's V, and p-value)

Religious affil iation Political party in the Netherlands Catholic Protestant None Total Christian parties 79 1 26 39 244

38.7% 63.6% 6.4% 24.2% Left wing parties 84 47 409 540

4 1 .2% 23.7% 67.4% 53.5% Right wing parties 35 2 1 1 1 3 1 69

1 7.2% 1 0.6% 1 8.6% 1 6.7% Liberal party 6 4 46 56

2 .9% 2.0% 7.6% 5.6% Total 204 1 98 607 1 ,009

1 00% 1 00% 1 00% Cramer's V = .39, p < .001

3.4.3 MEASURES OF ASSOCIATIONS FOR ORDINAL VARIABLES

In the previous section we used chi-square and Cramer's V to determine the relationship between variables of which at least one was nominal . If both variables are ordinal, not only can the strength of the relationship between the two be determined, but so can its direction. For example, it is obvious to expect a positive relationship between educational level and income: the higher the level of education attained, the more income earned. Likewise, studies show a negative re la t ionsh ip between health care and child mortality: the more a government i nvests i n hea l t h care, the lower child morta l i ty w i l l he. I n hol h cases. C ' r: t n t(· r 's V i s not a ppro­priate, as it fa i l s (O de l c;cl l hL· d i r Tl i0 1 1 01 S i ) ', l l l l f l l l \ ' t l ' i : t i i ( I I I S i l l j l .

I l l lOl l H l l i r d �) l r d l:il l t : I

l(c n d a l l ' s R a u l< ( 'm· • l' ln f u u : f a u h a nd l a u c

Maurice Kemb 1 1 ( 1 1 JO"/ 1 1JX3 ) constructed a rank correlation to express not on l y the prcsc l lcc a 1 1d s trength of a re lationship between two ordinal variables, but a l so t he direction. Kendall ' s correlation reaches the maxi­mum values of I or - I in a contingency table with ordinal variables, in case all observations are located on the main diagonal (see Table 3 .29).

Table 3.29 Perfect Positive and Perfect Negative Relationship

Low Moderate

High

Low Moderate H igh

Kendall 's tau b = 1

Low Moderate High

I I Kendall 's tau b = -1

Table 3 .29 is highly hypothetical as such situations wil l rare ly occur, i r ever. I n the social sciences it is more realistic to find the o iT d iagona l cells fi lled to some extent as wel l . I n cases o f positive re la t ionsh ips. u n i l s (e.g., respondents) scoring l o w o n the first variable tend to score l o w on the second variable and units scoring high on the f i rs t vari able lend lo score h igh on the second variable as well. However, there w i l l a lso bl: units where high scores on variable 1 relate to low scores on var iab le 2 and v ice versa. These violations wi l l cause the positive re lationsh i p to be less than 1 . Kendal l constructed his rank correlation tau (in notation the Greek letter -r often is used) by comparing the number of positively re­lated observations with the number of negatively related observations. Table 3 .30 i s an example for educational level and income.

Table 3.30 Relationship between Education and Income, as Example 20 Respondents with A verage Education and A verage Income (+ indicates positive relation, - indicates negative relation)

Educational Lowest Low Average High Highest level :

Income: Lowest + 20 + 5 - 3 - 4

Low + 5 + 1 0 0 - 2 - 1 Average

High '1 + 1 5 + 1 0 Highesl :I :) + 5 + 25

Page 43: Statistical tools Grotenhuis Weegen

lll l

I n Table :LW. t h · h i • l d i ) � h k d 1 1 1 1 1 \ ' 1 l' · 1 1 i l l us l ra l l:s t i lL· v n l 1 1 1 1 i i l l l l l l 1 ! 1 1\. cn­da l l ' s tau . Th i s ce l l r ·pr 'SL' I I t s . 10 I L'S ) lol l tknls w i t h av · 1 : 1g · I · v · l o; 1 1 1 edu­cation and i ncome. Typi ·a l l , n l l l l' : i l ion and i ncome ar · I H lS i t l vc ly re­lated, thus responden t s w i t h a low or very low educa t iona l leve l arc l i ke ly to have an income l ower t han the 20 peop le in the grey shaded cel l . L ikewise, people with a h igh o r very h igh level o f education w i l l typi­cally have a h igher leve l or i ncome compared with these 20 respondents. I ndeed, this is the case for the respondents in the cells marked '+ ' , amounting to 95 respondents ( 20 I 5 I 5 + I 0 + 1 5 + I 0 + 5 + 25 ) . This means that for each of the 20 respondents in the grey shaded cell, 95 re­spondents behave according to the presumed positive relationship . S ince there are 20 respondents in the grey shaded cel l , there are 95 * 20 = 1 900 combinations which are called concordant pairs. On the other hand, re­spondents in cells marked with a ' - ' indicate a negative associ ation. In total there are 19 (3 + 4 + 2 +I + 4 + I + 2 + 2) respondents who do not behave to the assumed positive relationship, resulting in 380 (20 * 1 9) discordant pairs. Notice that the cells i n the same row and column as the grey shaded cell (called 't ies ' ) are not used when calculating the number of concordant and d iscordant pairs. Following thi s strategy we can calcu­late the number of concordant and discordant pairs for every cell i n Table 3 .30 . Kendafl 's tau is s imply the d ifference (notation: S) between the to­tal number of concordant and d iscordant pairs, d ivided by a certain num­ber to keep tau i n the range of - 1 (perfect negative relationship) and + I (perfect positive relationship).

Kendall used hree d ifferent denominators for tau to ensure a range of ( - 1 , + 1 ) , resulting in the existence of three d ifferent tau measures: tau a, tau b, and tau c.26 Tau b and tau c are especially i mportant in social sci­ence research . Tau b can reach values - I and + 1 when a contingency table has an equal number of columns and rows ( ' square' tables). Tau c can reach values - 1 and + 1 in ' rectangular' tables (an unequal number of rows and columns). Thus, tau b is most suitable for square tables while tau c is most suitable for rectangular tables. However, the use of d ifferent de­nominators does not alter the interpretation: Kendall ' s tau is positive when the number of concordant pairs exceeds the number of discordant pairs and is negative when the number of d iscordant pairs exceeds the number of concordant pairs. The strength of the relationship expressed by Kendal l ' s tau can be determined using the rules on page 83. Hand calcu­lating the concordant and discordant pairs in Kendal l ' s tau b and tau c is highly time consuming but can eas i ly be done in most statistical pack­ages, including SPSS.

To statistically test whether Kcnda l l ' s ta 1 1 s i • n i l ica n t l y d i ffers f'rom 0 the normal sampl i ng d is t r i hu t ion L·: 1 1 1 lw 1 1s n l w h L· 1 1 t i lL · s: l l l l ) l k s i ze equals

1 1 1 1 1 r0 1 1 i l1 1 1 : i 1 1 1 1 i : l l l 1 : r 1 \ (

· . m bcr ] ( ) or more (s 'l' 1 - i ) ', l l l l ' I < 1 ) < , ,. 1 1 \ ' l : d ly , 1 1 1 1 assumes tha t the tot a l 1 1 1 1

of concordan t : 1 1 H I d i SI' I I I d : l l l l p:ms arc equa l , so S 0 and Kcnda l l ' � t au ··

· · · 1- I d · 1·1 · 1·· 0 va l ue 0. ' I o test whet h ·r t I l L· 1 1hs . , vcd S s 1gn 1 H.:ant y 1 ers rom , a Z' "7 1 1 1

. . )ower and assoc iated p-va l uc a rc ca l cu la ted . - W 1en t 1e samp e s1ze IS r t han 30, i t i s adv isable to use an exact-test instead because the san1P mg

distribution is probab ly no longer nonnally distributed. h I . .

I fi . I . 5earc To conclude t 1 1 s sect10n, two examp es rom socm sctence re h wi l l be given. The first example regards the relationship between t e

respondents ' educational level and spouse 's educational level to c\eter­

mine the extent of educational homogamy (see Table 3 .3 1 ) .

Table 3.31

Educational level (spouse)

Relationship between Educational Level and Spouse 'S Edu-

cational Level

Educational level (respondent) rota I Lowest Low Average High Highest � Lowest 2 1 1 8 3 2 1 6·6% 39.6% 7.7% 1 .2% 1 .0% 1 .2%

Low 1 9 1 26 67 26 5 243 35.8% 54. 1 % 27.8% 1 3. 1 % 6.2% 30· 1 %

Average 8 52 87 64 1 4 225 . 1 5. 1 % 22.3% 36. 1 % 32.2% 1 7.3% 21-9%

High 4 33 67 8 1 28 21 3 7 .5% 1 4.2% 27.8% 40.7% 34.6% 26.4%

Highest 4 1 7 26 33 81 1 .9% 1 .7% 7. 1 % 1 3. 1 % 40.7% 1 0 ·0%

Total 53 233 241 1 99 81 :sol 1 00% 1 00% 1 00% 1 00% 1 00% 1 00%

Kendall's tau b = .45, p (one-tai led) < .001

The grey shaded cells in Table 3 .3 1 - which hold the h ighest percentage

per column - suggest that there i s a positive relationship between tfl.e or­

d inal variables: the h igher the respondents' educational level, the J1lgh�r

the educational level attained by their spouse. This trend i s also �!'>SOC�­

ated with a strong positive relationship ( .45), and because the p-v�lu_e IS

very smal l (p < .00 I ) , i t is extremely un l ikely that th is posit ive reliltlOn­

ship does not ex i s t s in the popu la t ion . I n teres t i ng ly , the data in Table 3 ·3 1

were col lected i n I l l · Nc thcr l <mds i n 2000, suggest i ng tha t even now'£tdays . .

I I . ( I .tome-educationa l lcv ·I s ·c 1 ns 1 1 1 1 port an t w 1 ·n c wos 1 ng a par tner a p 1ev

non ca l l ed educultnuul ltnlltl l.l :l ll l l i ' ). I . I . 7ns 1 1 p

O ur second t' X ! I I I I J I I I · , ., 1 1 1 1 1 1 1 : 1 I P ) l l \ ' : i i i L ' ! H i y d 1 sc u sscd : the re a l l ( I . I I I I .I I ( . 'Tab le between rc.\'f ll ll/r , . , ,, 1 , ., "' ' ' ' ' ' ' / /" , · 1 ·, · 1 1 1 1 1 l l l f '( llll < ' c nss sec

3 . 3 2 ) .

Page 44: Statistical tools Grotenhuis Weegen

HI\

Table 3.32 Re/otions!titJ fwl l r •r ·r · ll /•,'r lnl 'olirmol l .cr 'el nnd l!l t 't l llt < '

Educational level Lowest Low Hi h rota I

Income Less than 39 1 0 1 75 38 260 €3,000 6 1 .9% 38.3% 27.5% 1 5.7% 7.4% 27.7% €3,000- 22 1 1 1 1 06 86 2 1 346 €5,000 34.9 42.0% 38.8% 35.5% 22. 1 % 36.9% More than 2 52 92 1 1 8 67 331 €5,000 3.2% 1 9.7% 33.7% 48.8% 70. 5% 35.3%

Total 63 264 273 242 95 937 1 00% 1 00% 1 00% 1 00% 1 00% 1 00%

Kendall's tau c = .36, p (one-tai led) < .001

Again, cel l s with the highest column percentages are highlighted in Table 3 .32 . There appears to be a positive relationship between educational level and income: the higher the educational level of a respondent, the higher his or her earnings. This tendency is also reflected in Kendall ' s tau c (because of the rectangular table), which indicates a strong positive and significant relationship ( .36, p (one-tail ed) < .00 1 ) .

Spearman's Rank Correlation

Because ordinal variables are rank ordered, the relationship between them can a lso be expressed as the difference in rank order, as argued by psy­chologist Charles Spearman ( 1 863- 1 945) . Suppose five respondents each have a di fferent level of education. We can assign rank scores to these individuals that correspond to their respective ranking of education. The respondent who has the lowest educational level is assigned the score 1 , the respondent with the second lowest educational level receives score 2, the middle category equals score 3 , the second h ighest educational level equals score 4, and the respondent with the highest educational l evel is ranked 5. Next, income is ranked in the same way. Now suppose that the variables educational level and income are perfectly related. In this event the ranking of education perfectly matches the ranking of income, and Spearman' s rank correlation (often i ndicated w ith rs) equals 1 . When there is no relationship between education and income, neither is there a relationship between the rank order of education and income (rank corre­lation = 0). F inally, when a perfectly negative relationship exists between educational level and income, the rank order of both variables perfectly oppose each other (rank correlation = - I ) . Tnble 3 . 3 3 conta i ns detai ls for Spearman 's rank corre la tion .

Tahlt J.:\J ,\j l< 'l ll 'l l tt l l t ·,, Nt tl ll, ( 'ut 't 'dution l l 'illt Frttli l l ( I ) . l! ldc ·t )( 'Jil lcnl

( 0), 1 11 1d ( Jf if JI I.I' t l< ' ( I ) l<o11k ings.

resp. Education , .. Income Income Income

A Lowest Lowest High 4 Highest 5

B Low 2 Low 2 Lowest High 4

c Average 3 Average 3 Average 3 Average 3

D High 4 High 4 Highest 5 Low 2

E Highest 5 Highest 5 Low 2 Lowest

Rank correlation: 0 - 1

* r = Ranking of Education and I ncome

Spearman 's rank correlation is calculated using the rank scores of two ordinal variables. To prevent the correlation from fal ling outside of the - I and + 1 range, rank scores are first standardized into z-scores, e l im inating the influence of variables measured in different units. For example, i n Table 3 .33 , the variables educational level and income are difficult lo compare because the ranking is measured in different units ( levels vs . income classes). An easy solution for th is incomparabi lity is to transform them into z-scores (see section 2 .3 .3 ) . Next, for each unit of analysis (of­ten respondents), the two z-scores are multipl ied and summed across a l l units to a total. This total sum of multiplied z-scores reaches a positive maximum if both rank orders match perfectly . Conversely, the total sum has a maximum negative value when both rank orders perfectly oppose each other. However, more units results in a higher total sum. Therefore, the total sum is divided by the to ta l number or units (n), resulting in a value that always fa l ls between - I ( max imum negative association) and + 1 (maximum positive assoc i a t ion ), wh i l e 0 means no association at a l 1 .2�

So, the rank scores arc l i rsl l r: l l t s l (�r�n ·d i n l o z-scores (a process ca l led ' standardization ' ) and l i r · sl : r J H b n l <k' v ia l ion hl.:comcs the u n i t of meas­urement. Therc ro n .:, l 's l l l l' : I I I S l l r : r l : J s l : l l l ( l ; r n l dcv i ; r l ion change or I in the ranking of var iah l · x rvs 1 r l l s 1 1 1 : 1 r l r : t J I ) ' t' o r 1 '· : s 1 : r 1 1d : t rd dev iat ions in t he rank ing or var ia hk y 1 ' 1 1 1 l ' \ . J I I I J i i l ' , 11 1 1 1 ' 1 1 / • , ( '( j i J < J i s . 'i , ;r r i se o r I stan­dard dev ia t ion i t l I l l \ ' J . J i dd i l ) ' I l l \ , . , l l ' • ' · < H 1 . 1 1 \ ' 1 1 \V i l l i : r tkcl i ll\.: or .5 stan­dard dev ia l io 1 1 s 1 1 1 1 1 1 1 ' 1 " "" " I ' 1 1 1 \ t\ ' " " ' ' ' ' ' " ' ' ' ' J l ' · l : r i JO I I i s l l w l t's i nd i -

Page 45: Statistical tools Grotenhuis Weegen

I I I 1 1 J l l l l 1 : 1

ca les lhe ex le 1 1 1 l o w h i c h ! I l l' v : 1 c 1 : c h k s ' r: 1 n k i n • s d i i'I L' I l 1 un 1 \ ' o i l 1 1 ul hcr.

l l ighly s i m i l a r ra nk ord ·rs l l·su l l i 1 1 l 1 i g l 1 ra nk corrc la l inns ( 1 1 1 : 1. \ 1 1 1 1 1 1 1 1 1 I ) , and opposing rank ings resu l l i n I l l' • a l i vc ran k corre l a t ions ( 1 n i 1 1 i l l l l l l n - I ) .

To test Spearman ' s ra nk ·orr · 1 : 1 1 ion ror samp les w i t h al leas I 30 ob­servations, the t-d is t r ibu t ion can he used ( sec F igure 3 . 8) . !\ga in , the t­value indicates the re la t i ve d is lance bel wccn the observed rank correla­tion and the correlation stated in lhc n u l l hypothesis .29 F inally, the associ­ated p-value is compared with the sc lcc lcd leve l of significance (a). For samples with less than 30 observations, i t is preferable to use an exact­test, where the p-value is calculated based on the distributions of x and y variables, assuming that there is a zero rank correlation in the population (= H0). Again, this probabi lity (p) is compared to the level of significance (a) to test whether the observed rank conelation differs from 0 in the population (= Ha).

As with Kendal l ' s tau, Spearman 's rank correlation can be used for variables measured at the ordinal level . Whether Kendall ' s tau or Spear­man 's correlation should be used depends on the research question : to compare findings from previous studies, it makes sense to choose the same measure of association. Additionally, the advantage of Kendall 's tau i s its ease of interpritation - namely, the difference between the num­ber of concordant and discordant pairs.

The benefit of Speannan's rank correlation is that it is s imi lar to Pear­son ' s correlation coefficient (see section 3 . 5 . 1 ), which is an important measure of association for interval-/ratio variables. However, use of the latter measure requires an approximately l inear relationship, which is not required for Spearman's correlation (for an explanation of l inear associa­tions, see section 3 . 5 .2). Therefore, rs provides a better alternative to de­scribe a bivariate nonl inear relationship between interval variables.30

Figure 3 .34 provides an example of a nonlinear association between age (measured in years) and the Body Mass Index. Between 1 8 and 33 years of age, the mean BMI rises strongly and does not rise much further after that. Thus, the BMJ does not constantly increase I in early between 1 8 and 70 years of age. Pearson's conelation coefficient for this relationship is .22 , but thi s i s a misrepresentation of the real nonlinear association. The strength of the relationship between age and BMI is expressed more ap­propriately by Speannan's rank correlation and is stronger: .29 (p < .00 1 ).

I i l l 1 1 1 1 I l l I l l � > 1 1 1 1 1 . l it : : I

30

0. 28 -:J 0 ,_

(J) 26 Q) (J) ro ,_ 24 Q) 0. :::2: 22 (()

c � 20 :::2:

1 8

1 8 23 2 8 33 38 43 48 5 3 58 63 68 Age

Figure 3.34 Nonlinear Relationship between Age and BMI (rs = . 29)

l l l

3.5 MEASURES OF ASSOCIATION FOR INTERVAL AND RATIO VARIABLES

This section discusses the statistical tools available for interval and rat io

variables. F irstly, Pearson 's correlation coefficient can be calculated .

Secondly, regression analysis can be utilized to demo�strate, �or In­

stance, the average weight increase for every s ingle um� mcrea�e m age.

Finally, thi s analytical technique can be used to determme the m fluence

exerted by multiple variab les simultaneously on a dependent vanable (y).

3.5.1 PEARSON 'S CORRELATION COEFFICIENT

Karl Pearson ( 1 857- 1 936 ) su • gL's lcd l hal lhe re l a t ionsh ip be t�een tw� interval or ratio var iables c : c 1 1 he : 1 n : t l y:;.cd us i ng a l inear correlat ion coe l ­

l 'ic ient now known : 1 s / '1 '1 11 '.1' 1 1 1 1 ·.,. c 't ll 'l 'dl ltiull I 'IW//icif'llf ( notat ion : r)._ This c�efl'icicnt eq u : t l s l l 1v 1 1 1 : 1 \ l l l l l l l l l v : t i i i L' u l ' I when a 1 -un i l i nc rease o l

t he variab le X i s : I SS( Il' I : Ckd w i l l I ol I 1 1 1 1 1 1 I l l \ 1 \ ' : C S\' o r v ;u· i ah lc y ( sec F igu re

3 .3 5 ) . Th i s rc l ; 1 1 i u 1 1 s i i i J I l 'l 1 . t l l c ·d o1 ( H" I i l ' l I J I I I ' • I I I Vl ' /ult 'l / 1 ' ns.wciotio11 . T l_1c

l i near assoc i ; t l i p 1 1 1 '1 1 w c i 1 ' 1 l l \ 1 1 • ) ' 1 1 1 1 \1 ( I I ) i l vv ' I V U l l l l l l lcrcasc o l x

rcsu l ls i n a I 1 1 1 1 1 1 , ;, . , 1 • ' " ' 1 1 1 \

Page 46: Statistical tools Grotenhuis Weegen

! ):I

y

X

F igure 3.35 Pearson 's Correlation Coefficient

I l l f l ! l l l l l ,\

T a b l e 3.3(, l )t 'l ll '.\'1 11 / ·,,. ( 'ut rl 'lulio11

( ·,wfjicielll jiJr 1 11 ·1.1 :/11 I V£ 'ip)il

Height p (one-tailed) -------

Weight .52 p < .001

Recal l that variables often have di fferent units of measurement. For ex­ample, the variables body weight and age are measured in ki lograms and years, respectively. In section 2 .3 .3 , we demonstrated that this problem can be solved by transforming the measures into z-scores. L ikewise the original scores of both variables used to calculate Pearson's correl�tion coefficient are also transformed into z-scores. Now, per unit of analysis, these z-scores on x and y are multiplied and finally summed across a l l L�nits. This . total sum is positive when the l inear association is also posi­tive, and v tce versa. Furthermore, just as in Spearman's rank correlation, the total sum tends to be higher when the total number of observed units (often respondents) is higher. Dividing by the total number of observa­tions results in a correlation that fal ls w ithin the range - 1 and 1.3 1 The correlation co�fficie�t always l ies between these two extremes and equals 0 when there IS no l mear association. This, however, may not mean that there is no association between the variables, as nonlinear association may exist (see F igure 3 .34).

As was mentioned before, the scores on the original variables were first transformed into z-scores with the standard deviation as their unit of measurement. Therefore Pearson's correlation coefficient indicates that when the score on one variable (x) increases by I standard deviation the score on the associated variable (y) wi l l increase by a number of standard deviations equal to r. In chapter 2, we graphically demonstrated a rela­tion

_s�ip bet�een height and weight (see F igure 2.29). Numerical ly, this

positive relatiOnship can be described with a Pearson's corre lation, which amounts to .52 (see Table 3 .36) . Therefore, for every standard deviation increase of height, weight increases on average by .52 standard devia­tions. An interpretation of the strength of Pearson ' s correlation coeffi­cients can be found on page 83 . As w i th Speannan 's rank correlation Pearson's corre lation is s ta t i s t i ca l ly t es ted u s i ng a t -d is t r ibution when th� sample has at least 30 obscrvat ions . 1 7 In sm:d kr samples, the exact-test provides a more appropr ia lc : d krn: 1 1 i v · .

Tl l · curr ·b t i P I I l 'I W I I I I 1 1 ' 1 1 1 1 : . l i i i i i i i iU i i l y u sc.d ; 1 s a n J <.: ; Isu rc. n l ' assoc i a­l inn . l l owc.vc.r. l h i s l l ll ' l l S l l l l' : I SS I I I l lL'S :1 l i nc.a r rc. l a t ionsh i p, w h ich can be. c l l c.ckc.d graph ica l l y i n :1 l l l l L' ) ', 1 : 1 p h ( sc.c l ; igurc 3 .34) and n umerica l l y by L ·ompari ng , . w i t h rs. w h ·r · r. · · · , . i nd ica tes a non l i near relationship. One d i sadvan t age or Pcarson ' s corre l a t ion is its sensitivity to extreme scores ( ou t l iers) , espec i a l ly w hen re l a t i vely few observations are present.

I n our example, Pearson 's correlation coefficient indicates the relative change in weight associated with changes i n height. However, another i n teresting question remains unanswered: on average, how many kilo­grams are added to body weight when body height increases by a particu­l a r ( absolute) value? The answer can be found in the next section.

3.5.2 LINEAR REGRESSION ANALYSIS

I . i near regression analysis is connected to the work of Sir Francis Galton ( I X22- 1 9 1 1 ) who researched heredity (nowadays labeled 'genetics') . For e x ample , he studied the relationship between successive generations of sweet peas. The size of the peas within a given generation provided a c lose prediction of the size of the next generation. More generally, Galton dc.monstrated that values on a dependent (y) variable can be predicted by scores on an independent (x) variable. This technique is called regression u11o/ysis.

In Figure 2.29 we demonstrated that tal ler respondents typical ly weigh 1 1 1ore . Based on Pearson 's correlation coefficient this association appears lo be quite strong (see Table 3 .36) . In addition, it is also possible to ind i ­cate how many kilograms someone' s weight wil l increase on average with a given unit increase of body height. Again, the relationship between hod)! height and body weight is assumed to be approximately l inear for l hc respondents in the sample ( i .e . , respondents who measure between I :'\0 and 205 centimeters). Th is means that weight increases at a constant l i 1ctor. This factor is represented by a regression l i ne that can be found 1 1 s i ng data from a scatter p lot . Three a t temp ts to find this l ine were made i 11 Figure 3 .37 . Lines I and 3 don ' t seem to represen t the l inear increase very wel l . According to l i ne I t h · weight i ncreases too fast (too many l lhservations c luster below t h · l i n · ) , : n 1 d ac cord i ng to line 3 the weight r ises too slowly ( to ma 1 1 ohsl' l v : i l i 1 l i i S ; i l w v · t he l i ne ) . L i ne 2, however, does seem to prov id · a ) (I H H I : l p j l l i l '\ l l l l : l l i o l l ; 1s i t ro1 1 • h ly runs through the 1 1 1 iddle of all obscrva t i l lm: 1 1 1 l l w �wn l t l ' l p l 1 1 1 ( l l l l l' .. : � c tua l l y is based on 1 <.:su l t s from re •r ·ss i o 1 1 n l l : t l y · l l ': ) < 1 1 1 1 ' 1 1 t I I l l ' l ', I I : J i s i 1 1 ca lcu la t i ng th i s re­gress ion l i ne i s lo l i : 1 VI' t l w I 1 1 L i l 'd l l l l 1 1 t i i l l \ l ' l l l l ' : i l d 1 s t a nccs between ob­Sl' rva t ions and i l il' I I ' J', I l"• ' i l l l l l l 1 1 1 1 l l j l l d 1 1 1 / t ' l l l

Page 47: Statistical tools Grotenhuis Weegen

- 1 05 (f) 1 00 E ro 95 '-Ol 90 .2 � 85 c 80

� ....... £ 75 0> 70 "(j) 65 s 60

55 50 45 40

( )( )

()() ( ) () (> ( )

0 0 ( )( ) ( ) 0 0 0

0

0

0

t l u oplnt : 1

0 I l l

0 0

[2] 0

0

0 [3]

8 g _.:; . - . - . - . - . - · � · go o

1 50 1 55 1 60 1 65 1 70 1 75 1 80 1 85 1 90 1 95 200 205 Height (in centi meters)

Figure 3.37 The Relationship between Height (in centimeters) and Weight (in kilograms) and 3 Lines Representing the Linear Tendency

1 1 0�------------------------------------------�

-

1 05 1 00

95 (f) 90 � 85 '-g 80 � 75 c 70 ::- 65 � 60 "Q) 55 s 50

45 40

Difference ('error') between observed weight and predicted weight

1 5 kg

b coefficient = 1 5 I 20= . 75

1 50 1 55 1 60 1 65 1 70 1 75 1 80 1 85 1 90 1 95 200 205 Height ( in centimeters)

Figure 3.38 Regression Line RejJresentin.!.!. !lie Uneor Relationship hetween 1 /ei,!.!, IJ/ m11 l Wl 'i,t �IJt

l ;( l l i l l u s t ra t i v · p t t q H l:W· .. 1\ ,. l t q •, l t l i gl l t i 1 1 l ; i • u r · .L\X l i ve respo1 1den ts to •e t her w i t h I l l · rL" ) '. t ('s:m l l t l t t t l ' t t ( l . 2 rrom l ; igure 3 . 3 7 . One respomknl n H.:asur ing 1 60 e ' t t l i t t t l" l l" t s l t : t s : t l l observed we igh t o f' 45 k i lograms and a pred ic ted we igh t or ( ) ( ) k i logra t n s Th i s ( vert ica l ) difference (called error, res idua l or c ) i s - 1 5 k i lograms (45 - 60). The difference for the other respondent measur ing 1 60 ccntimeters but weighing 75 is 1 5 (75 - 60). l ;or one respondent the observed weight equals the predicted weight ( dif­ICrcnce: 75 - 75 = 0), while for the remaining two respondents (measur­i ng 200 centimeters) the error amounts to 1 5 and - 1 5 ( 1 05 - 90 and 75 -90) . The sum for a l l five differences equals 0 (- 1 5 + 1 5 + 0 + 1 5 + - 1 5 ) .

Having calculated the regression l ine w e can easi ly determine the av­erage l inear weight increase when height increases by 1 centimeter. Mathematically, this factor is known as the gradient, but in regression analysis it is referred to as the b coefficient. This coefficient can be de­r ived from Figures 3 .37 and 3 .38 . ln F igure 3 .38 respondents measuring 1 60 and 1 80 centimeters are compared. The regression l ine predicts that t hey weigh 60 and 75 ki lograms, respectively. A 20 centimeter increase ( 1 80 - 1 60) therefore results i n a predicted increase of 1 5 ki lograms ( 75 - 60). So, the b coefficient is . 75 ( 1 5/20), mean ing that an increase o r I ccntimeter on average results in .75 ki logram more weight. Genera l ly , t he b coefficient i s the change in units of y (change i n ki lograms in th i s ex­ample) associated with a 1 -unit change in x ( I centimeter in th is example ) .

The intercept or constant (a) i s also an i mportant parameter in add i t ion to the b coefficient (b) as every straight l ine can be mathematica l ly de­scribed as y = a + bx . In our example this i s : body weight = a + b * body height. The i ntercept (a) i s the value of y where the l ine crosses the y­axis . It can be found simply by extending the regression l ine to the y-ax is . The y-axis originates at x = 0 . When the regression l i ne in F igure 3 .38 i s extended to 0 (height = 0), the weight drops from 6 0 k i lograms (height 1 60 centimeters) to -60 ki lograms (60 - 1 60 * .75 = -60), see Figure 3 .39.

60 kg - - - - - - . - - - - . - - - - - - - - - - - . . - . - . - - - - - - - - - . - . - - - - - - - - - - - . . - - - - - - . - - - - . - . - - - . - - - - .

160 cm

._ ll :10 I 40 0.75

Figu re 3.3'> ( 'nns/! 1 1 11 ( u ! 1 1 t '• ""' ' 1/ 1 • "r f l l l l f , I J ) t llid h ( 'ncfficicn l (h) ( 1 '11! 11 / l ' r ' t l / 1 11 '1 /fl d / 1 1 1 / l f l l l t l t 't /\ 1 ' 1 1/ I )

Page 48: Statistical tools Grotenhuis Weegen

• 11 I f ' I • I. " � I

( ) f COU rSL:, l l ( l rl: :d 1 \ l l ' : l l l l l l ! '. \ 1 1 1 1 l a· : t l l : t \ ' l l n l l o l l l l' l l l i l' l l l 'p l ( 1 1 ) I l l l l 1 1 . ' L'X

ample b�.:eause lh ' I\' : I l l' 1 u 1 � � · �•pnt Hkn t s w i t h :t.cro I H: i g h l 1 1 1 t l w S: t l t tpk,

h igh l igh t i ng t h�.: dung ' I \ 1 1 ' \ l r: t pu i : I I I O i t . Th is , how�.:v�.: r , dm·s l l l l l 1 \ IL:an that the resu l ts a n.: i nv: t l i d l l l l l l t l' t l·spon tknts i n t he samp l · . T i l L: regres­sion equation is : hod); W < 'i,!! ,/11 ( l ( ) I . 7 5 * hody height, and app l i �.:s to a l l respondents represented i n t h i s s: 1 1 np l · ( t hey a l l measure between 1 50 and 205 centimeters). lt is unknown w i t · t l tcr t h i s equation is valid for people tal ler or shorter than th i s, bu t we know ror sure that the equation is not valid for newboms (height :L 50 cent i m�.:t�.:rs ) .

In our example, we determ i ned a and b rough l y from the F igures 3 .37 and 3 .38 . I n statistical packages t he regress ion estimates for a and b are calculated using a technique cal led ord i nary l east squares (OLS). 33 Based on this technique, the ' real ' a and b equa l -50.54 and . 73 , respectively (see Table 3 .40) .

Table 3.40 Linear Relationship between Height and Weight: Estimates for a (constant) and b (b coefficient)

Dependent variable (y): Weight (in kilograms)

Constant (a) b coefficient for Height ( in cm) (b)

Estimates

-50.54 .73

Significance level (p) (two-tailed)

< .001 < .001

I n Table 3 .40, two-tailed p-values are reported for the constant (a) and the b coefficient (b) for height. The test for the intercept (H0: a = 0) is not relevant as it relates to non-existing respondents (height = 0). The p-value for the height b coefficient is used to test whether its value differs signifi­cantly from 0 in the population?4 Because p is much smal ler than a we can safely reject the nul l hypothesis and accept the alternative hypothesis, for it i s very l ikely that body height and body weight are positively and l inearly related in the population. Note that the SPSS generated p-values in Table 3 .40 are two-tailed and need to be divided by 2 because the al­ternative hypothesis is directional . However, in this case this does not make any difference to the hypothesis test since p is already very low.

Regression estimates a and b a llow for the calculation of predicted (or estimated) weights for all heights between 1 50 and 205 centimeters. For example, a person measuring 1 77 centimeters has an estimated weight of 78 .67 ki lograms (-50.54 + 1 77 * .73) . There wi l l be few, if any, respon­dents who are 1 77 centimeters tall and weigh exactly 78 .67 ki lograms, because the l inear regression l ine only rc l lec t s t he overa l l tendency and

lilltlronn. 1 : 1 1 11 1 1 � 1 1 � ! J f

obs�.:rvat ions w i l l dl: v t : t l l' l l l l l l l t l tc l i n �.:. Th�.:r�.: li.>rL:, �.:very r�.:gr�.:ss ion equa­t ion la kes the for1 1 1 : y : t I hx I C ( where C stands lor error Or deviation). I n t he soc i a l sc ienc�.:s, t i t �.: goa l is typically to show the overa l l l inear ten­dency rather than prov id i ng exact predictions. However, if the explana­tory power of a model is also important, the explained variance is used. In a previous section, we presented the variance of a variable as the sur­face of a square (see Figure 2.20). The important question is how much of this surface (the variance in y) can be explained using regression analy­sis? The explained variance of y is the surface of the square that is 'cov­ered' (explained) by x, divided by the total surface of the square (see Fig­ure 3 .4 1 ) . The outcome is always a number between 0 and l . I f the covered surface is 0, then the explained variance is 0. If the whole surface of y is covered (explained) by x, the result is 1 (or 1 00% ) . I n this case, al l observed scores of y are located exactly on the regression l ine and the l inear relationship is perfect (and all e = 0), see Figure 3 .4 1 . In our exam­ple, body height explains .32 (32%) of the variance in body weight.

Whether the explained variance is sufficiently high depends on the re­search question. As a general assessment of body weigh t based on body height, this regression model is not a bad i n s t ru ment , bu t li.1r medic : t l pur poses it would be inadequate . The exp la ined var ia nce c: tn h · i nn · : 1s ·d hy adding more' x variables to the model ( sec sec t ion :\ . (1 ) .

variance in y

variance in x

� �

variance explained : 0%

• •

50%

Figure 3.41 Graphical Presentation of 0, 50, and 1 00% Voriaucc /�\

plained (dark shaded area = explained variance)

Certain assumptions about the data need to be made be fore perfonn in ' regression ana lys is . We w i l l address these in sec t ion 3 .6.2 which dis­cusses m u l t i p l · I'L' I � I L'ss i u l t : t na lys i s . Beforehand, we s t ress t he importance

Page 49: Statistical tools Grotenhuis Weegen

Ul!

or th�.,; < lSS l l l 1 1ed l i 1 1 ' ! 1 1 I V l : r l l i i l l ' d i i J l h\ ' I W l'CI I y : t l ld X i l l t l w J l i l j l l l l . r i u l l l : 1 1 1d t he i mport a i iCI..: o r · hcc k l l l ) ', l l r l , ' I l l l l lc d : t l < l . Th i s is c: l s i l y c l wc k L·d i l l a l ine graph ( se�..: l ; i • u rcs .1 . \ I ! l l l l l I I · I ) a 1 1d/or a h i s togr: 1 1 1 r o l t he I..:ITors/ residuals ( the d i s t r i bu t i o 1 1 s l l o 1 d d h · approx ima te ly nonna l/l l i l l -shaped, especially with n < 30 ) . I f t l 1 · 1 · l : � t ions l l i p i s not ( roughly ) l i near, Spear­man ' s rank correlat ion or :t re • rcss i o 1 1 ana lys is using modi fied variables is more appropriate to us�..: ( se · our w �..: h page for further information). Furthermore, i t should be noted t h a t t l i�..: resu l t s of regression analyses are sensitive to outliers, espec i a l ly w i t h sa 1 1 1 p l c sizes smaller than 200. With samples as small as th is an ana lys i s o r t he errors (or residuals) i s neces­sary (again, see our website for more i n format ion) .

3.5.3 Odds Ratio

Typically, measures of association have minimum and maximum values. For example, values for Cramer's V always fa l l between 0 and 1 , while values for Kendal l ' s tau, Spearman 's rank correlation, and Pearson ' s cor­relation always fal l between - 1 and 1 . As such, the strength of statistical relationships can be compared by a standard; namely, 0 (no relationship) and 1 or - 1 (perfect relationship). The values taken by these measures are also determined by the distributions of the x and y variables (in a contin­gency table these are the counts in the row and column marginals). For example, when the y variable has a skewed distribution, many measures of association w i l l show low values. Additionally, changes in these dis­tributions wi l l change the values of many measures of association. This i s called marginal dependency and is often desirable. However, i t can be problematic when the research question focuses on relative comparisons between categories of the x variable. For example, in social mobi lity re­search, the research questions often relate to i nequality : 1 ) do women sti l l lag behind men in schooling i n the West? 1f yes, 2) has i nequality de­cl ined over time? Likewise, we know that students go on to higher educa­tion (college) at lower rates compared w ith lower and i ntermediate levels of education. However, over t ime adolescents increasingly do choose to go on to higher education, a trend that we touched upon earli er and at­tributed to educational expansion. The (skewed) distribution of h igher and lower/intermediate education and changes therein should not influ­ence the answer to the two research questions above. The odds ratio is a measure of association that is i nsens i t i ve to changes in d istributions .

This measure is fi rst i l l u s t ra ted us ing another example from social mobi l i ty research, dea l i ng w i th the re la t ionsh ip between f�1thcr and son job occupa t ions ( se�..: T: 1 b l c 3 .4 2 ) .

l r l i t l l t i i i L r l ! l l r r l l : r l l< : : U! l

Tahlc 3.42 'f'/w Nt ·!r tl/ 1 1 1 /,l'll lf l lwf l l 'CL 'II l'nflll.'r 's Occlf{)(tfioll mu/ ,)'on ·.,.

OccntN tfiou in l 'crccnfuge.\· and !l ssociated Odd1· Rotio.

Occupation father

Non-manager Manager Total

Occupation Non-manager 2 1 9 51 270

son 65.2% 37.8% 57.3%

Manager 1 1 7 84 201

34.8% 62.2% 42.7%

Total 336 1 35 471

1 00% 1 00% Odds ratio = 3.07, p (one-tailed ) < .001

Firstly, to determine the extent that sons arc nowaday s ( t he d a l : 1 : m: l H i r l l

2000) economically mobile compared t o the i r 1 ; 1 t hers . odds : 1 1 \' 1' 1 1 1 1 ' 1 1

lated?5 For sons whose fathers d o not/d id not occ upy : t 1 1 1 : 1 1 1 ! 1 ) ',\' l l r l p1 1 · . 1

tion, the chance of ending up in the same occup; l l io 1 1 : d \ ' : r l q •,I I I Y I ' , , ., "' "

(calculation: 2 1 9 I 336 * 1 00). Converse ly , t l iosl· so l i ,' ' l 1 : 1 \' 1 ' . r I I H" . , chance of. rising and getting a manageri a l j o b ( Sl'l' ' l ': t l l k I · I 1 ) l l 1 1 ' 1 1 1 1 1 1 1

between these chances i s called the odc/.1' a n d ·q u : d s ( 1 '-, 1 I I I H I H I This means that the chances of eventua lly ge t t i 1 1g : 1 1 1 1 1 1 1 1 1 1 . 1 1 1 1 ) ',1 ' 1 1 1 1 1 ) 1 1 l 1

are 1 . 87 times higher for sons who have l� 1 the rs w l 1 o : 1 1 v 1 1 1 1 1 1 1 1 1 1 1 1 1 . 1 ) ' 1 1 •,

ln contrast, the chances of attaining a m anager i a l p 1 >s i t i 1 1 1 1 l 1 1 1 :.1 1 1 1 ' • 11 l 1 1 1

have fathers who are managers is 62.2%, w hereas t h e c l l : t l l \' \ ' 1 1 1 l' l l l l t i p 1 1 1

a non-managerial position is 37 .8%. The odds a rc . (J I ( 1 7 .X I ( , ) , _! ) N o

tice that the distributions of the variables ( presen ted i n t i le 1 1 1 : 1 rg i " s o l' t i lL'

table) do not p lay any role when calculating the odds. To determine the relative difference in occupational opport un i t ies, the

ratio between both odds i s calculated. In this case, t he outcome is 3 .07 ( 1 .87 I 0.6 1 ) . This is called the odds ratio because it is the rat io b�..:t wccn two odds.36 In our example the odds ratio indicates that the odds or sons whose fathers do not have managerial jobs (= 1 . 87) is abou t 3 L i mes as high as the odds of sons who have fathers who work as managers ( .o I ) . The closer the odds ratio is to 1 , the less differences in occupa t iona l op­portuni ties. In Western Europe, occupational mobi l i ty used t o be much lower ( ' l ike father, l ike son ' ) and the odds ratio was consequen t ly h igher. More generally, an odds rat io o r I means t ha t both odds a rc equa l , wh ich means that t he categori�..:s or the x var iab le arc not s ta t i s t ica l l y rela ted to the dependent v a r i a h k ( SL'L' Tahk 3 .24 ) . In occupa t iona l mobi l i t y , an odds rat io or I i l ld l \ ' : l l l'S 1 1 1 : 1 1 I I I L ' l : l l h · r 's \ ) 'C 1 1 pa t i o l l i s not an i mport : l n t l� 1clor ( a n y i i H lrL· ) 1 1 1 I i l l ' r l · l i i l l l 1 · I H' I ' r r p: r i i P I I : d � · l r ; l l l l'L'S o l ' l 1 i s so1 1 . T i l e oc-

Page 50: Statistical tools Grotenhuis Weegen

c u pa t iona l J l l u h i l 1 1 y u l I I H ' ' • ' ' ' ' 1 1 1 �· 1 1 I S 1 10t l i n 1 i � t:d hy t l l 1 · I . I I i l 1 . 1 · : . 1 1r C l l l) : t­

t i o n .

. N o� back to I l l · l w u I l'.'l' : l l r l i l j i Jcst ions on t h<.: <.:duc: 1 t 1 u 1 1 ;d 1 1pport u n i ­��3�.f men and W l l l i i L' l l : 1 1 1 1 1 1 1 1 �· Inn • i t u d i na l c hanges t i H.:r<.: i n ( sec Table

Table 3.43 Relationsliit' hel l l '< '< ' 1 1 Sn ond Educational Level in 1 979 and 2000 ( counts. ( '( ) / 1 1 1 1 1 1 1 f !c ln'nlages and odds ratios)

Year of survey: 1 979 Sex

Female Male Total

Educational Less than college

451 422 level

873 92.4% 81 .5% 86.8%

College or 37 96 1 33 university 7.6% 1 8.5% 1 3.2%

Total 488 5 1 8 1 ,006

1 00% 1 00% Odds ratio - 2.77, p (one-tailed) < .001

Year of survey: 2000 Sex

Female Male Total

Less than college 378 336 7 1 4

74.7% 66.9% 70.8% College or 1 28 1 66 294 university 25.3% 33. 1 % 29.2%

Total 506 502 1 ,008

1 00% 1 00%

odds ratio - 1 .46, p (one-tailed ) = .003

!he percenta.ges in the gre.y shaded cells show that inequali ty was present 10 .1 979. and 10 �000: .relatr�ely more men than women went to college or �mvers1ty. D�n ng thts penod the chances of attaining the h ighest educa­tJ?nal levels t.ncreased: In 1 979, 1 3 .2% of all students were enrol led in h tghe� education and m 2000 this rose to 29.2%. However, did enroll­ment mcrease more among women than i t did for me ? Tl th

. · · . n . 1e answer to

b�: :uestwn hes m the odds rat

.ios. In 1 979, the odds ratio equated 2. 77

. ropped to 1 .46 m 2000. Thts decline is not expressed as an absolute dtfference but as a ratio, and is calcu lated as 1 .46 1 2 77 - 52 -rl ·

b · d. · . . 1 1 s num-

er m 1cates that the relative educat i ona l i nequa l i t y hct w<.:cn men and

women rough l y hal ved between 1 979 ·md 7 ( )( )( ) <• 1 1 . . . · I · , . · , . ' - · ·" , I S l l ,I I V l' l l lC l j l l<l I -

1 1 1 1

l i es d i d dcc n.:asc i n l l 1 : i l P�'I I I H I , : d t l iou • h : 1 s t a t i s t i c a l ! ·st i s r ·qu i n .:d ! 'or a

l l H lrC d c l 'i n i t i v ' ; I I ISW ·r ( SC ' i 'u rl l icr below ) .

The a d v a n t a ge o r t h e odds rat io is t ha t re la t i ve d i l lc rcnces a rc e x -

pressed i ndependent ly o i' the margina ls. !\ d i sadvan tage i s that no m a x i­

m u m v a l ue ex ists, so when the re lat ionsh i p i s negat ive, the odds rat i o is

b<.:t.ween I and negative infinity. If the association is pos i t i ve, the odds

rat i o i s between I and positive infinity. As a result, the odds rat io i ncl i ­

cates the direction of the relationship, but not the strength of the re lat ion­

ship. Furthermore, the odds ratio i s always calcu lated based on the counts

i n four inner cells of a contingency table. In a table with two rows and

two columns, only one odds ratio can be calculated . However, i n l a rger

tab les more odds rat io's can be calculated, which can be t roublesome

when no clear distinction can be made between more and less re l e v a n t

odds ratios. However, the odds rati o is one of the few measur<.:s o r asso · i a t ion t l i : 1 1

is insensitive to the marginal distribut ions , which mak ·s i t h i g h l y s 1 1 i l i ! h lv

to describe shifts in i nequal i ty, for examp le . i\ lso, i n 1 1 1 · ' "n i i l ' : I I Sl' l l ' I H 1 " 1

the odds ratio is often used in epidcmiologicl l r ·s ·arr l l 1 1 1 1 1 1.' 1 1 1 1 1 1 1 l " • 1 d

highly skewed variables such as morta l i t y rat ·s .

To test the null hypothes i s that t h<.: odds r: I I I o n p i l l i ' i I ( 1 1 1 1 1 1 ' 1'1"' 1 1 1

L ion), a ehi-square test can b e used, or t l l · · � : 1 \' l tv .� t 1 1 ( · , H 1 1 1 1 1 1 · 1 1 1 h 1

not met (see section 3 .4 .2) . The p-va l ues s l tuw 1 1 " ' l ' 1 1h l 1 •1 I I ' 1 1 1 1 . ! I I I were determined using a ch i-square tes t ( o 1 1 · l ! l l i l 'd , l h 'l ' l l l l l l I l l • ' ' 1 1 1 1 • l 1

hypotheses is directional , stat i n g the odds r: l l i u t o h1 · I . " I" ' ' t l " ' " l l r 1 1 1 1 •

that in a l l three cases, using a = . 0 5 , t h <.: n u l l l t ypl l l l u"" ' I ' l 1 1 I ' ' 1 1 d

Odds ratios are often calcu lated a n d I · s t l·d I I S I I I J '. /r l l ; / 1 l 1 1 ' ' , . , , , , , , 1 , ,

analysis. This technique can test w he t her t i t · ! kr l l l i i i i i' l l l l ' ' i " d 1 i \ 1 1 1 ' d 1 1

cational opportunities from Table 3 .43 ( . 5 � ) i s s q •,J t i i l l' , l l l l l v 1 1 1 1\ 1 1 i l l d l l I

( I = no change) . However, it i s beyond t he scoiK' 1 1 1 t l w. l 11 t1 1k 1 1 1 d i • • I I

this type of analysis further. A pract i ca l exph1 1 t : t l i o 1 1 ( ) I l ( ) }'. ' �· l u 1 1 ' } ' 1 1 ' • '1 1 1 1 1 '

analysis and a selection of anal yses c a n be round 0 1 1 0 1 1 1 \V l' l l'4 1 1 •

3.6 MUL TIVARIATE ANALYSIS

In the previous sections we d iscussed severa l b i variate re l a t ionsh i ps I K·

tween variables. These relationships were most l y assumed lo be causa l .

For example, i t was assumed that a h igher level o f education ( Gtus<.: ) r<.:­

sults i n a higher income (effect). Thi s assumption seems rea l i s t ic !'or t h ree

reasons. F irst, there is a c lear chronologic a l order; genera l l y , a p<.: rson

fin ishes h i s or her educa t ion hej'ore start i n g to earn a regu l ar i ncome.

Second, a rd:t t i onsl t i p het w <.:cn educa t ion and i ncome was i ntk<.:d con-

Page 51: Statistical tools Grotenhuis Weegen

1 0' I I I I I J I I I I I : 1

l i nned in s ta t i s t i c ; t l : 1 1 t : t lys t s ( sn· ' l 'a hk .LL ) . Th i rd, 1 1 S\T I I I S t l l t l t kc ly that the re la t ionsh ip h · tw · ' 1 1 ·d l lc: t l i P I I : t nd i ncome i s spt l l l l l i iS ( i . e . , no t real), because one or mon; var iab le s causa l ly determ ine bot l t educat ional level and i ncome. Thus, s t : t t i 1 1 • t l t : i l a h igher educat ion cu11ses a h igher income seems to be j us t i l ied.

However, in the soc ia l sciences causa l i ty is not always that clear. Using standard cross-sect iona l survey data, it is often difficult to ascertain a clear chronological order between two variables.37 The next best thing is to ground the chronologica l order as l i rm ly as possible in theoretical arguments, a lthough an emp ir ica l test or these arguments generally is not possible. Fortunately, it is less prob lemat ic to statistically demonstrate that an empirical relationship between two variables exists or not. Fur­thermore, multivariate analysis can be uti l ized to establ ish whether or not an initial (significant) bivariate relationsh ip is not the work of some other (confounding) variab I e( s) . 38

I n multivariate analyses, the initial bivariate relationship can be tested to see if it sti l l exists after control variables are taken into account. Con­troll ing for other variables is a common and fruitful procedure in the so­cial sciences. H owever, this technique is somewhat difficult to exp lain, which is probably why it is typically not reported in newspapers, let alone discussed on television. An interesting exception to this was recent criti­c ism and skepticism towards research on criminal ity amongst asylum seekers in the Netherlands. The bivariate relationship seemed clear: asy­lum seekers in the Dutch province of Groningen committed five times the number of crimes compared to the local population. However, a critical fol low-up study demonstrated that 'apples' had been compared to 'or­anges ' : there were important demographic (e.g., age) and socioeconomic differences (especially income and education). From statistical and ethi­cal points of view, it would have been better and fairer, respectively, to compare asyl um seekers with people from Groningen that did not differ from asylum seekers in these important characteristics.

It is difficult, however, (if not impossible) to find a randomly selected group of native people from Groningen who are identical in important aspects to a group of asylum seekers. Therefore, with the exception of experimental research, this i s not a useful strategy.

Fortunate ly, important differences can be ruled out by statistically controlling for relevant variables. Thi s method wi l l be i l lustrated using a study in which a respondent 's income i s related to their father 's educa­tional level (see Table 3 .44). To avoid unnecessary complex i ty , on ly one single control variable w i l l be used ( i n t he l i terat un;, co 1 1 t rnl va r i ab les a rc

often indicated us i ng t he lc l lcr z ( or somd im ·s t ) ) .

l r l l t l l or r l l. i l � � l l l l l :r l l< . : 1 0:1

Tahlc 3 .44 J<doflnllslllt ' lid l l 'c 't 'll fr 'nt!Jer ·.,. l�·duculionnl I A • r 'l 'l 1 11 1 1 /

He.l'f ll illd< 'llf 's lllr'ullle

Educational level (father)

Income (respondent)

Total

€2,000 maximum

more than €2,000

Kendall's tau b = . 1 1 , p (one-tailed) = . 0 1

Low High

1 39 46 46.6% 35.4%

1 59 84 53.4% 64.6%

298 1 30 1 00% 1 00%

Total 1 85

43.2% 243

56.8% 428

Table 3 .44 shows a weak but positive relationship between the educa­t ional level of fathers and the income of their sons (the respondents ) . o r a l l respondents who have fathers with low levels o f education, 53.4 per­cent earn over 2,000 euros. In contrast, 64.6 percent of al l responden t s with h ighly educated fathers earn over 2,000 euros (the difference i n p ·r centages (d%) is 1 1 .2) . The positive relationship is also reflected by the significant Kenda l l tau-b, amounting to . 1 1 . Again, this imp l ies t ha t res pondents with h ighly educated fathers are more l ikely to earn an i ncom · over 2,000 euros compared to respondents with lower educated fat hers .

However, i t i s debatable whether this relationship i s causal : do pcopk: really make more money because their fathers are h ighly educated'! Probably not ! Income is primari l y determined by one' s own educat ion, as employers w i l l inquire about the educational level of the applicant, and not about that of his or her father. In Table 3 .44, the observed relat ionsh ip is probably due to the fact thatfather 's educational level is posit ively as­sociated with both educational /eve! and income of his chi ld(ren).

Now, let us assume that income is real l y determined by one's own educational achievements and not by h is or her father's educational leve l . When this is actual l y t rue, t here cannot be any re la t ionship between t he l�t ther' s educationa l leve l : 1 1 H I h i s S \ l t t ' s i ncome, among sons who a l l share t he same educa t iona l l cvl' l . ' l ' l r vsv rL·sponden ts have performed s i m i larly and are rewarded w i t h a v • t t : r t l l I I W< l i i i L" , i rrespec t i ve (or i n dependent) o f t h e i r fathers' ach ieVL' I t t t· l l t s l l 1 1 :: 1dv : 1 t s kstnl i t t T : th le :1 .45 l o r respon­dents who a l l l t : t vc :1 l 1 1w 1 ' d 1 1 1 t l l ! l l l l l l k V\' 1 ( I I P I H " I t : t h l · ) : t nd responden ts who a l l have : t l t q •, l t < 'd i i t l i l l l l l l . t l ln 1 · l ( l t l ll \ ' 1 t : t l r l · ) . T i le d i lkrences i n pcrcenta •cs ( d" u ) l t l l r f l t l l l 1 1 1 1 1 1 1 1 1 1\ . t l t l t t l ' . i ( ) . t t td 1 1 1 1 I P t t •er s i • n i lican t . T h i s me: l l ls t l t n l '' l 1 1 1 1 1 " i t l l • d l i i i J ' ' " ' 1 1 ' • l lt l l ld t · l l t ' :' n l t tcat ion :d leve l , there i s i ns t t l l l l l t ' l l l • 1 id • ' " ' t l 1 1 1 t 1 1 t l n l t t l l l l l l l l · l . t l l t l l l. '' i l t p v. '< i s t s hc l w�..:c 1 1

Page 52: Statistical tools Grotenhuis Weegen

1 \ 111

l �t i i H.:r 's cduc: t l i O I I : t l kvd ; 1 1 1 d l i l l· l l l l ' l l l l l (,; or h i s l l l l :· i j l l l l l ) ', . l i l l ' l l'hy CO i l­f i rm i ng l hc i Lka l k 1 1 l l l L· l l l l ) ', l l la l rc l : i l ionsh i p bd Wl'l' l l l : i l l •c • ' s cduca­t iona l levc l and rcspondL· • • I ' s l l l l 'O i l ll' i s nu l causa l .

O f course, t h i s docs 1 10 1 l l lC: t l l l ha l l hc cduca t iona l kvd o r l h c l�t ther - or more genera l ly thc p: 1 rc 1 1 l s docs not p lay a role a t a l l . Even today i t is easier to obtain a h ighcr cdu ·a l iona l kvcl when your parents are also highly educated. There l 'orc, i l i s o lkn sa id that only an indirect causal relationship exists between thc cducal inna l level of the parents and the income of their eh i ld( ren ).

Table 3.45 The Relationship between Father 's Education and Respon­dent 's Income, Controllingfor Respondent 's Education

Educational level Educational level (father) (respondent): low

Low High Total Income €2,000 maximum 1 22 28 1 50 (respondent) 55.2% 53.8% 54.9%

more than €2,000 99 84 1 23 44.8% 46.2% 45. 1 %

Total 221 52 273 1 00% 1 00%

Kendall 's tau b = .01 . p (one-tai led) = .43

Educational level Educational level (father) (respondent): high

Low High Total income €2,000 maximum 1 7 1 8 35 (respondent) 22. 1 % 23. 1 % 22.6%

more than €2,000 60 60 1 20 77.9% 76.9% 77.4%

Total 77 78 1 55 1 00% 1 00%

Kendall's tau b = - .01 . p (one-tai led) = .44

3.6.1 FIVE DIFFERENT CAUSAL MULTIVARIATE MODELS

I n multivariate analysis, different causal model s can be assumed and ana­lyzed. F ive important models wil l be discussed in this section, which are related to : • Mediation • Spuriousness

l n l t l l l l l l l l i d : i ln l ln l lt .:

• P: tr l i a l 1 1 1 ·d i : t l i o 1 1 I p1 1 1 1 1 : d : . l l l l l l l H IS i lGSS • Supprcssion • Modcrat io 1 1 or i 1 1 1 ·r: • c l i o 1 1

· · · · -le l s for Two pred1ctor ( x ) vanab lcs wi l l be used to i l lustrate these moo

. . . · r more case of mterpretat 1on , but the same appl ies for models with three 0 pred ictor variables.

M ediation

Generally, in a multivariate model one or more control variable& (nota-.

) k . I d.

. d I .t

. 0 that a 1 1on : z are ta en mto account. n a me tatwn mo e 1 I S assume

· · l h · h · bl ( 11 d the me-change m the x vanab e causes a c ange m t e z vana e ea e · .

diator), whi le z in turn causes a change in the y variable. FurtherJnore, _1 t

i s assumed that the original (bivariate) relationship between x a11d Y� . 1 s . ) d · . ·fi 1 v d! l I er

reduced to a relationship (notat10n: xy.z that oes not stgm 1cant ; . rrom zero. Therefore, a mediation model is also referred to as :3 chc�111

d £ . . 3 6 d. .

d I d h we c l l s-'no el. In act m sectiOn . a me JatJOn mo e was use w en .

· h.

b £ h ' d · d th · ,.._ omc o l cussed the relatiOns 1p etween a at er s e ucat10n an e mv

n t hcst; h i s offspring. It turned out that the direct causal influence betwee i two variables was absent after taking into account respondents ' orV17 ec I l-

l . . · 1 'bl tha t l hc

cational level (see Table 3 .45) . Furt 1ermore, tt I S quite p aus1 e . educational level of the father (partly) determines the educational level o l

h i s chi ldren, and that the educational level o f the children cons�quen l l,Y d

· h · · I f 1 · · kn h father s

( partly) etermmes t etr mcome. t 1ts IS true, we ow ow

educational level positively influences his chi ld 's income: h igher edu­

cated fathers on average have their children reach h igher educatio-.:'al l �v­

c ls, and higher educational levels result in higher incomes. This Jmplle� that respondents' income is irrelevant to how educated his or h�r fat hel

(or parents) is . What does matter is the educational level of the respon­

dent . Of course, despite various governmental interventions, obt&-111 1ng a

· d · ·

· 1 1 · · I I · 1 1 d t d �s Gen-h lgher level of e ucat1 0n 1 s s t 1 eas 1er w1 t 1 1 1g 1 y e uca e paren · _

. . 1 ·on l or a

cra l l y , a mediation mode l scrvt;s : t s � " ' mterpre l a l 1on or exp anat1 . . 1 · 1

. · · 1 1 . b d 1 rec t l y rel ationship between l w n v ; tn : •hks w 1 1 c 1 t n t l t a y seem to e causally related.

x ___.. y (relation 1 1 1p xy I ( l ) , x .- 1 "" y ( r t ln l ion l1ip xy.z = n�)

(xy .z = relation 111p xy Wil l I ' 1 1 1 l i 1 1 l l t l t l 1 1 1 t l l l l l l 1, n non lgnificant) I

Example: P n 1 1 1 ' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 "' 1 1 • 1 11 1 ' ''' l i l l l l l l l l l l .,. ' l ; l l c l 's incorfle

Figurc J.4(, ll l� ·� lr"f l l '" t l! � , , , , f l , , d \ t� �, t, / "' " ' ' '"l ' " " 'l ,; t.; , t ll l lf/e

Page 53: Statistical tools Grotenhuis Weegen

1 1 11 1 ( l i i ipl i l l : 1

S p u rim1s ncss

i\ s t a l i s l ic ; t l n.: l a l i U I I S i i i p l ll ' l ll l ' l' l l : 1 1 1 \ : 1 1 1d y var iab le J n: ty 1 10 1 i ll' d i rec t ly or i nd i rec t l y causa l ly r l' l : i l l·d n l : d l . 'I'l l check whdhcr t h i s i s t he case; a ) i t must be p laus i b k l h : t l l l t l· 1. v: 1 1 1 ah l ·( s ) i s ( a rc ) the causa l l � tc lor lor both x and y ( sec F igure J .LI 7 ) ; : t nd h ) I l l · re l a t ionsh i p should become insig­n i fi ca n t or shou ld chan • c d i r T l io 1 1 wh · 1 1 con t ro l l i ng for (z) variables. For example, a b ivar ia tc pos i t i ve rc l ; t l io 1 1 sh i p ex ists between the variables church attendance and hodv l l 'ci,l)JI : on average, church attendees weigh more than non-attendees. l l owcver, i l is hard to imagine that church at­tendance real ly makes peop le ga i n we igh t . i t i s more plausible that this relationship is spurious and a t h i rd overlooked or omitted variable deter­mines both church attendance and we ight. This ' l urking' or ' confound­ing' variable is age: older people attend church more frequently than younger people and o lder people have typica l ly put on some weight dur­i ng their l i fe course. Indeed, if this is correct, the original relationship be­tween church attendance and weight wi l l become non-significant, after control l ing for age. This is indeed the case and age is said to ' explain away' the puzzl ing rel ationship between church attendance and weight.

An example where a third variabl e reverses the original relationship ( also known as Simpson 's paradox) is found in medical sciences. In hos­p i ta l s there is a positive relationship between the level of expertise and mortality rates. Fortunately, this alarming relationship is spurious as seri­ous ly i l l patients typical ly receive highly professional help but also have lower chances of surv ival compared to those not seriously i l l . The seri­OII.\'11ess of a patient 's illness must be taken into account (or control led l \ 1 r ) when fairly comparing hospitals . After contro ll ing for the seriousness ( ) r t he i l lness the original relationship is reversed: higher leve ls of exper-1 isc arc associ ated with lower mortal i ty rates. This makes sense - when p; t l i cn t s receive higher standards of professional care, their chances of smv i va l are higher compared to patients receiving lower l evels of care. Th is is especial ly so when patients are seriously i l l ; an addition that re­la tes to moderation or i nteraction and is discussed further on page I 08.

x -+y (relationship xy f:. 0), x /z

� y (relationship xy.z = ns) (xy.z = relationship xy while taking into account z, ns = nonsignificant)

Example: y Age �

Church attendance Weight

Figu re 3.47 SJJIIrioJJsness: Theordicol Modi '/ our l l·:nJJiiricol Example

Pa rt i a l M l·c l i a l i o u I l ' n r l i a l S p u ri o u s m·ss

Rare ly dncs l "u l l 1ncd i ; 1 l i o 1 1 or l "u l l spur iousness occur in t he soc i a l sc i­·nces. i\ i "ler coni rol l i ng l \ 1 1· one or more z var iab les, the original relation­sh i p is o l lcn reduced bu l rema ins s ign i ficant . Therefore, mediation or spur iousness i s on ly part i a l . Depending on the assumed causal direction between x and z, t he causal mode l is e ither partial mediation (when x is t he causal factor lor z (x -7 z)) or partial spuriousness (when z -7 x). For example, the re lationship between a father 's education and his child's ed11cation i s a partial mediation (or chain) model ; even when contro ll ing for father 's income, part of the original positive ( +) relationship remains ( see Figure 3 .48, upper panel) . The relationship between educational level and traditional attitudes (conservatism) is one of partial spurious­ness. After contro ll ing for birth cohort, a partial relationship between educational level and conservatism remains (see F igure 3 .48, lower panel ) .

Partial Mediation

x --+ y (relationship xy f:. 0), I -+ z --+ + (relationship xy.z < xy)

(xy.z = relationship xy while taking into account z) Example: + + Father's education --+ Father's income --+ Child's education

+

Partial Spuriousness

x --+ y (relationship xy f:. 0), X /z � � y (relationship xy.z < xy)

Example: � co�ort � Educational level conservatism

Figure 3.48 Partial Mediation I Partial Spuriousness: Theoretical Mo­dels and Examples

Suppression

Suppression occurs w i t · 1 1 I i l l' s l rl'ngl l t or l i t e re l a t ionsh i p between x and y increases a fter one or l t iO i l ' 1 v : 1 1 i : . hks : t rl' i l l l' l udcd ( sec Figure 3 .49). An i nstruct ive examp le t' < I I HT I I t :, l l t l ' t t· l : l l l u t l s i l i p hct wccn t he var iables age ( x ) and bodl' l l 'l 'i.t :l11 ( y ) ( ) 1 1 l l l ' l ' l : i ) ' l ' , : 1 s p ·upk • row o lder t hey put on weight . Ye t , t l t i , · pu:< i l l \ < ' 1 1 1 \ l t l l . l l < ' H ' L i l l o l l �. l l l p hL· Iwn: l l o.!.!:e and weight is surpri s i ng ! wt· : t k ; 1 1 1 1 1 '•' ' 1 1 1 • 1 1 1 1 1 1 l l l l , l d H I � ·vn yd: ty observa t ions t hat

Page 54: Statistical tools Grotenhuis Weegen

1 1>1 1 ' 1 ' 1 1 "

peopk over l i r t y ; � r · 1 10 1 : 1 . · . · l 1 1 1 1 : 1 . · I I IL·y were i 1 1 i l l · i r I W l' l i l l l ' r • I I H' n pla nat ion for th i s weak rc l : i l io 1 1 s l up l 1 vs i 1 1 t he < I S S UI I l p l i o l l 1 1 1 : 1 1 : i l l 1 · . · po l l­dents are equa l i n a l l o th -r i 1 1 1 pur t : 1 1 1 1 aspec ts . l l owcver, 0 1 1 · u v n l ooked

aspect i s the variab le hod1 ' /ici,! )lt ! 1 1 1 West ern countries, you 1 1 • e r people are generally tal ler than o lder p ·opk . Th i s may be due to changes i n d iet, l iving standards, and med ica l care d u r ing chi ldhood. Because these changes take p lace over t i me, d i fl 'c rcn t birth cohorts witnessed different circumstances ( i .e . , younger cohorts grew up during conditions that fa­vored growth). This interesting fac t i s t he subject of much social science research and is called a cohort-effect. So, in a bivariate analysis of the relationship between age and body weight, people who are young and tall are erroneously compared to those who are o lder and shorter. Due to their height, taller people are heavier than shorter people. As a consequence, the fact that young people typically weigh less than older people (an ag­ing effect) is obscured or suppressed to a certa in degree by these height differences (cohort effect). Statistically, it is therefore more appropriate to compare younger and older people who are equal in height. Statistically, this comparison is possible by taking into account the (suppressor) vari­able body height (see F igure 3 .49). After controlling for body height, the positive relationship between age and body weight is stronger.39

x ------. y (relationship xy « or ns ) , x ------. z ------. y (relationship xy.z - or +) I +/ - +

(xy.z = relationship xy while taking into account z, ns = nonsignificant) Example:

+ Age ------. Body Height _____. Body Weight

I + t Figure 3.49 Suppression: Theoretical Model and Empirical Example

Moderation or Interaction

In the four causal models discussed, a causal relationship or causal effect is assumed to be equally strong for all units. For example, in the discus­sion of the mediation model, we assumed that the positive effect of one' s education on income is equally strong for a l l respondents. However, stud­ies suggest that this effect is stronger for men than for women. Also, the negative effect of expertise within a hospital on its mortal i ty rates (see spuriousness) probably only holds for the ser ious ly i l l and matters less for those not seriously i l l . I f a re lat ionsh i p d i l 'lcrs across spec i fic groups or categories it is called moderation or inlem( '/ion. ln t T: 1c t ions a rc re l a-

l i ve ly weak w l l · J 1 I I I L' ;< l l l ' I I J ', I I I l l f t he rd;� t ionsh i p i s I W I eq 1 1 < � l acrnss groups, h u t ren � : 1 i 1 1 S i 1 1 t i lL' s: 1 1 1 1e d i rec t ion . In a s t ronger var ian t of i n tcr; �c­t ion, t he rda t io 1 1 s l l i p i s abs�.:nt or non-sign i f ican t l ( ) r cert a i n gro u ps or categories . The s t rongest i n teract ion occurs when t he re l a t ionsh i p i s pos i­t i ve lor some groups but negat ive for others. I f the strength o f a re la t ion­sh i p goes to zero or even changes direction for some groups/ca tegories then serious questions can be raised about the causal order between x and y. There may be a good theoretical explanation as to why a causa l re la­t ionship exists between x and y for some groups/categories bu t not for others. It is , however, much more difficult to explain why the causa l e f­fect is positive for some groups and negative for others.

I t is important not to confuse ' interaction effects' w i th 'contro l l ing l ( l r a variable ' . I n the case of interactions, the assumed cau sa l c iTcct o f x on varies across values of z (where z indicates different groups. catc • nr i l·s.

or conditions). When a variable is control led for i t. i s ass u n 1 ·d t h : i l t l w ( remain ing) causal effect of x on y i s approx ima te ly t h e S : l l l l L' l ( l r d i l l 1' 1 ent values of the z variable(s). In other words, i t i s : 1 sS 1 1 1 1 1n l t l 1 n t ! H I l l l t H I

eration or interaction i s present. A lso , a n i n t.cra · t io 1 1 i s J l i i 'S\' I l l t •d d i l l l ' l ently in a graph, because the z var iab le now i s \'l l l l l l i l l i l l l l i l i ' I H I I I 1 1 1 causal relationship between x and y ( sec l ; i • 1 1 rc \ . 0 )

/

x ------. y (relationship xy = + , - or 0) , X t � y ( r n l l l l i l ll l 1 1 1 1 ' "V 111 1 1 h• l

Figure 3.50 Moderation/Interaction: T /l( 'nl'< ·t 11 '1 d A /1 1, 11 '/

3.6.2 MULTIPLE LINEAR REGRESSION ANAl Y , t :.

In theory, the models discussed in the prcv io 1 1 s Sl ' l ' i l t l l l 1 1 1 1 d . l I 1 L t , t • . I

using contingency tables (see Table 3 .45 ), h u t i 1 1 prnr1 1 1 1 ' 1 1 1 i l y 1 1 1 1 1 '1 " " ' tables ( i .e . , a table with an x and y var iabk p l 1 1 s t l l l l ' 1 v 1 1 1 1 d d t ) 1 1 1 1 1 l • 1 analyzed this way. However, even three-w<�y t ; 1 h l ·s l 1 : 1 vl· p 1 1 1 1 1 1 1 d 1 1 1 1 1 1 1 • that are quickly reached with interval/ratio var i : �h l · s hn· : I I I SL' t i i i' V 1 1 1 1 1 mally contain a large number of categor ies . Thcrc l( lrc. l l l l llllf 'l, · !uJ• 't ll regression analysis i s often used in social sc ience resea rch. Tl 1 1 s t l' l ' i t

nique takes into account multiple independent ( x ) var iab les a t i 1 1 1 t:rv: i l and ratio levels. Nominal and ordinal variables can a l so be used a rt �.:r I l l ·y have been transformed into dichotomous variables ( i nterva l var iabks by definition). Additiona l ly, interaction models and non- l i near rc l a l i onsh i ps

can a lso be ana l yz�.:d .' 10 To i l l ustrate the versa t i l i ty o f mu l t i p le regress ion

Page 55: Statistical tools Grotenhuis Weegen

a na lys i s , ; 1 1 1 �.: xa r r r pk I ( J I Iows 1 1 1 w l r r r l r : 1 1 1 i r r lcrv : r l Sl' : t i L - i r r l · r r �H r r I l l ) ', r e l i ­g ious be l ids i s ex p l : r i r r l·d w r l l i :r l l l l i l l l ) l ·r o r x - v: r r i a b l ·s .

Modeling Interval and R a l io P n·d i d o r Va .-iables

To explain traditional relt�t;io11s /)( '/icfi· (which is a sum of five variables each measuring an aspect or t rad i l i o r ra l re l ig ious beliefs), two ratio vari­ables are relevant: education and age ( bo th measured in years). The two alternative hypotheses state that t he more years of education and the younger the respondent, the weaker rei ig ious beliefs wi l l be. These hy­potheses are confirmed after ca lcu la t i ng Pcarson's correlation coefficients between education and religious belief\·, and between age and religious beliefs (r equals - . 1 5 and .22 respectively, see Table 3 .5 1 ) . However, this confirmation is at the bivariate level, so the question remains as to whether both i ndicators persist in a multivariate model . More precisely, we will test a model in which partial mediation is suspected (z = education) :

Age ---. Education ---. Religious Bel iefs

I n addition, the effect of education on religious belief� could be (par­tially) spurious, because age determines both education and rel igious be­l iefs, So, we simultaneously test a second model which is partial spurious (z = age) :

. /Age

� Education ¥ � Religious Bel iefs

The outcomes are shown in Table 3 .5 1 :

Table 3.51 Results from Multiple Regression Analysis, y = Religious Be­liefs, xl= Education (in years) and x2= Age (in years)

Religious Pearson's b coefficient beliefs (y) coefficient

Constant - .02

Education (x1 ) -. 1 5 - .06 Age (x2) .22 .02

standard error

' 1 6

.01

.002

beta p

- . 1 1

.20

(two-tai led)

.896

<.001

< .001

The intercept in th is model equa l s - .02. Th is however has no meaning, because it represents the average re l ig ious hcl i l' rs I ( J r people aged 0 and with 0 years o r cduea t ion va l ues l h ; r l : rn: r ro l p r l'SV I I I 1 1 1 l l i l ' d : r l : r set ' The

l l i l"l " l l l l . r l : : r r d l : 1 l lt .: I l l

b coe nl c ien t l ( l l' c ·dtll 'l l l / t l / 1 I S ,O( J : : I l l i r rc reast.: o r I yc; r r i s assoc i : r l �.:d w i l h

a .O(J decrease o r , ., .,,_, : 1 1 1 1 1.\' 111 '/iej.\· . I n add i t ion, relig io11s heliej.i· i nc rease

by .02 for every ; r d d i l i o r r : r l yt.: : l l· or age. To determi ne the ex ten t t ha t t h is

agc-c iTcc t is exp l a i r red by ed uca t ion , the beta coe ffic ient (p ) is re levant . This coefficient can be in terpreted the same way as Pearson ' s correlat ron ( r). Here, p indicates the change in rel igious bel iefs in standard deviations when the score on age/education shifts 1 standard deviation. The differ­ence between r and p is that the latter expresses the rel ationship (or ef­fect) after considering one or more control variables. A comparison of r with p shows that the effect of education decreased most (from -. 1 5 to - . 1 1 ) .41 So, one third of the original relationship between education and rel igious beliefs (= -. 1 5 , see Table 3 .5 1 ) is explained by age. In other words, a partial spurious relationship exists between education and rel i­gious beliefs. This is because, on average, older respondents have less years of schooling compared to younger respondents (Pearson 's correl a­tion between age and education = - . 1 9) . This correlation is not the resu l t of an aging process, but reflects cohort differences (older respondents arc from o lder birth cohorts who witnessed less educational opportunities).

The beta coefficient is measured in standard deviations because i l is the result of a z-transformation (see section 2 .3 .3) , meaning that beta can be used to measure the relative strength of the effects in the mode l . Be­cause of this, it can be stated that the effect of age is about twice as strong as the effect of education (.20 I - . 1 1 ). Note that the strength of the educa­tion effect, expressed as the b coefficient, is about 3 times larger than the effect of age (-.06 I .02) . However, this is dependent on the measuremen t units of the variables education and age. When education is measured in months (instead of years), the b coefficient becomes -.005 (-.06 I 1 2) .

F urthermore, Table 3 . 5 1 shows that the resulting b and beta coeffi­cients differ significantly from zero, at the .05 significance level . Al ­though the hypotheses associated with both variables are directional, we can do without dividing the reported p-va l ues by two, because these val­ues are already very smal l . In ot her words, we can reject Ho with regards to education and age as it is h igh l y un l i ke l y l ha l l he b coefficients (and consequently the be t a coe l 'l i · i e r r l s ) ·q u : r l 0 i n l l r�.: popu la t ion .

M odeling O rd i n a l a n d N mu l u n l l ' n•t l i d t H· V ll l· i a h ks

A l l pred ic t ors ( x v: 1 1 1 : r l d i ' t i ) 1 1 1 " ' ' ' l' ' '"'�. r i H r : 1 1 1 : r l ys 1 s r r r 1 1 s l he measured at least at t he i n l l' r v : r l I . , • · I l l 1 1 \\ ' ' , . , , r l , . , l l" ' • ' . l l l l t' 1 1 1 I l l l ' l ude nom i na l and ord i n a l v a ri : r h k s " 1\ l ' l l l r • r l l t t • i l r r r l l l l r r • 1 , \\ t · l r r •, l r r s\· a h i v : r r i : r l e regres­

s ion ana lys is w r l l r l l r 1 \ 1 1 1 1 1 1 l d •• ' ' l r • ' l ' ' " ' /t, /1• '/1 r · , i l r · dl' I K' r H k l l l va r i ab le and sex as l l r 1 · 1 1 1 ! 1 1 I " 1 1 d • 1 1 1 I " ' d i • I • J i •:, 1 I '• . r d i l l i i i i i i i i iO I I S v ; r r i ; r hk.

Page 56: Statistical tools Grotenhuis Weegen

which has i n ll.:rv ; t l c l t ; t i ; J t' l ' l t S i t t ·� ( SL'l' � 'l' l ion 1 . 2 ) . M : 1 k:-. . J I J ' J l td n l ( ) : 1 1 H I lcrna ks have code I . ' I ' l l · : 1 wr: 1 ) ', · sn m.: l ( H· re l ig iot ts l w l t l' l .·� 1 :� . OX l 'or men and . I 0 lor women. So. 01 1 av · r : t •e , women ha v · � l l g i l l l y � l ronger rel igious be l iefs . l l c a n be demon�l ra led l i l a t the regress ion l i ne runs be­tween these two av erages . /\s it was ment ioned ear l ier, the b coe fl i c ient (b) equals the change in y assoc ia ted w i l h a 1 -u n i t change in x. Fo l l owing this logic, b indicates the change i n re i ig ious be liefs when a male respon­dent (0) is compared to a female responden t ( 1 ) . Therefore, b equals ( . 1 0 - -.08) I 1 = . 1 8 . Thus, the b coenicient equals the mean difference between men and women. Recall that an intercept (a) is the mean pre­dicted score when all x variables = 0. In this case, the intercept equals the mean for men ( -.08) as they score 0 on sex. The meaning of a and b are summarized in F igure 3 .52 .

. 1 0

-.08 (=a)

- - - - - - - - - - - - - - - - -r-� 0 1

- - - - - - - - - - - - - - - - - - - - -

( . 1 0 - - .08) I 1 = . 1 8 (= b) coding: 0 = male, 1 = female

Figure 3.52 Meaning ofb coefficient in dichotomous (dummy) variables

Tab l e 3 . 53 shows results from a regression analysis in which religious he/iefs· is the dependent variable and sex is the predictor (O=males, mean score on y = - .08 and 1 =females, mean score on y = . 1 0). The intercept ( a ) and the b coefficient (b) equal the values in Figure 3 .52 .

Table 3.53 Results from Simple Regression Analysis, y = Religious Be­liefs, x=Sex (O=Male, /=Female

Religious beliefs (y) b coefficient standard error beta p (two-tailed)

Constant (a) - .08 .03 . 022

Sex (b) . 1 8 .05 .09 < .001

When additional predictors are included in the regression model the in­terpretation of the intercept will change. l t then re fers to the mean pre­dicted score when all predictors eq ua l 0. The i n terpret at ion o r the b coef­ficient for sex sti l l i nd ica tes t he mean d i l 'll:n.:nc · hcl wcen men and

W O i l le l l , hut l l i i � l t n w . t i i J ' t l . t k t l l ) � t l l lc or 1 1 1o n.: con t ro l v a r i ; t hk� i n t o : tc

cou n t . l 'or i ns t : l l l l' · , l i lc v : l l t : t h l · l 'ducutiou cou l d be added to t i le n 1ode l

bec a u �c i t may cx p l : t i 1 1 w i l y men a n d women d i flcr i n re l ig ious bc l id - ; .

On average, won iCI I obt a i n s l igh t ly lower educa t iona l leve ls t han men ( sec Tab le 3 .43 ) a n d a l ower education is associated w ith stronger rel i ­gious bel iefs (see Tabl e 3 . 5 1 ). This results in the fol lowing mediation model : Sex -7 Education -7 Religious Beliefs.

Table 3.54 Results from Multiple Regression, y=Religious Beliefs·, xl=Sex (0= Male, 1= Female) and x2= Education (in years)

Religious beliefs (y) b coefficient standard error beta

Constant (a) Sex Education ( in years)

.66

. 1 5

-.07

. 1 3

.05

.01

.08

-. 1 4

p (two-tailed)

<.001

.002

<.001

Table 3 . 54 shows that the b coefficient (and beta) for sex hardly decreases compared to those in Table 3 . 53 . So, even after contro l ling for educa­

t ional differences between men and women, women on average sti l l have significantly stronger rel igious bel iefs compared to men. Therefore, i t i s unlikely that women have stronger rel igious beliefs, because they have lower levels of education then men on average. So, we sti l l do not k now why women have stronger rel igious beliefs compared to men. One could add other x variables to try to explain this remarkable difference.

A valid objection against the use of the variable education in years is that it measures something different than the highest completed level o f' education. For example, in the Netherlands, peopl e with Secondary Voca­tional School and people with A levels typicall y went to school for the same number of years. Sti l l , it is theoretical ly plausible that the latter have weaker rel igious bel iefs compared to the former. Therefore, educa­tion measured in years may be a poor i nst rument . On the other hand, the variable educational level is ord i n a l . When we treat this variable as an interval variable, we mus t assu 1 1 1 · l h : t t l he exact d i l 'lcrcnces (or interva l s ) between subsequent c a l cgor i ·s ( i . · . , k v e l s ) < t re t he same and known . I n the data set each educ< l l io l l : t l kvd i s n H kd I poi n l h igher than the pre­ceding ( lower) l e ve l . T h i s l l t t p l ws l l t : l l l ' ! l l t t 'a l io l l : t l l e v e l i nc reases at a constant factor ( I ) i 1 1 I l l , · P l l k l , l · l ' i l l l ' l t l : t l y Sv l ioo l ( 1 ) , Lower Voca­t iona l Schoo l ( � ) . l . t l \\1\' t . · l ' ! l l l t d l t t y S1 l 1 1 11 1 l ( \ ) . SL'l'O I H l a ry Voca l iona l School ( 4 ), 0 l .e v l ' l :: ( ) , . 1 1 1 1 1 t\ I 1 1 � · I • , 1 1 1 l 1 1 1 '. l w t ( < t ) . Ti l e order i s i n d i s­pu tab le ( ev · ry S l l h�l'I J I I I ' I t l lt 1 1 l 1 • l t q • l n 1 ) I t i l l 1 1 1 1 ' 1 l l l ts l : t l l l 1 ; 1 c t o r ( I ) i s .

Page 57: Statistical tools Grotenhuis Weegen

I l 'i

T o a d d t he v : 1 r i : 1 hk � ·t !ul ·uttuuu/ 11 · \ 'l 'l to t he l l l m k l \ 1 1 1 ! ' ' " ' " ' '• l " ' l : l i lk assumpt ions, a l l s i x ·d l lc : i l l < l l l n l kv�· l s h : 1vc t o he t rea l cd as svp1 1 1 . I l l ' v : ll' l

abies. These var ia b les ar · d i d lo l o i i i < I I I S w i t h scores 0 : 1 1 1d I 1 1 1 1 < l : 1 1 c l ·a l kd

dummy variables ( d u n l lny i l l t i lL' l l 1e : t l l i l lg o r ' subst i t u tc ' ) . ' I ' l l . d u i i 1 1 1 1Y variable elementary se/tool i 1 1 l ' 1 1 1dL·s a l l i nd i v i d ua ls w i t h c lcmcntary school as their highest lcvcl o r ·o1 1 1 p lc tcd cducation (they arc coded 1 ) . Respondents who completed h i • her educa t ional levels score 0. Dummy variables are created for a l l ot hcr ( 5 ) cduca t iona l levels in the same way.

I ntuitively, one might expect t ha t a l l s i x dummy variables shoul d be added to the regression model . l l owcvcr, ror mathematical reasons only 5 dummy variables can be used. To u nderstand w hy, we wi l l return to the variable sex once more. Instead of us ing t h i s dichotomous variable, we could use two dummy variables: male ( 0 = Female, I = Male) and female

(0 = Male, 1 = Female). However, because they are exactly opposite to each other, Pearson' s correlation coefficient is exactly - 1 . Wi thout addi­tional measures, it is not possible to add variables that correlate - 1 (or + 1 ) to a regression model . Fortunately, this i s not necessary here because the b coeffic ient for sex indicates the mean difference between men and women ( . 1 8 , see Table 3 .53) . Thi s of course is also the difference be­tween women and men - we j ust have to add a minus sign to the b coeffi­cient. Hence, e ither the dummy variable Male or the variable Female can be included, but not both because they are perfectly correlated.

Now back to our six educational levels. Each dummy variable is per­fectly correlated to the combination of the five other dummy variables.42

Thi s poses no problem - five dummy variables are added and conse­quently the five resulting b coefficients represent mean differences from the sixth (excluded) dummy variable. This also holds when other predic­tor variables are added to the model - only the mean differences are now control led for other variabl es. The omitted dummy variable is call ed the reference category. Generally, a reference category is chosen that corres­ponds to the direction i n the alternative hypothesis. In this case, elemen­tary school i s a good reference because we theoretica l ly expect rel igious beliefs to get weaker as educational levels rise.

Table 3 . 55 shows that when educational levels are included as dummy variables (instead of years of education), women are stil l more religious than men. The d ifference ( . 1 6) i s comparable to the difference in Table 3 .54 ( . 1 5) . So, analyzing educational levels instead of years of education does not change our conclusion that men are less rel igi ous than women even when accounting for educational differences . Table 3 . 55 also shows that all educational levels d i ffer s ign i ficant ly ( a .05 ) fi·om respondents with e lementary school as t h e i r h ighest lcvc l or complc tcd educat ion . This also holds for peop le w i t h lowcr vo · a t i o 1 1 a l L 'd 1 1c : l l io 1 1 , hL'l ' : I I ISC t he

:d ternat i vc hypot h · s 1 s 1 s d 1 rcc l io l l a l ( t hc h igher t hc educ: l l ion :d lev e l , t he

wcakcr re l ig ious hc l i · l 's ) . t he assoc i a ted p ( .05 7 ) nceds to bc d i v idcd hy 2 . The i n tercept ( .24 ) i s the pred ic ted mean on re i ig ious bel ic l 's f(>r m a lcs ( score 0 on sex) who have e lementary school i ng ( s core 0 on a l l 5 dummy

variables) as their h ighest level of education. Thi s type of respondcnt is represented i n the sample, therefore, the intercept and the assoc i ated test

as to whether i t d iffers from 0 can be interpreted meaningfully.

Table 3.55 Results from Multiple Regression Analysis, y= Religio11s 13e­liefi, x variables: Sex (0= Male, 1= Female) and 5 lc'dllca­

tional Levels (Elementary School is Reference Cote,!!,on >)

Religious bel iefs (y) b coefficient standard error beta p ( lwo I ; 1 l l < < I )

Constant (a) . 24 .08 ( )( ) ! )

Sex . 1 6 .0"' on 00 1

Lower vocational school -. 1 7 . 09 ( )I \ O ' , I Lower secondary school -.26 . ( HJ I I 1 1 1 1 I

Secondary school -44 ) ! , 1 11 1 1 11 1 1

0 levels -.75 1 1 ' I 1 1 1 1 1 A leve�s and more -.48 I ( I I l l 1 11 1 1

Additionally, Table 3 .55 suggcsts t l � : l l 1 \' �• J lP I I I I i 1 1 1 1 1 1 1 1 '" ' ' 1 • ' " 1 d n

school ing, have weaker re l i gious h · l i �· l s 1 1 1 1 1 1 1 1 1 1 1 ' ''' 11 1 1 1 1 l 1 1 • , 1 ' " i l l"" d schooling (mean difference: - .2(> . I I 01 ) ) l 1 1 '' • 1 '' 1 , , 1 1 1 , 1 1 1 1 1 > 1 1 1

ference i s significant, the eleme11/t n : 1 ' sdtuul d 1 1 1 1 1 1 1 1 \ 1 1 1 1 1 1 1 1 + 1 1 tdd · d ' " the regression model, whi le the lo l li( ' t ' \ 'nr ·ot/OII I I I ' ' f, , ,, , f d 1 1 i l l l l l \ 1 1 1 d + l • is removed and now serves a s t h c re l l;r ' I lL' · l ' : l t q •,l l ! v ( . , , •' 1 n l d 1 I • I , 1

Table 3.56 Results from Multiple He.f!.ressin/1 : f l lr t l l '," ' · I ' N1 1!,1 ' / 1 1 / 1 1 /11 liefs, x variables: Sex (0 Moll '. I 1 - 'l ' tllull ' ) 1 11 1, / / 1 !, , , ' '

tional Levels (Lower Vocalio11ul is l<c/1 ., ., ·ur ·� · < '"II ',I !Ot I ' )

Religious beliefs (y) b coefficient standard error bol ; l p ( l wo 1 . 1 1 1 1 d )

Constant (a) .07 .05 t BD

Sex . 1 6 .05 .08 . 00 1

Elementary school . 1 7 .09 .05 .Or-/

Lower secondary school -.09 .06 - .04 . 1 34 Secondary school -.27 . 1 3 -.05 . 044

0 levels - .58 .09 -. 1 8 < .001

A levels and m ro - . 3 1 .07 - . 1 2 . 001

Page 58: Statistical tools Grotenhuis Weegen

This add i l ion:d : l l t a lys t s s l to\\ ,, l l t : t l l l t L' assoc i : t l cd I W t t 1 1 1 ! 1 1 d p 1 ; d t l l ' i s . 1 34 and the r ig l t l ( ) I ll' l : t l i l 'd J 1 I S . 0 ( 1 / 1 \ > r l he I I IC < l l l d t l k l \ ' 1 1 \ \ ' hv i WLTI I Lo wer Secondwy Scl/ 1 1 1 11 i l l td I I l l \ '( ' / ' I 'rwulirmol ,)'c!Joril . ' l ' l l l i S , : t l : t l l u or . 05 there i s no t enough s l : t l t s l t c :d L· v i d ' l iCe l o accept l hc l typol l tes is l ha t respondents with a lower voc: t l io1 1 : i l c du ·a l ion have weaker re l ig ious be­l iefs than respondents w i l l t low ·r s ·u >mi < 1 ry education level s . Note that w ith a less strict but acccptabk ll:s l : 1 1 u . I 0, the alternative hypothesis is confirmed because .067 i s be low . I 0.

The beta coefficients assoc ia l cd w i l h each educational level from Ta­ble 3 . 5 5 and Table 3 .56 are not very i n l <m11a t ive because these are all relative to the reference category. When a di I'ICrent reference category is selected, the beta coefficients w i l l change ( compare Table 3.56 to Table 3 .55) . It is possible, however, to compute a combined beta coefficient for all dummy variables to measure the tota l standardized effect of educa­tional level. This beta is known as the sheaf coefficient.

Furthermore, it is assumed that the b coefficient for sex is equal across every educational level. However, an interaction between sex and educa­tional level may exist.

Finally, we would like to emphasize that it i s general ly preferable not to add large numbers of predictors to a regression model ( ' less is better') as it keeps the model parsimonious. It is possible to test whether the model with the five education dummy variables ' fits' the data more closely than the parsimonious or restrained model with education in years. We elaborate upon sheaf coefficient, interaction models and tests for restrained models on our website (http://www .ru.nl/mt/statistics/home ) .

Li near Regression Analysis: Assumptions

There are four important assumptions associated with regression analysis, wh ich will be discussed in the order of importance. The first assumption is that the mean of all errors ( i .e . , the mean of all the differences between observed and predicted y-values) is 0 for all (combinations of) x-scores in the population. This assumption is vio lated in the case of nonlinear rela­tionships. This assumption can be indirectly checked by inspecting l ine graphs in which the error is plotted against the x variable(s) . For example, the nonlinear relationship in F igure 3 .34 has a mean error that is mostly positive for young people but turns negative for those between 55 and 70 years old. This nonlinearity can be turned into a l inear relationship with a transformation of the variable age (see our website for deta i l s) .

Secondly, i t i s assumed that a l l errors are i ndependen t . Th i s means that the value of any error does not depend upon the va l ue or any o ther error. When the errors are depcnden l on each o l l tcr i l m: 1y i nd i L· : t l c non-

1 1 1 1 r on 1 1 r 1 1 · . tnWIII .

. . . . (X : I l l i pk, l t t t c: tn ly or l i t · ahsc l l l'l' u l I I I IL' or more 1 1 1 1 porl : t n l pn.:d t c l \ 1 . 1 . . · · · o �l l l ' I l l < C t 1 I he vanabk hodt · !Jet •!11 1 s 1 101 added l o a model whcr�rs . h .

1 . . . · · . . !.c · tgurc

11endenl vanab lc and l hc dc11cndcnt vanab lc 1 s /J()( /l ' w.- ' o�;e 1, . 1 . . . .· ' . ' � .St l l 1 l : t l e(

3 .49) t he wc 1ght o l younger people I S systcmat lca l l'y, 'J t:ht l � . 1 . 1 .

' � \ � l tgc 1 1 s

( overa l l pos i t i ve error) because younger peopl e a rc ta l le r und; 11, •

1 ' · ·r pcop c means that the values of the errors are positively corrc la'-. on avV(v 1 • • • \ 1 t 1c con-a re systematically overestimated (overall negative error ) 'eel. Oi l t ro l variable body weight because they are shorter on avch w i t hov

. . . \ Jlc popu-Tlmdly, all errors a�:e assumed to be normally d 1 stnb\... agc. '· e rrors is

lation for all (combmat10n of) x-scores . .When the d 1 stnb \Hed i 1 1 l . 1 1 . strongly skewed in either direction, it may i ndicate no,\ 1 t i on 11j1Yt 01

1

11 .

d' H . I I' ( I L:r p () s

absence of one or more Important pre 1ctors. 1 stograms 1 I nca 1 • 1 . . . . . I d ,l · c( . 1 s . 1 1 1 i n which the distribution of errors I S d1sp aye , arc com 1 1 \ a nd s' t indirect check for this assumption. on I y 1 1 1 1 • I V 1 1 •,

The fourth and final assumpt ion re la tes to honH>� .1 1 . 1 1 1 ' I I 1 1 1 1 1 means that the variance o f the error I S assumL:d l o I K· L' ( j l l . ·l ·das 1 1 1 1 )

. l . r· I .

. I ' I , 1 1 1 1 1 \

bination of x-scores. V10 at10n o t 1 1 s ass1 1 1 1 1 p t 1 o 1 1 ( Sl'l' • I l t l l 'I 1 1 1 1 · . , . \ � 1 1 I 1 1 lead to serious problems w hen t he van: l l ll'l' d l l ln : r w 1 ' 1 ' 1 1 1 1 ' 1

. t i l l ' 1 1 • number of cases used to ca lcu la te l he van : I I I \ ' L' d t l k t ·. l 1 1 1 1,, I l l I \ 1 ' " ' gory of variable x (see also t he : tSS I I I l t p l i t l l t S 1 1 1 1 1 1 I · I • , 1 1 1 l 1 11 1

' . I I I some rules of thumb) . In t he case o l SVI IP I I '• 1 1 1 1 1 1 , . 1 1 q 1 l j t . q

( Weighted Least Squares) regression i s t t l l t l l ' l l ! l l l l l l ( l l l l t l l 1 1 1 1 1 1 1 1 I I 1 1 Final ly it should be noted l ha l l i t \ ' 1 1"> 1 1 I ' • u l 1 1 1 1 1 1 1 1 1 1

, ' i l l analysis can be severely i n l l ucncl'd h y 1 1 1 1 ' 1\ 1 d ' ' 1 • 1 1 l t l l t l q d 1 ' I 1 I 1 I 1 1 I small samples (n < 200). ThcsL: ohsl· t v: i l l l l l l ' • 1\ d l l l l ' " 1 1 1 1 1 tively low o r high score o n the x-v < 1 1 i a hk( •l ) 1 P l l i l l l l l l d 1 1 \ 1 , , , 1 1 1 1 1 1 1 1

I / 1 1 1 I I l l Influential cases can be detected hy : tn : i l y l l t i J ' 1 1 1 1 1 1 1 1 1 1 1 1

\ l t 1 1 1 website for some new devclopmcn l s i 1 1 1 1 1 1 .'· l w ld i 1 1 1 1 1 1 1 1

3.7 Summary

r ' l l i q l l ! l To conclude we present the most import a l l l l l i l u t l l t l i l l l t l l 1 1 1 1 , I / " " ' l in the tables below. B ased on t hL: mcasurc l l l l' t l l k w l 1 1 1 ' " ' 1 1 1 1

one or more appropriate stat i st ica l t oo l s arc s 1 1 ' )..!.L's l l'd . ' ' ' " 1 1 1

Table 3.57 Univariate tests ----------------------�\ ,

frvl l l /u i l lo dichotomous nominal ordinal '-----.!

Test for proportion

Recede as Recede as dichotomous i l l ! ( lost 101 dichotomous variables -7 test for proporlio,1 1 1 10 � 1 1 1 variables -7 assume interval level insle<:)

�l)r t/

test for proportion of ordinal -7 l-test for mea� �---

\_j_j

Page 59: Statistical tools Grotenhuis Weegen

1 1 11

Table 3.5X /Ji t •nrinll ' /( ·.,· rs

Dependent l ndopondont variable (x) variable (y) nomin : 1 1 ord inal in lo r v: dlr � 1 l io

l-test nominal Cramer's V (m u l t in miAI ) I gi tic regression analysis *

X2-test I Kendall's tau-b and c ord inal Cramer's V Spearman's rank correlation (rs)

ordinal regression analysis *

Paired samples t-test Odds ratio (for dichoto-Two samples t-test

interval and Analysis of variance I mous variables)

ratio Bonferron i-test Spearman's rank corr. (rs) Linear regression analysis Pearson's correlation (r),

(predictor as dummy variables) Linear regression analysis

Table 3.59 Multivariate tests

Dependent variable (y)

nominal

ordinal

interval and ratio

nominal

Independent variable (x)

ordinal interval/ratio

X2-test Cramer's V

(multinomial) log istic regression analysis *

X2-test I Kendall 's tau b and c Cramer's V Spearman's rank correlation (rs)

Ordinal regression analysis *

Multiple l inear regression analysis (nominal and ordinal predictors as dummy variables)

* see http://www. ru . n l/mt/statistics/home

Concluding remarks on Statistical Tools

We ended this book with Tables 3 . 57 - 3 .59 . In these summary tables we present common statistical tests used in the social sciences. However the dis�ip l ine of stati.stics is very much a l ive and more advanced analyse� are avm l�ble dependmg on the research question. For example, mixed model te�hmques that are used for units at different levels of analysis, have ga�ned much attentiOn recently. Likewise, innovations are constantly bemg made in statistical packages such as SPSS and free software such as R (�ttp ::/ww.w.r-project.org). F i nal ly, during l he las t 1 5 years practical app lications tncreasmgly have become t he f ()(,;us o l ' s l ; � t i s l ics courses. W e hope that Statistical Tools con t ribu tes mc:t n i r t g l '� r l l y to l i t i s d ·vc lopment .

INDEX

assoc ia t ion 45-50,76- 1 0 I , I 07

analys is of variance 73-77, I 1 8, 1 25

alpha (a) 5 8-6 1 alphanumeric 1 1

alternative hypothesi s 5 8-6 1 anova 74, 1 25 arithmetic mean 3 1 association in cont. table 83,87 , I 04 assumptions regression 1 1 6- 1 1 7

b coefficient 94-96, 1 1 2 bar chart 23-25,50 beta coefficient 1 1 0- 1 1 6 between variance 73-76 binomial distribution 67, 1 24 bivariate analysis 46,50,9 1 , 1 1 8 B MI body mass index 72,90, 1 24 Bonferroni test 76, 1 1 8 box plot 36,37,46,50

categories causality census center central l imit theorem chain model

1 2, 1 3 78-79, 1 0 1 - 1 09

1 8 ,5 1 -52 27,29,32,50

52-55 ,60 1 05

44-45 80-83, 1 1 8 , 1 25

82

46, 1 07- 1 08, 1 1 1

7 7-79

86-87 ,90, 1 2.( 1

. ) c; x

Chebyshev's rule chi-square Cochran's rule cohort effect column total concondant pairs confidence interval confounding variable contingency table

1 02 , 1 01 1

7 7 7 1 > .XO ,

< )8 , 1 0· 1 , 1 ) '1 controll ing . I , I ( ) 1 I I I correction l ( l r ·o r l l i l l l l i i Y ( , l , l r K , I ' • I

corre la t ion 4 . , X \ K K , ' i I • > 1 , 1 1 I , 1 1 H

Cramcr ' s V HO H I , I I H , I ' '

cum. percentages

data collection data v iew data cleaning degrees of freedom delicate questions dependent variable

n,30,37 ,42

79

I 1 -22

1 7

1 1

22

1 22

22

47,50, 78 ,

93, 1 1 8

dependent events 1 7 , 1 1 6- 1 1 8

dependent samples 68-70

descriptive analysis 2 3 ,45,50

df degrees of freedom 1 22 , 1 26, I 7

dichotomous variable 1 4,65,98,

direct effect directional hypothesis

1 1 2- 1 1 6 , I I X

I 05, 1 I 0

59,6 1 ,8 1 ,

I ... . direction of association 84, I 0(1

discordant pairs 86-87 ,90, I _( \

distribution, univariate 2 3-44,50,

1 1 7

dummy variable 1 1 4, 1 1 8 , 1 28

empirical rule 44,45 ,60 ,63

exact test 82-83 ,87,90, I 0 I

expected frequency 80,�S2

experiment I 7, I 8

c x rla i ncd variance 9 7

l 'ra · t ion I proportion

l 'r ·q 1 1 · n ·y t a b l e I ; d i :-; t r i h � r l ion ( .' l l ' S I

I · \ : d I l l '

l ' l l l l ' l l i l r ; : l l l l l l l l ' l l l j i l l l r . d dr · .·a · r i p l i ( ) l l

62,65

n,50

7 5 , 1 25

7 3-7 7 , 1 1 8

74-76

1 6,20 ,4(1 ,5 I

2:\ ,4(1 ,50