Download pdf - WISC-III Factorial Invariance Across Normal and Clinical ... · WISC-III Factorial Invariance Across Normal and Clinical Samples Chen Hsin-Yi Professor, Dept. of話pecialEducation,

• 93 •

特蛛教育研究學刊

民阱， 33 卷 1 期， 93刪 107

WISC-III Factorial Invariance Across Normal and Clinical Samples

Chen Hsin-Yi Professor, Dept. of 話pecial Education,

Nationalτ'aiwan Normal University

Hung Li-Yu Professor, Dept. of Special Education,

National Taiwan Normal University

Zhu Jianjun Director, The Psychological COrpOI滋ion，

U.S.A.

Chang Cheng-Fen Professor, Dept. of Special Education, N袋tional Taiwan Normal University

Yang Tsung-Ren A銬。ciate Professor, Dept. of Special Educ鑫tion， National Taipei University of Education

ABSTRACT

Tak:ing as its sample a normal group of 1, 100 children and a clinical group of 1,150

children in Taiw紹， the pu叩ose of this study was to test fo1 invariance in the WIS心III

factorial structure across normal and clinical samples. ResuIts of our two-stage

multi心ample ∞nfirmatory factor analysis 到擇。如d partial factor invariance. Among all

examined parameters, only a few discrepancies involving unique error variance草 and sub

tly correlated residuals were found across groups. Evidence of partial measurement in

variance indic謹.ted that the hypothesized four-factor model generally fit the real data

fairly welI. It was found that while measurement accuracy for a few subtests di能rs be備

tween and among 草roup囂， the overall main structure and fi揖ctor loadings were generally

invariant across normal 削d clinical 草棚ples. Therefore we concluded 紛紛 WI蛇-111

scores can be interpreted as h糾穗 the same meanin草 across groups.

Keywords: Exceptíonal children; Factorial Invariance; Multi-sample CFA; WISC-III

*Our gratitude goes to the Academic Pap巷r Editing Clinic, NTNU.

. 94 . 特殊教育研究學刊

Introduction

γhe Wechsler Intelligence Scale for Chi1-

dren (WISC) is 000 of the most widely referred

individualized intel\igence 紐約 (Camar龜， Nathan,

& Puente, 2000). Up to now, it is estimated that,

including Taiwa錢， close to twenty countries h在.ve

adapted the standardized version of this instru

ment 防部rgas， Wei鉤， V認為 V俞斌& Saklo已

sk皂， 2003).

Compared to the traditional verbal - p軒"

formance Wechsler IQ construet, a new,

four-factor structure (Verbal Comprehension,

Perceptual Organization, Freedom from Distrac

tibili紗， and Proce鈴ing Speed) was proposed in

the third edition of this instrument 治 1991

(WISC-III; Wechsler, 1991). This four - fac

tor-based structure is more in line with contem編

pora句紛紛訂ch on inte \1ectual components and

has been recognized for i的， clinical uti1ity both in

Taiwan 制d int草mationally (Chen & Yang，鉤。0;

Hung, Chen, & Chen, 2003; Weiss, Saklofs棍，

Schwar侃， Prifitera, & Courvil1章， 2006). Besides,

this model h諮詢問 cross-validated extensively

in a variety of samples by traditional exploratory

or confirmatory factor 甜的那 (Do闊的&

Warschaus均" 1996; Keith & Witta, 1997; Kono泊，

Kush, & Canivez, 1997; Ro泊， Prifitera, & Weiss,

1993; Roid & Worrall, 1997; Tupa, Wríght, &

F泛stad， 1997) ，在nd was confirmed tobe a pre“

ferred model for Taiw給 normal children (Chen,

Z恤， & Chen, 2000; Georgas, V臨街 V草ver，

Weiss, & S雄lofs峙， 2003). Nonetheless, exami

nation of the faεtorial invariance between normal

and clinical popul滋ions b俗話d on multi柵group

s加cture equating modeling (SEM) is stil1 short.

Factorial invariance is a key property of any

measure (Dr絲露ow， 1984, 1987; Ro吭， Werts, &

Flaugher, 1978). Scores for individuals from

different groups cannot be given the same mean

ing if there is no evidence of such invariance

(Horn & McArd怡， 19亨氏 Vandenberg & Lance ,

2000). Furthermore, it is clearly stated in the

軍tandard 7.8 of “Standards for Ed凶ational and

Psychological Testing" (AERA, APA, & NCM缸，

1999) th斂，“Comparisons across groups are only

m給你19ful if scores have comparable meaning

acr。“ groups. The standard is intended as appli刪

cable to s惡ttings where scores are implìcitly or

explicitly presented as comparable in score

meaning across 在roups (p.83)'\Millsap and K wok

(2004) also pointed out that selection based on a

composite with underlying non-invariant factor

structures could be problematic.

In empirical practic皂， WISC-III is most fre

quently administered for the purpose of diagnos

ing and evaluating the cognitive function and

exceptionality of clinical populations (Kaufrnan,

1994; Sattler, 2001; Sattler & Dumo刻， 200余

Prifitera, Saklofs峙， & Weiss, 2005). Implicit in

this common practice is the assumption that

WISC-III 認btests and factors have 紡車 same

meaning for both normal and clinical children;

that 峙.， equivalence is assumed to hold for the

underlying theoretical 給ucture旱， factor pattems

(subtests loaded on the same factors across

groups) and the magnitudes of factor loadings.

1位前le literature, few studies ever ex器mine

the measurement equivalence of WISC-III acro給

l紡車:e normal and c1inical samples by SE說 tech“

nique which has the advantage of taking cov侃"

鵲ce matrix of both groups into consideration.

n紛紛fore， taking as its sample Taiwanese chil

dren with a 闊的ciently 1紋gesample size and

WISC-III Factorial Invarian凹 Across Normal and Clinical Samples ﹒的﹒

degree of variation, this study investigated the

issue ofinvariance in the WISC-III by means ofa

multi-sample structure equating modeling.

R直ethod

Participants

Data based on two samples were analyzed in

命諒你dy. The first one is a normal samp誨，制d

the second one is a clinical s蜘ple.

The normal sample The first s直mple is the WISC-III Taiwan

standardization sample, comprised of the re晴

spon終s of 1,100 normal children ranging in a伊

from 6 to 16 years old. This 位ationally represen

tative s霉的ple was divided into 11 groups accord

ing to age, wíth 50 males and 50 fem孟les ín each

group. This sample was selected carefully to

match the Taiwan census information on several

variables, inclu這ing region, gender, and parents'

educatíonal leve1. The me垂n age was 11 , wíth a

standard devíation of 3.16; the average

full-scaled IQ (FSIQ) was 100 (SD叫5). A de令偏

tailed description of t油hi詠s normal s翎ar紋t訣lpl抬e 誌s r終.穹單

po釘T舟$給ed i泊n the τ孟滋iwa納n version of the WI臨SCι-11111 1

ma船m削n沁u恤I泌直I(W缸齡ch站1怒sler，丸， 1997η).

The clinical sample The second sample ís a heterogeneous

clinical sample including a total of 1 ， 150τ'aiwan

children, who were forma l1y identified and diag

nosed by clinicíans or educational evaluators 鈴

with special needs. Among them, 37% wer重 di

agnosed as being m侃tal1y retarded, 32% as hav

ing learning disabilitie旱， 19% as being autist誨，

10% as having ADHD，個d 2% as havi均 emo

討onal and behavioral disorders. The data were

collected by authors vía multiple tracking meth

ods during 2002 to 2007. Some were from formal

academic or clinical evaluation 終∞r品， and oth

ers were from database i終 special education iden

tification and placement system. For this group of

children, the mean FSIQ w紛 78.59 (SD喘2 1.78).

The average age was similar to that ofthe normal

group (M路10.泣， SD踹2.69)， and the gender ratio

was roughly 7:3 (72% males vs. 28% fema1e哼，

which concurs with the known fact that there is a

much higher percentage of ma1es in the clinícal

population.

Instrumentation The Taiwan version of the WISC-II1

(Wechsler, 1997) c的tains 13 subtests: Informa

tion (INF), Similarity (SIM), Vocabulary (VOC),

Comprehensíon (COM), Picture Completion

(PIC), Pícture Arrangement (PA), Block Design

(BLD), Object Assembly (OA), Arithmetic (ARI),

Digit Span (DS), Coding (CD), Symbol Search

(SYS), and Mazes (MZ). .'\11 composites and

subtests demonstrated good reliabilities (ex可 the

íntemal reli~bility rangíng from .87 to .96 for

composites，翻d .68ω.90 for subtests). Cumula

tive research finding also provided good s街路的

of validity evidences of this instrument for Tai

W闊的e children (Chen, Ch組草， & Y:部草， 2004;

Chen, Lin, & Lìao, 2005; Chen & Y:車ng， 2000;

Hun車， Che泣，& Chen，卸的; Wechsler, 1997).

Ana今'Sis 01 the data Tests for the factoria1 invariance across

normal and clinical groups were based on the

訟alysis of covariance structure models using

LISREL 8.8 ( Jöreskog & Sörbom, 2006). Fig.

1 shows the structure of the hypothesized base總

line model, i琵 which 12 core WISC-II1 subtests

(with the exception of the optionaJ Mazes subte滋)

特聽教育研究學刊

Index, PSI. This 加seline model was first tested

separately so that e蠶ch group could examine its

approprlateness.

• 96 •

ar單位ivíded according to 泌dex-type: (1) Verbal

Comprehension Index, VCI; (2) Perceptual Or

ganiz銜。nal Index, POl; (3) Freedom from Dis

倡“ibility Index, FDI; and (4) Proce給ing Speed

Construct of the hypothesized WISC-III four-factor model

matrices. Maximum likelihood was the estima

tion method because of its robustness and sensi-

Figure 1 鱷

Ne吼 inv:鉗制能 analyses with nested mod

els were tωted ∞伽借 levels. 單ach level 鉛ts

tivity to incorrectly specified models (Hu &

Bentler, 1 型98).τhe scale of latent factors was

defined by fixing the first factor loading as one

仰r factor. Criteria were evaluated jointly 紛紛倫

sess over辜11 model fit (Bentler & Bone哎， 1980;

Mar弱， Balla, & McDonald, 1988). These in

cIuded weighted I細t squares "1:, '1.: to df ratio, goodness-of-fit index (GFI), adjusted good

ness-of-fit index (AGFI), normed fit index (NFI),

non-normed 知 index (NNFI), comparative fit

必贈啥叫15個ints than 削 imposed by 設le previ

ous ∞e (f\如edith， 1993). 甘le 貧富t and w磁k臼t

level tes給 for configural invarian間， which a公

sumes 也at the oVi仿冒11 factor patt，敘說 is the same

如k油椒油al 甜d clini開1 chil的民These切關

ASh-J

宮丘電話

deJj

暫且苟且可gc

毛

adiztad塌，首gam

index (CFI), and root-me妞-square-error of ap

proximation (RMSEA). In 直ccordance with ∞心

ventio訟， a value of 夕。 served as the rule-of

thumb lower limit of acceptable fit for all fit in

dices rangin草 from zero to 1, with 1 indic轟ting a

perfect fit (Hoyle & Panter, 1995; KIi帥， 2005).

level t，的“你 W帥k fact剝削 invariæ帥， or a1so

個lIed 波爾ic invari翻開;也is ∞郎trained m叫el

Z湖泊res the ma敘述“de ofthe 翁ctor 1咽dingsbe

徨姆總鳴叫'oss a11 grou濟(~ = ^c). The thi吋

level tests for uniqueness invariance (eN 囂。c);

ex器mining whether WISC-III

four-factor-structure explains the same amount of

vari車nces of s的tests in both groups. 許lat is, whether the subtest abilities could be measured

me似的this

with similar 在ccuracy across groups.

All models were tested using covari在nce

WISC-III Factorial lnvariance Across Normal and Clinical Samples ﹒悍，

Values close to 2.0 or 3.0 were considered fit for

the :e to df ratio (Bol\en, 1989). An RMSEA less

than .05 con明ponded to a “good" 給 and with .08

considered an “acceptable" 品 (McDonald & Ho, 2002).

During e在ch step of the analy認s， the chi

$限制 difference 抖的 was tested between

nested models, and suggestions regardin草 to par

tial measurement invariance (Byme, Shavelson,

& Muthen, 1989; Byrne & Watkins, 2003) w紛

carefully considered and fo l\owed. If inadequate

街 was detected, fit in the model was improved

by including additional para鈴eters identified by

the modification index (Ml) provided by LISREL.

Meanwhi泊， re中arameterization was examined

careful\y for meaningfulness.

Results

Descriptive statistics

The descriptive statistics for each subtest by

group are presented in Table 1, along with the

Shapiro削Wilk index (Shapiro & Wilk, 1965) for

normality testing.

According to the me轟n values in Table 1,

disabled children performed lower on all subtests

than their normal counter悶ts.τhe pa惚惚。fthe

data for both groups approximated 主 normal dis

甘ibution. Skewness ranged from -.3 4 to -.05 for

the normal, and from .07 to .55 for the 晶磊bled

group; kurtosis ranged from -.09 to .44 for the

former, and from -1.08 to -“22 for the 1搬r group.

Arr峙。rity of the Shapiro-Wilk indices were c10se

to 1. Muthén and Kap\an (1985) once suggested

that a Iikelihood estimate for variables with a

skewness and kurtosis around -} to 刊 is accept

ab\e. Kline (2005) a\so suggested that the z test

may not be very useful in large samples because

slight departures may end up as being statistically

significant. An alternative is to interpret the ab

solute values of standardized indices. When the

absolute va\ue of skewness is larger than 3, or the

在bsolute value of ku泌的is is larger th訟的， then

it is considered to be a non-normal problem. Our

T翁ble 1. Descriptive statistics for both groups

Subtest INF

SIM

VOC

COM

PIC

PA BLD

OA

ARI

DS CD

SYS

M 10.28

9.74

9.86

10.21

10.35

9.92

10.16

10.26

10.56

10.33

10.14

10.33

Nonnal SD Sk Ku 3.05 -0.05 0.02

3.吾5 -0.34 -0.0華

3.53 -0 .3 1 0.07

3.30 鵬0.32 0.17

3.05 -0.14 0.20

3.36 -0.24 -0.05

3.14 -0.17 0 .17

3.13 -0.1 9 -0.09

3.06 -0.05 0.22

3.16 -0.06 。.06

3.17 -0,07 0.32

3.25 -0.1 3 0.44

C1inical 皂roup (N 臨 1 ，150 )

S-W M SD Sk Ku S-W 0.99 6.25 4.05 。.44 -0.46 。.94

O.會喜吾.59 4.50 0.26 -1.0華 0.92

0.98 6.53 4.03 0.28 -0.67 0.95

0.98 6.54 4.07 0.28 -0.69 0.95

0.99 7.6盛 4.57 0.07 -0.97 。.95

0.99 6.95 4.32 0.27 -0.93 0.95

0.99 7.33 4 .57 0.17 -0.89 。.95

0.99 7.91 4.39 0.08 -0.77 0.97

0.99 5.87 3.87 0.51 -0.22 0.94

O.會9 6.72 3.81 0.33 -0.36 0.97

0,99 5.3尋 3.72 0 .55 峙.46 0.92

0.98 6.44 4.00 0.27 靜0.62 0.95

Note: M Mean; SD = Standard Deviation; Sk 就 Skewness; Ku = Kurtosis; S-W Shapiro-Wi1k test fornonnalíty

. 98 . 特臻教宵研究學辛苦

results revealed that no serious non-normality Baseline model checking

was identified in the current da絡， and thus the

maximum Iikelihood method was applied for As indícated by all goodness鷹。f-fit indices

model estimatíon. 認ported in Table 2, the initially hypothesized

Table2. Multi-sample CFA goodness-of-fit index:

Models 2/ df GFI AGFI NFI NNFI CFI RMSEA

Phase 1 : Baseline model fit for each group Normal group (N) 176.32 48 3.67 0.97 。.96 0.99 0.9會 0.99 0.0毒害

@7.8 free 144.96 47 3.08 O.空軍。.96 0.99 0.99 0.99 0.044 ClinicaI group (C) 449.86 48 9.37 。.94 0.90 。.98 0.98 0.99 0.085 @，月合自 360.18 47 7.66 0.95 0.92 0.99 O.空軍 0.99 。‘[email protected] 翁官E 317.23 是6 6.89 0.96 0.93 0.99 0.99 0.99 0.072 @79fr紛 26吾.65 45 5.93 0.96 0.94 0.99 0.99 0.99 0.065 @1，8 含章S 236.4 1 44 5.37 0.97 0.9.尋 0.99 0.99 0.99 0.062

。1月 free 21 1.35 43 4.92 0.97 0.95 0.99 。.99 0.99 0.058 喔~'.8 free } 89.04 42 4.50 0.97 0.95 。.99 0.99 1.00 0.055 @3.4 free 163 .41 41 3.99 0.98 。.96 。.99 0.99 1.00 0.051

Phase II : Factor invariance aα'Oss groups 1. Configur祖1 Inv吋削侃 30車.37 88 3.50 0.98 0.99 0.99 0.99 0.047 2. Factor loading InVi制給給 338.76 96 3.53 0.98 0.99 0.99 。‘99 0.047

L\2 vS.1 30.3會*. s 2a.@3,4 舍的 (N) 318.59 95 3.35 0.98 0.99 0.99 O.嘗嘗 0.046

L\2a vs. 1 10.22 ? 3. Error V紋ìance Inv蚓ance 465.33 107 4.35 0.97 0.99 0.99 0.99 0.055

L\3 VS. 2a 14吾.74** 12 3巷.@制為它e(C) 440.15 106 4.15 。.97 0.99 0.99 0.99 0.053

L\3a vs. 2a 121.56** 11 始[email protected] 合軍e(C) 405.31 105 3.在G 0.97 0.99 0.99 0.99 0.050

L\3b vs. 2a 86.72** 10 3c. 恥.7 台海哥拉} 383.77 104 3.69 0.97 0.99 0.99 0.99 0.049

L\3c vs. 2a 重5.1慧" 9 3d. @8,8 fi官e(C) 364.09 103 3.53 0.97 0.99 。.99 0.99 0.0尋?

L\3d v氣2a 45.50** $

3e. @4.7 合當(C) 3尋2.72 102 3.36 0.98 0.99 0.99 0.99 O.側重

L\3e vs. 2韓 24.13** ? 3f. @12.12 free (C) 325.93 101 3.22 0.98 0.99 。.99 0.99 。.045

L\3fvs‘2a 7.34 6 Note: ** p<.O 1; @',8 = error covariance be糊糊1 Block Desi伊您是 Object Assembly; @諦'.12 器研orω叫iance be.

抑的n Digit Span and Symbol Search; @7，9 棚 error ∞v研ance be伽een Block Desi伊 and Ari也鈴聲tic; @1，8臨

errorcovariancebetw棚1 Information and Obj學ctA蜘mbly; 副書，端 errorωvariance betw棚1 Information an是Arithmetic; 喔，直，8 = error covariance betw如1 Picture Completion and Object Assembly; @3.4 軍 error covari-ance between Vocabularγand Comprehension; 恥.7 串串rror covariance betw紛紛 Picture Arrangement 捌dBlock Desi伊;如.7 = error covari徊的 between Comprehension and Block Design; @6,6 =棚。r variance for Picture Arrangl開lent; @I.I 揖 error variance for Information; @ø，s盟 error variance for Object Assembly; @1:I.12 揖 error variance for Symbol Search.

WISC-III Factorial Invariance Across Normal and Clinical Samples . 99 .

four-factor model fits comparatively better for

the normal group than for the clinical 銘mple.

This was a reason給le finding since Wechsler

factor structures have traditionally been estab

lished based m在inly on the normal popul甜的，

and clinical population is known to have some

di反inct cognitive p揖ttern.

We further examined the model 說ness for

each popul轟tion individual詩. For the normal

group, all “恥" v辜!泣紹說 this initial model were

within ideal ranges. Result revealed that the

four-factor structure is 在n appropriate construct

for normal population. To identi命組y modi訂閱"

tion which may further improve this model, we

then proceeded in an exploratory f1紛hion to 10-

偽te possible mis-fit parameters. The highest MI

indicated an error covariance beh鴻章n Block

Design and Object Assembly subtests, which was

considered as a reasonable one. Once we relaxed

this error covariance for the normal group, a rela倆

tively small but significant value was estimated

(standardized estimatee7•8 = .13, t= 5.5 1, p<.Ol ).

This revised model fit was improved and thus

was treatedωthe starting model for normal

group in the following invari車nce checking.

The same inspecti∞ procedure was applied

to the cIinical sample. With an RMSEA 叫 .085

and the x2todfratio as 9.37 shown in the 泌itíal

model, we believed that some model modifica

tions 車re needed for better improvement. After

free estimation of seven correlated residuals be

ing aIlowed, a better-fitting model was estab

lished. Since these trivial correlated residuals

were consi是ered reasonable, this improved struc

ture was set as the starting model for the clinical

sample.

結ulti舟ample invariance analysis

Based on the defined startíng m。這e怒，

multi-sample 轟nal如es were conducted with con.

straints embedded in sequence. First, checking

for configural invariance across groups (Model 1)

reveale是 a good model-data fit. Normal and

clínical chiJdren basically share the same latent

four-factor structure, and corresponding subtests

employ the same factors. Second, factor loadings

were then constl滋ned to be equal ac紛紛 groups

(model 2). CFA results indicated a good model fi t.

However, the X2 difference between this model

and Model 1 was significa成 (år(8)寸0.3 9，

p<.Ol), an exploratory approach was then pro

ceeded to locate misfit parameters. The highest

MI indicated an error covariance for normal

group between the Vocabulary and Comprehen

sion 如bt悠悠 which was considered as with ap

propriate meaning. Once relaxing this parameter

(model 2紗，是 relatively smaJl but significant

value w的 estimated (standardized estimat草 eJ 再=

O.肘， t話 4.30, p<.01 ). The revis韋拉 model had

improved 是 and the X2 difference between I關el2a and model 1 was not significant (år(7)=1 0.22,

p>.05). Fi捕時， further ∞nstr紹給 on error vari輸

ance equivalence were imposed (model 3). The

model fit well from a practical perspective but

with a significant X2 di能rence (吋(12拘捕，抖，

p<力 1)， suggesting that unique variances are not

completely invariant under current model speci

fication. MI checking again helped to indicate

mis-fit parameters. With six more error variances

and covariances bein草 examined and set free for

estimation, the final model (model 3f)日t the data

fairly well and shown 轟 non-significant X2 differ

ence (åX2(6)組?此， p>.05) compared to model2a.

. 100 • 特殊教育研究學科

Standardized estimates based on model 3f

for each group are shown in Table 3. Again, the

scale of latent factors was defined by fixing the

initial first factor 10轟ding estimation 泌 one per

factor. According to Table 3, while with the

four-factor structure and all factor loadings re

main invariant, the 是ifferences between groups

mainly reside in trivial residual variance and

covariance terms. ln general, only 車 few discrep

anci的 were identifie是銘:。草s groups. The partial

factorial invariance in factor pattern, factor load

in斜，銷社 error variances between normal and

clinical groups thus was supported.

Table 3. Standardized parameter estim滋es for invariance model 3f

Normal gro誼p (Clinic叫 group)

Factor Uniq輯學綴章喜事

Residual covariances 0 loading單

。 INF 這SIM VOC COM PIC PA BLD OA ARI DS CD SYS A

INF .84 .23 (.84) (.35)

SIM .辜會 .22 O

(.89) (.22) (0)

VOC .86 .27 O O (.86) (.27) (0) (0)

CO其f.81 .34 Q O .06

(.81) (.34) (0) (0) (.04) .

PIC .79 .38 。 O O 。(.79) (.38) (0). (0) (0) (0)

PA .81 .45 O 。。。 O

(.81) (.25) (0) (0) (0) (0) (0)

BLD .81 .35 o O O o o O (.81) (.35) (0) (0) (0) (-.04) (0) (門。7) .

OA .72 .40 。 O O 。 O O .12 (.72) (.56) (咒。6) (0) (0) (0) 已。9) (0) (.1 5) -

ARI .85 .27 O O O o 。 O o 。(.85) (.27) (.07) (0) (0) (0) (0) (0) (.06) (0)

DS .70 .52 O 。 O O 。 O O 。 o (.70) (.52) (0) (0) (0) (0) (0) (0) (0) (0) (0)

CD .69 .53 O O 。 O O O 。 O O 。(.69) (.53) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0)

SYS .82 .24 O o O 。 O O 。 O O 。 O (.82) (.39) (0) (0) (0) (0) (0) (0) (0) (0) (0) (.1 1) (0)

To sum 吟， the normal and clinical groups residual between Block Design and Object As-

differ with regard to the following trivial but sembly subtests (e7•恥 .12 and .15 for each group

statistically significant parameters: (1) Two error respectively); the other was the correl器，ted resid-

covarÎances were found resided in both norm“ ual between Yocabulary and Comprehension

and clinical groups, the first was the correlated subtests (e3•4= .0吾 and .04 respectively); (2)

WISC-II1 Factorial lnvariance Across Normal and Clinical Samples . 101 .

seven residual covariances were identified for the

clinical group only. There w語re correlated errors

OOtween Digit Span and Symbol Search sub

tes絡(010. 12 = .1 1, t 話 6.10)， between Block De

sign and Arithmetic subtests (07•9 言 .06， t 5.3 1),

between Information and Object Assembly sub

tests (01，8踹弋06， t = -4.93), between Inform滋ion

and Arithmetic sub鈴sts (0 1 •9 品 .07， t = 5.31),

between Picture Completion and Object Assem

bly su悅耳st豆 (05 草草 .09， t 踹 5.93)， OOtween Pic

ture Arrangement and Block Design subtests

(0訂單弋07， t 紅 -5.2會)， also between Compreher卜

sion and Block Design subtests (。你認 -.04， t =

-3.87); and (3) four subtests shown variant error

variances across norm轟I and clinical groups.

They were Picture Arrangement (06品 .45

and .25 respectively); Inform以ion(0 1.1 .23

and .35 res與ctively); Object Assembly (0車.8

.40 and .56 respectively); and Symbol

Search(012,12 = .24 and .39 respectively).

Discussion

Findings of this study support the

four-factor structure, which is consistent with

many previous foreign analytic 紋udies of the

WISC-II1. Also confirmed was a generally in

variant factor structure 轟cross normal and clinical

groups. With some exceptions regarding error

variances and correlated resídua始， empirical

eviden時 generally supported that we were meas

uríng the same theoretical latent constructs for

both normal and clinical children. Besídes,

measurement accuracy of most WISC-II1 subtests

was shown invariant across groups. Our findings

supported partial factor invariance. In general, the WISC寸11 scores for both normal and clinical

children ∞怯怯 be interpreted equivalent旬， that 峙，

as having the same meaning.

Byrne et al. (1 9惡的 suggested that allowing

for correlated errors is often necessary in order to

obtain a well-fitting model. These correlated

errors usually represent nonrandom measurement

errors due to method effects such as item format.

τhey further specified that,“the equality of error

variances 組d covari在nces is probably the least

important hypoth的is to test ... it is widely ac

C海pted that to do so repr!的ents an overly restric開

tive test of the data" (Byrne, 1998, p.2吾 1 ).

In this study, we found correlated residuals

for Block Design and Object Assembly subtests,

also for Voc主bulary and Comprehension subte路

in both normal and c1 inic器I samples. Item formats

for these two pairs are known to 00 similar, for

example, BLD and OA both involving a

part-whole integration, VOC and COM both re

quirin惡 more verbal oral expression, which are

not shared by other Wechsler sub拇指 (S訓le几

2001), these residual covariances were consid

ered meaningful and thus could be incorporated

reasonably. Nonetheles丸 it should be noted that

the degree of group discrepancy on estimated

values were fl鑫irly small for either pair (.12

vs. .15 for the former one，部d .06 vs. .04 for the

later one), though this might indicate a differen

tial item format effect across group草， there

seemedωbe no real need for serious concern

about group discrepancy.

As shown in the results, seven other s泣btly

correlated residuals were detected in the clinical

group only. These estimated values range from

-.04 to .11 , which were indeed quite trivial. The

ones with the most signific轟nt cross-group dis

crepancies were the correlated residual OOt洞en

• 102' 特聽教育研究學科

Digit Span and Symbol Search, and the corre

Jated error between Picture Completion and Ob

ject Assembly. Further data examination revealed

th轟t correJation between subtests in each pair was

higher in the clinical s磁ple (r=.55 and .71 re

spectively, both p<.O 1) then it wωin the normal

惡roup (阱.34 and .45 respectively, both p<.OI).

This result showed that more subtest variances

are actually shared in the clinical population, and

it was in accordance with the known fact that

cognitive abi1ities are less correlated within the

hi拚er ability group (Detterm紹，啥~3; Detter翩

您認& Daniel, 1989; Legree, Pif址~ & Grafto刃，

待會6; Lyn袋， 1990; Lynn & C∞per，待好，待會尋;

Spearman, 1927). Sweetland, Reina, and Ta版

(2006) suggested that lower ability children may

have deficits in important central cognitive pro。

esses, and thus tend to operate on a more uniform

lower level.

Further more, it is clear from a review of the

literature that Digit Span and 句:諦。I Search

both require children to form visual i紡車喜悅 util輛

ize visual sc器nning (either physically or me心

tally), have a tolerance for s個紙 and be 詩le to

concentrate; meanwhile, Picture Completion and

Object Assembly both tap childr蚓、 abi1ity to

form wholistic ∞ncepts on pictures with mean

ingful contents (S揖ttler & Dumont, 2004). AlI

those correlated error terms in the clinical s獄時le

might represent some 紡車red compo泣ents which

講re mor告 congruent in the clinical population. It

might also be reasonable to assume th斜， for

cIinical children, once partitioning off the shared

varianc叫 between subtests which could be ex

plained by latent factors identified in this

four-fa佼佼 model， some trivial portion could

remain un-explained. However, given the sub-

t1ety of such differences across groups, we be

lieve that the overa)) main structure for WISC-III

can be fairly ∞nsidered as stable enough across

normal and clinical populations.

Finally, four subte級s were found with dis耐

crep部t e佼佼 V紹給C斜斜ross groups.τhe error

variances for Information, Object Assembly, and

Symbol Search 紹說你t8 were estimated slightly

higher in the clinical group, which revealed that

the corresponding tested intellectual abilities

could be explained more thoroughly by the

four-factor model in the normal children popula

tion. On the contrary, the error variances for Pic

給您 A汀敘1gement subtest was estimated slightJy

higher in the norrr叫車roup， thus suggested 紛紛

this kind of ability could be better explained by

the model in the clinical children popuJation.

Many factors, such as population unique varia

tion or ability level, could contribute to these

discrepant findings, or it could be the interaction

between test contents and population exception

ality. While we remind 紛紛在rchers to be careful

when expl轟inin草 the test result for children in

different population, the current findings surely

deserve further examinations.

Since our primary goal was to get a wholis

tic vi仇N on factorial invariance across groups,

some inevitable 1imitations of the present sωdy

deserve attention. Fi悶， even if the partial mea.糊

認認ment inv器riance approach he1ped us getti姆

拉le maximum information r宅garding the degree

of overall invarian嗨， we realized that the ex輛

ploratory post-hoc approach in identifying possi糊

ble variant parameters could bring the risk of

identification on chance leve l. In this study, be勵sides the statistical MI indicators, we did monitor

each st叩 and try to reason each modification

WISC-III Factorial Invariance Ac紛紛 Normal and Clinical Samples . 103 .

with sensible meaning through the whole model

缸設ing process. Noneth剎車ss， as Vande登berg and

Lance (2000) su草gested， researchers must care

fully consider and examine the thèoretical justi

fication when relaxing constraint器 in practicing

the p制ial invariance study. Second, it is good

that we used large 蠶nd heterogeneous s置mples for

ensuring the stability and accuracy for parameter

estimations in this current study. However, with

such det垂iled model-fitting modifications, we

f嘗aIize that it could be a problem for later

cross-vaIidation. Researchers are encourage哇 to

aware of this Iimitation.

In summary, given the relatively large dat磊

set and substantial number of variations with

which we wer軍 workin臣， the findings were of

substantial import臨ce for understanding the

factorial structure for this frequently used in

strument. Co削弱tent with contemporary r草$能rch

位ndings， the underlying WISC心1 factor struc

ture is 轟ppropriately represented by the proposed

four編factor m。這el comprising VCI, POI, FDI, and

PSI. Moreover, except for 鉛me trivial discrepan

cies on error variances and correlated residuals, majoritie器。f the model parameters were demon

strated invariant across groups. Evidence of

multi-Ievel invariance supported the partial factor

invariance of this instrument across normal and

clinical groups. The m轟in structure and the un

derlying meanin容。f each WISC-III factor are

generaIIy identical for both normal and cIinic蠶i

children populations in Taiwan.

References

American Educatíonal Research As草ociation，

American Psychological Association, Na-

tional Council 0汲 Measurement in Educa

tion. (1仰9). Standards for educationa/ and

psych%gica/ testing. Washington, DC:

AE設A.

Bentler, P. M吋& Bonett, D. G (1 縛。). Signi缸"

cance tests and goodne鉛翩。f-fit in the 組aly郁

的 of cov器riance structures. Psych%gical

Bu//etìn, 88, 588-606.

Bollen, K. A.(1989). Structural eq缸ations with

latent variables. New York:Wiley.

Byrn皂， B. M. (1998). Structure equation mode/

ing with LJSR缸， PREL/.在 and SIMP LlS:

Basic concepts, applicatÎons， ωui progr捌

ming. M軍hwah， NJ: Lawrence Erlbaum.

Byrn章， B. M., Shavelson, R. J., & Muthen, B.

(1 989). TI的ting for the equiv轟lence of factor

covariance and mean structures: The issue

of partial measurement invariance. Psycho

logical Bulletin, 105(3), 456-466.

Byr珊， B. 說叮& Watkins, D. (2003). The issue of

measurement invariance revisited. Journal of

cross-cult甜ra/ psych% gy, 34(2), 155-175.

Camar盔， W,., 1., Nath揖泣， J. 怠， & Puente, A. E.

(2000). Psychological test usage: Implica

tions in professional psychology. Profes鷹

sional Psych%gy: Research and Practice,

31, 141-154.

Chen，鼠， Chan車， C吋& Yang, T. (2004). Study of

intellec泌al patterns of children with Autism.

Bulletin ofS，戶cial Education, 26, 127-15 1.

Chen, H., Lin, K., & Liao, Y. (2005). Study of

WISC-IJI Intellectual Patterns of Mental

Retarded Children. Bulletin of Specia/ Eduω

cation, 28, 97-122.

Chen，日吋& Yan車， T. (2000). Base Rates of

WISιIII diagnostic subtest pa加股s in Tai

wan: Standardization , Learnin草 Disabled

. 104 . 特聽教育研究學刊

and ADHD s紋nples appIied. Psychological

跨'sting， 47(2), 91.110.

Chen, 丘， Zhu, J., & Chen, Y. (2000). The legiti.

macy and utility of the WISC.lIl fac

tor.based indexes: Taiwan standardization

sample applied. Paper presented at the an

nual meeting of the American Educational

Research As車。ciation， New Orleans, LA,

U.S.A.

加快erman， D. K. (1 993). Giftedness and inteIIi都

gence: One and the same? In Ciba Founda

tion 明d.)， The origins and de~lopment of

high ability. Wi的t Sussex, England: Wi1ey.

加快erma泊 D. K., & Daniel, M. H.

(19惡的.Correlations of mental tests with

each other and with cognitiv雷 vari揖bles are

highest for low IQ groups. lntellìgence, 13, 349-359.

。onders， J., & Warschaus峙， S. (1 996). A s如lC

tural equation analysis of the WISC-III in

children with traumatic head injury. Chi/d

Neuropsychology. 2, 185-192.

Drasgow, F. (1轉4). Scrutinizing psycholo草ical

tests: Measurement equivalence and

equivalent relations with extemal variables

ar單 central issues. Psychological Bulletin. 好，

134-135.

Drasgow, F. (1總7). Study of the measurement

bias of two st路也rdized psychological tests.

JournaJ of Applied PsychoJo智，刀， 19-29.

G疇。rgas， J., Wi糾紛， L. G., V.部 de Vijver, F. J. R.,

& Saklofs詣， D. 日，但也.) (2003). Culture

and chi/dren ~ inteJJigence: Cross編culturaJ

analysis of the WISC- lIl. S如 Diego多 CA:

Academic Press.

Georgas,J., Van de Vijver, F. J. 仗， Weis囂， L. o.，

& Saklofske, D. 成 (2003). A cross-cultural

analysis of the WISC-III. In J. Georgas, L.

0. We棍， F. J. 鼠. Van de V這ver， & D. H.

Saklofske (Eds.), Culture and children ~ În

teJJigence: Cross-cultural ana伊拉 of the

WISC-lIl (pp. 217-313). San Diego, CA:

Academic Press.

Hom, J. L., & McArdle. (1 992). A practical and

theoretical guì配給 me墨surement invariance

in agin在 research. Experimental Aging Re

search, 18(3), 117仆4ι

Hoyle, R. 鼠， & Panter, A.τ. (1995)龜 Writing

about structural equation mode怡. In R. H.

Hoyle (Ed.), Structural equation modeling:

concepts, issues, and applications. τhou

sand Oaks, CA: Sage.

Hu, L畫， & Bentler, P. M. (1 998). Fit indices ín

covariance structure modeling: 銬的Îtivity to

underpar轟meterized model misspecification.

P秒，chological Methods, 3(4), 424-453.

Hun章， L., Chen, S叮& Chen, H. (2003). The

Study of Intelligence of Secondary Students

with Leamíng Dis的iIities. Journal of Tai

wan Normal Universi妙: Education, 48(2),

215-238.

Jöresk嗯， K. 0., & Sörbom, D. (2006). LlSREL

8.8 statistical programs. Chic睹。: Scientific

Software.

Kaufm詣， A. S. (1 994). lntelligent testing with

the WISC-lll. NY: John Wiley & Sons.

Keith, T. 芯， & Witta，且心(1 997). Hierarchical

andc紛紛 age confirmatory factor 訟alysis

of the WISC-III: What does it measure?

School PsychoJogy Quarter秒，口，妙-107.

Klin章，又. B. (2005). Principles and practice of

structural equation modeling (2“ ed.). NY:

τhe Guilford Pres章建

Kono峙，了.氏， Kush, 1. C吋& Canivez, 0. L.

WISC-III Factoríal Invaríance Across Norrnal and Clínícal S益mples . 105 .

(1 997). Factor replication ofthe WISC-III ín

three independent s在mples of children re“

ceiving speciaI education. Journal of Psy

choeducationa/ Assessment, 15, 123-137.

Legree, P. J., Pifer，結.且，& Gra食on， F. C. (1 996).

Correlations among cognítive abilities are

10wer for higher abiIity groups. Inte l/igence,

23， 45-57。

Lynn, R. (1990). Does Spearma泣's g decJine at

high levels? Some evidence from Scol益nd.

The Journa/ of Genetic Psychology, 153, 229輛230.

Lynn, R., & Cooper, C. (1 993). A secuI直r decline

in Spearm轟n's g in France. Learning and

Individual Differences, 5，是3-48.

Lyn袋，狡.， & Cooper，仁(1 994). A secular decline

in Spearm喜泣's g in Japan. C缸rrent Psychol

ogy Developmental, Learning, Personali.紗，

Social, 13, 3-9.

Marsh, H. W., Balla, 1. 氏， & McDonald, R. P.

(1 988). Goodness-of-fit indexes in confir

matory factor analysís: The effect of sample

size. Psychological Bul/etin, 103， 3會 1-410.

McDon在峙，民跤， & Ho, M. R. (2002). Principles

車nd practice in reportíng structura1 equ轟ting

ana1yses. Psychological 品袋thods， 7(1),

64-82.

Meredith, W. (1 993). 但easurement invaríanc巴，

factor a泣器lysis and Ð轟ctorial invariance.

Psychometri旬， 58(4)， 525勵543.

MiIlsap, R. 且， & Kwok, 0.(2004). Eva1uating

the impact of partial factOI城 invariance on

selection in two populations. Psych%gical

Methods, 9(1), 93-115.

Muthén，紋， & Kaplan, D. (1 985). A compariso泣

of some methodologies for the factor analy總

sis of non-normal Iikert varíables. British

Journal of Mathematical and Statistìcal

Psychology, 38, 17卜 189.

Prifitera, A吋 Saklofske， D. 址， & Wei裕， L. G.

但也.) (2005). WISC-IV c/inical use and in

terpretation: Scientist-Practitioner perspec

tives. Burlíngton, MA: Academíc Press.

Ro仗， D. 丸， We口s， C. 況， & Flaugher, R. L.

(I978). The use of analysís of covariance

structures for comparíng the psychometric

properties of mu1tiple variables acro認

popu1ations. Mu /tivariate Behavioral Re恥

search，汀， 403-418.

Roid, G. 鼠， Prifitera, A勻& Weíss, L. G (1993).

Replication of the WISC-III f;在佼佼 structure

in an independent sample. Journal of Psy

choeducational Assessment Monograph 晨，

ries. Advances in Psychological Assessment:

F伶chsler Intelligence Scale for Chiι

dren-Third Editìon，重-2 1.

Roid, G. 此， & Worral, W. (1997). Rep1icatíon of

Wechsler Intelligence Scale for Chil

dren-Third edition four-factor model j琵 the

Canadian normative sampl缸• Psychological

Assessment, 9, 512“ 515.

S車從ler， 1. M. (2001). Assessment of children cog

nitive applications (4th ed). Le 說esa， CA:

Author.

S直設ler， J. M., & Dumo訟， R. (2004). Assessment

of children: WISC-IV and WPPSI-IIl sup

p/ement. Le Mesa, CA: Author.

Shapìro, S. 鼠， & Wilk, M. B. (1965). An anaJysis

。f v訂iance 車5t for normality (complete

時給ples). Biometri缸，泣， 591-61 1.

Spean棚， C. (1 927). The ability ofman. London:

Mac怯ilIan.

Sweetland, 1. D吋 Reina， J. M叮& Tatti, A. F.

(2006). WISC-III verballperforrnance dis.

. 106 . 特殊教育研究筆者j

crep紐cies among a sample of gifted chil

dren. Gi丹'edChild Quarter，紗\ 50(1), 7-10.

Tupa, D. 1., Wrig尬， M. 。門& Frist詞. M. A.

(1997). Confirmatory factor 部alysis of the

WISC-III with child psychiatric inpatients.

Psych%gica/ Assessme1仗" 9(3), 302-306.

Vandenberg, R. J., & Lance, C. E. (2000). A r，聶衛

view and synthesis of the measurement in

variance literature: Suggestions, practices, and recommendations for organiz測ional re

search. Organizationa/ Research Metho街，

3( 1)， 4-69. 叫

Wechsler, D. (1 991). Manua/ for the Wechs/er

Intel/igence Sca/e for Children - Third Edi

tion. San Antonio, TX: The Psychological

Corporation.

Wechsler, D. (1 997). M.鏘ua/for the 為iwan ver

sion of the Wechsler Inte /ligence Scale for

Children - Third Edition. Taipei, Taiw部:Chinese beh轟vioral science corporation也

W紛紛， L. (忌， Saklofs峙， D. 此， Schw紋t萃， D.M刊

Prifiter昌， A., & Courville, T. (200的. Ad欄

vanced clinical i治terpret揖tion of WISιIV

index scores. In. L.G. Weiss, D. H. Saklof

sk章， A. Prifiter丸& J. A. Holdnack. (Eds.),

WISC-IV advanced clinica/ ínterpretation

(pp139-179). Burlington, MA: Academic

Press.

收楠日期: 20肘.1 0.1 2

撥斃的期: 2008.03.11

. 107 .

HH G -t 到uw

nLV HM AU

n乞

旬，.

-HM

<U

L針，

za

ni--帥、句‘d

nyny p3

, fη

oi

、

-a

句、d

-HH弓‘Jv

aEa. ha i

]心似許許心利FM

讓灣一般兒車與臨床兒童之魏氏兒童智力量表第三版(WISC-III)因素恆等性研究

嗓心，拾

金灣師範大學特教是教授

朱建軍美鑽心理公司主任

洪儷瑜

發灣師範大學特教系教授

張正芬臺灣師範大學特教系教授

楊宗仁*北教育大學特教系副教授

本研究主要自鷗在樹驗魏氏兒童智力量表第三版(WISC-III)在一般兒黨與特殊

兒黨組群聞之區紫值等性。文中接據 1 ， 100 名 WISC-III 棋準化標本及 1150 名臨床

兒童標本黨腎，灑馬結構方程模式進行多樣本驗麓性囡黨分析.本研究以 WISC-III 12

劉東要分割驗與四個潛在留素架構為假設轟準模盤，分三階段完成橡蟻。階段一先

分韓三組進符單標本驗證性盟黨分軒， j;J.建立讓當之義準困難模式。階段二則聽立

睛段性主牽涉設限的巢賽棋盤，避…轅鞍因素架構、厲素負荷蠱、與殘差變異數之蹄

組對權等館設。研究發現除了少處都聽之殘競賽異或共變盤空投異持緝大多數之幢驗

蓮香數持真榜樣本罷等特性· WISC-I日在一控克黨與特殊兒童輯群間真有相間之自因

素架構與聞業負荷釁 e 整體面言，研究輯果支持鄧份因素盤等哇，台灣臨床兒童與

一般兒寞之 WISC-III 分數是具有輯同之解釋議議。

關攤詢:特殊兒童、因素盟等桂、多樣本驗蠶住國黨分析、 WISC-III