• 93 •
特蛛教育研究學刊
民阱, 33 卷 1 期, 93刪 107
WISC-III Factorial Invariance Across Normal and Clinical Samples
Chen Hsin-Yi Professor, Dept. of 話pecial Education,
Nationalτ'aiwan Normal University
Hung Li-Yu Professor, Dept. of Special Education,
National Taiwan Normal University
Zhu Jianjun Director, The Psychological COrpOI滋ion,
U.S.A.
Chang Cheng-Fen Professor, Dept. of Special Education, N袋tional Taiwan Normal University
Yang Tsung-Ren A銬。ciate Professor, Dept. of Special Educ鑫tion, National Taipei University of Education
ABSTRACT
Tak:ing as its sample a normal group of 1, 100 children and a clinical group of 1,150
children in Taiw紹, the pu叩ose of this study was to test fo1 invariance in the WIS心III
factorial structure across normal and clinical samples. ResuIts of our two-stage
multi心ample ∞nfirmatory factor analysis 到擇。如d partial factor invariance. Among all
examined parameters, only a few discrepancies involving unique error variance草 and sub
tly correlated residuals were found across groups. Evidence of partial measurement in
variance indic謹.ted that the hypothesized four-factor model generally fit the real data
fairly welI. It was found that while measurement accuracy for a few subtests di能rs be備
tween and among 草roup囂, the overall main structure and fi揖ctor loadings were generally
invariant across normal 削d clinical 草棚ples. Therefore we concluded 紛紛 WI蛇-111
scores can be interpreted as h糾穗 the same meanin草 across groups.
Keywords: Exceptíonal children; Factorial Invariance; Multi-sample CFA; WISC-III
*Our gratitude goes to the Academic Pap巷r Editing Clinic, NTNU.
. 94 . 特殊教育研究學刊
Introduction
γhe Wechsler Intelligence Scale for Chi1-
dren (WISC) is 000 of the most widely referred
individualized intel\igence 紐約 (Camar龜, Nathan,
& Puente, 2000). Up to now, it is estimated that,
including Taiwa錢, close to twenty countries h在.ve
adapted the standardized version of this instru
ment 防部rgas, Wei鉤, V認為 V俞斌& Saklo已
sk皂, 2003).
Compared to the traditional verbal - p軒"
formance Wechsler IQ construet, a new,
four-factor structure (Verbal Comprehension,
Perceptual Organization, Freedom from Distrac
tibili紗, and Proce鈴ing Speed) was proposed in
the third edition of this instrument 治 1991
(WISC-III; Wechsler, 1991). This four - fac
tor-based structure is more in line with contem編
pora句紛紛訂ch on inte \1ectual components and
has been recognized for i的, clinical uti1ity both in
Taiwan 制d int草mationally (Chen & Yang,鉤。0;
Hung, Chen, & Chen, 2003; Weiss, Saklofs棍,
Schwar侃, Prifitera, & Courvil1章, 2006). Besides,
this model h諮詢問 cross-validated extensively
in a variety of samples by traditional exploratory
or confirmatory factor 甜的那 (Do闊的&
Warschaus均" 1996; Keith & Witta, 1997; Kono泊,
Kush, & Canivez, 1997; Ro泊, Prifitera, & Weiss,
1993; Roid & Worrall, 1997; Tupa, Wríght, &
F泛stad, 1997) ,在nd was confirmed tobe a pre“
ferred model for Taiw給 normal children (Chen,
Z恤, & Chen, 2000; Georgas, V臨街 V草ver,
Weiss, & S雄lofs峙, 2003). Nonetheless, exami
nation of the faεtorial invariance between normal
and clinical popul滋ions b俗話d on multi柵group
s加cture equating modeling (SEM) is stil1 short.
Factorial invariance is a key property of any
measure (Dr絲露ow, 1984, 1987; Ro吭, Werts, &
Flaugher, 1978). Scores for individuals from
different groups cannot be given the same mean
ing if there is no evidence of such invariance
(Horn & McArd怡, 19亨氏 Vandenberg & Lance ,
2000). Furthermore, it is clearly stated in the
軍tandard 7.8 of “Standards for Ed凶ational and
Psychological Testing" (AERA, APA, & NCM缸,
1999) th斂,“Comparisons across groups are only
m給你19ful if scores have comparable meaning
acr。“ groups. The standard is intended as appli刪
cable to s惡ttings where scores are implìcitly or
explicitly presented as comparable in score
meaning across 在roups (p.83)'\Millsap and K wok
(2004) also pointed out that selection based on a
composite with underlying non-invariant factor
structures could be problematic.
In empirical practic皂, WISC-III is most fre
quently administered for the purpose of diagnos
ing and evaluating the cognitive function and
exceptionality of clinical populations (Kaufrnan,
1994; Sattler, 2001; Sattler & Dumo刻, 200余
Prifitera, Saklofs峙, & Weiss, 2005). Implicit in
this common practice is the assumption that
WISC-III 認btests and factors have 紡車 same
meaning for both normal and clinical children;
that 峙., equivalence is assumed to hold for the
underlying theoretical 給ucture旱, factor pattems
(subtests loaded on the same factors across
groups) and the magnitudes of factor loadings.
1位前le literature, few studies ever ex器mine
the measurement equivalence of WISC-III acro給
l紡車:e normal and c1inical samples by SE說 tech“
nique which has the advantage of taking cov侃"
鵲ce matrix of both groups into consideration.
n紛紛fore, taking as its sample Taiwanese chil
dren with a 闊的ciently 1紋gesample size and
WISC-III Factorial Invarian凹 Across Normal and Clinical Samples ﹒的﹒
degree of variation, this study investigated the
issue ofinvariance in the WISC-III by means ofa
multi-sample structure equating modeling.
R直ethod
Participants
Data based on two samples were analyzed in
命諒你dy. The first one is a normal samp誨,制d
the second one is a clinical s蜘ple.
The normal sample The first s直mple is the WISC-III Taiwan
standardization sample, comprised of the re晴
spon終s of 1,100 normal children ranging in a伊
from 6 to 16 years old. This 位ationally represen
tative s霉的ple was divided into 11 groups accord
ing to age, wíth 50 males and 50 fem孟les ín each
group. This sample was selected carefully to
match the Taiwan census information on several
variables, inclu這ing region, gender, and parents'
educatíonal leve1. The me垂n age was 11 , wíth a
standard devíation of 3.16; the average
full-scaled IQ (FSIQ) was 100 (SD叫5). A de令偏
tailed description of t油hi詠s normal s翎ar紋t訣lpl抬e 誌s r終.穹單
po釘T舟$給ed i泊n the τ孟滋iwa納n version of the WI臨SCι-11111 1
ma船m削n沁u恤I泌直I(W缸齡ch站1怒sler,丸, 1997η).
The clinical sample The second sample ís a heterogeneous
clinical sample including a total of 1 , 150τ'aiwan
children, who were forma l1y identified and diag
nosed by clinicíans or educational evaluators 鈴
with special needs. Among them, 37% wer重 di
agnosed as being m侃tal1y retarded, 32% as hav
ing learning disabilitie旱, 19% as being autist誨,
10% as having ADHD,個d 2% as havi均 emo
討onal and behavioral disorders. The data were
collected by authors vía multiple tracking meth
ods during 2002 to 2007. Some were from formal
academic or clinical evaluation 終∞r品, and oth
ers were from database i終 special education iden
tification and placement system. For this group of
children, the mean FSIQ w紛 78.59 (SD喘2 1.78).
The average age was similar to that ofthe normal
group (M路10.泣, SD踹2.69), and the gender ratio
was roughly 7:3 (72% males vs. 28% fema1e哼,
which concurs with the known fact that there is a
much higher percentage of ma1es in the clinícal
population.
Instrumentation The Taiwan version of the WISC-II1
(Wechsler, 1997) c的tains 13 subtests: Informa
tion (INF), Similarity (SIM), Vocabulary (VOC),
Comprehensíon (COM), Picture Completion
(PIC), Pícture Arrangement (PA), Block Design
(BLD), Object Assembly (OA), Arithmetic (ARI),
Digit Span (DS), Coding (CD), Symbol Search
(SYS), and Mazes (MZ). .'\11 composites and
subtests demonstrated good reliabilities (ex可 the
íntemal reli~bility rangíng from .87 to .96 for
composites,翻d .68ω.90 for subtests). Cumula
tive research finding also provided good s街路的
of validity evidences of this instrument for Tai
W闊的e children (Chen, Ch組草, & Y:部草, 2004;
Chen, Lin, & Lìao, 2005; Chen & Y:車ng, 2000;
Hun車, Che泣,& Chen,卸的; Wechsler, 1997).
Ana今'Sis 01 the data Tests for the factoria1 invariance across
normal and clinical groups were based on the
訟alysis of covariance structure models using
LISREL 8.8 ( Jöreskog & Sörbom, 2006). Fig.
1 shows the structure of the hypothesized base總
line model, i琵 which 12 core WISC-II1 subtests
(with the exception of the optionaJ Mazes subte滋)
特聽教育研究學刊
Index, PSI. This 加seline model was first tested
separately so that e蠶ch group could examine its
approprlateness.
• 96 •
ar單位ivíded according to 泌dex-type: (1) Verbal
Comprehension Index, VCI; (2) Perceptual Or
ganiz銜。nal Index, POl; (3) Freedom from Dis
倡“ibility Index, FDI; and (4) Proce給ing Speed
Construct of the hypothesized WISC-III four-factor model
matrices. Maximum likelihood was the estima
tion method because of its robustness and sensi-
Figure 1 鱷
Ne吼 inv:鉗制能 analyses with nested mod
els were tωted ∞伽借 levels. 單ach level 鉛ts
tivity to incorrectly specified models (Hu &
Bentler, 1 型98).τhe scale of latent factors was
defined by fixing the first factor loading as one
仰r factor. Criteria were evaluated jointly 紛紛倫
sess over辜11 model fit (Bentler & Bone哎, 1980;
Mar弱, Balla, & McDonald, 1988). These in
cIuded weighted I細t squares "1:, '1.: to df ratio, goodness-of-fit index (GFI), adjusted good
ness-of-fit index (AGFI), normed fit index (NFI),
non-normed 知 index (NNFI), comparative fit
必贈啥叫15個ints than 削 imposed by 設le previ
ous ∞e (f\如edith, 1993). 甘le 貧富t and w磁k臼t
level tes給 for configural invarian間, which a公
sumes 也at the oVi仿冒11 factor patt,敘說 is the same
如k油椒油al 甜d clini開1 chil的民These切關
ASh-J
宮丘電話
deJj
暫且苟且可gc
毛
adiztad塌,首gam
index (CFI), and root-me妞-square-error of ap
proximation (RMSEA). In 直ccordance with ∞心
ventio訟, a value of 夕。 served as the rule-of
thumb lower limit of acceptable fit for all fit in
dices rangin草 from zero to 1, with 1 indic轟ting a
perfect fit (Hoyle & Panter, 1995; KIi帥, 2005).
level t,的“你 W帥k fact剝削 invariæ帥, or a1so
個lIed 波爾ic invari翻開;也is ∞郎trained m叫el
Z湖泊res the ma敘述“de ofthe 翁ctor 1咽dingsbe
徨姆總鳴叫'oss a11 grou濟(~ = ^c). The thi吋
level tests for uniqueness invariance (eN 囂。c);
ex器mining whether WISC-III
four-factor-structure explains the same amount of
vari車nces of s的tests in both groups. 許lat is, whether the subtest abilities could be measured
me似的this
with similar 在ccuracy across groups.
All models were tested using covari在nce
WISC-III Factorial lnvariance Across Normal and Clinical Samples ﹒悍,
Values close to 2.0 or 3.0 were considered fit for
the :e to df ratio (Bol\en, 1989). An RMSEA less
than .05 con明ponded to a “good" 給 and with .08
considered an “acceptable" 品 (McDonald & Ho, 2002).
During e在ch step of the analy認s, the chi
$限制 difference 抖的 was tested between
nested models, and suggestions regardin草 to par
tial measurement invariance (Byme, Shavelson,
& Muthen, 1989; Byrne & Watkins, 2003) w紛
carefully considered and fo l\owed. If inadequate
街 was detected, fit in the model was improved
by including additional para鈴eters identified by
the modification index (Ml) provided by LISREL.
Meanwhi泊, re中arameterization was examined
careful\y for meaningfulness.
Results
Descriptive statistics
The descriptive statistics for each subtest by
group are presented in Table 1, along with the
Shapiro削Wilk index (Shapiro & Wilk, 1965) for
normality testing.
According to the me轟n values in Table 1,
disabled children performed lower on all subtests
than their normal counter悶ts.τhe pa惚惚 。fthe
data for both groups approximated 主 normal dis
甘ibution. Skewness ranged from -.3 4 to -.05 for
the normal, and from .07 to .55 for the 晶磊bled
group; kurtosis ranged from -.09 to .44 for the
former, and from -1.08 to -“22 for the 1搬r group.
Arr峙。rity of the Shapiro-Wilk indices were c10se
to 1. Muthén and Kap\an (1985) once suggested
that a Iikelihood estimate for variables with a
skewness and kurtosis around -} to 刊 is accept
ab\e. Kline (2005) a\so suggested that the z test
may not be very useful in large samples because
slight departures may end up as being statistically
significant. An alternative is to interpret the ab
solute values of standardized indices. When the
absolute va\ue of skewness is larger than 3, or the
在bsolute value of ku泌的is is larger th訟的, then
it is considered to be a non-normal problem. Our
T翁ble 1. Descriptive statistics for both groups
Subtest INF
SIM
VOC
COM
PIC
PA BLD
OA
ARI
DS CD
SYS
M 10.28
9.74
9.86
10.21
10.35
9.92
10.16
10.26
10.56
10.33
10.14
10.33
Nonnal SD Sk Ku 3.05 -0.05 0.02
3.吾5 -0.34 -0.0華
3.53 -0 .3 1 0.07
3.30 鵬0.32 0.17
3.05 -0.14 0.20
3.36 -0.24 -0.05
3.14 -0.17 0 .17
3.13 -0.1 9 -0.09
3.06 -0.05 0.22
3.16 -0.06 。.06
3.17 -0,07 0.32
3.25 -0.1 3 0.44
C1inical 皂roup (N 臨 1 ,150 )
S-W M SD Sk Ku S-W 0.99 6.25 4.05 。.44 -0.46 。.94
O.會喜 吾.59 4.50 0.26 -1.0華 0.92
0.98 6.53 4.03 0.28 -0.67 0.95
0.98 6.54 4.07 0.28 -0.69 0.95
0.99 7.6盛 4.57 0.07 -0.97 。.95
0.99 6.95 4.32 0.27 -0.93 0.95
0.99 7.33 4 .57 0.17 -0.89 。.95
0.99 7.91 4.39 0.08 -0.77 0.97
0.99 5.87 3.87 0.51 -0.22 0.94
O.會9 6.72 3.81 0.33 -0.36 0.97
0,99 5.3尋 3.72 0 .55 峙.46 0.92
0.98 6.44 4.00 0.27 靜0.62 0.95
Note: M Mean; SD = Standard Deviation; Sk 就 Skewness; Ku = Kurtosis; S-W Shapiro-Wi1k test fornonnalíty
. 98 . 特臻教宵研究學辛苦
results revealed that no serious non-normality Baseline model checking
was identified in the current da絡, and thus the
maximum Iikelihood method was applied for As indícated by all goodness鷹。f-fit indices
model estimatíon. 認ported in Table 2, the initially hypothesized
Table2. Multi-sample CFA goodness-of-fit index:
Models 2/ df GFI AGFI NFI NNFI CFI RMSEA
Phase 1 : Baseline model fit for each group Normal group (N) 176.32 48 3.67 0.97 。.96 0.99 0.9會 0.99 0.0毒害
@7.8 free 144.96 47 3.08 O.空軍 。.96 0.99 0.99 0.99 0.044 ClinicaI group (C) 449.86 48 9.37 。.94 0.90 。.98 0.98 0.99 0.085 @,月合自 360.18 47 7.66 0.95 0.92 0.99 O.空軍 0.99 。‘[email protected] 翁官E 317.23 是6 6.89 0.96 0.93 0.99 0.99 0.99 0.072 @79fr紛 26吾.65 45 5.93 0.96 0.94 0.99 0.99 0.99 0.065 @1,8 含章S 236.4 1 44 5.37 0.97 0.9.尋 0.99 0.99 0.99 0.062
。1月 free 21 1.35 43 4.92 0.97 0.95 0.99 。.99 0.99 0.058 喔~'.8 free } 89.04 42 4.50 0.97 0.95 。.99 0.99 1.00 0.055 @3.4 free 163 .41 41 3.99 0.98 。.96 。.99 0.99 1.00 0.051
Phase II : Factor invariance aα'Oss groups 1. Configur祖1 Inv吋削侃 30車.37 88 3.50 0.98 0.99 0.99 0.99 0.047 2. Factor loading InVi制給給 338.76 96 3.53 0.98 0.99 0.99 。‘99 0.047
L\2 vS.1 30.3會*. s 2a.@3,4 舍的 (N) 318.59 95 3.35 0.98 0.99 0.99 O.嘗嘗 0.046
L\2a vs. 1 10.22 ? 3. Error V紋ìance Inv蚓ance 465.33 107 4.35 0.97 0.99 0.99 0.99 0.055
L\3 VS. 2a 14吾.74** 12 3巷.@制為它e(C) 440.15 106 4.15 。.97 0.99 0.99 0.99 0.053
L\3a vs. 2a 121.56** 11 始[email protected] 合軍e(C) 405.31 105 3.在G 0.97 0.99 0.99 0.99 0.050
L\3b vs. 2a 86.72** 10 3c. 恥.7 台海哥拉} 383.77 104 3.69 0.97 0.99 0.99 0.99 0.049
L\3c vs. 2a 重5.1慧" 9 3d. @8,8 fi官e(C) 364.09 103 3.53 0.97 0.99 。.99 0.99 0.0尋?
L\3d v氣2a 45.50** $
3e. @4.7 合當(C) 3尋2.72 102 3.36 0.98 0.99 0.99 0.99 O.側重
L\3e vs. 2韓 24.13** ? 3f. @12.12 free (C) 325.93 101 3.22 0.98 0.99 。.99 0.99 。.045
L\3fvs‘2a 7.34 6 Note: ** p<.O 1; @',8 = error covariance be糊糊1 Block Desi伊您是 Object Assembly; @諦'.12 器研orω叫iance be.
抑的n Digit Span and Symbol Search; @7,9 棚 error ∞v研ance be伽een Block Desi伊 and Ari也鈴聲tic; @1,8臨
errorcovariancebetw棚1 Information and Obj學ctA蜘mbly; 副書,端 errorωvariance betw棚1 Information an是Arithmetic; 喔,直,8 = error covariance betw如1 Picture Completion and Object Assembly; @3.4 軍 error covari-ance between Vocabularγand Comprehension; 恥.7 串串rror covariance betw紛紛 Picture Arrangement 捌dBlock Desi伊;如.7 = error covari徊的 between Comprehension and Block Design; @6,6 =棚。r variance for Picture Arrangl開lent; @I.I 揖 error variance for Information; @ø,s盟 error variance for Object Assembly; @1:I.12 揖 error variance for Symbol Search.
WISC-III Factorial Invariance Across Normal and Clinical Samples . 99 .
four-factor model fits comparatively better for
the normal group than for the clinical 銘mple.
This was a reason給le finding since Wechsler
factor structures have traditionally been estab
lished based m在inly on the normal popul甜的,
and clinical population is known to have some
di反inct cognitive p揖ttern.
We further examined the model 說ness for
each popul轟tion individual詩. For the normal
group, all “恥" v辜!泣紹說 this initial model were
within ideal ranges. Result revealed that the
four-factor structure is 在n appropriate construct
for normal population. To identi命組y modi訂閱"
tion which may further improve this model, we
then proceeded in an exploratory f1紛hion to 10-
偽te possible mis-fit parameters. The highest MI
indicated an error covariance beh鴻章n Block
Design and Object Assembly subtests, which was
considered as a reasonable one. Once we relaxed
this error covariance for the normal group, a rela倆
tively small but significant value was estimated
(standardized estimatee7•8 = .13, t= 5.5 1, p<.Ol ).
This revised model fit was improved and thus
was treatedωthe starting model for normal
group in the following invari車nce checking.
The same inspecti∞ procedure was applied
to the cIinical sample. With an RMSEA 叫 .085
and the x2todfratio as 9.37 shown in the 泌itíal
model, we believed that some model modifica
tions 車re needed for better improvement. After
free estimation of seven correlated residuals be
ing aIlowed, a better-fitting model was estab
lished. Since these trivial correlated residuals
were consi是ered reasonable, this improved struc
ture was set as the starting model for the clinical
sample.
結ulti舟ample invariance analysis
Based on the defined startíng m。這e怒,
multi-sample 轟nal如es were conducted with con.
straints embedded in sequence. First, checking
for configural invariance across groups (Model 1)
reveale是 a good model-data fit. Normal and
clínical chiJdren basically share the same latent
four-factor structure, and corresponding subtests
employ the same factors. Second, factor loadings
were then constl滋ned to be equal ac紛紛 groups
(model 2). CFA results indicated a good model fi t.
However, the X2 difference between this model
and Model 1 was significa成 (år(8)寸0.3 9,
p<.Ol), an exploratory approach was then pro
ceeded to locate misfit parameters. The highest
MI indicated an error covariance for normal
group between the Vocabulary and Comprehen
sion 如bt悠悠 which was considered as with ap
propriate meaning. Once relaxing this parameter
(model 2紗,是 relatively smaJl but significant
value w的 estimated (standardized estimat草 eJ 再=
O.肘, t話 4.30, p<.01 ). The revis韋拉 model had
improved 是 and the X2 difference between I關el2a and model 1 was not significant (år(7)=1 0.22,
p>.05). Fi捕時, further ∞nstr紹給 on error vari輸
ance equivalence were imposed (model 3). The
model fit well from a practical perspective but
with a significant X2 di能rence (吋(12拘捕,抖,
p<力 1), suggesting that unique variances are not
completely invariant under current model speci
fication. MI checking again helped to indicate
mis-fit parameters. With six more error variances
and covariances bein草 examined and set free for
estimation, the final model (model 3f)日t the data
fairly well and shown 轟 non-significant X2 differ
ence (åX2(6)組?此, p>.05) compared to model2a.
. 100 • 特殊教育研究學科
Standardized estimates based on model 3f
for each group are shown in Table 3. Again, the
scale of latent factors was defined by fixing the
initial first factor 10轟ding estimation 泌 one per
factor. According to Table 3, while with the
four-factor structure and all factor loadings re
main invariant, the 是ifferences between groups
mainly reside in trivial residual variance and
covariance terms. ln general, only 車 few discrep
anci的 were identifie是銘:。草s groups. The partial
factorial invariance in factor pattern, factor load
in斜,銷社 error variances between normal and
clinical groups thus was supported.
Table 3. Standardized parameter estim滋es for invariance model 3f
Normal gro誼p (Clinic叫 group)
Factor Uniq輯學綴章喜事
Residual covariances 0 loading單
。 INF 這SIM VOC COM PIC PA BLD OA ARI DS CD SYS A
INF .84 .23 (.84) (.35)
SIM .辜會 .22 O
(.89) (.22) (0)
VOC .86 .27 O O (.86) (.27) (0) (0)
CO其f.81 .34 Q O .06
(.81) (.34) (0) (0) (.04) .
PIC .79 .38 。 O O 。(.79) (.38) (0). (0) (0) (0)
PA .81 .45 O 。 。 。 O
(.81) (.25) (0) (0) (0) (0) (0)
BLD .81 .35 o O O o o O (.81) (.35) (0) (0) (0) (-.04) (0) (門。7) .
OA .72 .40 。 O O 。 O O .12 (.72) (.56) (咒。6) (0) (0) (0) 已。9) (0) (.1 5) -
ARI .85 .27 O O O o 。 O o 。(.85) (.27) (.07) (0) (0) (0) (0) (0) (.06) (0)
DS .70 .52 O 。 O O 。 O O 。 o (.70) (.52) (0) (0) (0) (0) (0) (0) (0) (0) (0)
CD .69 .53 O O 。 O O O 。 O O 。(.69) (.53) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0)
SYS .82 .24 O o O 。 O O 。 O O 。 O (.82) (.39) (0) (0) (0) (0) (0) (0) (0) (0) (0) (.1 1) (0)
To sum 吟, the normal and clinical groups residual between Block Design and Object As-
differ with regard to the following trivial but sembly subtests (e7•恥 .12 and .15 for each group
statistically significant parameters: (1) Two error respectively); the other was the correl器,ted resid-
covarÎances were found resided in both norm“ ual between Yocabulary and Comprehension
and clinical groups, the first was the correlated subtests (e3•4= .0吾 and .04 respectively); (2)
WISC-II1 Factorial lnvariance Across Normal and Clinical Samples . 101 .
seven residual covariances were identified for the
clinical group only. There w語re correlated errors
OOtween Digit Span and Symbol Search sub
tes絡(010. 12 = .1 1, t 話 6.10), between Block De
sign and Arithmetic subtests (07•9 言 .06, t 5.3 1),
between Information and Object Assembly sub
tests (01,8踹弋06, t = -4.93), between Inform滋ion
and Arithmetic sub鈴sts (0 1 •9 品 .07, t = 5.31),
between Picture Completion and Object Assem
bly su悅耳st豆 (05 草草 .09, t 踹 5.93), OOtween Pic
ture Arrangement and Block Design subtests
(0訂單弋07, t 紅 -5.2會), also between Compreher卜
sion and Block Design subtests (。你認 -.04, t =
-3.87); and (3) four subtests shown variant error
variances across norm轟I and clinical groups.
They were Picture Arrangement (06品 .45
and .25 respectively); Inform以ion(0 1.1 .23
and .35 res與ctively); Object Assembly (0車.8
.40 and .56 respectively); and Symbol
Search(012,12 = .24 and .39 respectively).
Discussion
Findings of this study support the
four-factor structure, which is consistent with
many previous foreign analytic 紋udies of the
WISC-II1. Also confirmed was a generally in
variant factor structure 轟cross normal and clinical
groups. With some exceptions regarding error
variances and correlated resídua始, empirical
eviden時 generally supported that we were meas
uríng the same theoretical latent constructs for
both normal and clinical children. Besídes,
measurement accuracy of most WISC-II1 subtests
was shown invariant across groups. Our findings
supported partial factor invariance. In general, the WISC寸11 scores for both normal and clinical
children ∞怯怯 be interpreted equivalent旬, that 峙,
as having the same meaning.
Byrne et al. (1 9惡的 suggested that allowing
for correlated errors is often necessary in order to
obtain a well-fitting model. These correlated
errors usually represent nonrandom measurement
errors due to method effects such as item format.
τhey further specified that,“the equality of error
variances 組d covari在nces is probably the least
important hypoth的is to test ... it is widely ac
C海pted that to do so repr!的ents an overly restric開
tive test of the data" (Byrne, 1998, p.2吾 1 ).
In this study, we found correlated residuals
for Block Design and Object Assembly subtests,
also for Voc主bulary and Comprehension subte路
in both normal and c1 inic器I samples. Item formats
for these two pairs are known to 00 similar, for
example, BLD and OA both involving a
part-whole integration, VOC and COM both re
quirin惡 more verbal oral expression, which are
not shared by other Wechsler sub拇指 (S訓le几
2001), these residual covariances were consid
ered meaningful and thus could be incorporated
reasonably. Nonetheles丸 it should be noted that
the degree of group discrepancy on estimated
values were fl鑫irly small for either pair (.12
vs. .15 for the former one,部d .06 vs. .04 for the
later one), though this might indicate a differen
tial item format effect across group草, there
seemedωbe no real need for serious concern
about group discrepancy.
As shown in the results, seven other s泣btly
correlated residuals were detected in the clinical
group only. These estimated values range from
-.04 to .11 , which were indeed quite trivial. The
ones with the most signific轟nt cross-group dis
crepancies were the correlated residual OOt洞en
• 102' 特聽教育研究學科
Digit Span and Symbol Search, and the corre
Jated error between Picture Completion and Ob
ject Assembly. Further data examination revealed
th轟t correJation between subtests in each pair was
higher in the clinical s磁ple (r=.55 and .71 re
spectively, both p<.O 1) then it wωin the normal
惡roup (阱.34 and .45 respectively, both p<.OI).
This result showed that more subtest variances
are actually shared in the clinical population, and
it was in accordance with the known fact that
cognitive abi1ities are less correlated within the
hi拚er ability group (Detterm紹,啥~3; Detter翩
您認& Daniel, 1989; Legree, Pif址~ & Grafto刃,
待會6; Lyn袋, 1990; Lynn & C∞per,待好,待會尋;
Spearman, 1927). Sweetland, Reina, and Ta版
(2006) suggested that lower ability children may
have deficits in important central cognitive pro。
esses, and thus tend to operate on a more uniform
lower level.
Further more, it is clear from a review of the
literature that Digit Span and 句:諦。I Search
both require children to form visual i紡車喜悅 util輛
ize visual sc器nning (either physically or me心
tally), have a tolerance for s個紙 and be 詩le to
concentrate; meanwhile, Picture Completion and
Object Assembly both tap childr蚓、 abi1ity to
form wholistic ∞ncepts on pictures with mean
ingful contents (S揖ttler & Dumont, 2004). AlI
those correlated error terms in the clinical s獄時le
might represent some 紡車red compo泣ents which
講re mor告 congruent in the clinical population. It
might also be reasonable to assume th斜, for
cIinical children, once partitioning off the shared
varianc叫 between subtests which could be ex
plained by latent factors identified in this
four-fa佼佼 model, some trivial portion could
remain un-explained. However, given the sub-
t1ety of such differences across groups, we be
lieve that the overa)) main structure for WISC-III
can be fairly ∞nsidered as stable enough across
normal and clinical populations.
Finally, four subte級s were found with dis耐
crep部t e佼佼 V紹給C斜斜ross groups.τhe error
variances for Information, Object Assembly, and
Symbol Search 紹說你t8 were estimated slightly
higher in the clinical group, which revealed that
the corresponding tested intellectual abilities
could be explained more thoroughly by the
four-factor model in the normal children popula
tion. On the contrary, the error variances for Pic
給您 A汀敘1gement subtest was estimated slightJy
higher in the norrr叫車roup, thus suggested 紛紛
this kind of ability could be better explained by
the model in the clinical children popuJation.
Many factors, such as population unique varia
tion or ability level, could contribute to these
discrepant findings, or it could be the interaction
between test contents and population exception
ality. While we remind 紛紛在rchers to be careful
when expl轟inin草 the test result for children in
different population, the current findings surely
deserve further examinations.
Since our primary goal was to get a wholis
tic vi仇N on factorial invariance across groups,
some inevitable 1imitations of the present sωdy
deserve attention. Fi悶, even if the partial mea.糊
認認ment inv器riance approach he1ped us getti姆
拉le maximum information r宅garding the degree
of overall invarian嗨, we realized that the ex輛
ploratory post-hoc approach in identifying possi糊
ble variant parameters could bring the risk of
identification on chance leve l. In this study, be勵sides the statistical MI indicators, we did monitor
each st叩 and try to reason each modification
WISC-III Factorial Invariance Ac紛紛 Normal and Clinical Samples . 103 .
with sensible meaning through the whole model
缸設ing process. Noneth剎車ss, as Vande登berg and
Lance (2000) su草gested, researchers must care
fully consider and examine the thèoretical justi
fication when relaxing constraint器 in practicing
the p制ial invariance study. Second, it is good
that we used large 蠶nd heterogeneous s置mples for
ensuring the stability and accuracy for parameter
estimations in this current study. However, with
such det垂iled model-fitting modifications, we
f嘗aIize that it could be a problem for later
cross-vaIidation. Researchers are encourage哇 to
aware of this Iimitation.
In summary, given the relatively large dat磊
set and substantial number of variations with
which we wer軍 workin臣, the findings were of
substantial import臨ce for understanding the
factorial structure for this frequently used in
strument. Co削弱tent with contemporary r草$能rch
位ndings, the underlying WISC心1 factor struc
ture is 轟ppropriately represented by the proposed
four編factor m。這el comprising VCI, POI, FDI, and
PSI. Moreover, except for 鉛me trivial discrepan
cies on error variances and correlated residuals, majoritie器。f the model parameters were demon
strated invariant across groups. Evidence of
multi-Ievel invariance supported the partial factor
invariance of this instrument across normal and
clinical groups. The m轟in structure and the un
derlying meanin容。f each WISC-III factor are
generaIIy identical for both normal and cIinic蠶i
children populations in Taiwan.
References
American Educatíonal Research As草ociation,
American Psychological Association, Na-
tional Council 0汲 Measurement in Educa
tion. (1仰9). Standards for educationa/ and
psych%gica/ testing. Washington, DC:
AE設A.
Bentler, P. M吋& Bonett, D. G (1 縛。). Signi缸"
cance tests and goodne鉛翩。f-fit in the 組aly郁
的 of cov器riance structures. Psych%gical
Bu//etìn, 88, 588-606.
Bollen, K. A.(1989). Structural eq缸ations with
latent variables. New York:Wiley.
Byrn皂, B. M. (1998). Structure equation mode/
ing with LJSR缸, PREL/.在 and SIMP LlS:
Basic concepts, applicatÎons, ωui progr捌
ming. M軍hwah, NJ: Lawrence Erlbaum.
Byrn章, B. M., Shavelson, R. J., & Muthen, B.
(1 989). TI的ting for the equiv轟lence of factor
covariance and mean structures: The issue
of partial measurement invariance. Psycho
logical Bulletin, 105(3), 456-466.
Byr珊, B. 說叮& Watkins, D. (2003). The issue of
measurement invariance revisited. Journal of
cross-cult甜ra/ psych% gy, 34(2), 155-175.
Camar盔, W,., 1., Nath揖泣, J. 怠, & Puente, A. E.
(2000). Psychological test usage: Implica
tions in professional psychology. Profes鷹
sional Psych%gy: Research and Practice,
31, 141-154.
Chen,鼠, Chan車, C吋& Yang, T. (2004). Study of
intellec泌al patterns of children with Autism.
Bulletin ofS,戶cial Education, 26, 127-15 1.
Chen, H., Lin, K., & Liao, Y. (2005). Study of
WISC-IJI Intellectual Patterns of Mental
Retarded Children. Bulletin of Specia/ Eduω
cation, 28, 97-122.
Chen,日吋& Yan車, T. (2000). Base Rates of
WISιIII diagnostic subtest pa加股s in Tai
wan: Standardization , Learnin草 Disabled
. 104 . 特聽教育研究學刊
and ADHD s紋nples appIied. Psychological
跨'sting, 47(2), 91.110.
Chen, 丘, Zhu, J., & Chen, Y. (2000). The legiti.
macy and utility of the WISC.lIl fac
tor.based indexes: Taiwan standardization
sample applied. Paper presented at the an
nual meeting of the American Educational
Research As車。ciation, New Orleans, LA,
U.S.A.
加快erman, D. K. (1 993). Giftedness and inteIIi都
gence: One and the same? In Ciba Founda
tion 明d.), The origins and de~lopment of
high ability. Wi的t Sussex, England: Wi1ey.
加快erma泊 D. K., & Daniel, M. H.
(19惡的.Correlations of mental tests with
each other and with cognitiv雷 vari揖bles are
highest for low IQ groups. lntellìgence, 13, 349-359.
。onders, J., & Warschaus峙, S. (1 996). A s如lC
tural equation analysis of the WISC-III in
children with traumatic head injury. Chi/d
Neuropsychology. 2, 185-192.
Drasgow, F. (1轉4). Scrutinizing psycholo草ical
tests: Measurement equivalence and
equivalent relations with extemal variables
ar單 central issues. Psychological Bulletin. 好,
134-135.
Drasgow, F. (1總7). Study of the measurement
bias of two st路也rdized psychological tests.
JournaJ of Applied PsychoJo智,刀, 19-29.
G疇。rgas, J., Wi糾紛, L. G., V.部 de Vijver, F. J. R.,
& Saklofs詣, D. 日,但也.) (2003). Culture
and chi/dren ~ inteJJigence: Cross編culturaJ
analysis of the WISC- lIl. S如 Diego多 CA:
Academic Press.
Georgas,J., Van de Vijver, F. J. 仗, Weis囂, L. o.,
& Saklofske, D. 成 (2003). A cross-cultural
analysis of the WISC-III. In J. Georgas, L.
0. We棍, F. J. 鼠. Van de V這ver, & D. H.
Saklofske (Eds.), Culture and children ~ În
teJJigence: Cross-cultural ana伊拉 of the
WISC-lIl (pp. 217-313). San Diego, CA:
Academic Press.
Hom, J. L., & McArdle. (1 992). A practical and
theoretical guì配給 me墨surement invariance
in agin在 research. Experimental Aging Re
search, 18(3), 117仆4ι
Hoyle, R. 鼠, & Panter, A.τ. (1995)龜 Writing
about structural equation mode怡. In R. H.
Hoyle (Ed.), Structural equation modeling:
concepts, issues, and applications. τhou
sand Oaks, CA: Sage.
Hu, L畫, & Bentler, P. M. (1 998). Fit indices ín
covariance structure modeling: 銬的Îtivity to
underpar轟meterized model misspecification.
P秒,chological Methods, 3(4), 424-453.
Hun章, L., Chen, S叮& Chen, H. (2003). The
Study of Intelligence of Secondary Students
with Leamíng Dis的iIities. Journal of Tai
wan Normal Universi妙: Education, 48(2),
215-238.
Jöresk嗯, K. 0., & Sörbom, D. (2006). LlSREL
8.8 statistical programs. Chic睹。: Scientific
Software.
Kaufm詣, A. S. (1 994). lntelligent testing with
the WISC-lll. NY: John Wiley & Sons.
Keith, T. 芯, & Witta,且心(1 997). Hierarchical
andc紛紛 age confirmatory factor 訟alysis
of the WISC-III: What does it measure?
School PsychoJogy Quarter秒,口,妙-107.
Klin章,又. B. (2005). Principles and practice of
structural equation modeling (2“ ed.). NY:
τhe Guilford Pres章建
Kono峙,了.氏, Kush, 1. C吋& Canivez, 0. L.
WISC-III Factoríal Invaríance Across Norrnal and Clínícal S益mples . 105 .
(1 997). Factor replication ofthe WISC-III ín
three independent s在mples of children re“
ceiving speciaI education. Journal of Psy
choeducationa/ Assessment, 15, 123-137.
Legree, P. J., Pifer,結.且,& Gra食on, F. C. (1 996).
Correlations among cognítive abilities are
10wer for higher abiIity groups. Inte l/igence,
23, 45-57。
Lynn, R. (1990). Does Spearma泣's g decJine at
high levels? Some evidence from Scol益nd.
The Journa/ of Genetic Psychology, 153, 229輛230.
Lynn, R., & Cooper, C. (1 993). A secuI直r decline
in Spearm轟n's g in France. Learning and
Individual Differences, 5, 是3-48.
Lyn袋,狡., & Cooper,仁(1 994). A secular decline
in Spearm喜泣's g in Japan. C缸rrent Psychol
ogy Developmental, Learning, Personali.紗,
Social, 13, 3-9.
Marsh, H. W., Balla, 1. 氏, & McDonald, R. P.
(1 988). Goodness-of-fit indexes in confir
matory factor analysís: The effect of sample
size. Psychological Bul/etin, 103, 3會 1-410.
McDon在峙,民 跤, & Ho, M. R. (2002). Principles
車nd practice in reportíng structura1 equ轟ting
ana1yses. Psychological 品袋thods, 7(1),
64-82.
Meredith, W. (1 993). 但easurement invaríanc巴,
factor a泣器lysis and Ð轟ctorial invariance.
Psychometri旬, 58(4), 525勵543.
MiIlsap, R. 且, & Kwok, 0.(2004). Eva1uating
the impact of partial factOI城 invariance on
selection in two populations. Psych%gical
Methods, 9(1), 93-115.
Muthén,紋, & Kaplan, D. (1 985). A compariso泣
of some methodologies for the factor analy總
sis of non-normal Iikert varíables. British
Journal of Mathematical and Statistìcal
Psychology, 38, 17卜 189.
Prifitera, A吋 Saklofske, D. 址, & Wei裕, L. G.
但也.) (2005). WISC-IV c/inical use and in
terpretation: Scientist-Practitioner perspec
tives. Burlíngton, MA: Academíc Press.
Ro仗, D. 丸, We口s, C. 況, & Flaugher, R. L.
(I978). The use of analysís of covariance
structures for comparíng the psychometric
properties of mu1tiple variables acro認
popu1ations. Mu /tivariate Behavioral Re恥
search, 汀, 403-418.
Roid, G. 鼠, Prifitera, A勻& Weíss, L. G (1993).
Replication of the WISC-III f;在佼佼 structure
in an independent sample. Journal of Psy
choeducational Assessment Monograph 晨,
ries. Advances in Psychological Assessment:
F伶chsler Intelligence Scale for Chiι
dren-Third Editìon, 重-2 1.
Roid, G. 此, & Worral, W. (1997). Rep1icatíon of
Wechsler Intelligence Scale for Chil
dren-Third edition four-factor model j琵 the
Canadian normative sampl缸• Psychological
Assessment, 9, 512“ 515.
S車從ler, 1. M. (2001). Assessment of children cog
nitive applications (4th ed). Le 說esa, CA:
Author.
S直設ler, J. M., & Dumo訟, R. (2004). Assessment
of children: WISC-IV and WPPSI-IIl sup
p/ement. Le Mesa, CA: Author.
Shapìro, S. 鼠, & Wilk, M. B. (1965). An anaJysis
。f v訂iance 車5t for normality (complete
時給ples). Biometri缸,泣, 591-61 1.
Spean棚, C. (1 927). The ability ofman. London:
Mac怯ilIan.
Sweetland, 1. D吋 Reina, J. M叮& Tatti, A. F.
(2006). WISC-III verballperforrnance dis.
. 106 . 特殊教育研究筆者j
crep紐cies among a sample of gifted chil
dren. Gi丹'edChild Quarter,紗\ 50(1), 7-10.
Tupa, D. 1., Wrig尬, M. 。門& Frist詞. M. A.
(1997). Confirmatory factor 部alysis of the
WISC-III with child psychiatric inpatients.
Psych%gica/ Assessme1仗" 9(3), 302-306.
Vandenberg, R. J., & Lance, C. E. (2000). A r,聶衛
view and synthesis of the measurement in
variance literature: Suggestions, practices, and recommendations for organiz測ional re
search. Organizationa/ Research Metho街,
3( 1), 4-69. 叫
Wechsler, D. (1 991). Manua/ for the Wechs/er
Intel/igence Sca/e for Children - Third Edi
tion. San Antonio, TX: The Psychological
Corporation.
Wechsler, D. (1 997). M.鏘ua/for the 為iwan ver
sion of the Wechsler Inte /ligence Scale for
Children - Third Edition. Taipei, Taiw部:Chinese beh轟vioral science corporation也
W紛紛, L. (忌, Saklofs峙, D. 此, Schw紋t萃, D.M刊
Prifiter昌, A., & Courville, T. (200的. Ad欄
vanced clinical i治terpret揖tion of WISιIV
index scores. In. L.G. Weiss, D. H. Saklof
sk章, A. Prifiter丸& J. A. Holdnack. (Eds.),
WISC-IV advanced clinica/ ínterpretation
(pp139-179). Burlington, MA: Academic
Press.
收楠日期: 20肘.1 0.1 2
撥斃的期: 2008.03.11
. 107 .
HH G -t 到uw
nLV HM AU
n乞
旬,.
-HM
<U
L針,
za
ni--帥、句‘d
nyny p3
, fη
oi
、
-a
句、d
-HH弓‘Jv
aEa. ha i
]心似許許心利FM
讓灣一般兒車與臨床兒童之魏氏兒童智力量表第三版(WISC-III)因素恆等性研究
嗓心,拾
金灣師範大學特教是教授
朱建軍美鑽心理公司主任
洪儷瑜
發灣師範大學特教系教授
張正芬臺灣師範大學特教系教授
楊宗仁*北教育大學特教系副教授
本研究主要自鷗在樹驗魏氏兒童智力量表第三版(WISC-III)在一般兒黨與特殊
兒黨組群聞之區紫值等性。文中接據 1 , 100 名 WISC-III 棋準化標本及 1150 名臨床
兒童標本黨腎,灑馬結構方程模式進行多樣本驗麓性囡黨分析.本研究以 WISC-III 12
劉東要分割驗與四個潛在留素架構為假設轟準模盤,分三階段完成橡蟻。階段一先
分韓三組進符單標本驗證性盟黨分軒, j;J.建立讓當之義準困難模式。階段二則聽立
睛段性主牽涉設限的巢賽棋盤,避…轅鞍因素架構、厲素負荷蠱、與殘差變異數之蹄
組對權等館設。研究發現除了少處都聽之殘競賽異或共變盤空投異持緝大多數之幢驗
蓮香數持真榜樣本罷等特性· WISC-I日在一控克黨與特殊兒童輯群間真有相間之自因
素架構與聞業負荷釁 e 整體面言,研究輯果支持鄧份因素盤等哇,台灣臨床兒童與
一般兒寞之 WISC-III 分數是具有輯同之解釋議議。
關攤詢:特殊兒童、因素盟等桂、多樣本驗蠶住國黨分析、 WISC-III