Upload
dodien
View
214
Download
0
Embed Size (px)
Citation preview
72
學科能力測驗非選擇題閱卷一致性之探討
The Rating Consistency of General Scholastic Ability Test
Ming-Chiu Chang College Entrance Examination Center
Abstract
Rating consistency of The General Scholastic Ability Test (GSAT) has been
one of the most concerning issues to public. However, researches on rating
consistency of non- multiple-choice items are not sufficient since the GSAT is a
large-scale high-stake exam. The purpose of this study is to validate the rating
consistency of GSAT. The approaches to validate the rating consistency include
generalizability theory and rating consensus. The results show that GSAT, as a
large-scale high-stake exam, has high rating consistency. This paper not only
presents the evidence on the good rating consistency of GSAT, but it also provides
references on how to monitor and improve rating consistency of high-stake
non-multiple-choice items.
Keywords: General Scholastic Ability Test, Rating Consistency
_____________________________________________ Ming-Chiu Chang, Staff Member, College Entrance Examination Center
74
學科能力測驗非選擇題閱卷一致性之探討
consistency stability
split-half
Kuder-Richardson Cronbach α
Pearson
Spearman
Kendall coefficient of concordance Brown Glasswell Harland 2004
The New Standards Project, CRESST Vermont
ETS
75
學科能力測驗非選擇題閱卷一致性之探討
40%~60%
80%~100% 0.7~0.8
Novak Herman Gearhart 1996
Brown et al., 2004; Novak et al., 1996
Generalizability Theory G
Brennan, 2001
G
X true score, T
undifferentiated random error, E X=T+E
universe score
ANOVA
Brennan, 2001
77
學科能力測驗非選擇題閱卷一致性之探討
facet
admissible condition of measurement
Brennan,
2001 Shavelson & Webb, 1991
Brennan, 2001
Brennan, 2001
1
p i r p objects
of measurement i r
n m
p I R 2
Brennan, 2001; Shavelson &Webb, 1991
78
學科能力測驗非選擇題閱卷一致性之探討
fixed facet random facet
G
Shavelson & Webb, 1991
25 25 25
Shavelson & Webb, 1991
generalizability coefficient index of
dependability
2ˆE
22
2 2ˆ
ˆˆ ˆ
,E 2ˆ 2ˆ
ˆ
2
2 2
ˆˆˆ ˆ
, 2ˆ 2ˆ
80
學科能力測驗非選擇題閱卷一致性之探討
A B C
A+ A A- B+ B B- C+ C C-
0
2010
2012
4 0.5
1
2 99 100
2
9 18 27 >2 >5 >8
8 20 -- >2 >5 --
85
學科能力測驗非選擇題閱卷一致性之探討
+
( )
(
pir
p
i
X
+
+
+
+
( )
r
pi p i
pr p r
ir i r
+ . (
)pir pi pr ir p i rX
0 2pirX
2 2 2 2 2 2 2 2,pir p i r pi pr ir pir eX 2
,pir e
3 p i r
3 p×i×r
p SSp 1pn MSp 2 2 2 2
,pir e r pi i pr i r pn n n n
i SSi 1in MSi 2 2 2 2
,pir e r pi p ir p r in n n n
r SSr 1rn MSr
2 2 2 2,pir e i pr p ir p i rn n n n
p i SSpi 1 1p in n MSpi 2 2
,pir e r pin
p r SSpr 1 1p rn n MSpr 2 2
,pir e i prn
i r SSir 1 1i rn n MSir 2 2
,pir e p irn
p i r SSpir,e 1 1 1p i rn n n MSpir,e 2
,pir e
86
學科能力測驗非選擇題閱卷一致性之探討
p I R2ˆE G ˆ D
in rn pn
2 2 22 2 2 2 ˆ ˆ ˆˆ ˆ ˆ ˆ ,pi pr pir
pI pR pIRi r i rn n n n
222
2 2 22 22
ˆˆ
ˆ ˆˆ
,ˆ ˆ ˆ
ˆ
p
pi pr pirp
i r i r
E
n n n n 2ˆ
2 2 22 2 2
2 2 2 2 2 2 ˆ ˆ ˆˆ ˆ ˆˆ ˆ ˆ ˆ ˆ ˆ ,pi pr piri r irI R pI pR IR pIR
i r i r i r i rn n n n n n n n
22
2 2 22 2 2 2 22
ˆˆˆ ˆ
ˆ.
ˆ ˆ ˆˆ ˆ ˆˆ
p
pi pr piri r irp
i r i r i r i rn n n n n n n n
2010
87
學科能力測驗非選擇題閱卷一致性之探討
10 12
10 12 4
14 16
14 16 10 10
2 4 2 25
GENOVA Crick & Brennan, 1983
4 99 100
Brown 2004 40% 60% Brown
9
0 28
Brown
88
學科能力測驗非選擇題閱卷一致性之探討
4 99
50.3% 25.4%
8 20 100
99
90%
4
99
26.6% 32.9% 33.6% 50.3% 25.4% 61.7% 58.7% 61.0% 48.4% 71.0% 11.7% 8.4% 5.4% 1.3% 3.5%
100
27.8% 19.3% 11.7% 46.7% 23.3% 63.4% 74.1% 85.0% 51.3% 72.8% 8.8% 6.6% 3.4% 2.0% 4.0%
Pearson
r 5 100
r=0.689 0.7
Brown 2004 0.7 0.8
0.9
89
學科能力測驗非選擇題閱卷一致性之探討
5
Pearson’s r 99 .746 .846 .778 .901 .847 100 .689 .809 .706 .916 .881
6
100
99
7 8
p i r
90
學科能力測驗非選擇題閱卷一致性之探討
6
99 100
5.18 5.23 1.48 1.48
.58 .58 .16 .16
7.66 7.31 3.27 3.67
.43 .41 .18 .20
11.66 12.28 4.37 4.12
.43 .45 .16 .15
3.09 4.16 2.10 2.37
.39 .52 .26 .30
6.89 8.03 4.55 4.66
.34 .40 .23 .23
7 p i r
p
p i r 0
2
i r 0 2
p i 28%
91
學科能力測驗非選擇題閱卷一致性之探討
8
70%
2 0
i r 0
7 p i r
99 100
p .0158 19999 39.74% .0140 19999 32.09%i .0051 2 12.85% .0082 2 18.70%r .0000 1 0% .0000 1 0%p i .0112 39998 28.07% .0129 39998 29.42%p r .0015 19999 3.78% .0018 19999 4.08%i r .0000 2 0% .0000 2 0%p i r .0062 39998 15.56% .0069 39998 15.71%
8 p i r
99 100
p .0492 19999 75.23% .0594 19999 72.79% i .0007 1 1.08% .0070 1 8.58% r .0000 1 0% .0000 1 0% p i .0070 19999 10.70% .0089 19999 10.91% p r .0022 19999 3.36% .0012 19999 1.47% i r .0000 1 0% .0000 1 0% p i r .0063 19999 9.63% .0051 19999 6.25%
4 5
92
學科能力測驗非選擇題閱卷一致性之探討
2 3 99 0.71
100 0.69
.023-.050
99 3
5 100 3 6
Shavelson Webb 1991 0.80
4 99 p I R
5 100 p I R
0.71
0.74 0.76
0.78 0.76 0.79
0.81 0.82
0.69
0.73 0.75
0.77 0.74
0.78 0.80
0.81
0.60
0.65
0.70
0.75
0.80
0.85
3 4 5 6
99 pxIxR G -2
G -3
D -2
D -3
0.61
0.67
0.71
0.74 0.71
0.76
0.79
0.82
0.69
0.74
0.77 0.80
0.63
0.69
0.73 0.76
0.60
0.65
0.70
0.75
0.80
0.85
3 4 5 6
100 pxIxR
G -2
G -3
D -2
D -3
93
學科能力測驗非選擇題閱卷一致性之探討
6 7
99
0.89 100 0.90
2 1
Shavelson Webb 1991 .80
6 99 p I R
7 100 p I R
0.76
0.85
0.88
0.82
0.89
0.92
0.75
0.84
0.88
0.81
0.89
0.91
0.70
0.75
0.80
0.85
0.90
0.95
1 2 3
99 pxIxR G -1
G -2
D -1
D -2
0.80
0.88 0.91
0.83
0.90
0.93
0.73
0.84
0.88
0.76
0.86
0.90
0.70
0.75
0.80
0.85
0.90
0.95
1 2 3
100 pxIxR G -1
G -2
D -1
D -2
95
學科能力測驗非選擇題閱卷一致性之探討
9
99
.540 -- .431 .519
100 .478 -- .418 .575
99 .839 --
100 .866 --
p i r
0
2 1 0.80
p i r
p i
3 6
0.80
p i 28% 10%
97
學科能力測驗非選擇題閱卷一致性之探討
2008 9
2011
2012
2010
2010 NAER
-
2008
4 161-186
2004
337 368
Brennan, R. L. (2001). Generalizability theory. New York: Springer.
Brown, G. T. L., Glasswell, K., & Harland, D. (2004). Accuracy in the scoring of writing:
Studies of reliability and validity using a New Zealand writing assessment system.
Assessing Writing, 9, 105-121.
Crick, J. E., & Brennan, R. L. (1983). Manual for GENOVA: A generalized analysis of
variance system (American College Testing Technical Bulletin No. 43). Iowa City, IA:
ACT, Inc.
Novak, J. R., Herman, J. L., & Gearhart, M. (1996). Establishing validity for
performance-based assessments: An illustration for collections of student writing. The
Journal of Educational Research, 89(4), 220-233.
Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park:
SAGE.
98
學科能力測驗非選擇題閱卷一致性之探討
A A+ A A-
B B+ B B-
C C+ C C-
40%
1.
2. 3. 4.
5.
1.
2. 3. 4.
5.
1.
2. 3.
4.
5.
20%
1.
2.
3. 4. 5.
1. 2. 3. 4. 5.
1. 2. 3. 4. 5.
20%
1. 2. 3. 4. 5.
1. 2. 3. 4. 5.
1. 2. 3. 4. 5.
20%
1. 2. 3. 4. 5.
1.
2. 3. 4. 5.
1. 2. 3. 4. 5.
2012