Classroom-based learning diagnostic model for teacher-made tests: Program and applications

Classroom-based learning diagnostic model for teacher-

made tests: Program and applications

11Tsai-Wei Huang, Tsai-Wei Huang, 22Pei-Chen WuPei-Chen Wu11National Chiayi University, Taiwan, National Chiayi University, Taiwan,

22National Ping Tung University of Education, National Ping Tung University of Education, TaiwanTaiwan

2013/01/152013/01/15

Why Teacher-made test is Why Teacher-made test is necessary? necessary?

– Confirmation of instructional objectivesConfirmation of instructional objectives– Formative assessment for learning outcomesFormative assessment for learning outcomes– Integration between instruction and learning Integration between instruction and learning

What students need from What students need from tests?tests?

– Feedbacks of progressionsFeedbacks of progressions– Feedbacks of concept mastered or Feedbacks of concept mastered or

misconceptions revealedmisconceptions revealed– Feedbacks of their cognition styles detectedFeedbacks of their cognition styles detected

What teachers need from What teachers need from tests? tests?

– Feedbacks of item/test qualityFeedbacks of item/test quality– Feedbacks from students’ outcomesFeedbacks from students’ outcomes– Promotion of Item construction skillsPromotion of Item construction skills– Promotion of teaching effectivenessPromotion of teaching effectiveness– Diagnostic information on studentsDiagnostic information on students’’

learninglearning

Aberrance indexAberrance index

Detecting students’ (or items’) Detecting students’ (or items’) aberrant response patterns based on aberrant response patterns based on a theoretical model, e.g., Guttman a theoretical model, e.g., Guttman ideal response modelideal response model

Person-fit indexPerson-fit indexItem-fit indexItem-fit index

What aberrance index can What aberrance index can do? do?

– Able to reflect each studentAble to reflect each student’’s s (item’s) aberrant responses(item’s) aberrant responses

– provide rich diagnostic information provide rich diagnostic information about misconceptions and about misconceptions and psychological response patterns psychological response patterns during learning processes. during learning processes.

SSP P chartchart

Item # 14 2 3 9 15 6 7 1 5 13 11 12 8 4 10

p= .80 .70 .70 .60 .60 .60 .50 .50 .40 .40 .30 .20 .20 .20 .10

q= .20 .30 .30 .40 .40 .40 .50 .50 .60 .60 .70 .80 .80 .80 .90

Q = Nq = 2 3 3 4 4 4 5 5 6 6 7 8 8 8 9

ID T t s

#6 12 .80 .20 1 1 1 1 1 1 1 1 0 0 1 0 1 1 1

#3 11 .73 .27 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0

#9 10 .67 .33 1 1 0 1 1 1 0 1 1 1 0 1 0 0 0

#4 9 .60 .40 1 1 0 1 1 1 1 1 0 0 0 0 1 0 0

#1 8 .53 .47 1 1 1 1 0 0 1 0 1 0 1 0 0 1 0

#5 7 .47 .53 0 0 1 0 1 1 1 0 1 1 0 1 0 0 0

#2 6 .40 .60 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0

#8 4 .27 .73 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0

#7 2 .13 .87 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0#10 1 .07 .93 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Illustration of PERSON/ITEM Aberrant Response Patterns (N=10 persons, K=15 Items).

RATIONALERATIONALE----BW family indicesBW family indices Person-facetPerson-facet

– W W ((carelessnesscarelessness) ) – BB ( (guessingguessing))– CC ( (capabilitycapability))– MM

((misconceptionmisconception))

]2/)1[(

)()1(1

i*

K

qquW

iT

jjiTij

i

]2/)1[(

)(1

*i

K

qquB

K

TjiTjij

ii

]2/)1[(

)()-(1M 1

*i

i

K

qquK

TjiTjij

i

]2/)1[(

)(C 1

i*

i

K

qquiT

jjiTij

RATIONALERATIONALE----BW family indicesBW family indices Item-facetItem-facet

– W W ((hinthint ) ) – BB ( (disturbancedisturbance ) )– CC ( (difficultydifficulty ) )– MM ( (misfitmisfit ) )

]2/)1[(

)(1

j*

j

N

ttuw

jQ

iiQjij

]2/)1[(

)()1(1

*j

j

N

ttub

N

QiQjiij

j

]2/)1[(

)()-(11

j*

j

N

ttuc

jQ

iiQjij

]2/)1[(

)(1

*j

j

N

ttum

N

QiQjiij

j

10021

1

]/)N[(

)tt(uM

N

Qi

*Qjjiij

jj

BW learning diagnostic BW learning diagnostic ModelModel

BW modelBW model

PersonPerson overall overall ability ability powerpowerItemItem overall overall difficulty difficulty powerpower

41u j..i

u j..iij }PPPP{P ijij 111

iiii.i MBWCP 111

jjjjj mbwcP 111.

Wbstar program Wbstar program

Design ideasDesign ideas

Functions--Person Functions--Person aspect aspect

Able to estimate the probabilities of Able to estimate the probabilities of –items correctly answereditems correctly answered–concepts masteredconcepts mastered–teaching objectives reachedteaching objectives reached

Able to classify students to adequate classes Able to classify students to adequate classes of psychological cognition status for of psychological cognition status for diagnosis under a certain criterion (e.g., diagnosis under a certain criterion (e.g., alpha = .05)alpha = .05)

Functions--Item aspect Functions--Item aspect – Able to provide item information aboutAble to provide item information about

DifficultyDifficulty DiscriminationDiscrimination Option distractionOption distraction Hint Hint DisturbanceDisturbance MisfitMisfit

– Able to examine item’s functions by Able to examine item’s functions by classification under a certain criterion classification under a certain criterion (e.g., alpha = .05)(e.g., alpha = .05)

Program’s looksProgram’s looks



S-P chartS-P chart

Example 1: Example 1: Fraction and DecimalFraction and Decimal DataData

– 32 students in the 4th grade elementary mathematics 32 students in the 4th grade elementary mathematics class in Taiwan. class in Taiwan.

– Teacher-made test (22 items) with five concepts of Teacher-made test (22 items) with five concepts of fraction and decimalfraction and decimal: :

transforming decimal into fraction (TDF)transforming decimal into fraction (TDF) equivalent fraction (EF)equivalent fraction (EF) comparing fraction with decimal (CFD)comparing fraction with decimal (CFD) transforming fraction into decimal (TFD)transforming fraction into decimal (TFD) unit transformatiounit transformation (UT)n (UT)

Q matrix of items on concepts judged by 2 Q matrix of items on concepts judged by 2 teachersteachers

Classification designClassification design

Item analysisItem analysis--high w value --high w value (11.36)(11.36)

Irrelevant Hint:160 4/5 >1.605!

How about 160 2/5?

Item analysisItem analysis--high b value --high b value (4.83)(4.83)

Sequentially, he gave q3 an answer of 68/1000.

ID 8 student gave an correct answer of 68 for

q2.

Concepts masteryConcepts mastery

Aberrance patterns (H1,H3)Aberrance patterns (H1,H3)

Aberrance patterns (Aberrance patterns (A1, A1A1, A1’’, A2, A4 ), A2, A4 )

Aberrance patterns (Aberrance patterns (L1, L2, L3, L4 )L1, L2, L3, L4 )

Example 2:Example 2:Two-tier acid-base test Two-tier acid-base test

Examining the effects of classifying Examining the effects of classifying response patterns by BWCM indexesresponse patterns by BWCM indexes

67 sixth grade students (41 boys, 61.19%; 26 girls, 67 sixth grade students (41 boys, 61.19%; 26 girls, 38.81%) in Taiwan.38.81%) in Taiwan.

Four cognitive styles of the two tier Four cognitive styles of the two tier itemsitems

2-TIER ITEM

REASON (KNOW-HOW)

RIGHT WRONG

ANSWER(KNOW-WHAT)

RIGHTTrue knowing Guessing

WRONGCareless Unknown

QuestionsQuestions Q1 : How do the classified clusters Q1 : How do the classified clusters

based on the BW indices be verified? based on the BW indices be verified?

Q2 : What characteristics of the Q2 : What characteristics of the classified clusters will reveal?classified clusters will reveal?

Q3 : What characteristics will change Q3 : What characteristics will change for individuals between the know-for individuals between the know-what test and know-how test?what test and know-how test?

RESULTSRESULTSClassification

HIT RATES Group 2 Group 3DA K-mean

98.5% 91.5%

Ward 97.1% 95.6%MLR K-mean

100% 97.0%

Ward 100% 100%

Properties of 2 clustersProperties of 2 clusters

K-mean

0 5 10 15 20 25 30

1

2

Group

MCBW

Ward

0 5 10 15 20 25 30

1

2

Group

MCBW

Group of low misconception with aberrant

responses

Group of high competence with normal responses

3 clusters3 clustersK-mean

0 5 10 15 20 25 30 35 40

1

2

3

Group

MCBW

Ward

0 5 10 15 20 25 30 35 40

1

2

3

Group

MCBW

Middle competence Group with

complex aberrant responses

High competence Group with

normal responses

Low competenc

e Group with

guessing aberrant

responses

Change between know-what Change between know-what ((ANSWERANSWER) and know-how () and know-how (REASONREASON))

CASE 50

-1.5-1

-0.50

0.51

1.52

2.53

ZW ZB ZC ZM

ANSWERREASON

CASE 60

-1.5-1

-0.50

0.51

1.52

2.5

ZW ZB ZC ZM

ANSWERREASON

CASE 12

-1.5

-1

-0.5

0

0.5

1

ZW ZB ZC ZM

ANSWERREASON

H1H1 H1A3

L1’H1

Recommendations Recommendations The BW model can provide sufficient and The BW model can provide sufficient and

matching information of interpretations to an matching information of interpretations to an original response data matrix in both person-original response data matrix in both person-and item facets, especially in a and item facets, especially in a small sample small sample size size situation like that of students takes a situation like that of students takes a teacher-made test in class. teacher-made test in class.

Teachers can use the useful information to Teachers can use the useful information to improve their test development skills and to improve their test development skills and to have have immediate diagnostic feedbacksimmediate diagnostic feedbacks from from their studentstheir students’’ response patterns. response patterns.

The BW model can provide the extent to The BW model can provide the extent to which students mastered in which students mastered in conceptsconcepts and and achieved to achieved to objectionsobjections so that teachers can go so that teachers can go further for designing remedial instructions. further for designing remedial instructions.

Future Outlook Future Outlook The BW model in this study is The BW model in this study is nownow only suitable only suitable

for the for the dichotomouslydichotomously scoring situations; scoring situations;

PolytomouslyPolytomously scoring system is under scoring system is under developing...developing...

Related articles and Related articles and resourcesresources Huang, T. W. (2008). A study of cutoffs for aberrant indices under different data structures. The Journal of

Guidance and Counseling, 30, 1-16. (in Chinese) Huang, T. W. (2011). Establishing and examining the diagnostic space of two new developed person-fit indices:

The W* and the B* indices. Psychological Testing, 58(1), 1-27. (in Chinese) Huang, T. W. (2011). Robustness of BW aberrance indices against test length. Knowledge Management & E-

Learning: An International Journal, 3, 310-318. Huang, T. W. (2012). Aberrance Detection Powers of the BW and Person-Fit Indices. Journal of Educational

Technology & Society, 15(1), 28-37. Huang, T. W. & Wu, P. C. (in press). A classroom-based cognitive diagnostic model for teacher-made tests: An

example with fractions and decimals. Journal of Educational Technology & Society. Huang, T. W. (2007, July). Establishing Cutoffs for Two New Aberrance Indices: The Within- Ability-Concern

Index and the Beyond-Ability-Surprise Index. Paper presented at the annual meeting of the International Conference on the Teaching of Psychology, Vancouver, Canada.

Huang, T. W. (2012, July).Verifying the effects of item-fit indices on gender-related differential item functioning. Paper presented at the XXX International Congress of Psychology, Cape town, South Africa.

Huang, T. W., & Tsai, S. W. (2012, August). Examining the effect of classifying response patterns on an acid-base test by a Guttman-based person-fit index set. Paper presented at the 1st International Conference on Education Measurement and Evaluation 2012 (ICEME2012), Manila, Philippines.

The WBStar program now is available in CHINESE version at http://140.130.46.9. English version is not completely done, but coming soon…

http://140.130.46.9/

THANKS FOR YOUR ATTENTIONS!THANKS FOR YOUR ATTENTIONS!

Documents

Classroom-based learning diagnostic model for teacher-made tests: Program and applications