Research on Assessment 21 st Century Paradigm Shifts ◦ NCLB calls teachers and teacher candidates to be accountable for student learning citing research

Research on Assessment 21st Century Paradigm Shifts

◦ NCLB calls teachers and teacher candidates to be accountable for student learning citing research on connection of student learning/achievement to teacher preparation Widening achievement gaps

◦ Older Models of Assessment (R.J. Dietel, J.L. Herman, and R.A. Knuth, 1991)

Based on linear model of learning Instruction was “building blocks” approach Basic skills learned by rote could be assembled into

complex thinking and performance strategies.◦ Meaningful Learning

Cognitive Psychology Research Learning proceeds in many directions at once Important to measure how and whether students

organize, structure, and use information to solve problems.

Monarch Center 2011

Research on Assessment Assessment Trends (R.J. Dietel, J.L. Herman, and R.A. Knuth,

1991)◦ Cognitive View of Assessment

View of Learner: Active, constructive knowledge environment Scope of Assessment Integrated and cross-disciplinary

(alternative views exist – especially those experts in academic

language) Emphasis of Instruction Attention to meta-cognition, formative

assessments, effective materials, motivation and self-determination

Characteristics of Assessments Authentic assessments on multiple-choice contextualized problems that are short answer relevant and meaningful, emphasize higher-level

thinking, do not have a single correct answer, have public standards known in advance.

Who is Assessed Both individual and group process skills and performances

Frequency of Assessment Samples over time (portfolios, work samples) which provide basis for assessment by multiple stakeholders.

Monarch Center 2011

Research on Assessment Assessment Trends (R.J. Dietel, J.L. Herman, and R.A. Knuth, 1991)

◦ Cognitive View of Assessment Use of Technology High-tech applications such as

administration and scoring sheets computer-adaptive testing, expert systems, and simulated environments

What is Assessed Multidimensional assessment that recognizes the variety of human abilities and talents, malleability of student ability, and that IQ is not fixed

◦ Cognitive View of Assessment Performance Assessments

Richard Stiggins notes three critical components of performance assessments:

(1) Specification of a performance to be evaluated; (2) Development of exercises or tasks used to elicit

that performance; (3) Design of a scoring and recording scheme for results.

Monarch Center 2011

Research on AssessmentA Combination of Formative and Summative

Evaluations best. (C. Mathers, M. Oliva, S. Laine, 2008)◦ Results

◦ Issues of Performance-Based Tests/Tasks "Questions of fairness arise not only in the selection of

performance tasks but in the scoring of responses. As Stiggins has stated, it is critical that the scoring procedures are designed to assure that `performance ratings reflect the examinee's true capabilities and are not a function of the perceptions and biases of the persons evaluating the performance.' The same could be said regarding the perceptions and biases of the persons creating the test. The training and calibrating of raters is critical in this regard."

Monarch Center 2011

Research on AssessmentThe primary concerns with performance

assessment are:1. Time and Content: Performance-based

assessments are not able to test as much material as multiple-choice tests in the same amount of time. ◦Performance-based assessments usually

require additional time to administer—this takes away from instructional time. An inherent trade-off between time and content; the

more content the performance-based assessment attempts to cover, the more time it will take to design, administer, and score.

Monarch Center 2011

Research on Assessment2. Reliability: The main threat to reliability

comes from the necessity of having ‘experts’ score the performance-based assessments. Even with a set rubric, there may be variation in scoring among different raters. ◦ In particular, the scoring of performance-based

assessment is “susceptible to two general classes of measurement error: random and systematic” (Raymond & Viswervaran, 1993). Random error is inter-rater reliability. Systematic error, also called leniency error, “is

present when the mean of a rater summed over candidates differs from the mean of all raters” (Raymond & Viswervaran, 1993).

Monarch Center 2011

Research on Assessment3. Validity: Internal validity is the extent to which

each question (or task) on the test measures the objective, skill, etc. that it intends to measure. ◦ External validity, also referred to as generalizability, is

the extent to which a student’s performance on a question (or an entire test) represents their overall ability, or the extent to which their performance on one question can be generalized to the domain of knowledge the task represents, or even the extent to which various performance assessments, with the same domain, can be compared.

◦ Performance-based assessments are vulnerable to both internal and external validity threats because it is difficult to design performance-based assessments, and also because time and money constrains the sample one is able to test from the domain (Madaus & O’Dwyer, 1999).

Monarch Center 2011

Research on AssessmentLarge-scale, performance-based

assessments are a favorable alternative to current dominant testing strategies in any subject because they have the potential to

…yield a more complete picture of students’ abilities and weaknesses…support higher quality teaching… increase intellectual challenge in the classroom…[and] can overcome some of the validity challenges of assessing English Language Learners and students with disabilities. (Adamson & Darling-Hammond, 2010)

Monarch Center 2011

Performance Assessment of Teacher CandidatesCombination of standardized content

knowledge exams and comprehensive performance assessments◦Generic and Specific Models

Use of Rubrics Multidimensional sets of scoring guidelines for specific

areas/tasks/projects Use of Work Samples

An applied performance approach that can be tailored to: Learning goals Teaching style Group & individual student needs The context of the classroom, school, & community

Increased connection between courses and field experiences

Monarch Center 2011

California Teacher Performance AssessmentA performance assessment for teacher

candidates with new subject matter standards, new program standards, and new assessment standards; new candidates must complete beginning in 2008

Alternate assessments permitted must meet California Quality Standards for reliability/validity (i.e., AERA/APA test standards).

Aligned with the California Teaching Performance Expectations (standards) and California Content Standards

Monarch Center 2011

California Teacher Performance Assessment

The PACT is a performance assessment, in which teacher candidates plan a series of lessons based on a single standard or set of standards, implement the lessons and collect evidence of student learning, videotape one or more segments and reflect on performance, and reflect on the teaching event as a whole.

All the related documents are assembled into a portfolio. The rationale for an embedded performance assessment is

to assess competence in actual practice. The logic is that if a teacher preparation program can

closely document and then assess the candidate’s performance of an actual teaching event then the program has the greatest likelihood of truly measuring the candidate’s ability to perform under actual teaching conditions

Monarch Center 2011


www.pacttpa.org

Monarch Center 2011

http://www.pacttpa.org/

California Teacher Performance Assessment Content validity

◦ Development teams, Program directors, Program faculty, & Leadership team

◦ TPE alignment study

Concurrent validity◦ Evaluation of score validity◦ Decision Consistency · Holistic vs. Analytic ratings

Bias and fairness review• Construct validity

‣ Factor Analysis(2002-03 Pilot Year):• Reflection

& Assessment• Instruction• Planning

• Predictive Validity (Carnegie/CT Study)

Monarch Center 2011

California Teacher Performance Assessment Capstone Teaching Event Planning

◦ Lesson Plans◦ Handouts, overheads, student work◦ Lesson Commentary

Instruction◦ Video clip(s)◦ Teaching Commentary

Assessment◦ Analysis of Whole Class Assessment ◦ Analysis of learning of 2 students

Reflection◦ Daily Reflections◦ Reflective Commentary

Evidence of Academic LanguageMonarch Center 2011

California Teacher Performance Assessment Capstone Teaching Events: Criteria PLANNING

◦ Establishing a Balanced Instructional Focus

◦ Making Content Accessible

◦ Designing Assessments

INSTRUCTION◦ Engaging Students in Learning

◦ Monitoring Student Learning During Instruction

• ASSESSMENT‣ Analyzing Student Work From an Assessment

‣ Using Assessment to Inform Teaching

• REFLECTION‣ Monitoring Student Progress

‣ Reflecting on Teaching

• ACADEMIC LANGUAGE‣ Understanding Language Demands

‣ Supporting Academic Language Development

Monarch Center 2011

California Teacher Performance AssessmentEmbedded Signature

Assessments/Assignments◦Developed by Each Teacher

Preparation Program Some partnerships with local districts Within coursework tied to California

Teaching Performance Expectations Case Studies Analysis of Student Learning Analysis of Teaching/ Curriculum

Implementation

Monarch Center 2011


◦ Teaching Performance Events TPE 1 · Specific Pedagogical Skills for Subject

Matter Instruction TPE 2 · Monitoring Student Learning During Instruction TPE 3 · Interpretation and Use of Assessments TPE 4 · Making Content Accessible TPE 5 · Student Engagement TPE 6 · Developmentally Appropriate Teaching Practices TPE 7 · Teaching English Learners TPE 8 · Learning about Students TPE 9 · Instructional Planning TPE 10 · Instructional Time TPE 11 · Social Environment TPE 12 · Professional, Legal, and Ethical Obligations TPE 13 · Professional Growth

Monarch Center 2011

California Teacher Performance AssessmentSpecifics of TPE’s

◦ Focus of learning segment & aligned to California content standards

◦ Teaching/learning tasks on video clip(s)

◦ Additional prompts in some content areas (e.g., misconceptions in science, dispositions in mathematics, description of text in English/language arts)

◦ Multiple and single subject specific rubrics

CATs – Content Area Tasks

◦ Benchmarks within subject areas

Monarch Center 2011

California Teacher Performance AssessmentScoring of Teaching Events

◦Trained and calibrated subject specific assessors

◦Campus based with central audits & regional scoring

◦Rubric based scoring in real time (web based platforms)

◦Organized around dimensions of teaching and guiding questions

◦Sequentially Scored

Monarch Center 2011


In addition to on-site supervision, teacher candidates must submit a portfolio that comprises: ◦ The candidate’s work planning curriculum, assessment,

and instruction. Further, the portfolio includes evidence of work around

implementing instruction, assessing student learning, and then analysis of candidates’ teaching and learning outcome objectives.

All of the work on the portfolio is based on a teaching event of a candidate’s own design.

◦ The portfolio consists of planning documents in the form of pre-designed forms calling for classroom demographic and contextual information, lesson plans, video segments of the teaching event, reflective commentary around the various components of the teaching event, student work samples.

Monarch Center 2011

Future Trends in Performance AssessmentsLarge Scale Performance

Assessments◦To make performance-based

assessments affordable, some promising options are to:

◦…take advantage of the economies of scale that will accompany states banding together in consortia, tapping the efficiencies of technology in administering tests and supporting scoring, and using teachers strategically in the scoring of performance items. (Adamson & Darling Hammond, 2010)

Monarch Center 2011

Future Trends in Performance AssessmentsEarlier this year in a speech to state

leaders at Achieve’s American Diploma Project Leadership Team Meeting, U.S. Secretary of Education Arne Duncan remarked, “Today is the day that marks the beginning of the development of a new and much-improved generation of assessments for America’s schoolchildren. Today marks the start of Assessments 2.0” (Duncan, 2010, para 1).

Monarch Center 2011

Future Trends in Performance AssessmentsNew assessments by two large state

consortia, the Partnership for Assessment of Readiness for College and Careers (PARCC) and SMARTER Balanced Assessment Consortium◦Race to the Top competition (Duncan, 2010).

These two consortia will work over the next few years “designing and implementing comprehensive assessment systems in math and English language arts” so that by 2014 the assessments will be ready for use in any state (Duncan, 2010). Tied to Common Core Standards

Monarch Center 2011

Research on Assessment of Special Education Teacher Preparation Programs (L. Goe, 2006)

Teacher Preparation---Teacher Practices----Student Outcomes Relationship in Special Education◦ Issues

Inclusion Models of Delivery Co-Teaching Push-In; Pull Out Resource Rooms

Who is ultimately responsible for the student gains?

Monarch Center 2011


Experimenting with Value-added Models◦ Use of student test data compared to what is

“expected” to occur Imprecise in estimates of student effects Data needs two-three years minimum

Use of Student Standard Growth Models Holds constant student variables such as

poverty, race, language fluency, and prior test scores. Not every state has data or systems for perform

calculations

Schools Better Equipped to Collect and Analyze Data on School-Level◦ Less useful for specific teacher effectiveness

Monarch Center 2011


No Study Adequately Connects Teacher Preparation to Classroom Practice to Student Learning◦Pieces of the Puzzle – Five Factors

(Carlson, Lee and School, 2004) Experience, Credentials, Self-Efficacy,

Professional Activities, and Selected Classroom Practices (EBP) Sample of 1475 special education teachers All contributed to an aggregate measure of teacher quality

Monarch Center 2011


Grisham-Brown, Collins, and Baird (2000)◦ Researched pre-service special education

teachers explicit connections from what they learned in a course to actual teaching practices.

Miller (1991) Identified as Being Closest to Looking at All Three Components◦Integration of Special Education and

ELA teacher preparation program Unit of Study Actual Pre-Post test measures of student

learning

Monarch Center 2011


Future Directions Needed in Research◦ Encourage and support research that links teacher

preparation---classroom practices--- student learning/outcomes

◦ Extend research to include induction and early professional development

◦ Consider curriculum-based measures of achievement for students with special needs

◦ Include special education instruction in all teacher preparation programs; and encourage collaboration between special and general education teachers

◦ Design research that measures collective contributions to student outcomes

◦ Develop a comprehensive longitudinal teacher database

Monarch Center 2011

Documents

Research on Assessment 21 st Century Paradigm Shifts ◦ NCLB calls teachers and teacher candidates to be accountable for student learning citing research