SJT Presentation

Situational Judgment Tests: A Testing Format That Is Almost Too Good To Be True!

PTC/SC May 26, 2010

Michael A. Willihnganz

SJT Presentation

Presentation Agenda1. Define “situational judgment test (SJT)”

2. Trace the history of SJTs

3. Discuss advantages of written SJTs

4 S i th lidit id 4. Summarize the validity evidence

5. Explain the various scoring approaches

6. Present a 5-step developmental process

What Exactly is a Situational Judgment Test?▪ Simulations based on the assumption that one can predict how well an individual may perform on a job based on how the individual performs on a simulation of the job (McDaniel & Nguyen, 2000).

A h b id l ti d th t t k f th ▪ A hybrid selection procedure that takes on some of the

characteristics of job knowledge tests as well as some of the characteristics of work sample tests (Heneman & Judge, 2006).

What Exactly is a Situational Judgment Test?▪ Any paper-and-pencil test designed to measure judgment in work settings (McDaniel, Morgeson, Finnegan, Campion, & Braveman, 2000).

▪ A testing format that presents applicants with ▪ A testing format that presents applicants with

hypothetical job-related scenarios and asks them to identify an appropriate response from a list of alternatives (Peeters & Lievens, 2005).

What Exactly is a Situational Judgment Test?▪ A measurement method typically comprised of a series of job-related situations or scenarios that describe a dilemma or problem requiring the application of relevant knowledge, skills, and abilities (KSAs) to solve (Christian, 2008). ( , )

▪ SJTs are also referred to as low-fidelity simulations (Motowidlo, Dunnette, & Carter, 1990).

2010 IPAC ConferenceJuly 18th through the 21st Hyatt Regency Newport Beach

Visit IPACweb.org For program and registration information

1

Fidelity of Simulations▪ Simulations vary in the fidelity with which they present a task stimulus and elicit a response.

▪ High-fidelity simulations show a very close correspondence with job tasks.

▪ High-fidelity simulations are excellent predictors of job performance.

Decrease in Fidelity▪ Fidelity decreases as stimulus materials and responses become less and less exact approximations of actual job stimuli and responses.

▪ At the lower end of the fidelity continuum are simulations th t i l t did t ith h th ti l that simply present a candidate with a hypothetical situation to which the candidate indicates how he/she would respond, rather than carrying out the intended action.

▪ SJTs are low-fidelity simulations.

Low-Fidelity Simulations▪ Responses can follow either an open-ended format, in which the candidate describes how he/she would handle the problem situation in his/her own words, or responses can follow a multiple-choice format.

A itt l fid lit i l ti SJT i l t ▪ As a written low-fidelity simulation, SJTs simply present

a written description of a hypothetical work situation which asks the candidate to indicate how he/she would handle the situation.

Examples of High-Fidelity Simulations▪ In-Basket Exercise

▪ Role Play Exercise

▪ Data Entry Test

Low-Fidelity Task Stimulus▪ Interview (i.e., situational interview questions)

▪ DVD or Video

▪ Written (paper-and-pencil or CBT)

History of SJTs▪ The use of SJTs dates back to the 1920s.

▪ One of the first SJTs called the George Washington Social Intelligence Test measured judgment in social situations.

▪ During World War II, army psychologists assessed the judgment of soldiers using the SJT model.

▪ Starting in the 1940s, a number of SJTs were developed to measure supervisory potential (e.g., Practical Judgment Test, How Supervise?, and Supervisory Practices Test).

2

History of SJTs▪ In the late 1950s and early 1960s SJTs were used by large organizations as part of selection test batteries designed to predict managerial success.

▪ Standard Oil Company of New Jersey developed and d SJT ll d th M t J d t T t used an SJT called the Management Judgment Test.

▪ Renewed interest in SJTs followed the publication of an article by Motowidlo et al. (1990) entitled An Alternative Selection Procedure: The Low-Fidelity Simulation.

Sample ItemYou supervise an employee who is a chronic complainer. This employee frequently expresses his displeasure with work procedures. The employee’s complaints have become disruptive to the work unit you supervise. You would . . . A. Counsel the employee on seeking a position in the organization that p y g p g

is more to his liking. B. Assign work to the employee that he will find enjoyable and

rewarding. C. Talk to the employee about the issues causing his displeasure and

your expectations for acceptable behavior. D. Be tolerant of the employee’s complaints since it is not possible to

change someone’s personality.

Sample ItemYou are processing paperwork for a customer who is applying for a license. While you are completing the paperwork, the customer becomes impatient and tells you that you are working too slowly. You would . . . A. Apologize to the customer and work as quickly as possible to

complete the paperwork. p p p B. Suggest to the customer that she come back when she has more

time. C. Ask the customer to please refrain from making rude comments to

you. D. Inquire as to whether the customer would like to speak to your

supervisor for quicker service.

Sample ItemYou are at the hors d’oeuvre table placing hot chicken wings onto your cocktail plate when you accidentally drop a chicken wing into the martini of another conference attendee in the hors d’oeuvre line. The attendee does not notice that there is now a chicken wing floating next to the green olive in her martini. You would . . . A. Walk away from the hors d’oeuvre table before the attendee notices

the floating chicken wing. B. Go directly to the bar and purchase another martini for the attendee

before she notices the chicken wing. C. Apologize for your accident and offer to buy the attendee another

martini. D. Intentionally spill the attendee’s martini before she notices the

chicken wing and then offer to buy her another drink.

Sample ItemYou approach a suspect at his residence to serve an arrest warrant.

As the suspect sees you coming toward him, he becomes increasingly agitated and verbally abusive. Which one of the following actions should you take FIRST to effect the arrest?

A. Draw your firearm and aim it at the suspect.

B. Place a control hold on the suspect.

C. Spray pepper spray in the suspect’s face.

D. Command the suspect to raise his hands above his head.

Popularity of SJTs

SJTs are becoming increasingly popular for several

reasons:

▪ Large-scale studies have shown that SJTs have significant criterion-related validity (McDaniel et al., 2001)2001).

▪ SJTs possess incremental validity over and above cognitive ability and personality tests (e.g., Chan & Schmitt, 2002).

3

Popularity of SJTs▪ Applicants respond enthusiastically to SJTs because they perceive SJTs to be related to the target jobs for which they are applying (e.g., Ployhart & Ryan, 1998).

▪ SJTs show less adverse impact on minorities than t diti l iti bilit t t (Cl t l 2001) traditional cognitive ability tests (Clevenger et al., 2001).

Advantages of Written SJTs▪ Inexpensive to develop

▪ Transportable

▪ Ease of administration

Hi h did t t ▪ High candidate acceptance

▪ Relatively strong validity

Validity Evidence▪ A meta-analytic review of SJT validity studies found the mean validity of SJTs to be .34 (McDaniel et al., 2000).

▪ Hypothetical work behavior can predict performance without the expense of props, role players, and i t t i ll d d b hi h fid lit i l ti equipment typically needed by high-fidelity simulation tests (i.e., work sample tests), or the high-tech gadgetry of other SJT formats (e.g., DVD-based tests).

Advantages of SJTsUnlike higher fidelity simulations (e.g., work samples,

assessment centers), SJTs offer the convenience of mass administration, increased objectivity, reliability, standardization, face validity, and lower cost of administration (e.g., Motowidlo et al., 1990; Weekley & ( g , , ; y Jones, 1999).

Validity Evidence▪ Motowidlo et al. (1990) reported validity coefficients ranging from .28 to .37, with supervisory ratings serving as the criterion measure.

▪ Motowidlo, Hanson, and Crafts (1997, p. 248) have t t d th t h ll t i ifi t stated that researchers generally report significant relationships between written SJTs and ratings of job performance with correlations ranging from .20 to about .50.

SJT Assumptions▪ Intentions are related to actual behavior.

▪ A person’s behavior in certain kinds of situations in the past can predict how he/she is likely to respond to similar situations in the future.

4

Faking and SJTs▪ Faking on a selection measure can be defined as an individual’s conscious distortion of responses to score favorably (e.g., McFarland & Ryan, 2000).

▪ Haas and McDaniel (1999) found that fakers improved th i SJT b h lf t d d d i ti h their SJT scores by one-half standard deviation where as Juraska and Drasgow (2001) concluded that SJTs were not fakable.

What Can SJTs Measure?▪ The commonly used SJT format lends itself especially well to measuring various forms of job knowledge.

▪ SJTs may also be used to measure specific personality or ability variables (Motowidlo et al., 1997, p. 246).

Scoring ApproachesDichotomous Scoring with Most Likely & Least Likely

▪ Candidate responds twice to each presented scenario.

▪ The alternative chosen as the Most Likely response is scored as 1 if it is the best response and scored as 0 if it scored as 1 if it is the best response, and scored as 0 if it is the worst response or one of the other alternatives.

▪ The alternative chosen as the Least Likely response is scored as 1 if it is the worst response, and scored as 0 if it is the best response or one of the other alternatives.

Faking and SJTs▪ In a recent study that examined the fakability of an SJT of college students’ performance, Peeters and Lievens (2005) found that faking negatively affected the criterion- related validity of the SJT.

Th lt t th t f ki i ht b ibl ▪ These results suggest that faking might be a possible

threat to the use of SJTs in high-stakes testing programs.

Scoring ApproachesDichotomous Scoring

▪ The alternative chosen as the Best response or the Most Likely response is scored as 1 if it is the best response, and scored as 0 if it is one of the other incorrect alternatives incorrect alternatives.

▪ This is the traditional dichotomous scoring approach used with most multiple-choice exams.

Sample ItemYou are at the hors d’oeuvre table placing hot chicken wings onto your cocktail

plate when you accidentally drop a chicken wing into the martini of another conference attendee in the hors d’oeuvre line. The attendee does not notice that there is now a chicken wing floating next to the green olive in her martini.

You would . . . A. Walk away from the hors d’oeuvre table before the attendee notices the y

floating chicken wing. B. Go directly to the bar and purchase another martini for the attendee before

she notices the chicken wing. C. Apologize for your accident and offer to buy the attendee another martini. D. Intentionally spill the attendee’s martini before she notices the chicken wing

and then offer to buy her another drink.

1. Most Likely _____ 2. Least Likely _____

5

Scoring ApproachesWeighted-Response With Most Likely & Least Likely

▪ The alternative chosen as the Most Likely response is scored as 1 if it is the best response, -1 if it is the worst response, and 0 if it is one of the other alternatives.

▪ The alternative chosen as the Least Likely response is scored as 1 if it is the worst response, -1 if it is the best response, and 0 if it is one of the other alternatives.

Developmental Process▪ Conduct a job analysis

▪ Step I -- Develop critical incidents

▪ Step II -- Edit/refine the problem situations

St III D l lt ti ▪ Step III -- Develop response alternatives

▪ Step IV -- Develop scoring key

▪ Step V -- Establish a pass point

Domain to be Assessed: Supervision

❑ Attendance ❑ Discipline ❑ Performance ❑ Time management ❑ Implementation of new policy or procedure ❑ Training and development ❑ Workload prioritization ❑ Organizational change ❑ Other ______________

Scoring ApproachesWeighted-Response Scoring Example

▪ A score of -2 means that the candidate would Most Likely respond to the situation by choosing the worst alternative, and would Least Likely choose the best alternative.

▪ A score of +2 means that a candidate would Most Likely respond to the situation by choosing the best alternative, and Least Likely choose the worst alternative.

Step I Develop Critical Incidents▪ Start with the job analysis to identify appropriate content to assess.

▪ Through the observation of incumbents, the collection of work samples, and/or direct input from SMEs, develop the critical incidents within the domain being assessed the critical incidents within the domain being assessed.

▪ The critical incidents or problem situations should represent events that actually occur on the job.

Step I Develop Critical Incidents▪ Start with the job analysis to identify appropriate content to assess.

▪ Through the observation of incumbents, the collection of work samples, and/or direct input from SMEs, develop the critical incidents within the domain being assessed the critical incidents within the domain being assessed.

▪ The critical incidents or problem situations should represent events that actually occur on the job.

6

Step I Develop Critical Incidents▪ The critical incidents should represent problems or issues that incumbents must handle effectively or their job performance will suffer.

▪ The critical incidents should be complex enough to allow for meaningful differences in how they can be handled for meaningful differences in how they can be handled.

▪ The critical incidents should be described in enough detail to provide the cues necessary to distinguish more effective from less effective approaches to dealing with them.

Step II Edit/Refine the Problem Situations▪ From the critical incidents developed by the SMEs, select a representative sample which will adequately assess the domain.

▪ Ensure that the final inventory of problem situations covers important problems likely to be encountered on covers important problems likely to be encountered on the job.

▪ Edit/refine each problem situation into a standard format that describes the problem clearly and concisely in just a few sentences.

Step III Develop Response AlternativesEach response alternative should meet the following

criteria:

▪ Focus on a single action or response

▪ Be stated in a straight-forward, understandable manner

▪ Be a plausible and reasonable response to the problem situation

▪ Discriminate between the better qualified and less qualified candidates

Step I Develop Critical IncidentsAssemble a group of SMEs (i.e., incumbents and

supervisors) to develop the critical incidents. Each critical incident should include the following elements:

B k di f i dd il f i i ▪ Background information and details of a situation or

problem encountered by a job incumbent.

▪ A description of effective action to address the situation or problem.

▪ A description of ineffective or inappropriate action to address the situation or problem.

Step III Develop Response Alternatives▪ Response alternatives can be developed from the descriptions of effective and ineffective actions that were identified when the critical incidents were prepared.

▪ Response alternatives should represent differing plausible strategies for the handling the problem plausible strategies for the handling the problem situation.

▪ The more correct alternatives should be more attractive to candidates with the best potential for job success.

Step IV Develop Scoring Key▪ A scoring key is developed by collecting judgments from SMEs about the effectiveness of the alternative response options for handling each problem situation.

▪ Weighted-response scoring – SMEs identify the best response and the worst response response and the worst response.

▪ Dichotomous scoring – SMEs identify the best or most appropriate response.

7

Step V Establish a Pass Point▪ SMEs review and evaluate each problem situation using a modified Angoff approach to establish the minimal acceptable competency (MAC) level for the test.

▪ The MAC level becomes the starting point for establishing the pass point for the SJT establishing the pass point for the SJT.

Questions?? Questions??8

Documents

SJT Presentation