19
Erfaringer med Remote Usability Testing? Jan Stage Professor, PhD Forskningsleder i Informationssystemer (IS)/Human-Computer Interaction (HCI) Aalborg Universitet, Institut for Datalogi, HCI-Lab [email protected]

Erfaringer med Remote Usability Testing?

Embed Size (px)

DESCRIPTION

Erfaringer med Remote Usability Testing?. Jan Stage Professor, PhD Forskningsleder i Informationssystemer (IS)/Human-Computer Interaction (HCI) Aalborg Universitet, Institut for Datalogi, HCI-Lab [email protected]. Oversigt. Undersøgelse 1 Undersøgelse 2. Oversigt. - PowerPoint PPT Presentation

Citation preview

Page 1: Erfaringer med Remote Usability Testing?

Erfaringer med Remote Usability Testing?

Jan Stage

Professor, PhD

Forskningsleder i Informationssystemer (IS)/Human-Computer Interaction (HCI)

Aalborg Universitet, Institut for Datalogi, HCI-Lab

[email protected]

Page 2: Erfaringer med Remote Usability Testing?

Institut for Datalogi 2

• Undersøgelse 1

• Undersøgelse 2

Oversigt

Page 3: Erfaringer med Remote Usability Testing?

Institut for Datalogi 3

• Undersøgelse 1: synkron eller asynkron

• Metode

• Resultater

• Konklusion

• Undersøgelse 2

Oversigt

Page 4: Erfaringer med Remote Usability Testing?

Institut for Datalogi

Empirical Study 1

Four methods: LAB – RS – AE – AU

Test subjects: 6 in each condition (18 users and 6 with usability expertise), all students at Aalborg University

System: Email client (Mozilla Thunderbird 1.5)

9 defined tasks (typical email functions)

Setting, procedure and data collection in accordance with method

Data analysis: 24 outputs were analysed by three persons in random and different order

Generated their individual lists of usability problems with their own categorizations (also for the AE and AU conditions)

These were merged into an overall problem list through negotiation

4

Page 5: Erfaringer med Remote Usability Testing?

Institut for Datalogi 5

Results: Task CompletionNo significant difference in task

completion

Significant difference in task completion time

The users in the two asynchronous conditions spent considerably more time

We do not know the reason

Page 6: Erfaringer med Remote Usability Testing?

Institut for Datalogi 6

Results: Usability Problems Identified

A total of 46 usability problems

No significant difference between LAB and RS

AE/AU identified significantly fewer problems, also critical problems

No significant difference between AE and AU in terms of problems identified

Page 7: Erfaringer med Remote Usability Testing?

Institut for Datalogi 7

Conclusion

RS is the most widely described and used remote method. The performance is virtually equivalent to LAB (or slightly better)

AE and AU perform surprisingly well

Experts do not perform significantly better than users

Video analysis (LAB and RS) required considerably more evaluator effort than the user-based reporting (AU and AE)

Users can actually contribute to usability evaluation – not with the same quality, but reasonably well, and there are plenty of them

Page 8: Erfaringer med Remote Usability Testing?

Institut for Datalogi 8

• Undersøgelse 1

• Undersøgelse 2: hvilken asynkron metode

• Metode

• Resultater

• Konklusion

Oversigt

Page 9: Erfaringer med Remote Usability Testing?

Institut for Datalogi 9

Empirical Study 2

Purpose: examine and compare remote asynchronous methods

Focus on usability problems identified

Comparable with the previous study

Selection of asynchronous methods based on literature survey

Page 10: Erfaringer med Remote Usability Testing?

Institut for Datalogi 10

The 3 Remote Asynchronous Methods

User-reported critical incident (UCI)• Well-defined method (Castillo et al. CHI 1998)

Forum-based online reporting and discussion (Forum)• Assumption: through collaboration participants may give input which

increases data quality and richness (Thompson, 1999)• A source for collecting qualitative data in a study of auto logging (Millen,

1999): the participants turned out to report detailed usability feedback

Diary-based longitudinal user reporting (Diary)• Used on a longitudinal basis for participants in a study of auto logging to

provide qualitative information (Steves et al. CSCW 2001)• First day: same tasks as the other conditions (first part of diary delivered)• Four more days: new tasks (same type) sent daily (complete diary delivered)

Conventional user-based laboratory test (Lab)• Included as benchmark

Page 11: Erfaringer med Remote Usability Testing?

Institut for Datalogi 11

Participants:• 40 test subjects, 10 for each condition• Students, age 20 to 30• Distributed evenly: gender and tech/non-tech education

Setting:• LAB: in our usability lab• Remote asynchronous: in the participants’ homes

Participants in the remote asynchronous conditions received the software and installed it on their computer

Training material for the remote asynchronous conditions• Identification and categorisation of usability problems• A minimalist approach that was strictly remote and

asynchronous (via email)

Empirical Study (1)

Page 12: Erfaringer med Remote Usability Testing?

Institut for Datalogi 12

Tasks:• Nine fixed tasks• The same across the four conditions to ensure that all

participants used the same parts of the system• Typical email tasks (same as previous study)

Data collection in accordance with the method• LAB: video recordings• UCI: web-based system for generating problem descriptions

while solving tasks• Forum: after solving tasks, one week for posting and discussing

problems• Diary: a diary with no imposed structure; first part after the first

day

Empirical Study (2)

Page 13: Erfaringer med Remote Usability Testing?

Institut for Datalogi 13

All data collected before the data analysis started

3 evaluators did the whole data analysis

The 40 data sets were analysed by the 3 evaluators• In random order: by a draw• In different order between them

The user input from the three remote conditions was transformed into usability problem descriptions

Each evaluator generated his/her own individual lists of usability problems with their own severity ratings• A problem list for each condition• A complete problem list (joined)

These were merged into an overall problem list through negotiation

Data Analysis

Page 14: Erfaringer med Remote Usability Testing?

Institut for Datalogi 14

Results: Task Completion Time

Considerable variation in task completion times

Participants in the remote conditions worked in their home at a time they selected

For each task there was a hint that allowed them to check if they had solved the task correctly

As we have no data on the task solving process in the remote conditions, we cannot explain this variation

Page 15: Erfaringer med Remote Usability Testing?

Institut for Datalogi 15

LAB: significantly better than the 3 remote conditions

UCI-Forum: no significant difference

UCI-Diary: significant overall: Diary – also significant on cosmetic

Forum-Diary: significant overall: Diary – not significant on any level

Results: Usability Problems IdentifiedLab

N=10UCI

N=10ForumN=10

DiaryN=10

Task completion time in minutes: Average (SD)

24.24 (6.3)34.45

(14.33)15.45 (5.83)

Tasks 1-9: 32.57

(28.34)

Usability problems: # % # % # % # %

Critical (21) 20 95 10 48 9 43 11 52

Serious (17) 14 82 2 12 1 6 6 35

Cosmetic (24) 12 50 1 4 5 21 12 50

Total (62) 46 74 13 21 15 24 29 47

Page 16: Erfaringer med Remote Usability Testing?

Institut for Datalogi 16

Results: Evaluator Effort

The sum for all evaluators involved in each activity

Time for finding test subjects is not included (8h, common for all)

Task specifications from an earlier study. Preparation in the remote conditions: work out written instructions

Considerable differences between the remote conditions for analysis and merging of problem lists

Lab (46)

UCI (13)

Forum (15)

Diary (29)

Preparation 6:00 2:40 2:40 2:40

Conducting test 10:00 1:00 1:00 1:30

Analysis 33:18 2:52 3:56 9:38

Merging problem lists 11:45 1:41 1:42 4:58

Total time spent 61:03 8:13 9:18 18:46

Avg. time per problem

1:20 0:38 0:37 0:39

Page 17: Erfaringer med Remote Usability Testing?

Institut for Datalogi 17

Conclusion

The three remote methods performed significantly below the classical lab test in terms of the number of usability problems identified

The Diary was the best remote method – it identified half of the problems found in the Lab condition

UCI and Forum performed similarly for critical problems but worse for serious problems

UCI and Forum took 13% of the lab test. Diary took 30%

The productivity of the remote methods was considerably higher

Page 18: Erfaringer med Remote Usability Testing?

Institut for Datalogi 18

Page 19: Erfaringer med Remote Usability Testing?

Institut for Datalogi

Interaktionsdesign og usability-evaluering

Master i IT

Videreuddannelse under IT-Vest

Fagpakke i Interaktionsdesign og usability-evaluering starter 1/2-12

Optager bachelorer, men også indgang for datamatikere

Information: http://www.master-it-vest.dk/

19