by Gary J. Dickelman
The usability evaluation methodology described herein, although not unique, is an integral part of the Aetna Life Insurance Company's information technology (IT) organization. Usability PS documents the methods and fine work of Aetna's Human Factors Engineering group. Many thanks to Christine Neligon, Nancy Conlan, Donna Way, Tom Furey, Mike Berstene, and Margarita Torres for their commitment to defending human attributes in the age of the computer - and for the privilege of working with and learning from them during my four years with Aetna.
Usability PS evolved from the Aetna experience - including several attempts at publishing self-help guidelines for usability testing - and from personal activities around documenting a methodology for performance-centered system development. Usability engineering helps us answer a number of compelling questions: How and where do we start developing performance support systems? How do we analyze business problems and represent them for performance? How do we account for the human factors? Exactly what constitutes minimal information to solve a problem? What does it mean to have few variables to manipulate? How do we keep reliance on memory to a minimum?
I suppose we could all benefit from demonstrations of Microsoft Publisher, WillMaker, and Quicken insofar as we might recognize performance support when we see it. But creating it is quite another matter. Following the experience of designing and developing twenty or so performance support "things" with varying degrees of success, I am convinced that engaging human beings in usability evaluation is a huge part of the how. I hope that Usability PS provides you with some guidance as you struggle through the development lifecycle.
Jonathan Grudin once proposed the following: When those who benefit are not those who do the work, then the technology is likely to fail or, at least, be subverted. Donald Norman explains that technology has deficits and human beings suffer for them. It is our responsibility as designers to ensure that those who must use the technology are not the ones who suffer. Usability engineering provides a means to mitigate the pain. It is my hope that your development methods will improve and your "end-users" will benefit from Usability PS.
Gary Dickelman
October, 1996
In today's business environment IT organizations are faced with ever increasing demands to create and modify applications at a higher speed, with better results, and at a lower cost. Human Factors studies show how early usability testing helps to meet these demands and ultimately reduces total application development and maintenance costs. Training costs are also reduced and in many cases revenue increases because the systems more closely align with business goals.
Usability PS provides an introduction to a usability evaluation process and supporting materials.
Usability PS contains:
Usability PS enables you to plan, develop, and execute a usability test. It guides you through each activity with a set of descriptions, examples, and tools. Usability PS is most effective when you supplement it with coaching from a usability consultant and you observe or assist with at least one usability evaluation. Coaching and an unbiased critique of your first usability test by a usability expert is strongly recommended. You can then assume the role of a usability consultant for your next evaluation!
For additional information about usability testing please contact Gary Dickelman at (202)452-4567 or by e-mail: gershom@cais.com or gdickelman@bna.com.
Overview: How To Conduct a Usability Evaluation
Usability evaluation measures an application's user interface with respect to attributes such as ease of learning, ease of use, affordance, and satisfaction. The user interface is the part of a system with which the user interacts and typically includes the windows, dialogues, menus, icons, buttons, hypertext/hypermedia, or even voice response systems. Usability goals are the indicators of how designers intended the user interface to meet the needs of its users. Usability evaluation can help you to improve the user interface, evaluate competing designs, and to increase user acceptance of a new system. When the notion of usability is applied to performance-centered design, our systems are better able to meet business goals through human performance.
Usability evaluations result in:
Why Conduct Usability Evaluations?
There are many reasons, including:
When And How Is Usability Evaluation Conducted?
Evaluation can be done at any point in the development or implementation of a system, and in a variety of settings. What is most important are the four primary rules of test design and execution:
Where Is Usability Evaluation Performed?
Tests can be performed in a real work setting or in a usability lab (a studio equipped with video cameras to record users' interactions with the system). A usability lab isolates the system by filtering out all other work activities, so users can focus on performing tasks that the system was designed to support. On-site tests, which may also be videotaped, can be done when it is not possible or practical to use a lab, or when it is desirable to test the system under actual working conditions. A number of developers can observe if testing takes place in the lab since the evaluators and observers are in different rooms. A smaller team may be more appropriate for on-site testing. Recording test sessions on videotape is helpful for refreshing memories and demonstrating to management why changes to the interface are necessary.
If performed after a system has been implemented, usability evaluation can help identify areas of the user interface where additional support may be needed. Testing in this case can be conducted on site, shortly after the system is implemented. The general rule , however, is that usability evaluations should be planned and executed early in the development cycle and applied iteratively as a process for continuous improvement.
What Happens After Usability Evaluation?
Development teams must apply results to the user interface design to achieve all the benefits of usability testing. Applying results is the purpose and payoff of usability testing. A round of usability testing is estimated to increase product usability by 30% if changes are applied. Such an increase can save maintenance costs and improve productivity enough to save the entire development cost over the useful life of the product. It can also avert disasters, such as delivering a system that is rejected by users even though it met functional specifications.
To prepare a test plan for the usability evaluation:
On the day of the test, be sure to tell the users (evaluators) that their expert input is valued and will be taken seriously. Let them know that they are not being evaluated, rather the system is being evaluated against a set of usability goals. Don not reveal the goals.
Although adhering to the test plan is important to obtain meaningful results, often the experiences of the first evaluator(s) will show a need to modify the test plan to achieve overall test goals, so be flexible. The dry run will mitigate - but not necessarily eliminate - the need to modify the test plan once the evaluation has begun.
All members of the observation team should take notes and discuss test results at the end of each day's sessions. Look for trends. Are most of the evaluators having the same problems with the user interface? Suggestions for improvement are often discussed with the evaluator after the test.
Debriefing, which is conducted after the evaluator has completed the test, often provides valuable feedback on the usability of the user interface that was not observed during the test.
Review usability goals, evaluation notes, and the evaluation videotapes. Record recommendations for improving the user interface. The usability consultant produces a report of all recommendations, but it is a good idea to discuss results and recommended changes while they are fresh in everyone's mind. In practice, most suggestions for improvement are identified during the evaluation and from observing as few as three evaluators.
A Quick View of Steps, Deliverables, and Tips
Step 1 - Quick View: Determine Test Site
Deliverable: Commitment for a suitable location to conduct the performance support test
Step 2 - Quick View: Select Evaluators (aka Performers)
Deliverable: Commitment by suitable evaluators to participate in the test
Step 3 - Quick View: Document Performance and Usability Goals
Deliverable: Set of measurable goals that reflect business performance and usability
Step 4 - Quick View: Develop Scenarios
Deliverable: Set of business scenarios suitable to test the performance and usability goals
Consider processes, tasks, and activities which:
Step 5 - Quick View: Develop Evaluator Briefing
Deliverable: Briefing document and plan for delivering the briefing
The briefing is designed to orient evaluators to:
Step 6 - Quick View: Develop Evaluator Debriefing
Deliverable: Debriefing document and a plan for delivering the debriefing
The debriefing is designed to capture information critical to determining whether or not the performance and usability goals are met. It should capture attributes like:
Step 7 - Quick View: Schedule Evaluators
Deliverables: Evaluator schedule, memo to evaluators, and commitments from evaluators
Step 8 - Quick View: Set Up Lab or Evaluation Area
Deliverables: Evaluation area, completely tested and ready for the evaluation
Consider the following:
Step 9 - Quick View: Select Observation Team
Deliverables: Commitments from an appropriate number and type of people to serve as usability test observers
Step 10 - Quick View: Conduct Dry Run
Deliverables: Final versions of Briefing, Scenarios, Debriefing, and Roles / Responsibilities
The Dry Run must:
The Dry Run is a usability test of this test!
Step 11 - Quick View: Conduct Usability Evaluation
Deliverables: Video tapes, logger sheets, and observer comments
Step 12: Document Test Findings
Deliverables: Documentation of test results and recommendations for improvement
Step 1: Accomplished by: Estimated time to
complete:
Determine Test Site Project team representative 2 hours per resource
Usability consultant
Deliverable: Commitment on a suitable location to conduct the usability test
Select a usability lab or identify a location suitable for a usability test.
Consider connectivity, set up and testing.
Consider evaluator travel and parking.
Get commitments for space and equipment for the times you intend to conduct
the lab plus some time before and after.
Usability evaluations can be conducted in a formal usability lab or at the customer's site. Selecting the best site depends on the workspace configuration, hardware and software requirements, the availability of evaluators and evaluator accommodations (such as parking if traveling from another location), time, and budget.
Formal usability labs include video and audio equipment to capture all evaluation details. Soundproofed control rooms and evaluation rooms allow for project team members to observe and comment on the evaluations without disrupting the evaluators. Similarly, the evaluator is able to verbalize reactions without feeling self-conscious.
If you are inclined to use a formal lab, consider:
If the requirements for your application cannot be met by the usability lab equipment or if the lab cannot be scheduled at a convenient time, then consider conducting the test where equipment is appropriate and where you can set up the necessary minimum requirements (camera on tripod filming over the evaluator's shoulder, and the observation team viewing on a monitor somewhere isolated from the evaluator).
Advantages Disadvantages
Formal Use of multiple cameras to Evaluators and team may have to
Lab capture screen, work-area, travel
facial expressions, and body Costs
language simultaneously
Controlled setting for the
evaluation
Customer Portability of equipment to a Limited to image on the monitor
site location convenient for Project only
team and/or evaluators Setting is not as controlled as
Evaluators may feel more at in a lab
ease in their own work May be disruptive to other
environment employees
Step 2: Accomplished by: Estimated time to
complete:
Select Evaluators Project team representative 2 hours per resource
Usability consultant
Deliverable: Commitment by suitable evaluators to participate in the
usability test
Determine an appropriate number of evaluators.
Determine the levels of evaluator necessary for the test.
Consider evaluator travel, parking, and other needs.
Evaluators are the individuals who will test the usability of the product. They must be the "real" users of the product - performers - not members of the project team. Try to get a mix of expertise among the evaluators: high and low level of business knowledge and high and low level of comfort with technology. Recent studies have shown that a large number of evaluators does not necessarily strengthen the test results. The majority of problems that have a major impact on the application will be known within a short period of time (sometimes called the 80/20 rule).
When selecting evaluators, consider:
The number of evaluators you need depends on:
The latest human factors studies tell us that you will usually discover 80% of the usability problems from 20% of the evaluators and that "less is best." Usually 4 to 8 evaluators are sufficient. The larger the target user community, the more evaluators should scheduled.
Consider the following combination as a general rule:
High
1 or 2
Business evaluators 1
knowledge evaluator
and
skill
1
evaluator 1 or 2
evaluators
Low/Moderat High
e
Technology knowledge and skill
Step 3: Accomplished by: Estimated time to
complete:
Document Usability Goals Project team business 2 hours per resource
representative
Usability consultant
Deliverable: Set of measurable usability goals
Determine measurement criteria to evaluate usability
Determine attributes of concern, such as:
ease of learning;
ease of use;
usefulness;
ease of navigation;
affordance; and
satisfaction.
Determine the measurements of concern.
Consider things you can measure, such as:
the time evaluators take to do something;
the number of tasks that can be completed within a given time frame;
the number of successful interactions vs. errors;
time spent recovering from errors;
the number of contiguous errors;
the number of commands / features used to complete a task;
the frequency of help calls;
the frequency of reference to manuals;
the number of positive and negative comments;
the number of times the evaluator becomes frustrated;
the number of times the evaluator appears delighted;
the number of times the evaluator gets sidetracked;
the number of similar problems encountered; and
the number of key items the evaluators failed to see (messages, labels,
prompts).
Document goals.
Adapted from Jakob Nielsen, Usability Engineering, AP Professional, 1993
These goals define the expected usability of a product and are part of the design criteria. In the case of a computer-human interface, usability goals should be defined during the analysis phase of a project. Caveats: (1) If usability goals are not established, then usability cannot be evaluated. (2) If usability goals are first defined following analysis and design (e.g., in order to "validate" an existing design), then usability testing has no meaning. The goals are reviewed and documented at this time as you prepare to test usability. Usability goals must be measurable, focusing on attributes and measurement of the attributes. The most important thing to understand about usability testing is that if the established usability goals are not met by a usability test, then the product is not usable! Usability testing has no value unless you are willing to accept result and make changes.
Attributes, Measures, and Example Goals
Attribute Measures Example Goals
Ease of Learning Time to perform a task the The evaluator will be able
first time to complete task x without
Number of errors made any training within 5
performing a task the first minutes, with 2 or less
time errors, with 3 or less help
Number of help requests requests.
made while performing a The novice evaluator will
task the first time navigate with zero errors
Time to reach a certain and perform all tasks
performance level within 6 minutes.
Percentage of users judging The novice evaluator will
the product "easy to learn" have no more errors than an
expert in the last 10
minutes of working through
the scenarios.
The average rating for ease
of learning will be a 4 or
higher (1 being very
difficult, 5 being very
easy).
Ease of Use Time to perform a task The evaluator will be able
(Productivity) Number of errors made to complete task x in 3
performing a task minutes, with 2 or less
Number of help requests errors, with 3 or less help
made while performing a requests.
task The experienced evaluator
Percentage of user time will be able to complete
spent resolving errors task x with # or less
Frequency of reference to references to paper
documentation documentation/manuals.
Percentage of users judging
the product will make them
more productive
Usefulness Number of available The evaluator will use the
(match to job commands not invoked ABC application # minutes
function or Percentage of time product during x tasks.
workflow) is used during normal user
activities
Percentage of users judging
a product meets their needs
Ease of Number of attempts to The evaluator will locate
navigation locate an item the correct item from the
Average time to locate a menu within 3 clicks of the
menu item mouse
The evaluator will select
items from the menu with
100% accuracy.
Affordance* Time to infer what The evaluator will
something is used for and drag-and-drop the file icon
use it correctly within five seconds of its
Number of incorrect usage appearance
attempts The evaluator will drag the
paper icon to the file icon
*the degree to correctly by the second
which the attempt.
appearance of
something
suggests its use
Satisfaction Evaluator rates the The average rating for
application for satisfaction will be a 4 or
satisfaction at the higher (1 being not
conclusion of the lab. satisfied, 5 being very
satisfied).
Project:_____________________________________ Date:__________________
Attribute Check the measures that are Usability Goals
relevant to your evaluation. .
Add measures if needed.
Ease of Learning Time to perform a task the
first time
Number of errors made
performing a task the first
time
Number of help requests
made while performing a
task the first time
Time to reach a certain
performance level
Percentage of users judging
the product "easy to learn"
Ease of Use Time to perform a task
(Productivity) Number of errors made
performing a task
Number of help requests
made while performing a
task
Percentage of user time
spent resolving errors
Frequency of reference to
documentation
Percentage of users judging
the product will make them
more productive
Usefulness Number of available
(match to job commands not invoked
function or Percentage of time product
workflow) is used during normal user
activities
Percentage of users judging
a product meets their needs
Ease of Number of attempts to
navigation locate an item
Average time to locate a
menu item
Affordance* Time to infer what
something is used for and
use it correctly
Number of incorrect usage
attempts
*the degree to
which the
appearance of
something
suggests its use
Satisfaction Evaluator rates the
application for
satisfaction at the
conclusion of the lab.
Step 4: Accomplished by: Estimated time to
complete:
Develop Scenarios Project team business Approximately 15
representative hours per project
Usability consultant team members
5 hours for the
usability consultant
Deliverable: Test scenarios
Consider business processes, tasks, and activities which:
enable you to measure the usability goals;
accurately reflects the performer's environment (e.g., system and telephone
used simultaneously);
are realistic with respect to frequency;
are appropriate for the diversity of performers (e.g., different skill
levels);
will be performed when the system is in production;
have representative degrees of complexity;
realistically impact business performance (dollars) if not performed
correctly;
have representative business criticality;
accurately represent volumes (of calls, of processed documents, etc.); and
considers performers outside your organization if appropriate.
Scenarios are business processes, tasks or activities that are performed by the business community using the application being tested. They are structured so that the team can measure usability by observing the evaluator attempt to complete them. Scenarios consist of real work that will be performed. Each evaluator will complete the same set of scenarios so that the observers can identify trends in the usability of the system. Scenarios are the means by which we measure the usability goals.
In order to write scenarios, you must know:
This sample scenario measures usability goals for:
This is a good example because it provides just enough information for the evaluator to complete the task.
Shareholder Sam Jones calls for the latest information in his portfolio. At the same time, he also would like to update personal information and to make additional changes.
1. Update Sam's personal information so that it will be effective immediately:
Account number: 123-44-5678 Password: Toledo Address: 121 Granger Street City: Albany State: New York Zip Code: 12209 Telephone: 616 243-5009
2. Mr. Jones requests the following changes be made to his portfolio effective 5/1/96:
3. Mr. Jones also requests the following allocations:
The following is a bad scenario because it is too directive with regard to system navigation. It does not allow you to measure usability!! click on the accounts button, key in the account number, click on OK click on the Portfolio icon, click on the dollar button, click on the dollar amount.........
The following is a sample scenario from an actual usability evaluation. The content for this evaluation was a World Wide Web page containing an instructional outline.
The following sample scenarios are from a usability test of a web site that contains learning materials for employees.
--------------------------------------------------------------------------------------------
The Business of XYZ
Learning Materials for Information Systems Professionals
Usability Evaluation - February 20, 1996
Welcome, and thank you for participating in the usability test of the on-line learning materials. Please follow the scenarios listed below as you use the materials to identify learning activities. Please write your answers as indicated, and verbalize your thoughts as much as possible. You are NOT being tested; rather, we are testing to see whether or not these materials are easy to use. If you have difficulty answering any of the questions, we will make changes to the materials.
Begin at the home page of The Business of XYZ. Use The Guide to answer questions 1 - 5 about the Program:
1. How will you measure your success with this program?
2. List the types of resources contained in the program.
3. When will you know that you have completed the program?
4. How many objectives are there for this program?
5. What is Objective 3?
Return to the home page for The Business of XYZ. Use The Program section to answer questions 6 - 8.
6. List two resources that are available to assist you with sketching the organizational hierarchy of Subsidiary ABC. Be specific.
7. Which checklist items are provided to help you complete Objective 7?
8. Assume that you are studying XYZ's profitability. Information on this subject is contained in the XYZ Annual Report and The Business of XYZ handout. Where can you find copies of these resources in order to read them?
Return to the home page of The Business of XYZ.
9. Locate the link that allows you to comment on the program and record your comments on line.
10. In your own words state the purpose of this program.
STOP!!!
Step 5: Accomplished by: Estimated time to
complete:
Develop Evaluator Briefing Project team business 1 - 2 hours for each
representative resource
Usability consultant
Deliverable: Briefing document and plan for delivering the briefing
The briefing is designed to orient evaluators to:
what they will do;
what will be done with the results.
the sequence of events;
understand that the product is being tested, not them;
that the test is being filmed;
advance training;
how to get help;
how to proceed through the scenarios;
pencil and paper for making notes;
verbalize their thoughts;
taking breaks as they would on the job.
The purpose of the briefing is to prepare each evaluator for the usability lab just prior to conducting the test. At the conclusion of the briefing, the evaluator will know the purpose of the test,his or her role in the test, what is being tested, how to proceed with the test, how to ask for help during the test, and be at ease about the test.
Do not tell the evaluators: the usability goals; that the application is being measured according to usability goals; how many people are observing the test; or who is observing.