WE MEASURE USABILITY
by one stringent test and three loose tests.
The stringent test is called muscle event counting. A strategy
for discovering the difference of ease between two products is
to count the muscle group events needed by each product for a
common set of operations or macro. An operation requiring one
eye movement, a forearm movement, a finger movement and two voice
phrases would count as five muscle events. Why is this significant?
Each change of muscle group takes time, approximately one tenth
second, and requires a shift of attention from the task at hand.
The fewer muscle events there are, the less attention shifting
done by the user and thus the greater perceived ease.
The first loose test is to throw away the instruction manual
and see how far one can get without reference to the manual. Every
situation in which the user does not have adequate information
to carry out the task from whatever is otherwise presented by
the product is considered, by definition, an instance of poor
usability. This test normally brings howls of protest from vendor
developer teams until we remind them that users do not bring manuals
when traveling with their laptops.
The second loose test is a series of criteria. They are:
- visibility of system status - especially with regard to what
the recognizer is currently doing
- degree of match between user concepts and product conventions
- e.g. no computer-eese
- presence and depth of emergency exits - e.g. undo, redo, don't
do, please do because I need it very badly at this particular
crucial moment even if the product doesn't think so
- the ease with which various kinds of consecutive misrecognition
errors can be diagnosed and corrected. The prior mentioned "undo"
criterion concerns user mistakes. This criterion concerns recognizer
mistakes. Recognizer errors need characteristically different
tools than user mistakes.
- consistency - e.g. window shapes, colors, positioning, wording,
button location, grouping of similar concepts
- minimal need for user recall between one part of a dialog
- provision for experienced users as well as beginners - e.g.
shortcut key definition capability, diagnostic options
- messages which indicate problems precisely in plain talk and
recommend an implementable (as opposed to perfunctory) solution
- elegance of design - a small number of objects do a large
number of tasks (which is not the same as a small number of
objects overloaded with optional, unrelated or temporally dependent
- error preventive design - e.g. each instance in which an "undo"
might serve well is carefully engineered out so that the need
to undo is reduced
- "hint count" of manuals- e.g. hints, tips and sidebars
in an instruction manual are red flags of either poor manual
writing or poor product design. A product shouldn't need tips
to work better, it should just work better all by itself if
you follow normal instructions. Well-designed instructions and
product functions don't need cautions, hints, caveats, secondary
explanations and tips. Presence of hints often indicates failure
to attend to "error preventive design" criterion just
- time duration of operations - e.g. "easy" or frequently
needed operations should take shorter time than "hard"
- minimization of advertising, eye candy or other desiderata
which competes for user attention or screen real estate during
- on-line help which really helps
The third loose test is that a given
product behavior is or is not an "eyebrow raiser", i.e.
it did or did not cause a tester to jump back slightly in the
chair and wonder, with eyebrows raised, what on earth the designer
could have been thinking in order to make that particular design
choice. Often such a product behavior turns out to be an outright
bug on some level. Sometimes teams justify this sort of behavior
by pointing out that it simplifies programming. We have zero sympathy
for that perspective. Sometimes it is justified by saying "the
user will get used to that behavior." We have tolerance for
that only if the weird product behavior speeds up learning the
product or actual productivity and no alternative is available.
We then examine these usability weaknesses according to whether
they are either frequent or one-time occurrences and whether they
require Herculean user efforts to overcome. That leads to finally
rating them as any of the three possibilities of usability catastrophes,
annoyances or merely cosmetic issues.
If you have speech applications under development and wish to
have an analysis performed along these lines please contact
More on bug testing...
More on expandability...
Usability Heuristics by Jakob Nielsen