THE CURRENT CROP of
desktop speech recognition applications shows some improvement
in accuracy and scope over 1998 offerings, but very little improvement
in the human factors aspect. Click here
for the current list of reviewed desktop products.
Vendors do not seem to understand that the competition is no
longer about recognition accuracy - the products are all generally
accurate enough to be productive. Instead, the competition is
about ease of use.
It's hard to understand how some of these usability failures
passed by the quality assurance teams working in these companies
- some of the problems are so obvious. Has the passion for excellence
been derailed by other business priorities? Perhaps.
What's Going Wrong With Ease of Use
We suspect that the vendors have been lulled into believing
that traditional beta testing provided by potential customers,
who are rewarded for their efforts by free copies of the product,
will find most problems. But classical software testing strategies
can't work for this type of product, we think, for many reasons.
Here's why:
- The monetary value of one of these speech recognition products
is perhaps in the neighborhood of $150 to $200. It's difficult
to know how beta testers value their time, but $200 isn't
going to pay for much time these days of anybody with significant
intelligence.
- Internal company quality assurance teams are almost invariably
computer literate and therefore tend to accept as normal certain
human factors gaffes that a naive user would trip over.
- Internal teams normally test to the "Product Specification"
and, in the entire course of recorded information technology
history, product specifications never specified human factors
issues along with detailed usability criteria - concentrating
instead on merely specifying product functions. Well, almost
never.
- The range of things to be tested is very much larger than
is obvious as compared to a non speech oriented product. Adding
speech to the traditional personal computer interface of mouse,
screen and keyboard is not a mere linear extension of interface
complexity. It is, in fact, of combinatorial geometric complexity
because you are adding the multiple factors of vocalization
variation, English syntax variation, word processing habits,
text origination and assembling habits, real-time interaction
with the current desktop state - along with generally random
computer misrecognition errors of both text and/or commands
coupled with people's here to fore unobserved and idiosyncratic
responses to those kinds of errors.
- The quality expectation levels of users are raised very much
higher when a computer "understands you." The shoddy usability
characteristics of the previous slop which yesterday passed
for acceptable software is today made irritating and obvious
by these new expectations. People wake up out of their accustomed
usability numbness when they're in the presence of a speech
recognition system.
Where Desktop Speech Recognition is Now
OK, those things all make the problem thorny. And they force
us to be realistic about the current state of these products.
Here's the bottom line.
- If you are physically handicapped, these products are
a godsend.
- If you are a fumble finger typist, these products are
a godsend, but you'll curse the gods a lot.
- If you are a beginning computer user, these products
are a demi-godsend and you'll hope the demigod will go to
school and get some more god lessons as soon as possible.
- If you are an executive who is secretly afraid of computers
but would like to show your competitors how avant-garde you
are, these products are a godsend but keep your demos short
and simple.
- If you are very computer literate, you will bitch and
moan and scream and holler and swear at the gods, but you
will still use the products because sometimes the products
are actually magic.
What You Can Find in These Reviews
The reviews that follow are detailed, some would say even picky.
The detail makes them long, some would say even excessive. We
believe the detail is important for those people who may need
to support these products in a corporate environment and would
like some advance notice about the kind of support necessary.
We also have not seen this kind of detail reported elsewhere.
All we have seen is glowing and gushing - which these reviews
decidedly are not.
For an overview of the general usability criteria that we have
used in arriving at this detail, please checkout the section of
this web site titled "How
We Test". You may also be interested in the equipment
and tactics we used for testing this kind of product. Because
of the length, each review is preceded by a summary which is for
those of you who might find the detail unrelenting.
Since the marketplace changes constantly, all the detail for
all the speech products of interest is not currently in place.
For products which do not yet have full reviews but which we intend
to cover, you will find preliminary
information here. The detail for these products will be updated
from time to time in keeping with an article entitled "Not
Yet Speaking Freely" which appeared in Software
Development Magazine, the November 1999 issue.
And now on to the desktop continuous
speech recognition product reviews .....
IBM ViaVoice Millennium. Click here
for full review.
Lernout & Hauspie VoiceXpress 4.0. Click here
for full review.
Philips FreeSpeech 2000. Click here
for full review.
Dragon Systems Naturally Speaking 4.0 Click here
for a full review.