IBM VIAVOICE HAS MADE
SOME PROGRESS in scope and accuracy compared to last year's
IBM ViaVoice 98. But there are still the human factors rough edges
which characterize a product that has been perhaps released too
soon.
Synopsis:
Pros: Best integration with chat software, slick and professional
mobile device, good recognition accuracy which really does improve
over time, Linux SDK offerings, worked well with the Telex M-60
Desktop Microphone.
Cons: Counterintuitive and occasionally incorrect punctuation
handling, frail keystroke and command integration with some other
desktop applications, weak macro facility
Installation and Initial Enrollment
Testers did not have a uniform experience in installation. Some
machines went well. Other machines got a message about a system
file being open without indication of which file might be the
trouble. It asked us to abort or retry several times; we retried
- which worked. Another message said that we should update Internet
Explorer because it had system files that IBM needed. Fine, if
you like IE on your system. We wondered why IBM didn't include
the files in its install procedure for people who use other browsers.
Others installations came up with an error message saying that
"scrobj.dll" did not "register properly". At another point during
installation, the message came up indicating that the slow machine
"might not" be powerful enough to be accurate and again there
was no indication in the message as to which resources might be
the limiting factor. (The slow machine's accuracy turned out to
be just fine despite the message.)
Our first human factors critique would be that most of the phrases
contained jargon unintelligible to the beginning computer user.
But, more importantly, most messages gave no indication about
how the problem ought to be righted.
The installation procedure presumes that a person will always
install the product in its entirety on the "C:" Drive and calculates
space requirements on that assumption. In many corporate environments,
applications are installed on a second partition to keep as much
as possible out of the windows drive in case corruption occurs.
When we did this, space requirement calculations were overstated
and misleading. The installation process left several hundred
K of temp files in the Windows temp directory.
Presentation of Initial Setup Cluttered
Pictures accompanying the dialog boxes of installation only occasionally
bore useful relationship to the text of the dialog boxes thus
wasting valuable screen real-estate with puzzling information.
Our users were particularly puzzled by a picture which shows a
man standing near what might be either a white board or a pull
down projection screen. Misrecognition errors during training
were announced by an unfamiliar and unexplained "burp" sound but
scant visual indication that misrecognition has occurred.
Some testers, recognizing that they had grievously misspoken
a training sentence and desiring to give their very best performance
to the program, hit the "back" button in order to retrain the
misspoken phrases. This action is possible in the other products.
They were instead thrown back to the very beginning of the training
session where training text choice is made. They had a difficult
time figuring out how to return to the training track as the dialog
explanations do not account for this reasonable, though unanticipated,
user sequence.
Bold captions in some training texts were difficult to read because
the character spacing between words was too small on an 800 by
600 monitor. The typefaces used to distinguish between recognized
and as yet unprocessed words were too similar for some testers,
leading them to wait unnecessarily for the machine to catch up.
Some users found the black text on grey presentation difficult
to read on low gamma monitors. The buttons at the bottom of dialog
boxes for completing options settings were sometimes completely
obscured by the Windows task bar thus requiring some complex manual
screen real-estate management in order to access the buttons.
Strange Timing Estimates
The installation program displayed a message estimating 15 minutes
of computer time required to process the training text. But on
the fast machine that time was 6 minutes; on the slow machines
it was 45 minutes. So it would appear that "15" is a fixed alphanumeric
string rather than any real estimate derived by determining the
computer's actual processing power. This is not so unhappy when
the actual processing time is shorter than 15. But when it is
longer than 15, the behavior confused our testers.
It was especially confusing because after about 10 minutes on
the slow machine, the installation program displayed congratulatory
messages for finishing the training process and invited one to
begin as a new user. When the tester clicked the OK button, a
new message came up that said processing was not finished so the
new user could not be initiated. When that OK button was clicked,
all dialog boxes disappeared leaving the desktop empty but the
disk drives still whirring.
It required a check with the windows Task Manager to verify
that voice model processing was still continuing and further checks
were required to determine when processing had really finished.
We would not expect the naive user to have navigated this sequence
well. This misstatement of completion time and the false completion
messages continued in other training sessions.
You are here>: Introduction
More: Ongoing Training
Process...
More: Dictation into
the SpeakPad Applet...
More: Text to Speech Performance...
More: Integration With
Other Applications...