Integration with Other Applications
Cumbersome Linkage to Other Applications
ViaVoice requires you to set the voice control status bar each
time you "dictate directly" into another application, while the
other products reviewed assume that you are dictating into the
window that has desktop focus. Once you've connected ViaVoice
to the application, your cursor slows down. The cursor sluggishness,
compounded by the punctuation problem mentioned earlier, causes
errors that need manual corrections, and is troublesome during
multiple chat conversations.
However, under Instant Messenger, all corrections are properly
inserted. In fact, IBM behaves very well in this context. As a
special feature, the program also accepts some "chat jargon,"
translating spoken phrases such as "rolling on the floor laughing"
into the chat room shorthand of "rofl." But it won't properly
retranslate that shorthand during text-to-speech read-back from
that chat window.
Dictating into Dreamweaver was a very different story. Words
collided frequently, extra characters are deleted during the punctuation
processing at the end of a sentence, audio playback is erratic
for long phrases. Highlighted areas of text under correction sometimes
did not correspond to what was in the correction window. The "select"
command would often bounce the highlighted area around before
it settled down and would then finally settle in the wrong place.
The cursor slowdown mentioned elsewhere is especially noticeable.
It extends outside the text area to the thumbnails of Dreamweaver
windows.
Asking database query questions to the English Wizard program
works pretty reliably. There is the occasional odd bounce around
of highlighted correction areas noted in Dreamweaver.
To be fair, we note that IBM does not necessarily certify that
ViaVoice will work identically, or even reliably, with every program
that exists - nor do any of the other vendors. To its credit,
IBM provides a mode of dictating into its SpeakPad which can link
the SpeakPad with an arbitrary application in such a way that
allows quick transfer of the SpeakPad text to the application.
That reduces many of the problems noted above at the price of
a few muscle events to initiate the transfer.
Nevertheless, we believe that equal convenience across all contexts
is a goal in which IBM (and the other vendors) should put more
efforts. We believe that the average consumer will not tolerate
anything less over the longer term. And that consumer will not
excuse weak integration because it's hard for the vendor to accomplish
due to the horrendous internal software design efforts this pursuit
will require. Consumers don't care how hard it is for the programmers
- they just want the convenience.
Macros Weak
With no way to compensate for the variable loading time of the
Netscape mail application, send the proper control keys sequences
to Netscape, and deal with the address list, ViaVoice had difficulty
with the e-mail macro. It does handle corrections and insertions
properly when dictating directly into the manually started mail
application. The audio playback of recorded voice from within
Netscape was trouble free. Telling ViaVoice to "go to sleep"
before you send the e-mail doesn't leave it in sleeping mode after
the Send button is pressed; it wakes up spontaneously.
There is no way to copy and paste parts of one macro into another.
Reuse of a macro inside of another or capturing keystrokes into
a macro is only possible in certain types of macros. The macro
facility would be adequate for simple insertions into text and
simple navigation, but is not the generalized procedural control
facility that is our preference. There is no way to edit the internal
representation of a macro, something that can be done in Dragon
Systems and is useful for developers..
Transcription Usable
The round trip time to change between between direct dictation
and input from the mobile recorder is about 20 seconds on the
slow machine. This is the fastest speed of all the vendors.
However, we found that taking input from the mobile recorder
directly, our preferred method, into the SpeakPad applet was troublesome
as the Recognition Wizard kept popping up for various reasons
and there is no way for the recorder to interact with it. So the
best way to use the recorder is with the standalone transcription
software.
Although that standalone package works, we find the extra steps
involved to be an annoyance. Our ideal would be a design where
it was possible to switch back and forth at will between the two
sources within the same document.
Accuracy drops back several percent, from about 94% via microphone
to 90% via recorder (minimum training) in our tests - usable but
not great.
We liked the mobile recorder, an Olympus D150. It's sleek lightweight
form factor tucks easily in clothing. The button design is intelligent
and usage well thought out. We found it a nice tool to have by
the side of the bed for catching those crucial answers to business
questions that seem only to emerge from the subconscious in the
middle of the night.
Very Full SDK
The ViaVoice SDK comes on four CD-ROMs, and supports Microsoft's
SAPI 4.0 in U.S. English, U.K. English, French, German, Italian,
Spanish, and Japanese. It also supports text-to-speech in all
languages except Japanese. The ViaVoice SDK has a unique facility
called "speaker recognition," which lets an application verify
the speaker's identity by voiceprint. Other unique ActiveX features
are the animated text-to-speech icons, a high-level text-editing
control, and a speech control for HTML applications.
IBM has an elaborate support web site for developers at http://www.software.ibm.com/speech/.
There are also optional SDKs supporting telephony and Java. IBM
makes an SDK for Linux. A beta version is available for free download
at http://www.software.ibm.com/speech/dev/sdk_linux.html.
Our tests revealed that, in its current beta form, it requires
doing enrollment and training in Windows and then importing the
speaker files to Linux. The sample programs are quite informative.
More: Back to the Beginning
- Introduction
More: Ongoing Training
Process...
More: Dictation into
the SpeakPad Applet...
More: Text to Speech Performance...
You are here>: Integration
With Other Applications...