VoiceWizard: the speech resource  
for executives and other adventurers exploring voice technology miracles    

IBM ViaVoice Millennium cont'd:

Integration with Other Applications

Cumbersome Linkage to Other Applications

ViaVoice requires you to set the voice control status bar each time you "dictate directly" into another application, while the other products reviewed assume that you are dictating into the window that has desktop focus. Once you've connected ViaVoice to the application, your cursor slows down. The cursor sluggishness, compounded by the punctuation problem mentioned earlier, causes errors that need manual corrections, and is troublesome during multiple chat conversations.

However, under Instant Messenger, all corrections are properly inserted. In fact, IBM behaves very well in this context. As a special feature, the program also accepts some "chat jargon," translating spoken phrases such as "rolling on the floor laughing" into the chat room shorthand of "rofl." But it won't properly retranslate that shorthand during text-to-speech read-back from that chat window.

Dictating into Dreamweaver was a very different story. Words collided frequently, extra characters are deleted during the punctuation processing at the end of a sentence, audio playback is erratic for long phrases. Highlighted areas of text under correction sometimes did not correspond to what was in the correction window. The "select" command would often bounce the highlighted area around before it settled down and would then finally settle in the wrong place. The cursor slowdown mentioned elsewhere is especially noticeable. It extends outside the text area to the thumbnails of Dreamweaver windows.

Asking database query questions to the English Wizard program works pretty reliably. There is the occasional odd bounce around of highlighted correction areas noted in Dreamweaver.

To be fair, we note that IBM does not necessarily certify that ViaVoice will work identically, or even reliably, with every program that exists - nor do any of the other vendors. To its credit, IBM provides a mode of dictating into its SpeakPad which can link the SpeakPad with an arbitrary application in such a way that allows quick transfer of the SpeakPad text to the application. That reduces many of the problems noted above at the price of a few muscle events to initiate the transfer.

Nevertheless, we believe that equal convenience across all contexts is a goal in which IBM (and the other vendors) should put more efforts. We believe that the average consumer will not tolerate anything less over the longer term. And that consumer will not excuse weak integration because it's hard for the vendor to accomplish due to the horrendous internal software design efforts this pursuit will require. Consumers don't care how hard it is for the programmers - they just want the convenience.

Macros Weak

With no way to compensate for the variable loading time of the Netscape mail application, send the proper control keys sequences to Netscape, and deal with the address list, ViaVoice had difficulty with the e-mail macro. It does handle corrections and insertions properly when dictating directly into the manually started mail application. The audio playback of recorded voice from within Netscape was trouble free. Telling ViaVoice to "go to sleep" before you send the e-mail doesn't leave it in sleeping mode after the Send button is pressed; it wakes up spontaneously.

There is no way to copy and paste parts of one macro into another. Reuse of a macro inside of another or capturing keystrokes into a macro is only possible in certain types of macros. The macro facility would be adequate for simple insertions into text and simple navigation, but is not the generalized procedural control facility that is our preference. There is no way to edit the internal representation of a macro, something that can be done in Dragon Systems and is useful for developers..

Transcription Usable

The round trip time to change between between direct dictation and input from the mobile recorder is about 20 seconds on the slow machine. This is the fastest speed of all the vendors.

However, we found that taking input from the mobile recorder directly, our preferred method, into the SpeakPad applet was troublesome as the Recognition Wizard kept popping up for various reasons and there is no way for the recorder to interact with it. So the best way to use the recorder is with the standalone transcription software.

Although that standalone package works, we find the extra steps involved to be an annoyance. Our ideal would be a design where it was possible to switch back and forth at will between the two sources within the same document.

Accuracy drops back several percent, from about 94% via microphone to 90% via recorder (minimum training) in our tests - usable but not great.

We liked the mobile recorder, an Olympus D150. It's sleek lightweight form factor tucks easily in clothing. The button design is intelligent and usage well thought out. We found it a nice tool to have by the side of the bed for catching those crucial answers to business questions that seem only to emerge from the subconscious in the middle of the night.

Very Full SDK

The ViaVoice SDK comes on four CD-ROMs, and supports Microsoft's SAPI 4.0 in U.S. English, U.K. English, French, German, Italian, Spanish, and Japanese. It also supports text-to-speech in all languages except Japanese. The ViaVoice SDK has a unique facility called "speaker recognition," which lets an application verify the speaker's identity by voiceprint. Other unique ActiveX features are the animated text-to-speech icons, a high-level text-editing control, and a speech control for HTML applications.

IBM has an elaborate support web site for developers at http://www.software.ibm.com/speech/. There are also optional SDKs supporting telephony and Java. IBM makes an SDK for Linux. A beta version is available for free download at http://www.software.ibm.com/speech/dev/sdk_linux.html. Our tests revealed that, in its current beta form, it requires doing enrollment and training in Windows and then importing the speaker files to Linux. The sample programs are quite informative.

More: Back to the Beginning - Introduction
More: Ongoing Training Process...
More: Dictation into the SpeakPad Applet...
More: Text to Speech Performance...
You are here>: Integration With Other Applications...

 


 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
You are here::
home->reviews->desktop
->ibmvv ->integration
Technology Review
What's Out There
Executive View
Developer View
News
Hot Products
FAQ
At This Site
Who We Are
  Projects
Research
Wish List
Product Reviews
  Bugs
Report New Bug
Bugs on File
Consulting Services
Home

Other Stuff
Download
   
   
Case Studies
How We Test
  Bugs
  Usability
  Expandability
For Vendors Only
Copyright © 1999-2002 VoiceWizard
Comments? Questions? to voicemaster@voicewizard.com

Page Last Updated: 12/05/99