Speech recognition gains from less competition

Speech recognition gains from less competition

I know this sounds odd, but the problem with third-party speech products is that they tend to have different features and strengths. The problem exists because I want them all. And right now I can't simultaneously run two or more of my favourite speech products.

Lernout & Hauspie Speech Products ( makes my favourite speech software. The company really knows how to do two things extremely well - mix dictation with natural-language commands and translate text.

For example, I can use Voice Xpress to dictate the text in this paragraph and then - without doing anything to change from dictation to command mode - tell it to select the paragraph and make it bold. The trick is in where you pause. If you don't pause anywhere in the phrase "Tell it to select the paragraph and make it bold", it takes dictation. If you say, "Select the paragraph" and "Make it bold" surrounded by short pauses, it understands the phrases as commands. (You can also combine the commands, as in "Make this paragraph bold".)Text translation is a more recent development from Lernout & Hauspie. The Voice Xpress Ultimate Suite product includes a number of translation utilities that translate from English to Spanish, French, and German, or from Spanish, French, and German to English. One of the utilities lets you translate text from within almost any application.

I find this very cool because I love the French language. Did you ever notice that you can say almost anything in French and it will sound romantic? For example, ask "Qui a coupe le fromage?" Sounds wonderful, doesn't it? Next time you're getting cosy with that someone special, lean over and whisper, "Est-ce que vous allez etre dans la salle de bain tout le jour? Donnez une chance a quelqu'un autrement!"

(I apologise if there are any grammatical errors in the above French. I don't really remember enough of what I learned in my French classes to be sure the Lernout & Hauspie utility is doing a good job.)Here's where I begin to run into problems. When it comes to word processors, Lernout & Hauspie provides great natural-language commands only with Microsoft Word. I prefer WordPerfect myself, so I either have to forfeit the natural language or use another product, such as Dragon NaturallySpeaking (www. Dragon includes a decent module specifically designed to add natural-language commands to WordPerfect.

The natural-language approach Dragon uses isn't quite as strong as the Voice Xpress features for use with Microsoft Word. And Dragon doesn't provide natural-language commands for the Windows desktop. It does, however, have the merit of being more accurate at taking dictation than Voice Xpress. So the ideal solution is to use Voice Xpress for most applications and for the Windows desktop. But I have to shut down Voice Xpress when I want to run Dragon and WordPerfect.

Then there is Conversa Web 3.0 (www. This is a whiz-bang, killer Web navigation tool. Conversa Web lets you control your browser entirely by voice. If you see a hyperlink with the text "Today's top news stories", you simply say some or all of the text in the hyperlink and bam, it takes you there. If there is any reason why it can't surf hyperlinks based on text (for example, if the link is tied to an image), Conversa Web simply marks those links with numbers. So instead of speaking the text of the link, you say "No 3" and off you go.

All of the other navigational aids are available by speech, including things like "Go back", "Go back three pages", "Scroll down", and "Refresh page". It's one of those products you have to try to believe.

But here's the bad news. It works only with Internet Explorer. For reasons of security and preference, I don't use IE. So I don't use Conversa Web very often.

I hate to see anyone usurp the innovative technology of good companies such as Lernout & Hauspie, Dragon Systems, and Conversa. But the only way I can imagine that this problem can be solved is for someone to integrate a standard speech interface with natural-language features into the operating system. Then the application vendors, and not the voice technology vendors, could exploit the speech capabilities without stepping on one another.

What do you think should be done?

Follow Us

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.
Show Comments

Industry Events

24 May
ARN Exchange
View all events