AT LARGE: I know that voice

Thwarting the HAL 9000 at its own game, ARN's Matthew JC. Powell opens the pod bay doors and begins to write . . .

Ever since people started to imagine a future with high-performance computers, the plan has been for people to interact with those computers by speaking to them. In those early days, of course, real-life interaction with computers involved punch cards and the like, so anything even slightly more natural would be an improvement. Science fiction films set in the not-too-distant future have almost always featured hyper-intelligent machines that never seem to think much faster than the human characters, nor can they solve problems or resolve crises more efficiently. They did, however, speak and understand natural language, and that was intelligent enough.

As technology has progressed, the way in which we interact with our machines has become increasingly natural. We gesticulate with a mouse or a trackpad. We manipulate icons and menus instead of struggling to remember arbitrary commands.

Somewhere along the line, someone decided that handwriting would somehow be preferable to typing, as if scraping an instrument along a surface was closer to "language" than typing. I suppose the practice of writing predates typing, but I don't really think it's that much of an improvement. Fans of PalmPilots and Newtons may disagree, but I'm now so used to typing everything that writing just makes my wrist hurt.

Anyway, where was I going with this? Oh yeah. It's getting more natural and easy to interact with computers, and accordingly, speech recognition has begun to make its first real inroads onto desktops. Right now, I can tell you with some authority that pretty hot speech technology exists on high-performance computers at IBM and elsewhere. Right here in Sydney, buried in a lab at Macquarie University, there exists a computer which, I swear to you, can understand all the lyrics to James Reyne songs on first listening. OK, I might be lying slightly. No-one can understand James Reyne. But the day is drawing nearer.

"consumer" speech recognition products have so far left something to be desired. My Mac has speech recognition software which (mostly) gets commands right if they're not too long, but periodically decides that I'd like to switch my computer off in the middle of using it. Stopping its headlong rush into shutdown involves grabbing the mouse, thus defeating the purpose somewhat. On the other side of the platform fence, this office received a copy of IBM's ViaVoice software late last year that gave us hours of hilarious family fun with its attempts at understanding.

But all of this stuff is on the improve, and application vendors know it. Lotus has already implemented support for ViaVoice in the latest version of its word processor, and Microsoft is said to be investing heavily in speech recognition in the lead-up to Office 99. The day will come, and soon, when you will need to be able to differentiate between speech recognition products for the benefit of your customers.

Is it what you want?

Or will you? Is the industry's race for recognition merely a need to satisfy science fiction ideals of what constitutes an "intelligent" computer, or do people really want this stuff? Imagine for a moment a busy office environment. You walk in and you are surrounded by the sound of thousands upon thousands of key clicks per minute. Even in this office with eight full-time editorial staff, the sound often resembles the last moments of a domino world record attempt -- like you used to see on That's Incredible.

Imagine for a moment if, instead of clicking, you heard eight people mumbling to their computers at once. Is this going to be a more productive office environment?

A final note on vested interest: I have a bet (a very lucrative $10) with another technology writer on this very subject. He thinks speech recognition will be pervasive by the year 2001. I disagree, and so far I think I'm winning. I'd be interested in a vote on the subject though. Drop me an e-mail ( with the subject header "Speech by 2001 -- YES!" or "Speech by 2001 -- NO!" before Wednesday, March 12 and I'll print the results in the following issue of ARN.

