A speaking Palm handheld, webpages you can talk to, a courteous robot, and a first draft of the Star Trek universal translator. Science fiction? Not in IBM's labs.
The year of speech technology is almost here, according to WS "Ozzie" Osborne, general manager of IBM Voice Systems. He offers prototypes as proof, previewing the futuristic systems at a recent Speech Fair at IBM's Santa Teresa Laboratory in San Jose. Some 2500 research scientists are exploring voice technologies throughout IBM.
IBM is focusing about 80 per cent of its investment on speech-enabling the enterprise. But those higher-level technologies will benefit consumers as well.
"Voice will be integral to computing as devices change from PCs to handhelds; the interface will have to change," Osborne says. "Carrying around a keyboard will be too hard." After all, he points out, devices keep getting smaller, but our fingers don't.
The web and voice technology are already being wed in wireless phones. With the help of a pending standard called VoiceXML, you may be able to access webpage content by phone, or surf by asking questions like, "What's the latest list of the New York Times bestsellers?" Already, 64 developers support VoiceXML, which uses Enterprise Java Beans.
"The cell phone is the ultimate thin client," observes Osborne. "Human interface is what we're really working on."
Watch out for Robby
Focus on the interface is not limited to voice. Interpersonal communications are also driven by interactive visual cues, so IBM is developing bipedal robots that can react to humans.
An early prototype is a table-top robot consisting of a lollipop-shaped head of transparent orange plastic, Muppet-like bug-eyes, and a tiny video camera hidden in its nose. The camera senses movement, so the robot has freedom to make eye contact with its audience as it moves within a 12 degree range, says robot-handler Dr David Nahamoo, director of worldwide research for IBM Voice Systems.
The colourful contraption could also respond intelligently to conversation; for example, if told, "You're stupid" the robot could frown and then reply in a synthesised voice, "You're rude."
But don't expect to have your own personal C-3PO protocol droid anytime soon. "Ten to 15 years is a reasonable guess," Nahamoo says, pointedly noncommittal. "The future is really going to be about being able to interact with computers the same way we interact with people."
Mechanical issues must be overcome before a fully interactive robot can become a reality. "The user interface aspect needs to be worked out, as does the application integration," Nahamoo says. "Visual recognition is much more difficult than just speech recognition, since there are two dimensions (involved)."
From hand to voice
Coming soon, however, is a snap-on speech recognition base for Palm devices. A prototype contains a speaker, earphone jack, microphone, and -- most importantly -- a coprocessor that provides the necessary computing power to support voice technologies such as speech recognition and text-to-speech.
Using IBM's Personal Speech Assistant application, you can navigate through a to-do list, execute several hundred commands, and access your address book. For example, you can say, "Find Bill Smith", and the contact record for Bill Smith opens on-screen.
The integrated microphone offers a limited degree of noise cancellation; however, IBM's software is designed to compensate. Dictating a memo is as simple as holding down the record button and speaking into the unit's microphone. The prototype stores audio files in the base's 4MB of flash memory; IBM's compression scheme can contain 30 minutes of audio. The base can also be designed to accommodate removable media such as Compact Flash cards or even a 340MB IBM Microdrive.
When you sync the handheld with your desktop PC, IBM's ViaVoice engine on your desktop automatically transcribes the audio clip and uploads the transcript to the handheld. Though not unwieldy, a prototype base adds slightly to the weight and length of an IBM WorkPad unit (running the Palm OS), as demonstrated.