The future of speech recognition: Gaston Bastiaens
- 09 December, 1998 13:20
In 1999, natural language understanding, the underlying technology that takes voice recognition beyond speech to text, will begin targeting major corporations with a variety of business applications and tools that will include computer-driven translation programs to tools that access corporate data. IDG's Ephraim Schwartz spoke with Gaston Bastiaens, president and CEO of Lernout & Hauspie, one of the market leaders in speech technologyIDG: Speech technology has a certain gee-whiz quality, but why do you believe it has a place in the enterprise?
Bastiaens: There are a number of reasons. First, because of the globalisation of the markets and the Internet. If companies want to do business in China and in other countries, or if they have corporations over there, the capability to do real-time translation is of utmost importance. [L&H recently introduced a program called iTranslator for this market.]Second, the human voice is the most natural user interface with any device on earth and with any software program.
We have speech recognition systems that can receive Chinese, Japanese, Arabic, French, Portuguese, Italian, and so on - that's much more universal than a keyboard. We're going to ship Mandarin Chinese and Cantonese recognition - speech recognition that works seamlessly, integrated with a keyboard and with a graphic table optimised for professional users.
Ease of use is often cited as a reason why speech technology will catch fire. Is there more to it than that?
Yes. It also has to do with the cost of operations. If I can reduce the cost of my operations substantially by using voice as one of the key elements, then I definitely am going to do that.
Can you give me an example?
If you look at the medical market, there is a yearly multibillion-dollar cost for hospitals and specialists for transcription. By applying speech, costs can be substantially reduced. It also increases the quality and the throughput time, which is very important. You can also approach all kinds of businesses that deal with forms-based solutions.
Does speech recognition go beyond filling out forms?
Of course. The second part of the equation is artificial intelligence or natural language understanding: the ability for me to decipher tons of information and even bring that back with a short summary.
That's where I see the most important developments. Clearly this is beyond where we are today. It is the next level of natural language understanding.
If you look at any corporation, you will see that the amount of unstructured information is growing rapidly. It is one of the key tasks for Oracle, for example, to contain that information in a way that can be accessed completely and efficiently.
All of the key breakthroughs we see in our company for the next generation are with artificial intelligence and based on natural language understanding and its ability to access non-structured information, which is the majority of information which is out there.
If you go to an insurance company, the structured information will be very high. But the more you go away from there into a business operation, the more unstructured information is there because there is so much information that comes to us every day. According to studies from the Gartner Group, before you can make a decision, you have to go through massive amounts of information, often unstructured.
Well, the technology of natural language understanding will allow you to go through this information very quickly and, with the help of intelligent agents, read those documents and bring back the relevant information to you. That is what I see as a major contribution to any corporation using natural language understanding and artificial intelligence.
What are the next big markets?
You will see small handheld devices with enormous power, where the human voice is the only really good input/output for accessing data and information. Computerised translation market, by applying natural language technologies - that is a huge market opportunity.
Also, our new approach in speech synthesis is where you can let a computer or a device talk with a voice equal to a human voice. That's a research breakthrough where we are well ahead of the major linguistics labs in the world. That's what we launched at Comdex. We are going to introduce that in the market early next year.
Every large corporation trying to sell products across the globe has to take into account the local culture and the local languages. That's where we have an enormous expertise. And that's where we see a big oppor-0tunity for our company. The two growth areas we see for natural lan- guage understanding are in the user interface on one hand and language technologies on the other.