We have had a lot of requests over the years for products that are always on and listening for a key “trigger” word. The challenge of this approach is making a “trigger” that doesn’t accidentally trigger when it is not spoken, but also doesn’t accidentally NOT trigger when it IS spoken. The trade-off between these two types of errors is not so simple, since improving one usually makes the other worse, and background noise, especially talking, typically makes voice interfaces perform poorly. And this doesn’t even take into account the constant energy drain from devices that are always on and listening.
Nevertheless, we have gotten the same question over and over. “What’s the point of having speech recognition if I need to press a button to activate it?”
Some of our earliest customers, like VOS Systems, used a hands-free trigger to control a light switch. This was a particularly useful application, because it could be plugged into a wall without battery drain.
The “Phrase Spotting” technology has advanced over the years, and recently we introduced a new spin on it that we call “Truly Hands-Free” for Bluetooth carkits. This technology is being extremely well received, and we are consistently hearing high praise about performance in noise. It really hits the RIGHT combination of minimizing false accepts AND false rejects, all with minimal power drain considering it is always listening for a trigger word.
Now we’re starting to apply this technology to some new and interesting applications:
- Answer/Ignore for Bluetooth headsets and car kits. One of the most desired features of Sensory’s BlueGenie Voice Interface is that it allows answering a phone without having to touch it, for example in a Bluetooth headset or hands-free car kit. The challenge has been getting this to work well in the presence of really loud ring tones and background noises like a car radio or wind noise. The solution…we’ve implemented a Phrase Spotting version of Answer/Ignore that is completely robust to noise and ALWAYS does the right thing.
- Interactive Books. Imagine a book that offers an interactive experience with parents and children while they are reading at night. For example, I say “Jack and Jill went up a Hill” and Jack grunts and says “This is hard work!”, and then I say “to fetch a pail of water”, and I hear a water pouring sound, etc. Pretty fun! In the past this would have been difficult because the talking would have messed up the recognition, but the Phrase Spotting can be embedded even in the middle of a sentence!
- Remote-less Home Controls. If you are my age, you might remember the days of having to walk up to a TV set and manually crank the channel and volume knobs. That’s unheard of today, and nobody would ever buy a TV like that…but we do buy thermostats, microwaves, clocks, fans, heaters, lights, radios, and virtually everything else around the house that requires a manual interface. Why not use voice triggers? Sensory is currently working with many different consumer electronics manufacturers to implement this revolutionary recognition technology into a new generation of voice controlled devices.
Lot’s of exciting stuff in development here!! Next time, maybe I’ll write about our voice morphing TTS!