I made it to a couple of interesting tradeshows over the last month. The Voice Search Conference was held at the Marriot hotel in San Diego, which provided a really nice setting; the show was very well organized and well attended. Voice Search is all about bringing the power (and revenues) from internet search engines to the cell phone market. Google, Microsoft, Yahoo, and others are getting into this in a big way.
On the first morning I accidentally went into the wrong room for breakfast and sat down with a bunch of people from another industry. They were really negative about phone-based speech recognition, and offered these opinions:
- “Oh like when you call somewhere and the phone says “Press or Say One””
- “Yeah I tried it but it never works with my voice”
- “I hate that stuff, I just want to talk to a live operator”
I pointed out that it’s gotten a lot better than this! Directory services like 1800Goog411 (Google), 1800Call411 (Microsoft), 1800Free411 (Jingle) actually work quite well and do save time. Most of them are based on Nuance engines, which are very powerful server-based technologies. Nuance is the 800 pound gorilla in the speech space, because they’ve acquired pretty much every player in speech recognition (well, other than Sensory of course, but they certainly cleared away all our competitors in the embedded area). Microsoft, Google, and Yahoo have pretty large speech R&D teams, but I’d guess they all use Nuance IP in some fashion, probably to expand their language coverage, if not more.
I found it humorous when someone quoted a woman from Nuance who said “My boss told me never to give live demos at shows because they never work.” Novauris gave some of the best demos at the show, but sure enough they pushed the envelope until some stopped working. I do commend them for being willing to demonstrate technologically challenging concepts in front of a live audience. It can be something of a crap shoot showing off cutting edge technologies.
I spoke on a panel at the Voice Search Conference, and one of the other speakers was from IBM India. He gave a presentation about a telecom web that they are deploying so that users can use their phones to find and hear about service providers in India, basically through short audio messages like “Hello, I’m Pradeep the plumber. I have 12 years of experience doing all types of plumbing.” This is similar to searching the web and reading the short blurbs about different businesses, but instead hearing the entries from a telephone.
At CTIA Wireless 2008, the big cell phone show in Las Vegas, I had a chance to try the Vlingo voice search engine. Yahoo has licensed it already, and it is simply AMAZING! It is the closest thing to “natural language” and “context independence” speech recognition that I have ever seen. Vlingo provides a speech to text service that utilizes a thin client to server model in order to provide recognition in cell phones apps.
Bluetooth headsets were prominently on display at the show. Plantronics introduced a comfortable and cool looking headset that included a case which provides a 5 hour recharge. Great Concept!
The high point of the CTIA conference for me was BlueAnt Wireless winning an award for Best of Show in the peripherals category for their V1 Bluetooth Headset. BlueAnt is a smart and aggressive company that is making rapid inroads and finding a lot of success in the Bluetooth headset and carkit markets. The V1 is billed as the first voice-controlled headset, and it is based on Sensory”s BlueGenie Voice Interface, which gives the user the ability to control common functions like answering or rejecting calls and pairing devices vocally. It even has 1800Goog411 as a built-in command, meaning you’ll never have to press buttons to place a call to any business across the US. Now that”s what I call useful!