Posts Tagged ‘speech technology’
June 16, 2011
I’ve been in the speech technology field since the beginning and I have to say, there has never been a more exciting time for this space. Recently some of the biggest names in technology have announced the integration of voice capabilities into their products. At this year’s E3 conference, Microsoft stated that the next version of it’s Xbox Live will include voice commands. Also, it appears Apple will integrate speech-to-text input in the iOS 5. Android 2.1 already has speech-to-text built in to its mobile platform. And just this week, Google announced that voice search capability is coming to the Google.com search box (how cool?!)
All of these developments will be exposing more and more mainstream users to the benefits of the voice user interface on a daily basis. Consumers demand so much from personal devices and if they expect to control them via voice, they’ll want to do so from beginning to end (no button pressing, ever). This is where Sensory comes in. Our Truly Hands-Free technology is better than anything out there and lets manufacturers add a hands-free trigger to the interface so the user can give the device a call to action without ever lifting a finger. No need to take eyes off the road to make a call from a hands-free car kit, no need to dirty up your tablet or computer by using messy (cooking) hands to call up a recipe, no need to disturb your comfortable state of rest to set an alarm clock, etc.
I can say from where I sit, many manufacturers see the value of a voice user interface that includes a hands-free trigger phrase. Expect to see the makers of automotive products, smartphones, home entertainment products and more using Sensory’s technologies in the coming year. And be sure to stay tuned for exciting enhancements and innovations in store for our Truly Hands-Free technology, as well.
April 21, 2011
I had an interesting email conversation with a blog reader last month, and I thought I’d share some of the dialog. He is an equity analyst (who wishes to remain anonymous) that follows some companies in the speech industry. He emailed me saying:
“I came across your blog some time ago and have been reading it since with great interest. A topic of particular interest to me has been your periodic comments about how Apple has lagged the investments made by Google in speech recognition technology, opting instead to lean on Nuance. I was also struck by your observation that big companies, such as Google, have a history of licensing Nuance technologies before eventually taking those capabilities in-house.”
This makes me feel the need to clarify something…Nuance has great technologies, period. When companies feel the need to bring the technology “in-house”, it’s not driven by a failing of Nuance, but simply the fact that the USER EXPERIENCE IS SO CRITICAL to the success of consumer products. It’s difficult for big companies like Apple, Google, Microsoft, HP and others that depend heavily upon positive consumer experiences to farm out the technology for such a critical component.
The conversation turned to Apple, and the equity manager asked about the all too common question of whether Apple might acquire Nuance. Here’s, roughly, how the conversation went:
Analyst: What is your current view on Apple’s efforts in this space? As a company they seem to take great pride in controlling the user experience and that extends to how they think about key technologies (witness the Flash vs. HTML 5 spat, for example). It makes me wonder if Apple would be satisfied relying on Nuance for such a visible and important capability or whether they’d feel the need to also bring it in-house.
Todd: Apple can definitely afford Nuance. In fact, Apple probably makes enough profit in a good quarter to buy Nuance outright. Nevertheless, it would be a BIG price tag, and not in line with Apple’s traditional acquisition strategy. I wouldn’t rule it out, but I wouldn’t say they “need” Nuance, either, but they do need to do something, and they know it. Apple has been posting job requisitions this year in the area of speech recognition, so they definitely want to bring more of the technology in-house. My guess is they’ll do some M&A in the speech technology area as well. Google and Microsoft have combined aggressive hiring with M&A, so it seems likely that Apple will go beyond the SIRI acquisition (which added an AI layer on top of Nuance) and acquire more core speech technology expertise.
Analyst: I agree with you that Apple makes/has enough cash to acquire Nuance, but that it would be out of character for Apple to do so. Where I’m most interested is whether there are meaningful technical/architectural reasons why Apple must partner with Nuance for SR, or if the gap between Nuance and these smaller players is narrow enough that Apple would acquire or partner more closely with one of the small guys in order to maintain more control over the technology. Many people seem to think that an SR acquisition would have to be of Nuance, but I’ve been told that there are many quality SR start-ups. If you had to bet, do you think that Apple needs the 800-pound gorilla Nuance in order to do a good job in SR, or would one of these smaller companies give Apple a sufficient base upon which to build out a solution?
Todd: I’m confident Apple will eventually own it. I’d say the odds of them buying Nuance though are quite low (10-30% as a wild guess). There’s no technical reason why they can’t use another technology, but the 3 best reasons they’d acquire Nuance are:
Apple’s in-house teams are quite familiar with the Nuance engines as they have already implemented them in some products. Apple is engaged in a lot of patent fights, and Nuance has the best portfolio of speech patents in the world – That’s a really valuable asset that the Google’s and Microsoft’s would probably fight over! Of course, for the cost of Nuance, someone could probably buy all of the other TTS and SR tech companies in the world! ;-)
Analyst: Apple really has a phobia about adding third-party software to their products. No Mosaic core in their browser, no audio compression codecs from Dolby or DTS, no Flash from Adobe…. They acquired two microprocessor design companies to create a proprietary stack on ARM chips rather than using broadly available chipsets from Qualcomm or Broadcom. Now comes the question of what to do with SR technology….
Todd: It will be interesting to see how this all unfolds. I suspect a lot of other large companies will want to get into the game as well. It could be that the cloud-based solutions for TTS and SR become generic and replaceable enough that there isn’t a need to bring them “in-house”. Of course, Sensory is hoping and betting on the need for the Client/Server approaches, where an embedded solution (like our Truly Handsfree Triggers) nicely complement the cloud-based offerings.