Archive for the ‘voice search’ Category
October 14, 2016
I watched Sundar and Rick and the team at Google announce all the great new products from Google. I’ve read a few reviews and comparisons with Alexa/Assistant and Echo/Home, but it struck me that there’s quite an overlap in the reports I’m reading and some of the more interesting things aren’t being discussed. Here are a few of them, roughly in increasing order of importance:
June 9, 2014
I still subscribe to the San Jose Mercury News, as they do a good job of tech business reporting. One of my favorite Mercury News writers is a true critic in the literary sense of the term, Troy Wolverton. Troy rarely raves and is typically critical, but in a smart, logical, and unemotional way.
A few days back he started writing about Microsoft’s Cortana and said “Watch out Siri, someone wants your job.”
I was eager to read his review of Cortana this morning and in particular his comparison with Siri. He ended up giving it a 7/10, and concluding Siri was still ahead. What I thought was most interesting though was that in his final summary, he compared three products and three assistants based on the ease of calling up each of those assistants:
Motorola is Sensory’s customer, and I am happy to read that Troy gets it and considers this front end activation an important metric in comparing personal assistants!
April 25, 2014
It’s not often that I rave about articles I read, but Ian Mansfield of Cellular News hit the nail on the head with this article.
Not only is it a well written and concise article but its chock full of recent data (primarily from JD Power research), and most importantly it’s data that tells a very interesting story that nicely aligns with Sensory’s strategy in mobile. So, thanks Ian, for getting me off my butt to start blogging again!
A few key points from the article:
Now, let me dive one step deeper into the problem, and explore whether customer satisfaction can be achieved with minimal impact on cost:
Seamless voice control is here and soon every phone will have it, and it doesn’t add any hardware cost. Sensory introduced the technology with our TrulyHandsfree technology that allows users to just start talking, and our “trigger to search” technology has been nicely deployed by companies like Motorola that pioneered this “seamless voice control” in many of their recent releases. The seamless voice control really doesn’t add much cost, and with excellent engines from Google and Apple and Microsoft sitting in the clouds, it can and will be nicely implemented without effecting handset pricing.
Sensors are a different story. By their nature they will be embedded into the phones and will increase cost. Some “sensors” in the broadest sense of the term are no brainers and necessities, for example microphones and cameras are a must have, and the six-axis sensors combining GPS and accelerometers are arguably must haves as well. Magnetometers, barometers are getting increasingly common, and to differentiate further leading manufacturers are embedding things like heartbeat monitors; stereo 3D cameras are just around the corner. To address the desire for biometric security Samsung and Apple have the 2 bestselling phones in the world embedded with fingerprint sensors!
The problem is that all these sensors add cost, and in particular those finger print sensors are the most expensive and can add $5-$15 to the cost of goods. It’s kind of ironic that after spending all that money on biometric security, Apple doesn’t even allow them as a security measure for purchasing iTunes. And both Samsung and Apple have been chastised for fingerprint sensors that can be cracked with gummy bears or glue!
A much more accurate and cost effective solution can be achieved for biometrics by using the EXISTING sensors on the phones and not adding special purpose biometric sensors. In particular, the “must have sensors” like microphones, cameras, and 6-axis sensors can create a more secure environment that is just as seamless but much less difficult to crack. I’ll talk more about that in my next blog.
September 27, 2013
I think everybody in the speech industry must know about Motorola’s touchless control feature. Their ad campaign using comedian/actor TJ Miller has been a smashing success. Although their ads started off a bit racy (“touch each other not phones”), the switch to Miller introduced the “lazy phone guy” (which appears to be a knock on Apple) and better showcases key features and advantages of Moto X. The big advantage is in the low power speech activation technology that calls up Google Now without touching the phone!
The lazy phone campaign has ads for each of the device’s key features – Touchless Control, Quick Capture, Active Notifications, and the “Design It Yourself” concept. They are all entertaining, but it’s the touchless control that brings the most laughs. The first video went viral with over 15 million views, making it one of the most popular mobile phone ads ever.
The new touchless control ad is pretty funny with hundreds of thousands of views and growing!
September 24, 2013
Motorola, who just happens to be a Sensory customer, launched a suite of new phones including Moto X and three Droids – Maxx, Ultra, and Mini – all with this awesome feature called “touchless control.” The “touchless control” uses a technology to wake up the phone by voice from a low power state, so the phone is always on and listening. Sorta like TrulyHandsfree! It links into GoogleNow so you can control pretty much anything and access information without touching the phone.
August 16, 2013
I saw a post recently in the Android Central forum that talked about Sensory’s technology as used by Samsung:
What makes it different from any other voice app is its part of the OS. e.g. get a call, you can say ‘Answer’ or ‘Ignore’. alarm rings, just say ‘Snooze’. You don’t have to launch an app or press buttons to do this, the phone is always active and listening. No one else does this!
It’s an astute comment but not 100% accurate. When people talk about “always listening” what they really mean is that it appears to be “always listening”. At Sensory we call it TrulyHandsfree, and the idea is that there can be certain “modes” or “windows” where it listens for specific words. Like when the alarm goes off, it listens for “snooze” etc. If you say “snooze” when the alarm isn’t going off you find it’s not really “always listening”.
Glass has a similar usage model. It’s “always listening” but for different things at different times and only for short periods of time. I put my Glass on and timed it. The OK Glass trigger window seems to last 3-4 seconds, then the next set of commands (like Get Directions to) stays on 10-11 seconds.
What’s really cool about Glass is that during those listening windows you can say other things and it doesn’t “false fire” on them. I let my wife try out my Glass, and she said “You mean I just say OK Glass and then I can say any of these things like get directions to Chef Chu’s and…woah it works!” It ignores everything it’s not listening for and picks out the things it is listening for. The technology is known as “keyword spotting” for this reason.
To save power, Hallmark’s use of Sensory’s technology kicks into gear when the product is turned on. If it doesn’t hear one of the words it’s listening for spoken within a certain time frame, it will automatically power down, and stop “always listening” until its turned back on with a button press.
Sensory recently introduced a low power sound detection technology that further cuts power consumption by having the device “always listening” in a low power mode, where it doesn’t perform speech recognition. When it hears something it quickly powers up the recognizer for further analysis. This can cut the power consumption by “always listening” but not always recognizing, down to 1mA or so.
August 5, 2013
I often get the question, “If Android and Qualcomm offer voice activation for free, why would anyone license from Sensory?” While I’m not sure about Android and Qualcomm’s business models, I do know that decisions are based on accuracy, total added cost (royalties plus hardware requirements to run), power consumption, support, and other variables. Sensory seems to be consistently winning the shootouts it enters for embedded voice control. Some approaches that appear lower cost require a lot more memory or MIPS, driving up total cost and power consumption.
It’s interesting to note that companies like Nuance have a similar challenge on the server side where Google and Microsoft “give it away”. Because Google’s engine is so good it creates a high hurdle for Nuance. I’d guess Google’s rapid progress helps Nuance with their licensing of Apple, but may have made it more challenging to license Samsung. Samsung actually licensed Vlingo AND Nuance AND Sensory, then Nuance bought Vlingo.
Why doesn’t Samsung use Google recognition if it’s free? On the server it’s not power consumption effecting decisions, but cost, quality, and in this case CONTROL. On the cost side it could be that Samsung MAKES more money by using Nuance in some sort of ad revenue kickbacks, which I’d guess Google doesn’t allow. This is of course just hypothesizing. I don’t really know, and if I did know I couldn’t say. The control issue is big too as companies like Sensory and Nuance will sell to everyone and in that sense offer platform independence and more control. Working with a Microsoft or Google engine forces an investment in a specific platform implementation, and therefore less flexibility to have a uniform cross platform solution.
August 2, 2013
What about Microsoft and Amazon? Both have good cloud based recognition engines in house but neither seem particularly relevant in Mobile…YET!
Kudos to Microsoft for its always listening feature in XBox! It’s actually the best implementation I’ve seen that doesn’t use Sensory technology. I’ll blog more about how they do it and why they can’t do a low power implementation in the weeks ahead.
August 1, 2013
One of the leakiest announcements in recent memory, Motorola’s new Moto X is expected to be officially announced today. Rather than trying to one up Apple and Samsung with the highest resolution screen and fastest processor, the Moto X competes on its ability to be customized and its intelligent use of low power sensors. With my background, it’s no surprise that I’m excited to see the “always listening” technology enabling the wake-up command “OK Google Now”. With this feature, speech recognition is enabled but in an ultra low power state, so it can be on and responsive without draining the battery. From other “press leaks”, I’m looking forward to a line of Droid phones with similar “always listening” functionality.
Motorola isn’t the only one rolling out interesting new “always listening” kinds of functions. Samsung did this first in the mobile phone, but implemented it in a “driving mode” so that it was sometimes always listening. The new Moto phones have been compared with Google’s Glass and the “OK Glass” function which some hackers have noted can be put in an “always listening” mode. Qualcomm has even implemented a speech technology on their chips and Android has released a function like this in their OS. Motorola’s use of the “always listening” trigger is especially cool because it calls up Google Now for a seamless flow from client to server speech recognition.
Here’s a demo of Sensory’s use of a very similar approach that we call “trigger to search” from a video we posted around a year ago:
So what’s Sensory’s involvement in these “always on” features from Android, Glass, Motorola, Nuance, Qualcomm, Samsung, etc.? I can’t say much except we have licensed our technology to Google/Motorola, Samsung and many others. We have not licensed Android or Qualcomm, but Qualcomm has commented on its interest in a partnership with Sensory for more involved applications.
With a mass market device like the Moto X, I’m excited to see more people experiencing the convenience of voice recognition that is always listening for your OK. Tomorrow I’m going to discuss leading voice recognition apps on the top mobile environments and then over the next few days and weeks, I’ll cover more topics around voice triggering technology such as pricing models (it’s free right?), power drain, privacy concerns with an “always listening” product, security and personalization. This is an exciting time for TrulyHandsfree™ voice control and I’d welcome your thoughts.
May 1, 2013