HEAR ME -
Speech Blog
HEAR ME - Speech Blog  |  Read more March 11, 2019 - Taking Back Control of Our Personal Data
HEAR ME - Speech Blog

Archives

Categories

Posts Tagged ‘Voice Control’

Sensory CEO Interview with Pymnts

January 11, 2019

Karen Webster is one of the best writers and interviewers in tech/fintech.

Read the full interview here.

Google Assistant vs. Amazon’s Alexa

June 15, 2016

“Credit to the team at Amazon for creating a lot of excitement in this space,” Google CEO Sundar Pichai. He made this comment during his Google I/O speech last week when introducing Google’s new voice-controlled home speaker, Google Home which offers a similar sounding description to Amazon’s Echo. Many interpreted this as a “thanks for getting it started, now we’ll take over,” kind of comment.

Google has always been somewhat marketing challenged in naming its voice assistant. Everyone knows Apple has Siri, Microsoft has Cortana, and Amazon has Alexa. But what is Google’s voice assistant called?

Read more at Embedded Computing…

Way to go Moto Voice

September 5, 2014

I was very excited to hear Motorola’s announcements today about the new Moto X, MotoG, Moto Hint and Moto 360.

What particularly caught my ear was the statement that they were changing the name from Touchless Control to Moto Voice.  They made this decision because so many people thought the technology came from Google in the form of Android, and Moto wanted everyone to know it DIDN’T come from Google.

Actually…It came from Sensory.  At least we were an important part of it!!! We have been working on the cool new user defined triggers and are excited that Moto has adopted them for the flagship MotoX (Write-up).

This feature was announced in our TrulyHandsfree 3.0

The new Moto Hint headset is really cool too. It’s a bit like Intel’s Jarvis headset that was announced by Intel CEO Brian Krzanich at CES (and of course uses Sensory!).

Of course the Moto360 is AWESOME, and has some pretty cool voice control features. Yes, Sensory has done an “OK Google” trigger…we even benchmarked our trigger against Google’s…I might share the results in an upcoming blog if there is interest.

Hey Siri what’s really in iOS8?

June 4, 2014

It was about 4 years ago that Sensory partnered with Vlingo to create a voice assistant with a special “in car” mode that would allow the user to just say “Hey Vlingo” then ask any question. This was one of the first “TrulyHandsfree” voice experiences on a mobile phone, and it was this feature that was often cited for giving Vlingo the lead in the mobile assistant wars (and helped lead to their acquisition by Nuance).

About 2 years ago Sensory introduced a few new concepts including “trigger to search” and our “deeply embedded” ultra-low power always listening (now down to under 2mW, including audio subsystem!). Motorola took advantage of these excellent approaches from Sensory and created what I most biasedly think is the best voice experience on a mobile phone. Samsung too has taken the Sensory technology and used in a number of very innovative ways going beyond mere triggers and using the same noise robust technology for what I call “sometimes always listening”. For example when the camera is open it is always listening for “shoot” “photo” “cheese” and a few other words.

So I’m curious about what Google, Microsoft, and Apple will do to push the boundaries of voice control further. Clearly all 3 like this “sometimes always on” approach, as they don’t appear to be offering the low power options that Motorola has enabled. At Apple’s WWDC there wasn’t much talk about Siri, but what they did say seemed quite similar to what Sensory and Vlingo did together 4 years ago…enable an in car mode that can be triggered by “Hey Siri” when the phone is plugged in and charging.

I don’t think that will be all…I’m looking forward to seeing what’s really in store for Siri. They have hired a lot of smart people, and I know something good is coming that will make me go back to the iPhone, but for now it’s Moto and Samsung for me!

Mobile phones – It doesn’t have to be Cost OR Quality!

April 25, 2014

It’s not often that I rave about articles I read, but Ian Mansfield of Cellular News hit the nail on the head with this article.

Not only is it a well written and concise article but its chock full of recent data (primarily from JD Power research), and most importantly it’s data that tells a very interesting story that nicely aligns with Sensory’s strategy in mobile. So, thanks Ian, for getting me off my butt to start blogging again!

A few key points from the article:

  1. Price is becoming increasingly important in the choice of mobile phones, and simultaneously the prices of mobile phones are increasing.
  2. Although price might be the most important factor in choice, the overall customer satisfaction is driven by features.
  3. The features customers want are seamless voice control (36%); built-in sensors that can gauge temperature, lighting, noise and moods to customize settings to the environment (35%); and facial recognition and biometric security (28%).
  4. As everyone knows, Samsung and Apple have the overwhelming market share in mobile phones, but interesting to me was that they also both lead in customer satisfaction.

Now, let me dive one step deeper into the problem, and explore whether customer satisfaction can be achieved with minimal impact on cost:

Seamless voice control is here and soon every phone will have it, and it doesn’t add any hardware cost. Sensory introduced the technology with our TrulyHandsfree technology that allows users to just start talking, and our “trigger to search” technology has been nicely deployed by companies like Motorola that pioneered this “seamless voice control” in many of their recent releases. The seamless voice control really doesn’t add much cost, and with excellent engines from Google and Apple and Microsoft sitting in the clouds, it can and will be nicely implemented without effecting handset pricing.

Sensors are a different story. By their nature they will be embedded into the phones and will increase cost. Some “sensors” in the broadest sense of the term are no brainers and necessities, for example microphones and cameras are a must have, and the six-axis sensors combining GPS and accelerometers are arguably must haves as well. Magnetometers, barometers are getting increasingly common, and to differentiate further leading manufacturers are embedding things like heartbeat monitors; stereo 3D cameras are just around the corner. To address the desire for biometric security Samsung and Apple have the 2 bestselling phones in the world embedded with fingerprint sensors!

The problem is that all these sensors add cost, and in particular those finger print sensors are the most expensive and can add $5-$15 to the cost of goods. It’s kind of ironic that after spending all that money on biometric security, Apple doesn’t even allow them as a security measure for purchasing iTunes. And both Samsung and Apple have been chastised for fingerprint sensors that can be cracked with gummy bears or glue!

A much more accurate and cost effective solution can be achieved for biometrics by using the EXISTING sensors on the phones and not adding special purpose biometric sensors. In particular, the “must have sensors” like microphones, cameras, and 6-axis sensors can create a more secure environment that is just as seamless but much less difficult to crack. I’ll talk more about that in my next blog.

The price of free phone features

August 5, 2013

I often get the question, “If Android and Qualcomm offer voice activation for free, why would anyone license from Sensory?” While I’m not sure about Android and Qualcomm’s business models, I do know that decisions are based on accuracy, total added cost (royalties plus hardware requirements to run), power consumption, support, and other variables. Sensory seems to be consistently winning the shootouts it enters for embedded voice control. Some approaches that appear lower cost require a lot more memory or MIPS, driving up total cost and power consumption.

It’s interesting to note that companies like Nuance have a similar challenge on the server side where Google and Microsoft “give it away”. Because Google’s engine is so good it creates a high hurdle for Nuance. I’d guess Google’s rapid progress helps Nuance with their licensing of Apple, but may have made it more challenging to license Samsung. Samsung actually licensed Vlingo AND Nuance AND Sensory, then Nuance bought Vlingo.

Why doesn’t Samsung use Google recognition if it’s free? On the server it’s not power consumption effecting decisions, but cost, quality, and in this case CONTROL. On the cost side it could be that Samsung MAKES more money by using Nuance in some sort of ad revenue kickbacks, which I’d guess Google doesn’t allow. This is of course just hypothesizing. I don’t really know, and if I did know I couldn’t say. The control issue is big too as companies like Sensory and Nuance will sell to everyone and in that sense offer platform independence and more control. Working with a Microsoft or Google engine forces an investment in a specific platform implementation, and therefore less flexibility to have a uniform cross platform solution.

Out Today: Moto X is “Always Listening”

August 1, 2013

One of the leakiest announcements in recent memory, Motorola’s new Moto X is expected to be officially announced today. Rather than trying to one up Apple and Samsung with the highest resolution screen and fastest processor, the Moto X competes on its ability to be customized and its intelligent use of low power sensors. With my background, it’s no surprise that I’m excited to see the “always listening” technology enabling the wake-up command “OK Google Now”. With this feature, speech recognition is enabled but in an ultra low power state, so it can be on and responsive without draining the battery. From other “press leaks”, I’m looking forward to a line of Droid phones with similar “always listening” functionality.

Motorola isn’t the only one rolling out interesting new “always listening” kinds of functions. Samsung did this first in the mobile phone, but implemented it in a “driving mode” so that it was sometimes always listening. The new Moto phones have been compared with Google’s Glass and the “OK Glass” function which some hackers have noted can be put in an “always listening” mode. Qualcomm has even implemented a speech technology on their chips and Android has released a function like this in their OS. Motorola’s use of the “always listening” trigger is especially cool because it calls up Google Now for a seamless flow from client to server speech recognition.

Here’s a demo of Sensory’s use of a very similar approach that we call “trigger to search” from a video we posted around a year ago:

https://www.youtube.com/watch?v=cEvpq7Xe-8o

So what’s Sensory’s involvement in these “always on” features from Android, Glass, Motorola, Nuance, Qualcomm, Samsung, etc.? I can’t say much except we have licensed our technology to Google/Motorola, Samsung and many others. We have not licensed Android or Qualcomm, but Qualcomm has commented on its interest in a partnership with Sensory for more involved applications.

With a mass market device like the Moto X, I’m excited to see more people experiencing the convenience of voice recognition that is always listening for your OK. Tomorrow I’m going to discuss leading voice recognition apps on the top mobile environments and then over the next few days and weeks, I’ll cover more topics around voice triggering technology such as pricing models (it’s free right?), power drain, privacy concerns with an “always listening” product, security and personalization. This is an exciting time for TrulyHandsfree™ voice control and I’d welcome your thoughts.

Random Thoughts and Miscellaneous Videos

August 29, 2012

  • Android JellyBean Speech Recognition. It’s REALLY REALLY awesome. I thought all those video comparisons with Siri must be staged, but I’ve been using it and it’s very fast and very accurate and reasonably intelligent. My only criticism is in their marketing. First of all where’s the Mike LeBeau video? And what’s it called? Google Now? Google Voice? Google Voice Actions? JellyBean Speech Recognition? None of this marketing stuff really matters…it’s a big step forward in the handset based speech wars, and by my count puts Android in the lead on speech technology. Can’t wait to see Apple’s next release!! I bet it will be great…and Microsoft? You spent a billion dollars on Tellme, you have had the biggest speech team for the longest time, what are you doing???
  • One of Sensory’s technology apps guys did a really nice demo placing the Sensory trigger to call up the Android JellyBean speech engine. Look how nicely the Sensory technology interacts to make the whole experience not only handsfree but ripping fast!
  • ChinaMobile invested over $200M in iFlytek…WOAH!!! Really? Over $1.2B valuation. Holy Smokes.
  • OK, I’m a speech geek…there’s something I really like about attractive women using speech recognition on QVC (yeah this is a Sensory chip based product, that works AMAZINGLY well in a live shoot)
  • I’m a huge fan of Hallmark’s Interactive Storybuddies…There’s a ton of other fans who have posted videos showing how nice these products are. Sensory’s TrulyHandsfree technology on a NLP chip is embedded in a plush character that responds while you read a book. Now everyone in the speech industry knows that speech recognition works better with men than women, and that accents destroy recognition accuracy, and that you need to speak loudly into the mic or else the S/N will be too poor for recognition to perform. Well watch this video of a soft speaking British accented female using a Hallmark Storybuddy to see how AMAZINGLY perfect the Sensory engine does.

Todd
sensoryblog@sensoryinc.com

Thank you SIRI!

January 27, 2012

Lot’s of thoughts…no time to share them…So I’ll be brief in a few different areas:

  1. Thank you SIRI! Now every CE Company must have speech technology. How the world has changed, and after 18 years of Sensory being one of the only speech company focused on consumer electronics, now everyone is doing it!
  2. What’s really weird is the number of chip companies and investment bankers that have been popping up on our doorsteps since SIRI shipped. Companies do move in herds!
  3. Nuance buys Vlingo. Full disclosure…Vlingo is Sensory’s partner (we’ll see what happens after the deal closes.) How much was paid? (Rumor I keep hearing is the highway that runs near my house…) Why did they pay so much? (because they can, to end the personal lawsuit, to end the other lawsuits, to prevent market share from eroding, NOT to grow their technology base!)
  4. Speaking of Vlingo, I really like that their newsletter and videos that imply they are better than SIRI because they have “more hands-free functionality”…that’s TrulyHandsfree by Sensory!
  5. And what about the Justice Department’s investigation of Nuance (Don’t they have better things to do with our taxes these days?)…The Nuance/Vlingo’s position seems to be all about fighting Microsoft, Google, etc…which has some merit, but if it don’t have Android or Windows Phone, who ya gonna call? Nuance will always be on the list.
  6. Sensory news…
    • Yeah! Our TrulyHandsfree is in Samsung’s Galaxy Note, introduced at CES!
    • Monster Cable showed a cool product at CES with TrulyHandsfree™ inside…they were kind enough to invite the Sensory crew to see Chicago. GREAT CONCERT! I think there were another 20-30 or so products on the CES floor with Sensory inside!
    • We also just got nominated for a Global Mobile Award at the Mobile World Congress.
    • And who says there’s a recession still going on? Our chip-based product sales are going through the roof! The success of our IC product line is also based on TrulyHandsfree because it enables a quasi-natural language interface.
    • Where in the world is Majel???? Sensory did a voice-controlled light switch a few years back with a company called VOS Systems. They licensed the Star Trek brand, used “Computer” as the voice trigger to control the lights, and even licensed Majel Roddenberry’s voice…pretty cool!

Todd
sensoryblog@sensoryinc.com

Sensory at the Intel Developer Forum

September 17, 2011

I decided to pop up to San Francisco this week to hit the Intel Developer Forum. It’s open to the public, but it’s really more of a show and tell to Intel employees than from them.

One of the sessions was entitled “Enhanced Experiences with Low Power Speech Recognition,” and this was my main reason for being there. Intel’s Devon Worrell gave a very nice presentation, focusing on the importance of a closed computer being not just a brick, but still having functionality in a low power state. He put up a lot of compelling slides about using speech recognition in this mode, and emphasized the need for low-power command and control with an always-on always listening device that responds to commands…hmmmm…sounds like a page right out of the Sensory bible!

Realtek appears to have been selected by Intel as a chip provider for the low-power speech recognition, and they presented at the session and even gave a demo of their in-house speech recognition technology. I wasn’t very impressed; the idea was for it to work in music with the user not speaking directly into the microphone. For the demo, however, the music was so quiet the audience could barely tell it was on, and the speaker spoke only a few inches from the mic. I had a hard time understanding if it was working or not (well, that’s giving it the benefit of the doubt.)

Jean-Marc Jot from DTS also spoke and gave an impressive presentation and demo. Of course, I’m very biased….The DTS speech recognition demo used Sensory’s TrulyHandsfree™ Voice Control. I was a bit nervous because of Jean-Marc’s French accent and the fact that DTS had created their own TrulyHandsfree trigger phrase, “Hello Jennifer” without any assistance from Sensory. (As a side note, Sensory’s TrulyHandsfree 2.0 SUBSTANTIALLY improves performance, but there are a number of complex variables in our algorithm that are not accessible through our SDK’s, and therefore our customers can not yet use the latest technology to its fullest extent unless Sensory fine tunes the vocabularies in-house.) So…Jean-Marc was demoing our earliest incarnation of TrulyHandsfree Voice Control, with a French accent in a noisy room and with a command set that Sensory has never reviewed.

The demo was AWESOME. Jean-Marc spoke about 3 feet from the mic, and said commands like “Hey Jennifer…play Lady Gaga.” The music was cranked up really loud, and Jean-Marc spoke commands like “fast forward” and other music controls as well as calling up songs by name. I have a habit of counting speech recognition errors… On the trigger there were no false positives (accidental firing), and only 2 false negatives (where Jean-Marc needed to repeat the trigger phrase). That was 2 out of about 30 or 40 uses, indicating a 94% or 95% acceptance accuracy in high noise, and the phrases following the trigger had about the same high accuracy.

Sweet Demo of how speech recognition can work in a low-power mode and be always on and listening for commands even in high noise situations!

Todd
sensoryblog@sensoryinc.com

« Older Entries