March 23, 2015
This month had three very different announcements about face recognition from Alibaba, Google, and Microsoft. Nice to see that Sensory is in good company!!!
Alibaba’s CEO Jack Ma discussed and demoed the possibility of using face verification for the very popular Alipay.
A couple interesting things about this announcement…First, I have to say, with a name like Alibaba, I am a little let down that they’re not using “Open Sesame” as a voice password to go with or instead of the face authentication… All joking aside, I do think relying on facial recognition as the sole means of user authentication is risky, and think they would be better served using a solution that integrates both face and voice recognition (something like our own TrulySecure), to ensure the utmost security of their customers’ linked bank accounts.
Face is considered one of the more “convenient” methods of biometrics because you just hold your phone out and it works! Well, at least it should… A couple of things I noticed in the Alibaba announcement: Look at the picture…Jack Ma is using both hands to carefully center his photo, and looking at the image of the phone screen tells us why. He needs to get his face very carefully centered on this outline to make it work. Why? Well, it’s a technique used to improve accuracy, but this improved accuracy, trades off the key advantage of face recognition, convenience, to make the solution more robust. Also the article notes that it’s a cloud based solution. To me cloud based means slower, dependent on a connection, and putting personal privacy more at risk. At Sensory, we believe in keeping data secure, especially when it comes to something like mobile payments, which is why we design our technologies to be “embedded” on the device – meaning no biometric data has to be sent to the cloud, and our solutions don’t require an internet connection to function. Additionally, with TrulySecure, we combine face and voice recognition, making authentication quick and simple, not to mention more secure, and less spoofable than face-only solutions. By utilizing a multi-biometric authentication solution like TrulySecure, the biometric is far less environmentally sensitive and even more convenient!
Mobile pay solutions are on the rise and as more hit the market differentiators like authentication approach, solution accuracy, convenience and most of all data security will continue to be looked at more closely. We believe that the embedded multi-biometric approach to user authentication is best for mobile pay solutions.
Also, Google announced that its deep learning FaceNet is nearly 100% accurate.
Everybody (even Sensory) is using deep learning neural net techniques for things like face and speech recognition. Google’s announcement seems to have almost no bearing on their Android based face authentication, which came in the middle of the pack of the five different face authentication systems we recently tested. So, why does Google announce this? Two reasons: – 1) Reaction to Baidu’s recent announcement that their deep learning speech recognition is the best in the world: 2) To counter Facebook’s announcement last year that their DeepFace is the best face recognition in world. My take – it’s really hard to tell whose solution is best on these kind of things, and the numbers and percentages can be deceiving. However, Google is clearly doing research experiments on high-accuracy face matching and NOT real world implementation, and Facebook is using face recognition in a real world setting to tag photos of you. Real-world facial recognition is WAY harder to perfect, so my praise goes out to Facebook for their skill in tagging everyone’s picture to reveal to our friends and family things might not have otherwise seen us doing!
Lastly, Microsoft’s announced Windows Hello.
This is an approach to getting into your Windows device with a biometric (face, iris, or fingerprint). Microsoft has done a very nice job with this. They joined the FIDO alliance and are using an on-device biometric. This approach is what made sense to us at Sensory, because you can’t just hack into it remotely, you must have the device AND the biometric! They also addressed privacy by storing a representation of the biometric. I think their approach of using a 3D IR camera for Face ID is a good approach for the future. This extra definition and data should yield much better accuracy than what is possible with today’s standard 2D cameras and should HELP with convenience because it could be better at angles can work in the dark. Microsoft claims 1 in 100,000 false accepts (letting the wrong person in). I always think it’s silly when companies make false accept claims without stating the false reject numbers (when the right person doesn’t get in). There’s always a tradeoff. For example I could say my coffee mug uses a biometric authenticator to let the right user telepathically levitate it and it has less than a 1 in a billion false accepts (it happens to also have a 100% false reject since even the right biometric can’t telepathically levitate it!). Nevertheless, with a 3D camera I think Microsoft’s face authentication can be more accurate than Sensory’s 2D face authentication. BUT, its unlikely that the face recognition on its own will ever be more accurate than our TrulySecure, which still offers a lower False Accept rate than Microsoft – and less than 10% False Reject rate to boot!
Nevertheless, I like the announcement of 3D cameras for face recognition and am excited to see how their system performs.
March 3, 2015
It feels like I had a whole week’s worth of the trade show wrapped into one day! By the time mid week hits, I’ll surely be ready to head home! Here are some of the highlights from the first day of Mobile World Congress 2015:
February 11, 2015
The advent of “always on” speech processing has raised concerns about organizations spying on us from the cloud.
In this Money/CNN article, Samsung is quoted as saying, “Samsung does not retain voice data or sell it to third parties.” But, does this also mean that your voice data isn’t being saved at all? Not necessarily. In a separate article, the speech recognition system in Samsung’s TVs is shown to be an always-learning cloud-based system solution from Nuance. I would guess that there is voice data being saved, and that Nuance is doing it.
This doesn’t mean Nuance is doing anything evil; this is just the way that machine learning works. There has been this big movement towards “deep” learning, and what “deep” really means is more sophisticated learning algorithms that require more data to work. In the case of speech recognition, the data needed is speech data, or speech features data that can be used to train and adapt the deep nets.
But just because there is a necessary use for capturing voice data and invading privacy, doesn’t mean that companies should do it. This isn’t just a cloud-based voice recognition software issue; it’s an issue with everyone doing cloud based deep learning. We all know that Google’s goal in life is to collect data on everything so Google can better assist you in spending money on the right things. We in fact sign away our privacy to get these free services!
I admit guilt too. When Sensory first achieved usable results for always-on voice triggers, the basis of our TrulyHandsfree technology, I applied for a patent on a “background recognition system” that listens to what you are talking about in private and puts together different things spoken at different times to figure out what you want…. without you directly asking for it.
Can speech recognition be done without having to send all this private data to the cloud? Sure it can! There’s two parts in today’s recognition systems: 1) The wake up phrase; 2) The cloud based deep net recognizer – AND NOW THEY CAN BOTH BE DONE ON DEVICE!
Sensory pioneered the low-power wake up phrase on device (item 1), now we have a big team working on making an EMBEDDED deep learning speech recognition system so that no personal data needs to be sent to the cloud. We call this approach TrulyNatural, and it’s going to hit the market very soon! We have benchmarked TrulyNatural against state-of-the-art cloud-based deep learning systems and have matched and in some cases bested the performance!
January 21, 2015
I know it’s been months since Sensory has blogged and I thank you for pinging me to ask what’s going on…Well, lot’s going on at Sensory. There are really 3 areas that we are putting a strategic focus on, and I’ll briefly mention each:
Of course, there’s a lot more going on than just this…we recently announced partnerships with Intel and Nok Nok Labs, and we have further lowered power consumption in touchless control and always-on voice systems with the addition of our hardware block for low power sound detection.
October 15, 2014
A couple of news headlines have appeared recently asserting that voice activation is unsafe. I thought it was time for Sensory to weigh in on a few aspects of this since we are the pioneers in voice activation:
September 5, 2014
I was very excited to hear Motorola’s announcements today about the new Moto X, MotoG, Moto Hint and Moto 360.
What particularly caught my ear was the statement that they were changing the name from Touchless Control to Moto Voice. They made this decision because so many people thought the technology came from Google in the form of Android, and Moto wanted everyone to know it DIDN’T come from Google.
Actually…It came from Sensory. At least we were an important part of it!!! We have been working on the cool new user defined triggers and are excited that Moto has adopted them for the flagship MotoX (Write-up).
This feature was announced in our TrulyHandsfree 3.0
The new Moto Hint headset is really cool too. It’s a bit like Intel’s Jarvis headset that was announced by Intel CEO Brian Krzanich at CES (and of course uses Sensory!). Here’s a release we did a while back that tells about the technology deployed to connect a Bluetooth headset to a handset through a seamless voice command sequence, Press Release.
Of course the Moto360 is AWESOME, and has some pretty cool voice control features. Yes, Sensory has done an “OK Google” trigger…we even benchmarked our trigger against Google’s…I might share the results in an upcoming blog if there is interest.
July 31, 2014
Yeah, I grew up in an era of watching robots on TV and in the movies, and reading about them in books and comic strips. They were and still are a part of our media culture. My goal in life has been to live in a Jetsons-like world! Well, not really, but I do have a film slide of Rosie the Maid up on my wall, and the mod, Googie, mid-century future style from the Jetsons is definitely my style.
It’s been fun at Sensory to be part of a robot revolution in toys. We have put speech technologies into over 50 robotic creatures from dolls to strange new alien things like Furby. When Aibo first shipped, we had half a dozen companies come to us with awesome designs for new low cost robotic dogs that could respond to their masters’ voices!
Here’s a fun realistic looking robot dog – Scamps. Sensory was in this a few years ago, and it seems to be enjoying a huge comeback in 2014.
More recently we were in Intel’s “Jarvis” headset…When we first created the Jarvis trigger, I didn’t get the name. Then I saw the movie Ironman! :-)
Sensory has designed a lot of robotic technologies beyond speech recognition and synthesis. We have platforms such as sound sourcing, where a robot with two mics can locate the speaker through triangulation. We have sonic networking as a low cost wireless protocol so robots can take commands from TV commercials or YouTube videos or even other robots. We even have made lip synchronization approaches and pitch detection technologies so robots can mimic their owners in a fun and playful way.
The rise of robotic vacuum and window cleaners and non-toy robotic applications is really Neato (yeah that’s a pun!) Of course there have been a lot of beer delivery home robots over the years too, but none of them are making it into the mainstream.
The magic however has not yet really hit, because I want the fun, playfulness, and interactivity of the toys but with utility added in, so it really is more like the Jetsons or Lost in Space.
Jibo is a new robot that might foot this bill, and it seems that I’m not the only one that likes the concept, as it is getting pretty close to being one of the Top 10 funded Indiegogo campaigns of all time!
July 25, 2014
I see a bit of irony that a great Saturday Night Live alumnus is launching a campaign to decrease spoofing. I’m talking about Senator Al Franken, who has been looking into the problem of stolen fingerprints, see article.
Senator Franken challenges Samsung and Apple with some fair concerns about the problem of stolen or spoofed biometrics. The issue is that most biometrics that could be stolen can’t be easily replaced. We only have one face, two eyes, and 10 fingers, so not a lot of chances to replace or change them if they are stolen.
The mobile phone companies, challenged on the fingerprint issue, had two responses:
I think Franken is right to question the utility of biometric fingerprints, because a product like Sensory’s TrulySecure (combining voice and vision authentication) offers a large number of advantages:
Here’s a more canned demo on Sensory’s home page that better showcases some of the anti-spoofing features.
June 30, 2014
June 9, 2014
I still subscribe to the San Jose Mercury News, as they do a good job of tech business reporting. One of my favorite Mercury News writers is a true critic in the literary sense of the term, Troy Wolverton. Troy rarely raves and is typically critical, but in a smart, logical, and unemotional way.
A few days back he started writing about Microsoft’s Cortana and said “Watch out Siri, someone wants your job.”
I was eager to read his review of Cortana this morning and in particular his comparison with Siri. He ended up giving it a 7/10, and concluding Siri was still ahead. What I thought was most interesting though was that in his final summary, he compared three products and three assistants based on the ease of calling up each of those assistants:
Motorola is Sensory’s customer, and I am happy to read that Troy gets it and considers this front end activation an important metric in comparing personal assistants!