Archive for the ‘voice authentication’ Category
April 22, 2015
How does Big Data and Privacy fit into the whole Deep Learning Puzzle?
Privacy and Big Data have become big concerns in the world of Deep Learning. However, there is an interesting relationship between the Privacy of personal data and information, Big Data, and Deep Learning. That’s because a lot of the Big Data is personal information used as the data source for Deep Learning. That’s right, to make vision, speech and other systems better, many companies invade users’ personal information and the acquired data is used to train their neural networks. So basically, Deep Learning is neural nets learning from your personal data, stats, and usage information. This is why when you sign a EULA (end user license agreement) you typically give up the rights to your data, whether its usage data, voice data, image data, personal demographic info, or other data supplied through the “free” software or service.
Recently, it was brought to consumers’ attention that some TVs and even children’s toys were listening in on consumers, and/or sharing and storing that information to the cloud. A few editors called me to get my input and I explained that there are a few possible reasons for devices to do this kind of “spying” and none of which are the least bit nefarious: The two most common reasons are 1) The speech recognition technology being used needs the voice data to train better models, so it gets sent to the cloud to be stored and used for Deep Learning and/or 2) The speech recognition needs to process the voice data in the cloud because it is unable to do so on the device. (Sensory will change this second point with our upcoming TrulyNatural release!)
The first reason is exactly what I’ve been blogging about when we say Deep Learning. More data is better! The more data that gets collected, the better the Deep Learning can be. The benefits can be applied across all users, and as long as the data is well protected and not released, then it only has beneficial consequences.
Therein lies the challenge: “as long as the data is well protected and not released…” If banks, billion dollar companies and governments can’t protect personal data in the cloud, then who can, and why should people ever assume their data is safe, especially from systems where there is no EULA is place and data is being collected without consent (which happens all the time BTW)?
Having devices listen in on people and share their voice data with the cloud for Deep Learning or speech recognition processing is an invasion of privacy. If we could just keep all of the deep neural net and recognition processing on device, then there would be no need to risk the security of peoples’ personal data by sharing and storing it on the cloud… and its with this philosophy that Sensory pioneered an entirely different, “embedded” approach to deep neural net based speech recognition which we will soon be bringing to market. Sensory actually uses Deep Learning approaches to train our nets with data collected from EULA consenting and often paid subjects. We then take the recognizer built from that research and run it on our OEM customers’ devices and because of that, never have to collect personal data; so, the consumers who buy products from Sensory’s OEM customers can rest assured that Sensory is never putting their personal data at risk!
In my next blog, I’ll address the question about how accurate Sensory can be using deep nets on device without continuing data collection in the cloud. There are actually a lot of advantages for running on device beyond privacy, and it can include not only response time but accuracy as well!
April 15, 2015
Deep Neural Nets, Deep Belief Nets, Deep Learning, DeepMind, DeepFace, DeepSpeech, DeepImage… Deep is all the rage! In my next few blogs I will try to address some of the questions and issues surrounding all of these “deep” thoughts including:
Part 1: What is Deep Learning and is Sensory Just Jumping on the Bandwagon?
Artificial Neural Network approaches have been around for a long time, and have gone in and out of favor. Neural Nets are an approach within the field of Machine Learning and today they are all the rage. Sensory has been working with Neural Net technology since our founding more than 20 years ago, so the approach is certainly not new for us. We are not just jumping on the bandwagon… we are one of the leading carts! ;-)
Neural Networks are very loosely modeled after how our brains work – nonlinear, parallel processing, and learning from exposure to data rather than being programmed. Unlike common computer architectures that separate memory from processing, our brains have billions of neurons that communicate and process all in parallel and with huge quantities of connections. This architecture based on how our brains work turns out to be much better than traditional computer programs at dealing with ambiguous and “sensory” information like vision and speech – a little Trivia: that’s how we came up with the name Sensory!
In the early days of Sensory, we were often asked by engineers, “What kind of neural networks are you running?” They were looking for a simple answer, something like a “Kohonen Net.” I once asked my brother, Mike Mozer, a pioneer in the field of neural nets, a Sensory co-founder, and a professor of computer science at U. Colorado Boulder, for a few one liners to satisfy curious engineers without giving anything away. We had two lines: the first being, “a feed forward multi-layer net” which satisfied 90% of those asking, and the other response for those that asked for more was, “it’s actually a nonlinear and multivariate function.” That quieted pretty much everyone down.
In the last five years Neural Networks have proven to be the best-known approaches for various recognition and ambiguous data challenges like vision and speech. The breakthrough and improvement in performance came from these various terms that use the word “deep.” The “deep” approaches entailed more complex architectures that receive more data. The architecture relates to the ways that information is shared and processed (like all those connections in our brain), and the increased data allows the system to adapt and improve through continuous learning, hence the terms, “Deep Learning” and “Deep Learning Net.” Performance has improved dramatically in the past five years and Deep Learning approaches have far exceeded traditional “expert-based” techniques for programming complex feature extraction and analysis.
March 23, 2015
This month had three very different announcements about face recognition from Alibaba, Google, and Microsoft. Nice to see that Sensory is in good company!!!
Alibaba’s CEO Jack Ma discussed and demoed the possibility of using face verification for the very popular Alipay.
A couple interesting things about this announcement…First, I have to say, with a name like Alibaba, I am a little let down that they’re not using “Open Sesame” as a voice password to go with or instead of the face authentication… All joking aside, I do think relying on facial recognition as the sole means of user authentication is risky, and think they would be better served using a solution that integrates both face and voice recognition (something like our own TrulySecure), to ensure the utmost security of their customers’ linked bank accounts.
Face is considered one of the more “convenient” methods of biometrics because you just hold your phone out and it works! Well, at least it should… A couple of things I noticed in the Alibaba announcement: Look at the picture…Jack Ma is using both hands to carefully center his photo, and looking at the image of the phone screen tells us why. He needs to get his face very carefully centered on this outline to make it work. Why? Well, it’s a technique used to improve accuracy, but this improved accuracy, trades off the key advantage of face recognition, convenience, to make the solution more robust. Also the article notes that it’s a cloud based solution. To me cloud based means slower, dependent on a connection, and putting personal privacy more at risk. At Sensory, we believe in keeping data secure, especially when it comes to something like mobile payments, which is why we design our technologies to be “embedded” on the device – meaning no biometric data has to be sent to the cloud, and our solutions don’t require an internet connection to function. Additionally, with TrulySecure, we combine face and voice recognition, making authentication quick and simple, not to mention more secure, and less spoofable than face-only solutions. By utilizing a multi-biometric authentication solution like TrulySecure, the biometric is far less environmentally sensitive and even more convenient!
Mobile pay solutions are on the rise and as more hit the market differentiators like authentication approach, solution accuracy, convenience and most of all data security will continue to be looked at more closely. We believe that the embedded multi-biometric approach to user authentication is best for mobile pay solutions.
Also, Google announced that its deep learning FaceNet is nearly 100% accurate.
Everybody (even Sensory) is using deep learning neural net techniques for things like face and speech recognition. Google’s announcement seems to have almost no bearing on their Android based face authentication, which came in the middle of the pack of the five different face authentication systems we recently tested. So, why does Google announce this? Two reasons: – 1) Reaction to Baidu’s recent announcement that their deep learning speech recognition is the best in the world: 2) To counter Facebook’s announcement last year that their DeepFace is the best face recognition in world. My take – it’s really hard to tell whose solution is best on these kind of things, and the numbers and percentages can be deceiving. However, Google is clearly doing research experiments on high-accuracy face matching and NOT real world implementation, and Facebook is using face recognition in a real world setting to tag photos of you. Real-world facial recognition is WAY harder to perfect, so my praise goes out to Facebook for their skill in tagging everyone’s picture to reveal to our friends and family things might not have otherwise seen us doing!
Lastly, Microsoft’s announced Windows Hello.
This is an approach to getting into your Windows device with a biometric (face, iris, or fingerprint). Microsoft has done a very nice job with this. They joined the FIDO alliance and are using an on-device biometric. This approach is what made sense to us at Sensory, because you can’t just hack into it remotely, you must have the device AND the biometric! They also addressed privacy by storing a representation of the biometric. I think their approach of using a 3D IR camera for Face ID is a good approach for the future. This extra definition and data should yield much better accuracy than what is possible with today’s standard 2D cameras and should HELP with convenience because it could be better at angles can work in the dark. Microsoft claims 1 in 100,000 false accepts (letting the wrong person in). I always think it’s silly when companies make false accept claims without stating the false reject numbers (when the right person doesn’t get in). There’s always a tradeoff. For example I could say my coffee mug uses a biometric authenticator to let the right user telepathically levitate it and it has less than a 1 in a billion false accepts (it happens to also have a 100% false reject since even the right biometric can’t telepathically levitate it!). Nevertheless, with a 3D camera I think Microsoft’s face authentication can be more accurate than Sensory’s 2D face authentication. BUT, its unlikely that the face recognition on its own will ever be more accurate than our TrulySecure, which still offers a lower False Accept rate than Microsoft – and less than 10% False Reject rate to boot!
Nevertheless, I like the announcement of 3D cameras for face recognition and am excited to see how their system performs.
January 21, 2015
I know it’s been months since Sensory has blogged and I thank you for pinging me to ask what’s going on…Well, lot’s going on at Sensory. There are really 3 areas that we are putting a strategic focus on, and I’ll briefly mention each:
Of course, there’s a lot more going on than just this…we recently announced partnerships with Intel and Nok Nok Labs, and we have further lowered power consumption in touchless control and always-on voice systems with the addition of our hardware block for low power sound detection.
October 15, 2014
A couple of news headlines have appeared recently asserting that voice activation is unsafe. I thought it was time for Sensory to weigh in on a few aspects of this since we are the pioneers in voice activation:
July 25, 2014
I see a bit of irony that a great Saturday Night Live alumnus is launching a campaign to decrease spoofing. I’m talking about Senator Al Franken, who has been looking into the problem of stolen fingerprints, see article.
Senator Franken challenges Samsung and Apple with some fair concerns about the problem of stolen or spoofed biometrics. The issue is that most biometrics that could be stolen can’t be easily replaced. We only have one face, two eyes, and 10 fingers, so not a lot of chances to replace or change them if they are stolen.
The mobile phone companies, challenged on the fingerprint issue, had two responses:
I think Franken is right to question the utility of biometric fingerprints, because a product like Sensory’s TrulySecure (combining voice and vision authentication) offers a large number of advantages:
Here’s a more canned demo on Sensory’s home page that better showcases some of the anti-spoofing features.