Everyone understands that with biometrics it’s all about the combination of convenience and accuracy. Biometrics can be much more convenient than a password because there’s nothing to remember and safety is enhanced because biometrics aren’t easily predictable and can’t be stolen.
But not all biometrics are the same in convenience and accuracy. Here’s an internal comparison Sensory did that helped place our focus on face and voice for their combination of convenience and accuracy:
Although people inherently understand the importance of convenience and accuracy, most people get confused by accuracy. What really matters isn’t so much a pure accuracy defined by false accepts and false rejects of random users. What really is important is preventing people from intentionally breaking the system. That goes beyond accuracy and could be called liveness or spoofability.
When Apple first introduced a fingerprint biometric, and claimed only 1/50,000 people could get in, they were talking about “false accepts” or the ability for a random fingerprint to pass as the right one. Pretty quickly videos would quickly come out showing how Elmer’s glue or gummy bears could be used to spoof the iPhone. Then when Apple introduced FaceID with 3D cameras they made a big point of showing how a realistic 3D mask wouldn’t break the system…that was showing how difficult to spoof the new approach was!
As it turns out, hitting that 1/50,000 false accept rate from a random user is quite easy with 2D face authentication or even voice authentication. Sensory achieved those metrics very quickly in our research efforts. What was a harder was Stopping the Spoof.
Sensory has now come up with some very novel approaches to use a 2D camera to stop spoofing with photos, 3D masks, face cutouts, videos and more. We are now achieving the most spoof robust system that a 2D “passive” solution has ever achieved!
Stopping a face spoof by “active” means has always been the easy route, but asking people to do random things “actively” with their face (wink, look right, etc.) hurts the convenience. So, with Sensory’s face authentication technology we created a deep learned passive model that doesn’t affect convenience. We wanted to allow Passive Face ID to avoid awkward moves and inconvenience.
You can see how it works in this video:
We made 3D masks to try spoofing in all different angles and lighting:
Sensory spent a lot of time trying to accomplish passive liveness from voice. There are a lot of interesting techniques that we tried, but when it came down to it, it was hard to get ultra-low spoof rates from recordings while still allowing the right user to get in under varying conditions (noise background, echoes, etc.).
We decided that for voice we could accept an active liveness to improve the accuracy. We did this by combining a text independent speaker verification (identifies a user by their voice and not a specific password) with Sensory’s TrulyNatural speech recognition (ability to recognize any random phrase). This way we could prompt the user to say a random phrase that they couldn’t possibly have a recording of. In this case, we use TrulyNatural to confirm the RIGHT phrase is spoken and we use TrulySecure Voice to perform the text independent speaker ID.
It works quite well, and users are finding it to not be too intrusive:
Depending on your desired level of security, either passive face liveness or active voice liveness may meet your needs. However, when these two technologies are combined it gets really exciting. The result is the most robust anti-spoofability solution available. A very low cost solution that requires only a 2D camera and a microphone. Finally, a solution with best in class accuracy, maximum anti-spoofability performance, low-cost components and high convenience.