HEAR ME -
Speech Blog
HEAR ME - Speech Blog  |  Read more September 17, 2019 - IFA 2019 Takes Assistants Everywhere to a New Level
HEAR ME - Speech Blog

Archives

Categories

Posts Tagged ‘speaker verification’

Face and Voice Biometrics Quickly Gaining Popularity

February 22, 2016

Recently Peter O’Niel at FindBiometrics interviewed our CEO Todd Mozer about Sensory’s recent announcement of TrulySecure 2.0, check out the interview here: FindBiometrics

Summary: The industry is embracing biometrics faster than ever and many CE companies and app developers are embracing face and voice biometrics to improve user experience and bolster security. Face and voice offers significant advantages over other biometric modalities, notably when it comes to convenience, and particularly in the case of our TrulySecure technology, accuracy and security.

Sensory’s TrulySecure technology has evolved dramatically since its release and recently we announced TrulySecure 2.0 that actually utilizes real world usage data collected from our “AppLock by Sensory” app on the Google Play store. By applying what we learned with AppLock, we were able to adapt a deep learning approach using convolutional neural networks to improve the accuracy of our face authentication. Additionally, we significantly improved the performance of our speaker verification in real world conditions by training better neural nets based on the collected data.

Overall, we have been able to update TrulySecure’s already excellent performance to be even better! The solution is now faster, smarter and more secure, and is the most accurate face and voice biometrics solution available.

Sensory’s CEO, Todd Mozer, interviewed on FutureTalk

October 1, 2015

Todd Mozer’s interview with Martin Wasserman on FutureTalk

TrulySecure From Sensory Becomes First Face and Voice Biometrics Technology to be FIDO UAF Certified

August 20, 2015

Santa Clara, Calif., – August 20, 2015 – TrulySecure Multimodal Biometric Authentication from Sensory, Inc. Has Been Fully Tested and Certified for Compliance with the FIDO Universal Authentication Framework Specifications V1.0

Sensory Inc., a Silicon Valley based company focused on improving the user experience and security of consumer electronics through state-of-the-art embedded voice and vision technologies, today announced that its TrulySecure™ is the first multimodal face and voice biometric authentication software to be FIDO Certified™. The FIDO (Fast Identification Online) Alliance tested TrulySecure for compliance with the FIDO UAF (Universal Authentication Framework) 1.0 specifications, which determines that implementations of the FIDO specification are uniform across products and that those products are interoperable with other products and services that support the FIDO 1.0 specifications.

“We recognize Sensory for building TrulySecure to be fully compliant with the FIDO Universal Authentication Framework specifications and are excited to add their innovative multimodal biometric authentication solution to the FIDO Alliance’s prestigious roster of FIDO UAF Certified authenticators,” said Brett McDowell, FIDO Alliance executive director. “As more enterprises, application developers and mobile device makers shift away from password authentication, solutions like Sensory’s TrulySecure multimodal biometric authentication software will continue to prove valuable as an essential, secure means of authenticating users and keeping their data safeguarded.”

Working with the FIDO Alliance to certify compliance with FIDO standards and interoperability of TrulySecure demonstrates Sensory’s commitment to advancing the current state of user authentication, by ensuring that the industry’s most secure multimodal face and voice authentication software can be easily integrated within authentication solutions from FIDO Certified™ providers. Sensory joined the FIDO Alliance in early 2015 to work alongside other companies eager to create more secure user authentication protocols. Sensory has been a strong supporter of the FIDO Alliance since its inception and has worked with companies like Nok Nok Labs to ensure the biometric authenticator portion of their authentication solution, powered by TrulySecure from Sensory, was fully compliant with FIDO UAF 1.0 specs.

“Sensory’s TrulySecure is a great example of what can be delivered with multimodal biometrics and we are happy to support the solution within our own FIDO Certified S3 Authentication Suite,” said Ramesh Kesanupalli, founder of Nok Nok Labs and FIDO visionary. “Enterprises are looking for turnkey user solutions that offer a mix of authentication methods. Working with Sensory allows Nok Nok Labs to provide its customers with a greater variety of solutions that offer superior security compared to vulnerable passwords.”

TrulySecure leverages Sensory’s deep strengths in speech processing, computer vision, and machine learning. The combination of face recognition and speaker verification to authenticate a specific individual allows users to rest assured that their device is secure, without the hassle of fumbling around with a fingerprint reader or entering a password or PIN every time they want to access it or authenticate to sites and services. Consistent with FIDO standards, TrulySecure is an on-device biometric not requiring a cloud connection. Embedded authentication is a preferred approach for consumers and businesses that don’t want their biometric information stored outside of their personal devices. Embedded biometric solutions are also preferred for their higher security and reliability compared to cloud based systems, which have proven to be vulnerable to hackers and break-ins, and undependable in low-signal/no Internet environments. In addition to the security and dependability benefits of being embedded, TrulySecure further safeguards devices and data by requiring two forms of biometrics, making it at least twice as secure as even the best fingerprint readers found on mobile devices.

The advantages of TrulySecure when compared to other biometric authentication methods include:

  • Simple touch-free authentication – users do not need to tap a screen or swipe a finger, just look at the screen and say the passphrase
  • Its fast! – embedded within software or on device, TrulyHandsfree does not require an Internet or cloud connection to work and responds instantly
  • Robust to environmental changes – works in real-world situations, including low light or high noise environments
  • Nearly impenetrable by imposters – unlike fingerprint readers, which are vulnerable to spoofing, TrulySecure features anti-spoofing techniques to eliminate the chances of an intruder getting access
  • No additional hardware costs – TrulySecure only requires a device have a microphone and a camera; nearly all smartphones, tablets and PCs already feature these components
  • No hardware to wear out – biometric sensors, such as fingerprint readers, wear out with use; since TrulyHandsfree does not rely on special sensors, there is no risk of authentication hardware failure

“We at Sensory are huge supporters of the work the FIDO Alliance has done to create an exciting consortium focused on streamlining user transactions with on-device biometrics,” said Todd Mozer, chairman and CEO of Sensory, Inc. “Promoting biometrics for more than two decades, we are pleased that our TrulySecure technology has become the first multimodal face and vision biometrics technology to be awarded the status of FIDO Certified. By working with companies across the entire authentication ecosystem to certify the interoperability of their FIDO Certified technologies with TrulySecure, we have made it even easier for companies to integrate the industry’s easiest to use and most secure biometric authentication technology within their products.”

For more information about this announcement, Sensory or its technologies, please contact sales@sensory.com; Press inquiries: press@sensory.com

# # #

About The FIDO Alliance
The FIDO (Fast IDentity Online) Alliance, was formed in July 2012 to address the lack of interoperability among strong authentication technologies, and remedy the problems users face with creating and remembering multiple usernames and passwords. The FIDO Alliance is changing the nature of authentication with standards for simpler, stronger authentication that define an open, scalable, interoperable set of mechanisms that reduce reliance on passwords. FIDO authentication is stronger, private, and easier to use when authenticating to online services with FIDO Certified™ products and services.

About Nok Nok Labs
Nok Nok Labs provides organizations with the ability to bring strong, FIDO-based authentication infrastructure to their mobile and web applications. The Nok Nok Labs S3 Authentication Suite enables organizations to accelerate revenues, reduce fraud, and strengthen security. Nok Nok Labs is a founding member of the FIDO Alliance with customers and partners that include NTT DoCoMo, PayPal, Alipay, Samsung and Lenovo. For more information, visit www.noknok.com.

KitKat’s Listening!

November 15, 2013

Android introduced the new KitKat OS for the Nexus 5, and Sensory has gotten lots of questions about the new “always listening” feature that allows a user to say “OK Google” followed by a Google Now search. Here’s some of the common questions:

  1. Is it Sensory’s? Did it come from LG (like the hardware)? Is it Google’s in-house technology? I believe it was developed within the speech team at Android. LG does use Sensory’s technology in the G2, but this does not appear to be an implementation of Sensory. Google has one of the smartest, most capable, and one of the larger speech recognition groups in the industry, and they certainly have the chops to build a key word spotting technology. Actually, developing a voice activated trigger is not very hard. There are several dozens of companies that can do this today (including Qualcomm!). However, making it useable in an “always on” mode is very difficult where accuracy is really important.
  2. The KitKat trigger is just like the one on MotoX, right? Ugh, definitely not. Moto X really has “always on” capabilities. This requires low power operation. The Android approach consumes too much power to be left “always on”. Also, the Moto X approach combines speaker verification so the “wrong” users can’t just take over the phone with their voice. Motorola is a Sensory licensee, Android isn’t.
  3. How is Sensory’s trigger word technology different than others?
    • First of all, Sensory’s approach is ultra low power. We have IC partners like Cirrus Logic, DSPG, Realtek, and Wolfson that are measuring current consumption in the 1.5-2mA range. My guess is that the KitKat implementation consumes 10-100 times more power than this. This is for 2 reasons, 1) We have implemented a “deeply embedded” approach on these tiny DSPs and 2) Sensory’s approach requires as little as 5 MIPS, whereas most other recognizers need 10 to 100 times more processing power and must run on the power hungry Android processor!
    • Second…Sensory’s approach requires minimal memory. These small DSP’s that run at ultra low power allow less RAM and more limited memory access. The traditional approach to speech recognition is to collect tons of data and build huge models that take a lot of memory…very difficult to move this approach onto low power silicon.
    • Thirdly, to be left always on really pushes accuracy, and Sensory is VERY unique in the accuracy of its triggers. Accuracy is usually measured in looking at the two types of errors – “false accepts” when it fires unintentionally, and “false rejects” when it doesn’t let a person in when they say the right phrase. When there’s a short listening window, then “false accepts” aren’t too much of an issue, and the KitKat implementation has very intentionally allowed a “loose” setting which I suspect would produce too many false accepts if it was left “always on”. For example, I found this YouTube video that shows “OK Google” works great, but so does “OK Barry” and “OK Jarvis”
    • Finally, Sensory has layered other technologies on top of the trigger, like speaker verification, and speaker identification. Also Sensory has implemented a “user defined trigger” capability that allows the end customer to define their own trigger, so the phone can accurately and at ultra low power respond to the users personalized commands!

Can end users create their own “trigger phrases”?

August 14, 2013

The technology does exist for end users to create their own unique wake up words and/or speaker verification pass-phrases. If the phrase is known and prepared for in advance, we can typically achieve a higher accuracy. Some care needs to be put into training new or unexpected words to ensure the phrases have sufficient differentiated content that doesn’t frequently occur in real world conversations. Also, there needs to be excellent application design to ensure the templates recorded are of good quality. A bad training recording can really mess things up, and adaptive averaging approaches and good application designs can prevent this. We usually recommend training in quiet and using anywhere.

Let’s talk security

August 12, 2013

Here’s another question I hear: If the device is listening for a specific wake up phrase, how do I stop others from using it?

Some users and analysts have noted the amazing sensitivity of Glass. In my own experiments I’ve noticed that it’s even responsive to whispers in a quiet room or speakers from across the room, so it is possible that someone not wearing it can activate it in quiet conditions.

Speaker verification could be added to wake up words, without hurting the power consumption. The settings can be very light to reduce false firing and keep out some percentage of unintended users, or it can be tighter for more security. The “tighter” and higher security means the higher likelihood that the right user won’t always get in, that’s why we use a “light” setting so wrong users are USUALLY kept out and right users virtually always get in. The speaker verification requires training, but this could happen in an “adaptive” fashion with use, so that the training is invisible to the user. The longer the training word or phrase the better the accuracy!

Follow the Leader in Mobile

October 2, 2012

I really enjoyed reading this article interviewing Vlad Sejnoha, Nuance’s CTO. Most people would consider Nuance the leader in speech recognition today, and Vlad is certainly a very smart, thoughtful, and articulate man.

I enjoyed it for a few different reasons. The first and main reason I liked the article is it helps to push the idea Sensory has been championing for the past several years that devices don’t have to be touched to enable voice commands, and that you should be able to just start talking to things like we talk to each other. That’s what Sensory calls TrulyHandsfree, and it’s the technology that showed up in the first Bluetooth carkit that requires no touching (by BlueAnt) AND the first mobile phones that responded to voice without touch (Samsungs Galaxy SII and SIII and Note). Even hit toys like Mattel’s award winning Fijit Friends and Hallmarks Interactive Books use this unique technology that just works when you talk to it. In fact, it really was the TrulyHandsfree feature that made Vlingo so popular, as this Vlingo video nicely states in its comparison between Vlingo and Siri. (Nuance bought Vlingo earlier this year, but the Sensory TrulyHandsfree didn’t come with it!).

The article says “Sejnoha believes that within a year or two you’ll be able to talk to your smartphone even as it lies idle on a desk, asking it questions such as, “When’s my next appointment?” The phone will be able to detect that you are speaking, wake itself up, and accomplish the task at hand.” Check out this Sensory video…this is definitely what Vlad is talking about! Yeah, we can do it today, and it’s REALLY FAST and really accurate.

But is it low power? Well that’s ABSOLUTELY KEY. That’s why Sensory partnered with Tensilica. Tensilica is a leader in low power audio DSP’s for Mobile Phones. Sensory already has its TrulyHandsfree running on chips that run under 5 mW for a COMPLETE audio system. And that’s without having to wake up to understand the task at hand. We can drop by another 1-2mW by not being always on, but turning the recognizer off doesn’t do much. That’s because even if the full recognizer is shut down, you still need to run a mic and preamp, which drives a lot of the current consumption when you have a low power recognizer like TrulyHandsfree (it can run on as little as 7 MIPS!). This means it’s REALLY critical to have a low power recognizer as well, and that’s Sensory’s forte. We are expecting that by next year we will have systems running at 1-3mW!

The article mentions “persistent” listening, but even though I’ve always preached this “always on” concept, I think what will really explode is “intelligent automatic listening”. That is, the device figures out when it needs to listen for what and turns on to listen for it. So it doesn’t always have to be on…it will just seem that way because the devices are so intelligent. For example a certain traveling speed could make a phone listen for car commands or car wake up words. An incoming call could cause the recognizer to wake up and listen for Answer/Ignore. For these to work, the device needs to run not only at very low power but also with VERY high accuracy. You don’t want to have a background conversation triggering the phone call to hang up! Accuracy is another Sensory forte! The combination of accuracy with low power consumption is a difficult mix to conquer! Sensory’s accuracy is not only in noise but also from a distance…that is when a recognizer works well with a poor S/N ratio, that means the signal can be lower (like from distance) and/or the noise can be higher.

So it’s really cool that Nuance is getting on the bandwagon behind Sensory’s innovations like TrulyHandsfree at low power. In fact after Samsungs release on the Galaxy SII with Sensory, Nuance did come out with an always “on and listening mobile device”; for fun we quickly ported our technology onto the same phone to compare…check out this video.

Something interesting we noticed was that after Sensory announced its speaker verification and speaker ID for mobile devices at CTIA this year, Nuance shortly thereafter came out with their own announcement, but there were no demo’s available so we couldn’t do a comparison video.

Todd
sensoryblog@sensoryinc.com

Mobile Users Get it!

May 30, 2012

Sensory’s had a lot of press lately. We made 3 big announcements all pretty much together:

1) Announcing speaker verification

2) Announcing speaker identification

3) Saying Sensory is in the Samsung Galaxy S3

Sensory announced these just before CTIA in New Orleans. We had a small booth at the show, and gave demos at several events (on the CTIA stage and floor, at the Mobility Awards dinner, and at the excellent Pepcom Mobile Focus event).

We got a lot of nice press from this. I was thrilled that the Speech Technology email newsletter put our verification release as the featured and lead story. One of the articles I like best, though, just came out last week by Pete Pachal at Mashable http://mashable.com/2012/05/29/sensory-galaxy-s-iii/

This article is great for several key reasons. One is that Pete gets it. He didn’t just reprint our press release, but he added his commentary and wrapped it up in a nice story that hits some of the key issues.

However, what’s best is what the readers wrote in. I LOVE their insights and comments. Here’s a few of the dialogs with my commentary attached:

JB: Seriously??? You still need to push a button to use Siri? I’ve had the “wake with voice” option on my crusty old HTC Incredible, via VLingo inCar, for about 2 years now. Hard to believe Apple is that far behind.

My response: EXACTLY JB! In fact that crusty old HTC using Vlingo, also uses Sensory’s TrulyHandsfree approach! Vlingo was our first licensee in the mobile space.

Scott: But this is talking about OS integration instead of app integration. And as I’m sure you’ve seen on your phone, and as the article noted, wake with voice options currently use a lot of power, which means I can’t see a lot of people willing to use it.

My response: Precisely, Scott! This is why we are implementing the “deeply embedded” approach that will take power consumption down by a factor of 10! Nevertheless, users LOVE it even if it consumes power:

JB – I use it all the time and since my phone plugs into the car’s adapter, I don’t really worry at all about power usage. It’s never been a problem.

My response – Yes, Vlingo and Samsung did a very nice implementation by having an “always listening” mode, particularly useful while driving. Other approaches we expect to see in the future are intelligent sensor based approaches so the phone knows when to listen and when not to (e.g. why not have it turn on and listen whenever you start traveling past 20 MPH, etc.)

Is there anything to prevent me from messing with another person’s phone?

Fillfill Ha ha, imagine being in an auditorium and yelling “Hi Galaxy! … Erase Address Book! … Confirm!”

My comment – Funny! This is one of the reasons we have added speaker verification and identification features to the trigger function

DhanB – Siri doesn’t require a button. It can be activated by lifting the phone up to your face.

Great reader responses:

Darkreaper – …..while driving? (Right! That’s illegal in California and other states!)

Tone – Yes, but with the Samsung Galaxy II, I don’t have to touch it at all. As the article states, this is crucial when you’re in a situation, such as driving. I’ve dropped the phone on the floor while driving and I was still able to send a text message, an email and place a call with it sliding around the back seat. (Bluetooth) iPhone can’t compete, sorry. :-/

…and of course the old “butt dialing” problem:

Jason – This makes me think of the old “butt dialing” problem when you sat down on your phone cause I’d much prefer a manual trigger to prevent accidental usage.

My comment: Once again, I agree with the readers. Sensory isn’t pushing to force “always listening” modes on users, we just want to allow them the choice. We strongly recommend that products have multiple options for anything that can be done by voice or touch. We believe the users should have the right and the ability to access the power of mobile devices without being forced to touch them. And if they want to turn off this ability, that is certainly their choice! We turn off our ringers (at least we should) when we enter a meeting or go to the movies. Likewise, we can turn off hands free voice control when it’s not appropriate…and with the growing presence and power of intelligent sensors, it will get easier and easier (albeit with some mishaps along the way!) for the phones to know when they should listen!

A lot of people commented about Siri. Apple isn’t stupid. They get it that hitting buttons isn’t the most convenient way to always access voice control. That’s why there’s a sensor in place when you lift the phone to your face (of course still requiring touch), it’s also why Siri can speak back. Apple pushed the Voice User Interface forward with Siri…Samsung pushed it further with TrulyHandsfree wake up. There will be a lot of back and forth over the coming years and voice features will continue as a major battleground.

As devices get increasing utility WITHOUT touching the phones (e.g. remote control functions, accessing and receiving data by voice, etc.), the need for a TrulyHandsfree approach will grow stronger and stronger, and Sensory will continue to have the BEST solution – More Accurate, Lower Power, Faster Response Times, and NOW with built in speaker verification or speaker ID!

Todd
sensoryblog@sensoryinc.com