Archive for the ‘speaker identification’ Category
January 11, 2019
Karen Webster is one of the best writers and interviewers in tech/fintech.
Read the full interview here.
August 30, 2017
A few days ago I wrote a blog that talked about assistants and wake words and I said:
“We’ll start seeing products that combine multiple assistants into one product. This could create some strange and interesting bedfellows.”
Interesting that this was just announced:
Here’s another prediction for you…
All assistants will start knowing who is talking to them. They will hear your voice and look at your face and know who you are. They will bring you the things you want (e.g. play my favorite songs), and only allow you to conduct transaction you are qualified for (e.g. order more black licorice). Today there is some training required but in the near future they will just learn who is who much like a new born quickly learns the family members without any formal training.
October 16, 2015
I saw a LinkedIn message to one of the biometrics groups in which I’m a member linking to a new video on biometrics:
I was quite surprised to see that I am actually in it!
It’s a great topic…Banks turning to biometrics. The video doesn’t talk too much about what’s really happening and why, so I’ll blog about a few salient points, worthy of understanding:
1) Passwords are on their deathbed. This is old news and everyone gets it, but worthy of repeating. Too easy to crack and/or too hard to remember
2) Mobile is everything, and mobile biometrics will be the entry point. Our mobile phones will be the tools to control and open a variety of things. Our phones will know who we are and keep track of the probability of that changing as we use them. Mobile banking apps will be accessed through biometrics and that will allow us to not only check balances, but pay or send money or speed ATM transactions.
3) EMV credit cards are here…Biometric credit confirmation is next! Did you get a smart card from your bank? Europay, Visa, and MasterCard decided to improve fraud by shifting fraud risk based on security implemented. Smart cards are now, biometrics will be added to aid fraud prevention.
4) It’s all about convenience & security. So much focus has been on security that convenience was often overlooked. There was a perception that you can’t have both! With Biometrics you actually can have an extremely fast and convenient solution that is highly accurate.
5) Layered biometrics will rule. Any one biometric or authentication approach in isolation will fail. The key is to layer a variety of authentication techniques that enhance the systems security but don’t hurt convenience. Voice and face authentication can be used together, passwords can be thrown on top if the biometric confirmation is unsure, tokens or fingerprint or iris scans can also be deployed if the security isn’t high enough. The key is knowing the accuracy of match and increasing the security to the desired security level in a stepped function so as to maximize user convenience.
October 1, 2015
Todd Mozer’s interview with Martin Wasserman on FutureTalk
August 6, 2015
We first came out with TrulyHandsfree about five years ago. I remember talking to speech tech executives at MobileVoice as well as other industry tradeshows, and when talking about always-on hands-free voice control, everybody said it couldn’t be done. Many had attempted it, but their offerings suffered from too many false fires, or not working in noise, or consuming too much power to be always listening. Seems that everyone thought a button was necessary to be usable!
In fact, I remember the irony of being on an automotive panel, and giving a presentation about how we’ve eliminated the need for a trigger button, while the guy from Microsoft presented on the same panel the importance of where to put the trigger button in the car.
Now, five years later, voice activation is the norm… we see it all over the place with OK Google, Hey Siri, Hey Cortana, Alexa, Hey Jibo, and of course if you’ve been watching Sensory’s demos over the years, Hello BlueGenie!
Sensory pioneered the button free, touch free, always-on voice trigger approach with TrulyHandsfree 1.0 using a unique, patented keyword spotting technology we developed in-house– and from its inception, it was highly robust to noise and it was ultra-low power. Over the years we have ported it to dozens of platforms, Including DSP/MCU IP cores from ARM, Cadence, CEVA, NXP CoolFlux, Synopsys and Verisilicon, as well as for integrated circuits from Audience, Avnera, Cirrus Logic, Conexant, DSPG, Fortemedia, Intel, Invensense, NXP, Qualcomm, QuickLogic, Realtek, STMicroelectronics, TI and Yamaha.
This vast platform compatibility has allowed us to work with numerous OEMs to ship TrulyHandsfree in over a billion products!
Sensory didn’t just innovate a novel keyword spotting approach, we’ve continually improved it by adding features like speaker verification and user defined triggers. Working with partners, we lowered the draw on the battery to less than 1mA, and Sensory introduced hardware and software IP to enable ultra-low-power voice wakeup of TrulyHandsfree. All the while, our accuracy has remained the best in the industry for voice wakeup.
We believe the bigger, more capable companies trying to make voice triggers have been forced to use deep learning speech techniques to try and catch up with Sensory in the accuracy department. They have yet to catch up, but they have grown their products to a very usable accuracy level, through deep learning, but lost much of the advantages of small footprint and low power in the process.
Sensory has been architecting solutions for neural nets in consumer electronics since we opened the doors more than 20 years ago. With TrulyHandsfree 4.0 we are applying deep learning to improve accuracy even further, pushing the technology even more ahead of all other approaches, yet enabling an architecture that has the ability to remain small and ultra-low power. We are enabling new feature extraction approaches, as well as improved training in reverb and echo. The end result is a 60-80% boost in what was already considered industry-leading accuracy.
I can’t wait for TrulyHandsfree 5.0…we have been working on it in parallel with 4.0, and although it’s still a long ways off, I am confident we will make the same massive improvements in speaker verification with 5.0 that we are doing for speech recognition in 4.0! Once again further advancing the state of the art in embedded speech technologies!
June 30, 2014
May 7, 2014
If you read through the biometrics literature you will see a general security based ranking of biometric techniques starting with retinal scans as the most secure, followed by iris, hand geometry and fingerprint, voice, face recognition, and then a variety of behavioral characteristics.
The problem is that these studies have more to do with “in theory” than “in practice” on a mobile phone, but they never-the-less mislead many companies into thinking that a single biometric can provide the results required. This is really not the case in practice. Most companies will require that False Accepts (error caused by wrong person or thing getting in) and False Rejects (error caused by the right person not getting in) be so low that the rate where these two are equal (equal error rate or EER) would be well under 1% across all conditions. Here’s why the studies don’t reflect the real world of a mobile phone user:
A great case in point is the fingerprint readers now deployed by Apple and Samsung. These are extremely expensive devices, and the literature would make one think that they are highly accurate, but Apple doesn’t have the confidence to allow them to be used in the iTunes store for ID, and San Jose Mercury News columnist Troy Wolverton says:
“I’ve not been terribly happy with the fingerprint reader on my iPhone, but it puts the one on the S5 to shame. Samsung’s fingerprint sensor failed repeatedly. At best, I would get it to recognize my print on the second try. But quite often, it would fail so many times in a row that I’d be prompted to enter my password instead. I ended up turning it off because it was so unreliable (full article).”
There is a solution to this problem…It’s to utilize sensors already on the phone to minimize cost, and deploy a biometric chain combining face verification, voice verification, or other techniques that can be easily implemented in a user friendly manner that allows the combined usage to create a very low equal error rate, that become “immune” to conditions and compliance issues by having a series of biometric and other secure backup systems.
Sensory has an approach we call SMART, Sensory Methodology for Adaptive Recognition Thresholding that takes a look at environmental and usage conditions and intelligently deploys thresholds across a multitude of biometric technologies to yield a highly accurate solution that is easy to use and fast in responding yet robust to environmental and usage models AND uses existing hardware to keep costs low.
November 15, 2013
Android introduced the new KitKat OS for the Nexus 5, and Sensory has gotten lots of questions about the new “always listening” feature that allows a user to say “OK Google” followed by a Google Now search. Here’s some of the common questions: