November 12, 2015
A really smart guy told me years ago that neural networks would prove to be the second best solution to many problems. While he was right about lots of stuff, he missed that one! Out of favor for years, neural networks have enjoyed a resurgence fueled by advances in deep machine learning techniques and the processing power to implement them. Neural networks are now seen to be the leading solution to a host of challenges around mimicking how the brain recognizes patterns.
Google’s Monday announcement that it was releasing its TensorFlow machine learning system on an open-source basis underscores the significance of these advances, and further validates Sensory’s 22 year commitment to machine learning and neural networks. TensorFlow is intended to be used broadly by researchers and students “wherever researchers are trying to make sense of very complex data — everything from protein folding to crunching astronomy data”. The initial release of TensorFlow will be a version that runs on a single machine, and it will be put into effect for many computers in the months ahead, Google said.
Microsoft also had cloud-based machine learning news on Monday, announcing an upgrade to Project Oxford’s facial recognition API launched in May specifically for the Movember Foundation’s no-shave November fundraising effort: a facial hair recognition API that can recognize moustache and beard growth and assign it a rating (as well as adding a moustache “sticker” to the faces of facial hair posers).
Project Oxford’s cloud-based services are based on the same technology used in Microsoft’s Cortana personal assistant and the Skype Translator service, and also offer emotion recognition, spell check, video processing for facial and movement detection, speaker recognition and custom speech recognition services.
While Google and Microsoft have announced some impressive machine-learning capabilities in the cloud, Sensory uniquely combines voice and face for authentication and improved intent interpretation on device, complementing what the big boys are doing.
From small footprint neural networks for noise robust voice triggers and phrase-spotted commands, to large vocabulary recognition leveraging a unique neural network with deep learning that achieves acoustic models an order of magnitude smaller than the present state-of-the-art, to convolutional neural networks deployed in the biometric fusion of face and voice modalities for authentication, all on device and not requiring any cloud component, Sensory continues to be the leader in utilizing state-of-the-art machine learning technology for embedded solutions.
Not bad company to keep!
October 16, 2015
I saw a LinkedIn message to one of the biometrics groups in which I’m a member linking to a new video on biometrics:
I was quite surprised to see that I am actually in it!
It’s a great topic…Banks turning to biometrics. The video doesn’t talk too much about what’s really happening and why, so I’ll blog about a few salient points, worthy of understanding:
1) Passwords are on their deathbed. This is old news and everyone gets it, but worthy of repeating. Too easy to crack and/or too hard to remember
2) Mobile is everything, and mobile biometrics will be the entry point. Our mobile phones will be the tools to control and open a variety of things. Our phones will know who we are and keep track of the probability of that changing as we use them. Mobile banking apps will be accessed through biometrics and that will allow us to not only check balances, but pay or send money or speed ATM transactions.
3) EMV credit cards are here…Biometric credit confirmation is next! Did you get a smart card from your bank? Europay, Visa, and MasterCard decided to improve fraud by shifting fraud risk based on security implemented. Smart cards are now, biometrics will be added to aid fraud prevention.
4) It’s all about convenience & security. So much focus has been on security that convenience was often overlooked. There was a perception that you can’t have both! With Biometrics you actually can have an extremely fast and convenient solution that is highly accurate.
5) Layered biometrics will rule. Any one biometric or authentication approach in isolation will fail. The key is to layer a variety of authentication techniques that enhance the systems security but don’t hurt convenience. Voice and face authentication can be used together, passwords can be thrown on top if the biometric confirmation is unsure, tokens or fingerprint or iris scans can also be deployed if the security isn’t high enough. The key is knowing the accuracy of match and increasing the security to the desired security level in a stepped function so as to maximize user convenience.
October 1, 2015
Todd Mozer’s interview with Martin Wasserman on FutureTalk
August 26, 2015
Guest post: Sensory’s Marketing Team
The editors of Inc. identified Sensory as one of America’s fastest growing companies. The annual ranking of the 5,000 fastest-growing private companies in the United States put Sensory at 3,301 on the list with over 100% growth over three years and 30 new jobs added.
Sensory has a breadth of software products on the market contributing to its growth including TrulyHandsfree, TrulySecure and TrulyNatural, and can be found in over a billion consumer electronics devices around the world.
Congratulations to the Sensory team for making the Inc 5000 list this year!
Sensory Wins Coveted 2015 Speech Technology Magazine’s Industry Star Performer Award for TrulyNatural
August 11, 2015
Guest post by: Sensory’s Marketing Department:
For the second year in a row, Sensory earns Speech Technology Magazine’s Industry Star Performer Award! Having won the award in 2014 for TrulySecure Speaker Verification and for TrulyHandsfree 3.0, Speech Technology Magazine awarded Sensory the 2015 Speech Industry Star Performer Award for its recently released TrulyNatural technology.
TrulyNatural is a major leap forward for client-based speech recognition and is the first embedded large-vocabulary deep neural nets speech recognition platform capable of supporting natural language. TrulyNatural is a scalable solution that can be implemented on highly constricted devices, supporting hundreds of phrases, with a footprint of under a megabyte, or as a natural language engine on devices with more available memory, like mobile devices, cars, and more.
For more information about TrulyNatural, please visit the technology page.
See official article announcing the award at: http://www.speechtechmag.com
August 6, 2015
We first came out with TrulyHandsfree about five years ago. I remember talking to speech tech executives at MobileVoice as well as other industry tradeshows, and when talking about always-on hands-free voice control, everybody said it couldn’t be done. Many had attempted it, but their offerings suffered from too many false fires, or not working in noise, or consuming too much power to be always listening. Seems that everyone thought a button was necessary to be usable!
In fact, I remember the irony of being on an automotive panel, and giving a presentation about how we’ve eliminated the need for a trigger button, while the guy from Microsoft presented on the same panel the importance of where to put the trigger button in the car.
Now, five years later, voice activation is the norm… we see it all over the place with OK Google, Hey Siri, Hey Cortana, Alexa, Hey Jibo, and of course if you’ve been watching Sensory’s demos over the years, Hello BlueGenie!
Sensory pioneered the button free, touch free, always-on voice trigger approach with TrulyHandsfree 1.0 using a unique, patented keyword spotting technology we developed in-house– and from its inception, it was highly robust to noise and it was ultra-low power. Over the years we have ported it to dozens of platforms, Including DSP/MCU IP cores from ARM, Cadence, CEVA, NXP CoolFlux, Synopsys and Verisilicon, as well as for integrated circuits from Audience, Avnera, Cirrus Logic, Conexant, DSPG, Fortemedia, Intel, Invensense, NXP, Qualcomm, QuickLogic, Realtek, STMicroelectronics, TI and Yamaha.
This vast platform compatibility has allowed us to work with numerous OEMs to ship TrulyHandsfree in over a billion products!
Sensory didn’t just innovate a novel keyword spotting approach, we’ve continually improved it by adding features like speaker verification and user defined triggers. Working with partners, we lowered the draw on the battery to less than 1mA, and Sensory introduced hardware and software IP to enable ultra-low-power voice wakeup of TrulyHandsfree. All the while, our accuracy has remained the best in the industry for voice wakeup.
We believe the bigger, more capable companies trying to make voice triggers have been forced to use deep learning speech techniques to try and catch up with Sensory in the accuracy department. They have yet to catch up, but they have grown their products to a very usable accuracy level, through deep learning, but lost much of the advantages of small footprint and low power in the process.
Sensory has been architecting solutions for neural nets in consumer electronics since we opened the doors more than 20 years ago. With TrulyHandsfree 4.0 we are applying deep learning to improve accuracy even further, pushing the technology even more ahead of all other approaches, yet enabling an architecture that has the ability to remain small and ultra-low power. We are enabling new feature extraction approaches, as well as improved training in reverb and echo. The end result is a 60-80% boost in what was already considered industry-leading accuracy.
I can’t wait for TrulyHandsfree 5.0…we have been working on it in parallel with 4.0, and although it’s still a long ways off, I am confident we will make the same massive improvements in speaker verification with 5.0 that we are doing for speech recognition in 4.0! Once again further advancing the state of the art in embedded speech technologies!
June 11, 2015
Guest post by: Michael Farino
Sensory’s CEO, Todd Mozer joined Alan Taylor, host of Popular Science Radio, in a fun discussion about artificial intelligence, Sensory’s involvement with the Jibo robot development team, and also gave the show’s listeners a look into the past 20 years of speech recognition. Todd and Alan additionally discussed some of the latest advancements in speech technology, and Todd provided an update on Sensory’s most recent achievements in the field of speech recognition as well as a brief look into what the future holds.
Listen to the full radio show at the link below:
Big Bang Theory, Science, and Robots | FULL EPISODE | Popular Science Radio #269
June 3, 2015
When I started Sensory over 20 years ago, I knew how difficult it would be to sell software to cost sensitive consumer electronic OEMs that would know my cost of goods. A chip based method of packaging up the technology made a lot of sense as a turnkey solution that could maintain a floor price by adding the features of a microcontroller or DSP with the added benefit of providing speech I/O. The idea was “buy Sensory’s micro or DSP and get speech I/O thrown in for free”.
After about 10 years it was becoming clear that Sensory’s value add in the market was really in technology development, and particularly in developing technologies that could run on low cost chips and with smaller footprints, less power, and superior accuracy than other solutions. Our strategy of using trailing IC technologies to get the best price point was becoming useless because we lacked the scale to negotiate the best pricing, and more cutting edge technologies were becoming further out of reach; even getting the supply commitments we needed was difficult in a world of continuing flux between over and under capacity.
So Sensory began porting our speech technologies onto other people’s chips. Last year about 10% of our sales came from our internal IC’s! Sensory’s DSP, IP, and platform partners have turned into the most strategic of our partnerships.
Today in the semiconductor industry there is a consolidation that is occurring that somewhat mirrors Sensory’s thinking over the past 10 years, albeit at a much larger scale. Avago pays $37 billion dollars for Broadcom, Intel pays $16.7B for Altera, and NXP pays $12B for Freescale, and the list goes on, dwarfing acquisitions of earlier time periods.
It used to be the multi-billion dollar chip companies gobbled up the smaller fabless companies, but now even the multibillion-dollar chip companies are being gobbled up. There’s a lot of reasons for this but economies of scale is probably #1. As chips get smaller and smaller, there are increasing costs for design tools, tape outs, prototyping, and although the actual variable per chip cost drops, the fixed costs are skyrocketing, making consolidation and scale more attractive.
That sort of consolidation strategy is very much a hardware centered philosophy. I think the real value will come to these chip giants through in house technology differentiation. It’s that differentiation that will add value to their chips, enabling better margins and/or more sales.
I expect that over time the chip giants will realize what Sensory concluded 10 years ago…that machine learning, algorithmic differentiation, and software skills, are where the majority of the value added equation on “smart” chips needs to come from, and that improving the user experience on devices can be a pot of gold! In fact, we have already seen Intel, Qualcomm and many other chip giants investing in speech recognition, biometrics, and other user experience technologies, so the change is underway!
May 4, 2015
I was at the Mobile Voice Conference last week and was on a keynote panel with Adam Cheyer (Siri, Viv, etc.) and Phil Gray (Interactions) with Bill Meisel moderating. One of Bills questions was about the best speech products, and of course there was a lot of banter about Siri, Cortana, and Voice Actions (or GoogleNow as it’s often referred to). When it was my turn to chime in I spoke about Amazon’s Echo, and heaped lots of praise on it. I had done a bit of testing on it before the conference but I didn’t own one. I decided to buy one from Ebay since Amazon didn’t seem to ever get around to selling me one. It arrived yesterday.
Here are some miscellaneous thoughts:
OK, Amazon… here’s my free advice (admittedly self-serving but nevertheless accurate):
May 1, 2015
Winning on Accuracy & Speed… How can a tiny player like Sensory compete in deep learning technology with giants like Microsoft, Google, Facebook, Baidu and others?
There’s a number of ways, and let me address them specifically:
These 3 items together have provided Sensory with the highest quality embedded speech engines in the world. It’s worth reiterating why embedded is needed, even if speech recognition can all be done in the cloud: