HEAR ME -
Speech Blog
HEAR ME - Speech Blog  |  Read more September 17, 2019 - IFA 2019 Takes Assistants Everywhere to a New Level
HEAR ME - Speech Blog

Archives

Categories

Archive for the ‘trulysecure’ Category

Sensory Makes Inc. 5000 2015 List

August 26, 2015

Guest post: Sensory’s Marketing Team

The editors of Inc. identified Sensory as one of America’s fastest growing companies. The annual ranking of the 5,000 fastest-growing private companies in the United States put Sensory at 3,301 on the list with over 100% growth over three years and 30 new jobs added.

Sensory has a breadth of software products on the market contributing to its growth including TrulyHandsfree, TrulySecure and TrulyNatural, and can be found in over a billion consumer electronics devices around the world.

Congratulations to the Sensory team for making the Inc 5000 list this year!

Going Deep Series – Part 2 of 3

April 22, 2015

Going Deep Banner small

 

 

How does Big Data and Privacy fit into the whole Deep Learning Puzzle?

Privacy and Big Data have become big concerns in the world of Deep Learning. However, there is an interesting relationship between the Privacy of personal data and information, Big Data, and Deep Learning. That’s because a lot of the Big Data is personal information used as the data source for Deep Learning. That’s right, to make vision, speech and other systems better, many companies invade users’ personal information and the acquired data is used to train their neural networks. So basically, Deep Learning is neural nets learning from your personal data, stats, and usage information. This is why when you sign a EULA (end user license agreement) you typically give up the rights to your data, whether its usage data, voice data, image data, personal demographic info, or other data supplied through the “free” software or service.

Recently, it was brought to consumers’ attention that some TVs and even children’s toys were listening in on consumers, and/or sharing and storing that information to the cloud. A few editors called me to get my input and I explained that there are a few possible reasons for devices to do this kind of “spying” and none of which are the least bit nefarious: The two most common reasons are 1) The speech recognition technology being used needs the voice data to train better models, so it gets sent to the cloud to be stored and used for Deep Learning and/or 2) The speech recognition needs to process the voice data in the cloud because it is unable to do so on the device. (Sensory will change this second point with our upcoming TrulyNatural release!)

The first reason is exactly what I’ve been blogging about when we say Deep Learning. More data is better! The more data that gets collected, the better the Deep Learning can be. The benefits can be applied across all users, and as long as the data is well protected and not released, then it only has beneficial consequences.

Therein lies the challenge: “as long as the data is well protected and not released…” If banks, billion dollar companies and governments can’t protect personal data in the cloud, then who can, and why should people ever assume their data is safe, especially from systems where there is no EULA is place and data is being collected without consent (which happens all the time BTW)?

Having devices listen in on people and share their voice data with the cloud for Deep Learning or speech recognition processing is an invasion of privacy. If we could just keep all of the deep neural net and recognition processing on device, then there would be no need to risk the security of peoples’ personal data by sharing and storing it on the cloud… and its with this philosophy that Sensory pioneered an entirely different, “embedded” approach to deep neural net based speech recognition which we will soon be bringing to market. Sensory actually uses Deep Learning approaches to train our nets with data collected from EULA consenting and often paid subjects. We then take the recognizer built from that research and run it on our OEM customers’ devices and because of that, never have to collect personal data; so, the consumers who buy products from Sensory’s OEM customers can rest assured that Sensory is never putting their personal data at risk!

In my next blog, I’ll address the question about how accurate Sensory can be using deep nets on device without continuing data collection in the cloud. There are actually a lot of advantages for running on device beyond privacy, and it can include not only response time but accuracy as well!

Going Deep Series – Part 1 of 3

April 15, 2015

Going Deep Banner small

 

 

Deep Neural Nets, Deep Belief Nets, Deep Learning, DeepMind, DeepFace, DeepSpeech, DeepImage… Deep is all the rage! In my next few blogs I will try to address some of the questions and issues surrounding all of these “deep” thoughts including:

  • What is Deep Learning and why has it gotten so popular as of late
  • Is Sensory just jumping on the bandwagon with Deep Nets for voice and vision?
  • How does Big Data and Privacy fit into the whole Deep Learning arena?
  • How can a tiny player like Sensory compete in this “deep” technology with giants like Microsoft, Google, Facebook, Baidu and others investing so heavily?

Part 1: What is Deep Learning and is Sensory Just Jumping on the Bandwagon?

Artificial Neural Network approaches have been around for a long time, and have gone in and out of favor. Neural Nets are an approach within the field of Machine Learning and today they are all the rage. Sensory has been working with Neural Net technology since our founding more than 20 years ago, so the approach is certainly not new for us. We are not just jumping on the bandwagon… we are one of the leading carts! ;-)

Neural Networks are very loosely modeled after how our brains work – nonlinear, parallel processing, and learning from exposure to data rather than being programmed. Unlike common computer architectures that separate memory from processing, our brains have billions of neurons that communicate and process all in parallel and with huge quantities of connections. This architecture based on how our brains work turns out to be much better than traditional computer programs at dealing with ambiguous and “sensory” information like vision and speech – a little Trivia: that’s how we came up with the name Sensory!

In the early days of Sensory, we were often asked by engineers, “What kind of neural networks are you running?” They were looking for a simple answer, something like a “Kohonen Net.”  I once asked my brother, Mike Mozer, a pioneer in the field of neural nets, a Sensory co-founder, and a professor of computer science at U. Colorado Boulder, for a few one liners to satisfy curious engineers without giving anything away. We had two lines: the first being, “a feed forward multi-layer net” which satisfied 90% of those asking, and the other response for those that asked for more was, “it’s actually a nonlinear and multivariate function.” That quieted pretty much everyone down.

In the last five years Neural Networks have proven to be the best-known approaches for various recognition and ambiguous data challenges like vision and speech. The breakthrough and improvement in performance came from these various terms that use the word “deep.” The “deep” approaches entailed more complex architectures that receive more data. The architecture relates to the ways that information is shared and processed (like all those connections in our brain), and the increased data allows the system to adapt and improve through continuous learning, hence the terms, “Deep Learning” and “Deep Learning Net.” Performance has improved dramatically in the past five years and Deep Learning approaches have far exceeded traditional “expert-based” techniques for programming complex feature extraction and analysis.

An Inside look at Sensory’s Strategic Endeavors

January 21, 2015

I know it’s been months since Sensory has blogged and I thank you for pinging me to ask what’s going on…Well, lot’s going on at Sensory. There are  really 3 areas that we are putting a strategic focus on, and I’ll briefly mention each:

  1. Applications. We have put our first applications into the Google Play store, and it is our goal over the coming year to put increased focus on making applications and in particular making good user experiences through Sensory technologies in these applications.
    Download AppLock or VoiceDial
    These are both free products and more intended as a means to help tune our models and get real user feedback to refine the applications so they delight end users! We will offer the applications with the technology to our mobile, tablet, and PC customers so they can build them directly into their customers’ user experience.
  2. Authentication. Sensory has been a leader in embedded voice authentication for years. Over the past year, though, we have placed increase focus in this area, and we have some EXCELLENT voice authentication technologies that we will be rolling out into our SDK’s in the months ahead.
    Of course, we aren’t just investing in voice! We have a vision program in place and our vision focus is also on authentication. We call this fusion of voice and vision TrulySecure™, and we think it offers the best security with the most convenience. Try out AppLock in the above link and I hope you will agree that it’s great.
  3. TrulyNatural™. For many years now, Sensory has been a leader in on device speech recognition. We have seen our customers going to cloud-based solutions for the more complex and large vocabulary tasks. In the near future this will no longer be necessary! We have built from the ground up an embedded deep neural net implementation with FST, bag of words, robust semantic parsing and all the goodies you might expect from a state of the art large vocabulary speech recognition solution! We recently benchmarked a 500,000 word vocabulary and we are measuring about a 10% word error rate (WER). On smaller 5K vocabulary tasks the WER is down to the 7-8% range. This is as good as or better than today’s published state-of-the-art cloud based solutions!

Of course, there’s a lot more going on than just this…we recently announced partnerships with Intel and Nok Nok Labs, and we have further lowered power consumption in touchless control and always-on voice systems with the addition of our hardware block for low power sound detection.

Happy 2015!

Spoofing Biometrics

July 25, 2014

I see a bit of irony that a great Saturday Night Live alumnus is launching a campaign to decrease spoofing. I’m talking about Senator Al Franken, who has been looking into the problem of stolen fingerprints, see article.

Senator Franken challenges Samsung and Apple with some fair concerns about the problem of stolen or spoofed biometrics. The issue is that most biometrics that could be stolen can’t be easily replaced. We only have one face, two eyes, and 10 fingers, so not a lot of chances to replace or change them if they are stolen.

The mobile phone companies, challenged on the fingerprint issue, had two responses:

  1. The biometric data is ON DEVICE. This is very important because when it’s stored in the clouds it becomes much more accessible to a hacker AND much more desirable because the payoff is a whole lot of user information. Cloud security is often hacked into, such as the recent break-in of the European Central Bank. In fact many banks I have spoken to insist that passwords can’t be stored in the clouds because they are just too easy to hack that way.
  2. The fingerprint biometric is not stored as a fingerprint image, but as some sort of mathematical representation. I’m not sure I understand this argument because if the digital representation can be copied and replicated, then the system is cracked whether or not it looks like a fingerprint.

I think Franken is right to question the utility of biometric fingerprints, because a product like Sensory’s TrulySecure (combining voice and vision authentication) offers a large number of advantages:

  1. The TrulySecure biometric is not easy to copy or find. Unlike a fingerprint which gets left everywhere, a voice print with a video image of a person saying a particular phrase is NOT easy to find, and even if well recorded, would fall apart with Sensory’s anti-spoofing technology that requires a live image.
  2. The TrulySecure biometric is readily changeable. Unlike the nine chances that a user has to replace a fingerprint, there are a virtually unlimited number of TrulySecure password phrases that can be used. If by some nearly impossible chance a TrulySecure biometric phrase is copied, it can be changed in a matter of seconds and a virtually unlimited number of times.
  3. TrulySecure works across conditions. Every biometric seems to have a failure mode. Fingerprint sensors seem to require a highly directionalized swipe of a very clean finger. If I cut my finger or have a little peanut butter on it, it just doesn’t work. Likewise a voiceprint by itself might fail in high noise, and a faceprint might fail in low lighting, but that magical dual biometric fusion in TrulySecure seems immune to conditions.

Here’s a demo I gave to UberGizmo in a somewhat dark and very noisy hotel lobby. I like this demo because it shows a real world situation and how FAST TrulySecure works.

Here’s a more canned demo on Sensory’s home page that better showcases some of the anti-spoofing features.

Newer Entries »