HEAR ME -
Speech Blog
HEAR ME - Speech Blog  |  Read more June 11, 2019 - Revisiting Wake Word Accuracy and Privacy
HEAR ME - Speech Blog

Archives

Categories

Posts Tagged ‘TrulyNatural’

Sensory Winning Awards

October 6, 2016

It’s always nice when Sensory wins an award. 2016 has been a special year for Sensory because we won more awards than any other year in our 23 year history!!

Check it out:

Sensory Earns Multiple Coveted Awards in 2016
Pioneering embedded speech and machine vision tech company receiving industry accolades

Sensory Inc., a Silicon Valley company that pioneered the hands-free voice wakeup word approach, today, announced it has won over half a dozen awards in 2016 across its product-line, including awards for products, technologies, and people, covering deep learning, biometric authentication and voice recognition.

The awards presented to Sensory include the following:
AIconics are the world’s only independently judged awards celebrating the drive, innovation and hard work in the international artificial intelligence community. Sensory was initially a finalist along with six other companies in the category of Best Innovation in Deep Learning, and judges determined Sensory to be the overall WINNER at an awards ceremony held in September 2016. The judging panel was comprised of 12 independent professionals spanning leaders in artificial intelligence R&D, academia, investments, journalists and analysts.

CTIA Super Mobility 2016™, the largest wireless event in America, announced more than 70 finalists for its 10th annual CTIA Emerging Technology (E-Tech) Awards. Sensory was nominated in the category of Mobile Security and Privacy for its TrulySecure™ technology, along with Nokia, Samsung, SAP, and others. Sensory was presented with the First Place award for the category in a ceremony on September 2016 at the CTIA Las Vegas event.

Speech Technology magazine, the leading provider of speech technology news and analysis, had its 10th annual Speech Industry Awards to recognize the creativity and notable achievements of key influencers (Luminaries), major innovators (Star Performers), and impressive deployments (Implementation Awards). The editors of Speech Technology magazine selected 2016 award winners based on their industry contributions during the past 12 months. Sensory’s CEO, Todd Mozer, was awarded with a Luminary Award, making it his second time winning the prestigious award. Sensory as a company was awarded the Star Performer award along with IBM, Amazon and others.

Two well-known industry analyst firms issued reports highlighting Sensory’s industry contributions for its TrulyHandsfree product and customer leadership, offering awards for innovations, customer deployment, and strategic leadership.

“Sensory has an incredibly talented team of speech recognition and biometrics experts dedicated to advancing the state-of-the-art of each respective field. We are pleased that our TrulyHandsfree, TrulySecure and TrulyNatural product lines are being recognized in so many categories, across the various industries in which we do business,” said Todd Mozer, CEO of Sensory. “I am also thrilled that Sensory’s research and innovations in the deep learning space has been noticed, generating our company prestigious accolades and management recognition.”

For more information about this announcement, Sensory or its technologies, please contact sales@sensory.com; Press inquiries: press@sensory.com

Sensory Earns Two Coveted 2016 Speech Tech Magazine Awards

August 22, 2016

Sensory is proud to announce that it has been awarded with two 2016 Speech Tech Magazine Awards. With some stiff competition in the speech industry, Sensory continues to excel in offering the industry’s most advanced embedded speech recognition and speech-based security solutions for today’s voice-enabled consumer electronics movement.

The 2016 Speech Technology Awards include:

sla2016

Speech Luminary Award – Awarded to Sensory’s CEO, Todd Mozer

“What really impresses me about Todd is his long commitment to speech technology, and specifically, his focus on embedded and small-footprint speech recognition,” says Deborah Dahl, principal at Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interactions Working Group. “He focuses on what he does best and excels at that.”

spa2016

Star Performers Award – Awarded to Sensory for its contributions in enabling voice-enabled IoT products via embedded technologies

“Sensory has always been in the forefront of embedded speech recognition, with its TrulyHandsfree product, a fast, accurate, and small-footprint speech recognition system. Its newer product, TrulyNatural, is ground- breaking because it supports large vocabulary speech recognition and natural language understanding on embedded devices, removing the dependence on the cloud,” said Deborah Dahl, principal at Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interactions Working Group. “While cloud-based recognition is the right solution for many applications, if the application must work regardless of connectivity, embedded technology is required. The availability of TrulyNatural embedded natural language understanding should make many new types of applications possible.”

– Guest Blog by Michael Farino

 

IoT Roadshow with Open Systems Media

May 6, 2016

Rich Nass and Barbara Quinlan from Open Systems Media visited Sensory on their “IoT Roadshow”.

IoT is a very interesting area. About 10 years ago we saw voice controlled IoT on the way, and we started calling the market SCIDs – Speech Controlled Internet Devices. I like IoT better, it’s certainly a more popular name for the segment! ;-)

I started our meeting off by talking about Sensory’s three products – TrulyHandsfree Voice Control, TrulySecure Authentication, and TrulyNatural large vocabulary embedded speech recognition.

Although TrulyHandsfree is best known for its “always on” capabilities, ideal for listening for key phrases (like OK Google, Hey Cortana, and Alexa), it can be used a ton of other ways. One of them is for hands-free photo taking, so no selfie stick is required. To demonstrate, I put my camera on the table and took pictures of Barbara and Rich.  (Normally I might have joined the pictures, but their healthy hair, naturally good looks, and formal attire was too outclassing for my participation).

 

IoT pic 1IoT pic 2

 

 

 

 

 

 

 

 

There’s a lot of hype about IoT and Wearables and I’m a big believer in both. That said, I think Amazon’s Echo is the perfect example of a revolutionary product that showcases the use of speech recognition in the IoT space and am looking forward to some innovative uses of speech in Wearables!

Here’s the article they wrote on their visit to Sensory and an impromptu video showing TrulyNatural performing on-device navigation, as well as a demo of TrulySecure via our AppLock Face/Voice Recognition app.

IoT Roadshow, Santa Clara – Sensory: Look ma, no hands!

Rich Nass, Embedded Computing Brand Director

If you’re an IoT device that requires hands-free operation, check out Sensory, just like I did while I was OpenSystems Media’s IoT Roadshow. Sensory’s technology worked flawlessly running through the demo, as you can see in the video. We ran through two different products, one for input and one for security.

Sensory’s CEO, Todd Mozer, interviewed on FutureTalk

October 1, 2015

Todd Mozer’s interview with Martin Wasserman on FutureTalk

Sensory Wins Coveted 2015 Speech Technology Magazine’s Industry Star Performer Award for TrulyNatural

August 11, 2015

Guest post by: Sensory’s Marketing Department:

SpeechTeCoverFor the second year in a row, Sensory earns Speech Technology Magazine’s Industry Star Performer Award! Having won the award in 2014 for TrulySecure Speaker Verification and for TrulyHandsfree 3.0, Speech Technology Magazine awarded Sensory the 2015 Speech Industry Star Performer Award for its recently released TrulyNatural technology.

TrulyNatural is a major leap forward for client-based speech recognition and is the first embedded large-vocabulary deep neural nets speech recognition platform capable of supporting natural language. TrulyNatural is a scalable solution that can be implemented on highly constricted devices, supporting hundreds of phrases, with a footprint of under a megabyte, or as a natural language engine on devices with more available memory, like mobile devices, cars, and more.

For more information about TrulyNatural, please visit the technology page.

See official article announcing the award at: http://www.speechtechmag.com

STM15AWARD_starperformbig

Sensory Talks AI and Speech Recognition With Popular Science Radio Host Alan Taylor

June 11, 2015

Guest post by: Michael Farino

Pop Science Radio

 

 

 

 

 

 

 

Sensory’s CEO, Todd Mozer joined Alan Taylor, host of Popular Science Radio, in a fun discussion about artificial intelligence, Sensory’s involvement with the Jibo robot development team, and also gave the show’s listeners a look into the past 20 years of speech recognition. Todd and Alan additionally discussed some of the latest advancements in speech technology, and Todd provided an update on Sensory’s most recent achievements in the field of speech recognition as well as a brief look into what the future holds.

Listen to the full radio show at the link below:

Big Bang Theory, Science, and Robots | FULL EPISODE | Popular Science Radio #269
Ever wondered how accurate the science of the Big Bang Theory TV series is? Curious about how well speech recognition technology and robots are advancing? We interview two great minds to probe for these answers

OK, Amazon!

May 4, 2015

I was at the Mobile Voice Conference last week and was on a keynote panel with Adam Cheyer (Siri, Viv, etc.) and Phil Gray (Interactions) with Bill Meisel moderating. One of Bills questions was about the best speech products, and of course there was a lot of banter about Siri, Cortana, and Voice Actions (or GoogleNow as it’s often referred to). When it was my turn to chime in I spoke about Amazon’s Echo, and heaped lots of praise on it. I had done a bit of testing on it before the conference but I didn’t own one. I decided to buy one from Ebay since Amazon didn’t seem to ever get around to selling me one. It arrived yesterday.

Here are some miscellaneous thoughts:

  • Echo is a fantastic product! Not so much because of what it is today but for the platform it’s creating for tomorrow. I see it as every bit as revolutionary as Siri.
  • The naming is really confusing. You call it Alexa but the product is Echo. I suspect this isn’t the blunder that Google made (VoiceActions, GoogleNow, GoogleVoice, etc.), but more an indication that they are thinking of Echo as the product and Alexa as the personality, and that new products will ship with the same personality over time. This makes sense!
  • Setup was really nice and easy, the music content integration/access is awesome, the music quality could be a bit better but is useable; there’s lots of other stuff that normal reviewers will talk about…But I’m not a “normal” reviewer because I have been working with speech recognition consumer electronics for over 20 years, and my kids have grown up using voice products, so I’ll focus on speech…
  • My 11 year old son, Sam, is pretty used to me bringing home voice products, and is often enthusiastic (he insisted on taking my Vocca voice controlled light to keep in his room earlier this year). Sam watched me unpack it and immediately got the hang of it and used it to get stats on sports figures and play songs he likes. Sam wants one for his birthday! Amazon must have included some kids voice modeling in their data because it worked pretty well with his voice (unlike the Xbox when it first shipped, which I found particularly ironic since Xbox was targeting kids).
  • The Alexa trigger works VERY well. They have implemented beamforming and echo cancellation in a very state of the art implementation. The biggest issue is that it’s a very bandwidth intensive approach and is not low power. Green is in! That could be why its plug-in/AC only and not battery powered. Noise near the speaker definitely hurts performance as does distance, but it absolutely represents a new dimension in voice usability from a distance and unlike with the Xbox, you can move anywhere around it, and aren’t forced to be in a stationary position (thanks to their 7 mics, which surely must be overkill!)
  • The voice recognition in generally is good, but like all of the better engines today (Google, Siri, Cortana, and even Sensory’s TrulyNatural) it needs to get better. We did have a number of problems where Alexa got confused. Also, Alexa doesn’t appear to have memory of past events, which I expect will improve with upgrades. I tried playing the band Cake (a short word, making it more difficult) and it took about 4 attempts until it said “Would you like me to play Cake?” Then I made the mistake of trying “uh-huh” instead of “yes” and I had to start all over again!
  • My FAVORITE thing about the recognizer is that it does ignore things very nicely. It’s very hard to know when to respond and when not to. The Voice Assistants (Google, Siri, Cortana) seem to always defer to web searches and say things like “It’s too noisy” no matter what I do, and I thought Echo was good at deciding not to respond sometimes.

OK, Amazon… here’s my free advice (admittedly self-serving but nevertheless accurate):

  • You need to know who is talking and build models of their voices and remember who they are and what their preferences are. Sensory has the BEST embedded speaker identification/verification engine in the world, and it’s embedded so you don’t need to send a bunch of personal data into the cloud. Check out TrulySecure!
  • In fact, if you added a camera to Alexa, it too could be used for many vision features, including face authentication.
  • Make it battery powered and portable! To do this, you’d need an equally good embedded trigger technology that runs at low power – Check out TrulyHandsfree!
  • If it’s going to be portable, then it needs to work if even when not connected to the Internet. For this, you’d need an amazing large vocabulary embedded speech engine. Did I tell you about TrulyNatural?
  • Of course, the hope is that the product-line will quickly expand and as a result, you will then add various sensors, microphones, cameras, wheels, etc.; and at the same time, you will also want to develop lower cost versions that don’t have all the mics and expensive processing. You are first to market and that’s a big edge. A lot of companies are trying to follow you. You need to expand the product-line quickly, learning from Alexa. Too many big companies have NIH syndrome… don’t be like them! Look for partnering opportunities with 3rd parties who can help your products succeed – Like Sensory! ;-)

Mobile World Congress Day 1

March 3, 2015

It feels like I had a whole week’s worth of the trade show wrapped into one day! By the time mid week hits, I’ll surely be ready to head home! Here are some of the highlights from the first day of Mobile World Congress 2015:

  • First a word about Catalonia. That’s where Barcelona is…in the heart of Catalonia, a province of Spain. Don’t expect delayed meetings, inefficiencies, relaxed long lunches or anything like that. The Catalonians have the precision of Germans (to continue my gross stereotyping!), and my experience with one of the largest trade shows on the planet is that it’s going off without a hitch! I picked up my badge at the airport in a five-minute line that was well staffed and moved rapidly. I could just about walk into the show yesterday morning. The subways and trains though crowded and overheated ran extremely smoothly. Kudos to the show management for pulling off such a difficult feat!
  • I’d be remiss without mentioning the Galaxy S6. Samsung invited us to the launch and of course they continue to use Sensory in a relationship that has grown quite strong over the years.  Samsung continues to innovate with the Edge, and other products that everyone is talking about. It’s amazing how far Apple took the mantle in the first iPhone and how companies like Samsung and the Android system seem to now be leading the charge on innovation!
  • My favorite product that doesn’t feature Sensory technology that I bumped into was an electronic jump rope. They put sensors in the handles and a visual display shows across the field of the rope, kind of like those clocks that rapidly flash LED’s as the pendulum quickly moves back and forth in order to display the time. I talked with Alex Woo from Tangram and he said they were going to launch a crowdfunding campaign. I gave Alex a demo of our TrulyHandsfree with jump ropers jumping and all the show noise and of course it worked flawlessly. It would be really cool to be able to ask things like “How much time,” “How many jumps,” “What’s my heart rate,” or “How many calories burned” and so on, and the display would make voice control so much more functional!
  • We had a couple of partnership announcements here at the show, supporting both Qualcomm and Synopsys – both great partners to add to our support mix, and always nice when its customers driving our platform directions. The Qualcomm platform is interesting because it’s not their standard platform for 3rd parties to support. As far as I know they opened it up to Sensory and ONLY Sensory, and already we are seeing much interest!
  • Last night ZTE had a press party to indoctrinate Sensory and NXP into its Smart Voice Alliance. ZTE is really putting some forward thinking into the user experience and their research shows how much people want a voice interface but how dissatisfying the current state of the art actually is. Sensory’s hoping to change that! We’ll make one of our biggest announcements in history over the next month… and I’ll let you in on the secret (it’s on our website already!) We call it TrulyNatural, and it will be the highest accuracy large vocabulary embedded speech engine that the world has ever seen!

Hasta Luego!!!

Deep Listening in the Cloud

February 11, 2015

The advent of “always on” speech processing has raised concerns about organizations spying on us from the cloud.

4081596290_5ccb708d7d_mIn this Money/CNN article, Samsung is quoted as saying, “Samsung does not retain voice data or sell it to third parties.” But, does this also mean that your voice data isn’t being saved at all? Not necessarily. In a separate article, the speech recognition system in Samsung’s TVs is shown to be an always-learning cloud-based system solution from Nuance. I would guess that there is voice data being saved, and that Nuance is doing it.

This doesn’t mean Nuance is doing anything evil; this is just the way that machine learning works. There has been this big movement towards “deep” learning, and what “deep” really means is more sophisticated learning algorithms that require more data to work. In the case of speech recognition, the data needed is speech data, or speech features data that can be used to train and adapt the deep nets.

But just because there is a necessary use for capturing voice data and invading privacy, doesn’t mean that companies should do it. This isn’t just a cloud-based voice recognition software issue; it’s an issue with everyone doing cloud based deep learning. We all know that Google’s goal in life is to collect data on everything so Google can better assist you in spending money on the right things. We in fact sign away our privacy to get these free services!

I admit guilt too. When Sensory first achieved usable results for always-on voice triggers, the basis of our TrulyHandsfree technology, I applied for a patent on a “background recognition system” that listens to what you are talking about in private and puts together different things spoken at different times to figure out what you want…. without you directly asking for it.

Can speech recognition be done without having to send all this private data to the cloud? Sure it can! There’s two parts in today’s recognition systems: 1) The wake up phrase; 2) The cloud based deep net recognizer – AND NOW THEY CAN BOTH BE DONE ON DEVICE!

Sensory pioneered the low-power wake up phrase on device (item 1), now we have a big team working on making an EMBEDDED deep learning speech recognition system so that no personal data needs to be sent to the cloud. We call this approach TrulyNatural, and it’s going to hit the market very soon! We have benchmarked TrulyNatural against state-of-the-art cloud-based deep learning systems and have matched and in some cases bested the performance!

An Inside look at Sensory’s Strategic Endeavors

January 21, 2015

I know it’s been months since Sensory has blogged and I thank you for pinging me to ask what’s going on…Well, lot’s going on at Sensory. There are  really 3 areas that we are putting a strategic focus on, and I’ll briefly mention each:

  1. Applications. We have put our first applications into the Google Play store, and it is our goal over the coming year to put increased focus on making applications and in particular making good user experiences through Sensory technologies in these applications.
    Download AppLock or VoiceDial
    These are both free products and more intended as a means to help tune our models and get real user feedback to refine the applications so they delight end users! We will offer the applications with the technology to our mobile, tablet, and PC customers so they can build them directly into their customers’ user experience.
  2. Authentication. Sensory has been a leader in embedded voice authentication for years. Over the past year, though, we have placed increase focus in this area, and we have some EXCELLENT voice authentication technologies that we will be rolling out into our SDK’s in the months ahead.
    Of course, we aren’t just investing in voice! We have a vision program in place and our vision focus is also on authentication. We call this fusion of voice and vision TrulySecure™, and we think it offers the best security with the most convenience. Try out AppLock in the above link and I hope you will agree that it’s great.
  3. TrulyNatural™. For many years now, Sensory has been a leader in on device speech recognition. We have seen our customers going to cloud-based solutions for the more complex and large vocabulary tasks. In the near future this will no longer be necessary! We have built from the ground up an embedded deep neural net implementation with FST, bag of words, robust semantic parsing and all the goodies you might expect from a state of the art large vocabulary speech recognition solution! We recently benchmarked a 500,000 word vocabulary and we are measuring about a 10% word error rate (WER). On smaller 5K vocabulary tasks the WER is down to the 7-8% range. This is as good as or better than today’s published state-of-the-art cloud based solutions!

Of course, there’s a lot more going on than just this…we recently announced partnerships with Intel and Nok Nok Labs, and we have further lowered power consumption in touchless control and always-on voice systems with the addition of our hardware block for low power sound detection.

Happy 2015!