HEAR ME -
Speech Blog
HEAR ME - Speech Blog  |  Read more September 17, 2019 - IFA 2019 Takes Assistants Everywhere to a New Level
HEAR ME - Speech Blog

Archives

Categories

Archive for the ‘Uncategorized’ Category

Identifying Sounds as Accurately as Wake Words

August 21, 2019

At a recent meeting Sensory was credited for “inventing the wake word”. I explained that Sensory certainly helped to evangelize and popularize it, but we didn’t “invent” it. What we really did was substantially improve upon the state of the art so that it became useable. And it was a VERY hard challenge since we did it in an era before deep learning allowed us to further improve the performance.

Today Sensory is taking on the challenge of sound and scene identification. There are many dozens of companies working on this challenge…and it’s another HUGE challenge. There are some similarities with wake words and dealing with speech but a lot of differences too! I’m writing this to provide an update on our progress, to share some of our techniques, compare a bit with wake words and speech, and to bring more clear metrics to the table to look at accuracy!

Sensory announced our initial SoundID solution at CES 2019 here.

Since then we have been working on accuracy improvements and adding gunshot identification into the mix of our sounds (CO2 and smoke alarms, glass break, baby cry, snoring, door knock/bell, scream/yell, etc.) to be identified.

  1.  General Approach. Sensory is using its TrulySecure Speaker Verification platform for sound ID. This approach using proprietary statistical and shallow learning techniques runs smaller models on device. It also uses a wider bandwidth filtering approach as it is intended to differentiate speech and sounds as opposed to simply recognizing words.
    1. A 2nd stage approach can be applied to improve accuracy. This second stage uses a DeepNet and can also run on device or in the cloud. It is more MIPS and memory intensive but by using the first stage power consumption is easily managed, and the first stage can be more accepting while the 2nd stage eliminates false alarms
      1. Second Stage (Deep Neural Network) eliminates 95% of false alarms from the first stage, while passing 97% of the real events.
      2. This enables to tune to the desired operating point (1 FA/day, .5 FAs/day, etc…)
      3. FR rate stays extremely low (despite of FA reduction) due to very accurate deep neural network and a “loose” first stage that is less discriminative
    2. Second stage classifier (deep neural network) is trained on many target sound examples. In order to separate target events from similar sounding non-target events we apply proprietary algorithmic and model building approaches to remove false alarms
    3. Combined model (1st and 2nd stage) smaller than 5 MB
    4.  Does a 3rd stage make sense? Sensory uses its TrulyHandsfree (THF) technology performing key word spotting for its wake words, and often transfers to TrulySecure for higher performance speaker verification. This allows wake words to be listened for at the lowest possible power consumption. Sensory is now exploring using THF as an initial stage for Sound ID to enable a 3 stage approach with the best in accuracy and the best in power consumption. This way power consumption can average less than 2 milliamps.
  2. Testing Results. Here’s a few important findings that effect our test results:
    1. The difference between a quiet and noisy environment is quite pronounced. It’s easy to perform well in quiet and very difficult to perform great in noise, and it’s a different challenge than we faced with speech recognition, as the noises we are looking to identify can cover a much wider range of frequencies that can more closely match background noises. There’s a very good reason that when Alexa listens for glass break sounds, she does it in an “away” mode…that is when the home is quiet!! (Kudos to Amazon for the clever approach!). The results we will report all use noise-based testing. Spoiler alert…Sensory kicks ass! In our Alexa test simple drum beats and music caused glassbreaks to be detected. Sensory’s goal is avoiding this!
    2. Recorded sound effects are quite different than they sound live. The medium of playback (mobile phone vs PC vs high end speaker) can have a very big impact on the frequency spectrum and the ability to identify a sound. Once again this is quite different than human speech which falls into a relatively narrow frequency band and isn’t as affected by the playback mechanism. For testing, Sensory is using only high-quality sound playback.
    3. Some sounds are repeated others aren’t. This can have a huge effect on false rejects where the sound isn’t properly identified. It can be a “free” second chance to get it right. But this ability varies from sound to sound, for example, a glass break probably happens just once and it is absolutely critical to catch it; whereas a dog bark or baby cry happening once and not repeating may be unimportant and OK to ignore. We will show the effect on repeated sounds on accuracy tests.
    4. Our 2 stage approach works great. All the results shown are the performance of the 2 stage side by side.

 

  • This at 1 FA in 24 hours on balanced mix of noise data. We tend to work on the sounds until we exceed 90% accuracy with 1 FA/day. So its no surprise that they hover in the same percentage region…some of these took more work than others. ;-)

 

 

  • Once again at 1 FA in 24 hours on balanced mix of data. You can see how detection accuracy drops as noise levels grow. Of course we could tradeoff FA and FR to not drop performance so rapidly, and as the chart below shows we can also improve performance by requiring multiple events.

  • Assuming 1 FA in 24 hours on balanced mix of data. The general effects of multiple instances hold true across sound ID categories. So for things like repeated dog barks or baby cries the solution can be very accurate. As a dog owner, I really wouldn’t want to be notified if my dark barked once or twice in a minute, but if it barked 10 times within a minute it might be more indicative of an issue I want to be notified about. Devices with Sensory technology can allow parametric controls of the number of instances to cause a notification.

 

Sensory is very proud of our progress in sound identification. We welcome and encourage others to share their accuracy reporting…I couldn’t find much online to determine “state of the art”.

Now we will begin work on scene analysis…and I expect Sensory to lead in this development as well!

Revisiting Wake Word Accuracy and Privacy

June 11, 2019

I used to blog a lot about wake words and voice triggers. Sensory pioneered this technology for voice assistants, and we evangelized the importance of not hitting buttons to speak to a voice recognizer. Then everybody caught on and the technology went into main stream use (think Alexa, OK Google, Hey Siri, etc.), and I stopped blogging about it. But I want to reopen the conversation…partly to talk about how important a GREAT wake word is to the consumer experience, and partly to congratulate my team on a recent comparison test that shows how Sensory continues to have the most accurate embedded wake word solutions.

Competitive Test Results. The comparison test was done by Vocalize.ai. Vocalize is an independent test house for voice enabled products. For a while, Sensory would contract out to them for independent testing of our latest technology updates. We have always tested in-house but found that our in-house simulations didn’t always sync up with our customers’ experience. Working with Vocalize allowed us to move from our in-house simulations to more real-world product testing. We liked Vocalize so much that we acquired them. So, now we “contract in” to them but keep their data and testing methodology and reporting uninfluenced by Sensory.

Vocalize compared two Sensory TrulyHandsfree wake word models (1MB size and 250KB size) with two external wake words (Amazon and Kitt.ai’s Snowboy), all using “Alexa” as the trigger. The results are replicable and show that Sensory’s TrulyHandsfree remains the superior solution on the market. TrulyHandsfree was better/lower on BOTH false accepting AND false rejecting. And in many cases our technology was better by a longshot! If you would like see the full report and more details on the evaluation methods, please send an email request to either Vocalize (dev@vocalize.ai) or Sensory (sales@sensory.com).

 

It’s Not Easy. There are over 20 companies today that offer on-device wake words. Probably half of these have no experience in a commercially shipping product and they never will; there are a lot of companies that just won’t be taken seriously. The other half can talk a good talk, and in the right environment they can even give a working demo. But this technology is complex, and really easy to do badly and really hard to do great. Some demos are carefully planned with the right noise in the right environment with the right person talking. Sensory has been focused on low power embedded speech for 25 years, we have 65 of the brightest minds working on the toughest challenges in embedded AI. There’s a reason that companies like Amazon, Google, Microsoft and Samsung have turned to Sensory for our TrulyHandsfree technology. Our stuff works, and they understand how difficult it is to make this kind of technology work on-device! We are happy to provide APK’s so you can do you’re your own testing and judge for yourself! OK, enough of the sales pitch…some interesting stuff lays ahead…

It’s Really Important. Getting a wake word to work well is more important than most people realize. It’s like the front door to your house. It might be a small part of your house, but if you aren’t letting the homeowners in then that’s horrible, and if you are letting strangers in by accident that’s even worse. The name a company gives their wake word is usually the company brand name, imagine the sentiment that comes off when I say a brand name and it doesn’t work. Recently I was at a tradeshow that had a Mercedes booth. There were big signs that said “Hey Mercedes”…I walked up to the demo area and I said “Hey Mercedes” but nothing happened…the woman working there informed me that they couldn’t demo it on the show floor because it was really too noisy. I quickly pulled out my mobile phone and showed her that I could use dozens of wake words and command sets without an error in that same environment. Mercedes has spent over 100 years building up one of the best quality brand reputations in the car industry. I wonder what will happen to that reputation if their wake word doesn’t respond in noise? Even worse is when devices accidentally go off. If you have family members that listen to music above volume 7 then you already know the shock that a false alarm causes!

It’s about Privacy. Amazon, like Google and a few others seem to have pretty good wake words, but if you go into your Alexa settings you can see all of the voice data that’s been collected, and a lot of it is being collected when you weren’t intentionally talking to Alexa! You can see this performance issue in the Vocalize test report. Sensory substantially outperformed Amazon in the false reject area. This is when a person tries to speak to Alexa and she doesn’t respond. The difference is most apparent in babble noise where Sensory falsely rejected 3% and Amazon falsely rejected 10% in comparable sized models (250KB). However the False Accept difference is nothing short of AMAZING. Amazon false accepted 13 times in 24 hours of random noise. In this same time period Sensory false accepted ZERO times (on comparably sized 250KB models). How is this possible you may be wondering? Amazon “fixes” its mistakes in the cloud. Even though the device falsely accepts quite frequently, their (larger and more sophisticated) models in the cloud collect the error. Was that a Freudian slip? They correct the error…AND they COLLECT the error. In effect, they are disregarding privacy to save device cost and collect more data.

As the voice revolution continues to grow, you can bet that privacy will continue to be a hot topic. What you now understand is that wake word quality has a direct impact on both the user experience and PRIVACY! While most developers and product engineers in the CE industry are aware of wake words and the difficulty in making them work well on-device, they don’t often consider that competing wake words technologies aren’t created equally – the test results from Vocalize prove it! Sensory is more accurate AND allows more privacy!

Sensory Releases Embedded AI That Allows Voice Assistants and IoT Devices to Identify a Wide Variety of Home Sounds

January 7, 2019

TrulySecure Sound ID identifies sounds within the home and can send messages to homeowners

Santa Clara, Calif., January 7, 2019 – Sensory, a Silicon Valley-based company focused on improving the user experience and security of consumer electronics through state-of-the-art embedded AI technologies, today announces TrulySecure™ Sound ID, a major breakthrough for cloud-free always listening AI that gives devices the ability to identify a variety of common and critical sounds within the home, and intelligently interpret if action needs to be taken, without the security risk of sending audio recordings to the cloud for processing.

Backed by Sensory’s 25 years of experience developing practical applications for neural networks and deep learning, TrulySecure Sound ID is capable of recognizing a variety of environmental sounds, including glass breaking, babies crying, dogs barking, home security alarms, smoke/CO alarms and low battery warnings, doorbells, knocking, snoring and more. Consumer devices could instantly warn users when these sounds occur and send the owner sound clips to better understand the situation.

“With so many voice-controlled products entering our homes, we saw an excellent opportunity to enable the microphones in these devices to do more than just listen for wake words and recognize speech,” said Todd Mozer, CEO of Sensory. “With TrulySecure Sound ID, we’re making it possible for companies to create smart always-listening products that can alert us of things happening within our homes that we may not hear, and safeguard our home and family from potentially dangerous situations.”

Sharing the same distinguished qualities that make Sensory’s ‘AI at the Edge’ technologies industryleading, TrulySecure Sound ID boasts performance and privacy by performing all analyses on device. By training each specific sound profile on thousands of real-world samples, similar to how Sensory trains biometric recognition, TrulySecure Sound ID takes several discriminate factors into account when processing sounds. Sound ID learns each sound type utilizing an approach that combines deep learning model training with proprietary discriminant learning techniques, the two approaches combined produce a superior solution than either approach could achieve in isolation.

Sound ID is available today as a component of Sensory’s TrulySecure Speaker Verification (TSSV) suite of
AI solutions for smart speakers and IoT devices. Ideal for enabling smart assistants, TSSV combines:

  • Text dependent and text independent voice biometric recognition for user ID and authentication
  • Always listening wake word recognition
  • Seamless voice enrollment
  • Sound ID

All of the technologies in TSSV pose no privacy or security concerns because they run completely on device and never touch the cloud. TSSV supports all major operating systems and is hardware agnostic, offering nearly limitless implementation flexibility. Additionally, Sensory can customize TSSV to match the exact needs of its customers, enabling only the sound profiles required for specific use cases.

For more information about this announcement, Sensory or its technologies, please contact
sales@sensory.com; Press inquiries: press@sensory.com.

About Sensory
Sensory Inc. creates a safer and superior UX through vision and voice technologies. Sensory’s technologies are widely deployed in consumer electronics applications including mobile phones, automotive, wearables, toys, IoT and various home electronics. Sensory’s product line includes TrulyHandsfree voice control, TrulySecure biometric authentication, and TrulyNatural large vocabulary natural language embedded speech recognition. Sensory’s technologies have shipped in over a billion units of leading consumer products. 

TrulySecure is a trademark of Sensory Inc.

Sensory Brings Low-Power Wake Words to Mobile Apps

April 3, 2018

Santa Clara, Calif., April 3, 2018 – Sensory’s TrulyHandsfree speech recognition has been re-engineered to run ultra-low-power on Android and iOS mobile applications without special hardware

Sensory, a Silicon Valley-based company focused on improving the user experience and security of consumer electronics through state-of-the-art embedded AI technologies, today announced that it has made a significant breakthrough in running its TrulyHandsfree™ wake word and speech recognition AI engine directly on Android and iOS smartphone applications at low-power. As a software component, TrulyHandsfree can be adapted to any app without requiring special purpose hardware or DSPs to capture efficiencies in computing.

Introduced in 2009, TrulyHandsfree paved the way for the hands-free operation we have come to expect with today’s always-listening personal assistant solutions. When released it revolutionized voice user interfaces by offering the first commercially successful always-listening low power wake word. With each succeeding generation, TrulyHandsfree has continually upped the benchmark for always-listening speech recognition performance, by increasing accuracy, lowering power consumption, and running across an increasing number of hardware platforms at ultra-low-power consumption.

TrulyHandsfree has seen large commercial success by running on special purpose hardware for low-power operation. Companies like Avnera, Cirrus Logic, Conexant/Synaptics, CSR/Qualcomm, DSP Group, Knowles, QuickLogic, Realtek, XMOS and many others have penetrated the market for voice assistants using Sensory TrulyHandsfree technology. This specialized hardware approach has worked well for Sensory’s customers like Samsung, Huawei, LG, Motorola and other Android mobile providers who design their own phones and wearables with their choice of hardware.

Until now, always-listening wake word solutions for apps required too much power to be practical, especially for apps that remain open and active in the background. Additionally, having to maintain the same user experience across operating systems, and across all different devices added an extra layer of complexity. However, this isn’t the case anymore. TrulyHandsfree streamlines the implementation and coding process, allowing developers to quickly and easily deploy apps with power-efficient always-listening wake word and command set capabilities across all popular mobile and PC operating systems.

In 2017 Sensory embarked on investigations of using Qualcomm and ARM as more standard cross-platform solutions to figure out how to lower power consumption for wake words used across mobile platforms. Sensory came up with a series of independent actions that when combined could lower power consumption on a mobile app using a wake word by more than 80%, or a reduction of approximately 200mAh in a 12-hour day. That enables a mobile app wake word to consume approximately one-percent of the smartphone battery in 12 hours. To achieve this outstanding reduction in power consumption, Sensory utilized an approach known as “little-big,” which uses a very small model to identify an interesting event and then revalidates the event on a large model (both events are processed on the Application Processor). This method provides the optimal user experience of the big model only when needed, while maintaining the power consumption of the little model most of the time. Frame stacking approaches further cut certain wake word model processing functions’ MIPS in half with negligible accuracy impact. Additionally, multithreading has been deployed to allow more efficient processing of speech recognition and can significantly improve the speed of execution for larger wake word models.

“Hands-free operation for voice control has become the norm, and application developers are now looking to create hands-free wake words for their own apps,” said Todd Mozer, CEO of Sensory. “For example, we recently helped Google’s Waze accept hands-free voice commands by supplying them with Sensory’s ‘OK Waze’ wake word that runs when the app is open. With previous versions of TrulyHandsfree, having our always-on wake word engine listening for the OK Waze wake word during a short trip would have had minimal effect on a smartphone’s battery, but for longer trips a more efficient system was desired – so we created it. Sensory is excited to now offer TrulyHandsfree with excellent low-power performance to all app developers!”

TrulyHandsfree is the most widely deployed embedded speech recognition engine in the world, having enabled a hands-free voice user experience on more than two billion devices from leading brands worldwide. TrulyHandsfree offers support for every voice UI application with several types of wake word options, such as independent fixed wake words, user enrolled fixed wake words, and user defined wake words. Sensory offers off-the-shelf wake word models for all major Assistant services, including Alexa, Hey Siri, OK Google, Hey Cortana, as well as wake word models for third-party devices that support cloud AI systems from Baidu, Alibaba and Tencent. Sensory can also combine multiple wake words into one solution and is the only supplier to have deployed numerous cross-assistant wake word solutions to the market.

Sensory’s TrulyHandsfree currently supports US English, UK English, Australian English, Indian English, Arabic, Dutch, French (EU and Canadian), German, Italian, Japanese, Korean, Mandarin, Portuguese (EU and Brazil), Russian, Spanish (EU, Latin America and US), Swedish and Turkish. An SDK for TrulyHandsfree is available for Android, iOS, Linux, Mac OS, QNX and Windows. Sensory provides developer support for cloud service interfaces on Android, iOS, Linux, Mac OS, Windows as well as support for dozens of proprietary DSPs, microcontrollers, smart microphones and other low-power embedded devices. SDK updates taking advantage of lower power TrulyHandsfree are now being rolled out for Android and iOS in Q2 2018.

For more information about this announcement, Sensory or its technologies, please contact sales@sensory.com; Press inquiries: press@sensory.com.

About Sensory
Sensory Inc. creates a safer and superior UX through vision and voice technologies. Sensory’s technologies are widely deployed in consumer electronics applications including mobile phones, automotive, wearables, toys, IoT and various home electronics. Sensory’s product line includes TrulyHandsfree voice control, TrulySecure biometric authentication, and TrulyNatural large vocabulary natural language embedded speech recognition. Sensory’s technologies have shipped in over a billion units of leading consumer products. 

TrulyHandsfree is a trademark of Sensory Inc.

I Nailed It!

August 30, 2017

A few days ago I wrote a blog that talked about assistants and wake words and I said:

“We’ll start seeing products that combine multiple assistants into one product. This could create some strange and interesting bedfellows.”

Interesting that this was just announced:

http://fortune.com/2017/08/30/amazon-alexa-microsoft-cortana-siri/

Here’s another prediction for you…

All assistants will start knowing who is talking to them. They will hear your voice and look at your face and know who you are. They will bring you the things you want (e.g. play my favorite songs), and only allow you to conduct transaction you are qualified for (e.g. order more black licorice). Today there is some training required but in the near future they will just learn who is who much like a new born quickly learns the family members without any formal training.

Embedded AI is here

February 10, 2017

The wonders of deep learning are well utilized in the area of artificial intelligence, aka AI. Massive amounts of training data can be processed on very powerful platforms to create wonderful generalized models, which can be extremely accurate. But this in and of itself is not yet optimal, and there’s a movement afoot to move the intelligence and part of the learning onto the embedded platforms.

Certainly, the cloud offers the most power and data storage, allowing the most immense and powerful of systems. However, when it comes to agility, responsiveness, privacy, and personalization, the cloud looks less attractive. This is where edge computing and shallow learning through adaptation can become extremely effective. “Little” data can have a big impact on a particular individual. Think how accurately and how little data is required for a child to learn to recognize its mother.

A good example of specialized learning is when it comes to accents or speech impediments. Generalized acoustic models often don’t handle this well, resulting in customized models for different markets and accents. However, this customization is difficult to manage, can add to the cost of goods, and may negatively impact the user experience. Yet, this still results in a model generalized for a specific class of people or accents. An alternative approach could begin with a general model built with cloud resources, with the ability to adapt on the device to the distinct voices of the people that use it.

The challenge with embedded deep learning occurs in its limited resources and the need to deal with on-device data collection, which by its nature, will be less plentiful, unlabeled, yet more targeted. New approaches are being implemented such as teacher/student models where smaller models can be built from a wider body of data, essentially turning big powerful models into small powerful models that imitate the bigger ones while getting similar performance.

Generative data without supervision can also be deployed for on-the-fly learning and adaptation. Along with improvements in software and technology, the chip industry is going through somewhat of a deep learning revolution, adding more parallel processing and specialized vector math functions. For example, GPU vendor nVidia taking has some exciting products that take advantage of deep learning. Some smaller private embedded deep learning IP companies like Nervana, Movidius, and Apical are getting snapped up in highly valued acquisitions from larger companies like Intel and ARM.

Embedded deep learning and embedded AI is here.

TrulySecure 2.0 Wins First Place in 2016 CTIA E-Tech Awards

September 9, 2016

Print

We are pleased to announce that Sensory’s TrulySecure technology has earned first place in this year’s CTIA E-Tech Awards. We believe that this recognition serves as a testament to Sensory’s devotion to developing the best embedded speech recognition and biometric security technologies available.

For those of you unfamiliar with TrulySecure – TrulySecure is the result of more than 20 years of Sensory’s industry leading and award-winning experience in the biometric space. The TrulySecure SDK allows application developers concerned about both security and convenience to quickly and easily deploy a multimodal voice and vision authentication solution for mobile phones, tablets, and PCs. TrulySecure is highly secure, environment robust, and user friendly – offering better protection and greater convenience than passwords, PINs, fingerprint readers and other biometric scanners. TrulySecure offers the industry’s best accuracy at recognizing the right user, while keeping unauthorized users out. Sensory’s advanced deep learning neural networks are fine tuned to provide verified users with instant access to protected apps and services, without the all too common false rejections of the right user associated with other biometric authentication methods. TrulySecure features a quick and easy enrollment process – capturing voice and face simultaneously in a few seconds. Authentication is on-device and almost instantaneous.

TrulySecure provides maximum security against unauthorized attempts by mobile identity thieves from breaking into a protected mobile device, while ensuring the most accurate verification rates for the actual user. Compared to published data by Apple, the iPhone’s thumbprint reader offers about in 1:50K chance of a false accept of the wrong user, and the probability of the wrong user getting into the device gets higher when the user enrolls more than one finger. With TrulySecure, face and voice biometrics individually offer a baseline 1:50k false accept rate, but can each be made more secure depending on the security needs of the developer. When both face and voice biometrics are required for user authentication, TrulySecure is virtually impenetrable by anybody but the actual user. As a baseline, TrulySecure’s face+voice authentication offers a baseline of 1:100k False Accept Rate, but can be dialed in to offer as much as a 1:1Million False Accept Rate depending on security needs.

TrulySecure is robust to environmental challenges such as low light or high noise – it works in real-life situations that render lesser offerings useless. The proprietary speaker verification, face recognition, and biometric fusion algorithms leverage Sensory’s deep strength in speech processing, computer vision, and machine learning to continually make the user experience faster, more accurate, and more secure. The more the user uses TrulySecure, the more secure it gets.

TrulySecure offers ease-of-mind specifications: no special hardware is required – the solution uses standard microphones and cameras universally installed on today’s phones, tablets and PCs. All processing and encryption is done on-device, so personal data remains secure – no personally identifiable data is sent to the cloud. TrulySecure was also the first biometric fusion technology to be FIDO UAF Certified.

While we are truly honored to be the recipient of this prestigious award, we won’t rest on our laurels. Our engineers are already working on the next generation of TrulySecure, further improving accuracy and security, as well as refining the already excellent user experience.

Guest blog by Michael Farino

Sensory Earns Two Coveted 2016 Speech Tech Magazine Awards

August 22, 2016

Sensory is proud to announce that it has been awarded with two 2016 Speech Tech Magazine Awards. With some stiff competition in the speech industry, Sensory continues to excel in offering the industry’s most advanced embedded speech recognition and speech-based security solutions for today’s voice-enabled consumer electronics movement.

The 2016 Speech Technology Awards include:

sla2016

Speech Luminary Award – Awarded to Sensory’s CEO, Todd Mozer

“What really impresses me about Todd is his long commitment to speech technology, and specifically, his focus on embedded and small-footprint speech recognition,” says Deborah Dahl, principal at Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interactions Working Group. “He focuses on what he does best and excels at that.”

spa2016

Star Performers Award – Awarded to Sensory for its contributions in enabling voice-enabled IoT products via embedded technologies

“Sensory has always been in the forefront of embedded speech recognition, with its TrulyHandsfree product, a fast, accurate, and small-footprint speech recognition system. Its newer product, TrulyNatural, is ground- breaking because it supports large vocabulary speech recognition and natural language understanding on embedded devices, removing the dependence on the cloud,” said Deborah Dahl, principal at Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interactions Working Group. “While cloud-based recognition is the right solution for many applications, if the application must work regardless of connectivity, embedded technology is required. The availability of TrulyNatural embedded natural language understanding should make many new types of applications possible.”

– Guest Blog by Michael Farino

 

IoT Roadshow with Open Systems Media

May 6, 2016

Rich Nass and Barbara Quinlan from Open Systems Media visited Sensory on their “IoT Roadshow”.

IoT is a very interesting area. About 10 years ago we saw voice controlled IoT on the way, and we started calling the market SCIDs – Speech Controlled Internet Devices. I like IoT better, it’s certainly a more popular name for the segment! ;-)

I started our meeting off by talking about Sensory’s three products – TrulyHandsfree Voice Control, TrulySecure Authentication, and TrulyNatural large vocabulary embedded speech recognition.

Although TrulyHandsfree is best known for its “always on” capabilities, ideal for listening for key phrases (like OK Google, Hey Cortana, and Alexa), it can be used a ton of other ways. One of them is for hands-free photo taking, so no selfie stick is required. To demonstrate, I put my camera on the table and took pictures of Barbara and Rich.  (Normally I might have joined the pictures, but their healthy hair, naturally good looks, and formal attire was too outclassing for my participation).

 

IoT pic 1IoT pic 2

 

 

 

 

 

 

 

 

There’s a lot of hype about IoT and Wearables and I’m a big believer in both. That said, I think Amazon’s Echo is the perfect example of a revolutionary product that showcases the use of speech recognition in the IoT space and am looking forward to some innovative uses of speech in Wearables!

Here’s the article they wrote on their visit to Sensory and an impromptu video showing TrulyNatural performing on-device navigation, as well as a demo of TrulySecure via our AppLock Face/Voice Recognition app.

IoT Roadshow, Santa Clara – Sensory: Look ma, no hands!

Rich Nass, Embedded Computing Brand Director

If you’re an IoT device that requires hands-free operation, check out Sensory, just like I did while I was OpenSystems Media’s IoT Roadshow. Sensory’s technology worked flawlessly running through the demo, as you can see in the video. We ran through two different products, one for input and one for security.

Sensory Talks AI and Speech Recognition With Popular Science Radio Host Alan Taylor

June 11, 2015

Guest post by: Michael Farino

Pop Science Radio

 

 

 

 

 

 

 

Sensory’s CEO, Todd Mozer joined Alan Taylor, host of Popular Science Radio, in a fun discussion about artificial intelligence, Sensory’s involvement with the Jibo robot development team, and also gave the show’s listeners a look into the past 20 years of speech recognition. Todd and Alan additionally discussed some of the latest advancements in speech technology, and Todd provided an update on Sensory’s most recent achievements in the field of speech recognition as well as a brief look into what the future holds.

Listen to the full radio show at the link below:

Big Bang Theory, Science, and Robots | FULL EPISODE | Popular Science Radio #269
Ever wondered how accurate the science of the Big Bang Theory TV series is? Curious about how well speech recognition technology and robots are advancing? We interview two great minds to probe for these answers

« Older Entries