Archive for the ‘Voice Control’ Category
August 13, 2018
It’s not easy to be a retailer today when more and more people are turning to Amazon for shopping. And why not shop online? Ordering is convenient with features such as ratings. Delivery is fast and cheap, and returns are easy and free – if you are Prime member! In April 2018 Bezos reported there are more than 100 million Prime members in the world, and the majority of US households are Prime members. Walmart and Google have partnered in an ecommerce play to compete with Amazon, but Walmart is just dancing with the devil. Google will use the partnership to gather data and invest more in their internal ecommerce and shopping experiences. Walmart isn’t relaxing, and is aggressively pursuing ecommerce and AI initiatives through acquisitions, and its Store #8 that acts as an incubator for AI companies and internal initiatives. Question: why does Facebook have a Building 8 and Walmart have a Store 8 for skunkworks projects?
Read more at Embedded Computing.
August 6, 2018
Apple introduced Siri in 2011 and my world changed. I was running Sensory back then as I am today and suddenly every company wanted speech recognition. Sensory was there to sell it! Steve Jobs, a notorious nay-sayer on speech recognition, had finally given speech recognition the thumbs up. Every consumer electronics company noticed and decided the time had come. Sensory’s sales shot up for a few years driven by this sudden confidence in speech recognition as a user interface for consumer electronics…
Read more on Voicebot.ai
August 6, 2018
Here’s the basic motivation that I see in creating Voice Assistants…Build a cross platform user experience that makes it easy for consumers to interact, control and request things through their assistant. This will ease adoption and bring more power to consumers who will use the products more and in doing so create more data for the cloud providers. This “data” will include all sorts of preferences, requests, searches, purchases, and will allow the assistants to learn more and more about the users. The more the assistant knows about any given user, the BETTER the assistant can help the user in providing services such as entertainment and assisting with purchases (e.g. offering special deals on things the consumer might want). Let’s look at each of these in a little more detail:….
Read more at Embedded Computing
July 25, 2018
This is part one in a three part series. Catch part two next week.
I have spoken on a lot of “voice” oriented shows over the years, and it has been disappointing that there hasn’t been more discussion about the competition in the industry and what is driving the huge investments we see today. Because companies like Amazon and Google participate in and sponsor these shows, there is a tendency to avoid the more controversial aspects of the industry. I wrote this blog to share some of my thoughts on what is driving the competition, why the voice assistant space is so strategically important to companies, and some of the challenges resulting from the voice assistant battles…
Read more at Embedded Computing
September 28, 2017
Finovate is one of those shows where you get up on stage and give a short intro and live demo. They are selective in who they allow to present and many applicants are rejected. Sensory demonstrated some really cutting-, perhaps bleeding-, edge stuff by combining animated talking avatars, with text-to-speech, lip movement synchronization, natural language speech recognition and face and voice biometrics. I don’t know of any company ever combining so many AI technologies into a single product or demo!
Speech recognition has a long history of failing on stage, and one of the ways Sensory has always differentiated itself, is that our demos always work! And all our AI technologies worked here too! Even with bright backlighting, our TrulySecure face recognition was so fast and accurate some missed it. With the microphones and echo’s in the large room, our TrulyNatural speech recognition was perfect! That said, we did have a user-error… before Jeff and I got on stage he put his demo phone in DND mode, which cut our audio output – but quickly recovered from that mishap.
August 30, 2017
A few days ago I wrote a blog that talked about assistants and wake words and I said:
“We’ll start seeing products that combine multiple assistants into one product. This could create some strange and interesting bedfellows.”
Interesting that this was just announced:
Here’s another prediction for you…
All assistants will start knowing who is talking to them. They will hear your voice and look at your face and know who you are. They will bring you the things you want (e.g. play my favorite songs), and only allow you to conduct transaction you are qualified for (e.g. order more black licorice). Today there is some training required but in the near future they will just learn who is who much like a new born quickly learns the family members without any formal training.
August 28, 2017
Ten years ago, I tried to explain to friends and family that my company Sensory was working on a solution that would allow IoT devices to always be “on” and listening for a key wake up word without “false firing” and doing it at ultra-low power and with very little processing power. Generally, the response was “Huh?”
Today, I say, “Just like Hey Siri, OK Google, Alexa, Hey Cortana, and so on.” Now, everybody gets it and the technology is mainstream. In fact, next year, Sensory will have technology that’s embedded in IoT devices that listens to all those things (and more). But that’s not good enough.
Read more at Embedded Computing
January 5, 2017
Virtual handsfree assistants that you can talk to and that talk back have rapidly gained popularity. First, they arrived in mobile phones with Motorola’s MotoX that had an ‘always listening’ Moto Voice powered by Sensory’s TrulyHandsfree technology. The approach quickly spread across mobile phones and PCs to include Hey Siri, OK Google, and Hey Cortana.
Then Amazon took things to a whole new level with the Echo using Alexa. A true voice interface emerged, initially for music but quickly expanding domain coverage to include weather, Q&A, recipes, and the most common queries. On top of that, Amazon took a unique approach by enabling 3rd parties to develop “skills” that now number over 6000! These skills allow Amazon’s Echo line (with Tap, Dot) and 3rd Party Alexa equipped products (like Nucleus and Triby) to be used to control various functions, from reading heartrates on Fitbits to ordering Pizzas and controlling lights.
Until recently, handsfree assistants required a certain minimum power capability to really be always on and listening. Additionally, the hearable market segment including fitness headsets, hearing aids, stereo headsets and other Bluetooth devices needed to use touch control because of their power limitations. Also, Amazons Alexa had required WIFI communications so you could sit on your couch talking to your Echo and query Fitbit information, but you couldn’t go out on a run and ask Alexa what your heartrate was.
All this is changing now with Sensory’s VoiceGenie!
The VoiceGenie runs an embedded recognizer in a low power mode. Initially this is on a Qualcomm/CSR Bluetooth chip, but could be expanded to other platforms. Sensory has taken an SBC music decoder and intertwined a speech recognition system, so that the Bluetooth device can recognize speech while music is playing.
The VoiceGenie is on and listening for 2 keywords:
For example, a Bluetooth headset’s volume, pairing, battery strength, or connection status can only be controlled by the device itself, so VoiceGenie handles those controls without touching required. VoiceGenie can also read incoming callers’ names and ask the user if they want to answer or ignore. VoiceGenie can call up the phone’s assistant, like Google Assistant or Siri or Cortana, to ask by voice for a call to be made or a song to be played.
Some of the important facts behind the new VoiceGenie include:
This third point is perhaps the least understood, yet the most important. People want a personalized assistant that knows them, keeps their secrets safe, and helps them in their daily lives. This help can be accessing information or controlling your environment. It’s very difficult to accomplish this for privacy and power reasons in a cloud powered environment. There needs to be embedded intelligence. It needs to be low power. VoiceGenie is that low powered voice assistant.
October 14, 2016
I watched Sundar and Rick and the team at Google announce all the great new products from Google. I’ve read a few reviews and comparisons with Alexa/Assistant and Echo/Home, but it struck me that there’s quite an overlap in the reports I’m reading and some of the more interesting things aren’t being discussed.
Read the rest at Embedded Computing…
August 22, 2016
Sensory is proud to announce that it has been awarded with two 2016 Speech Tech Magazine Awards. With some stiff competition in the speech industry, Sensory continues to excel in offering the industry’s most advanced embedded speech recognition and speech-based security solutions for today’s voice-enabled consumer electronics movement.
The 2016 Speech Technology Awards include:
Speech Luminary Award – Awarded to Sensory’s CEO, Todd Mozer
“What really impresses me about Todd is his long commitment to speech technology, and specifically, his focus on embedded and small-footprint speech recognition,” says Deborah Dahl, principal at Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interactions Working Group. “He focuses on what he does best and excels at that.”
Star Performers Award – Awarded to Sensory for its contributions in enabling voice-enabled IoT products via embedded technologies
“Sensory has always been in the forefront of embedded speech recognition, with its TrulyHandsfree product, a fast, accurate, and small-footprint speech recognition system. Its newer product, TrulyNatural, is ground- breaking because it supports large vocabulary speech recognition and natural language understanding on embedded devices, removing the dependence on the cloud,” said Deborah Dahl, principal at Conversational Technologies and chair of the World Wide Web Consortium’s Multimodal Interactions Working Group. “While cloud-based recognition is the right solution for many applications, if the application must work regardless of connectivity, embedded technology is required. The availability of TrulyNatural embedded natural language understanding should make many new types of applications possible.”
– Guest Blog by Michael Farino