Just about every quarter some big chip company announces an initiative or partnership in speech recognition. I saw one a couple months back and the announcement was so vague that it wasn’t even clear what technology would be available where, with what memory requirements or in what language or at what cost. It was unclear how to start developing and build a prototype, or how to get a support or license agreement in place. It was a pretty unintelligible announcement that was likely intended as a “fishing” trip, to see if anyone would respond.
Sensory is making an announcement with our partner ST Micro, and we are intentionally NOT being vague. This partnership is designed to make it easy to develop or prototype and we have created a very clear path for interested clients. This partnership combines a few key elements that make it VERY special and easy to develop speech-based or voice controlled products:
- STM32Cube and the XCube-LocalVUI is a STMicroelectronics original initiative to improve designer productivity significantly by reducing software development effort, time, and cost. STM32Cube covers the whole STM32 portfolio including user-friendly software development tools to cover project development from conception to realization, including graphical software, C-compilers, and a debug suite. Available for free download at ST.com.
- STM32H747I-DISCO Discovery kit. This is a hardware board equipped with everything needed for prototyping including microphones and necessary I/O ports. It’s available from Digikey and other vendors for under $100! It nicely interfaces with the STM32 Cube software!
- VoiceHub. VoiceHub is Sensory’s easy to use web tools for prototype vocabulary development. You can type in your custom TrulyHandsfree wake word (or wake words) in any of the more than a dozen languages supported. Select from many different size options and select the STM32 as your download platform. Large vocabulary’s can be added with TrulyNatural including natural language understanding for defining more complex interactions using intents and slots. The TrulyNatural output can be tested on your PC, android, or iOS device, then sent to the STM32 Disco discovery kit. Licensing is simple and fast with click to sign agreements.
To help illustrate what is possible with this new partnership, Sensory ran its microwave assistant (includes tens of thousands of phrases that could be spoken to a microwave) on the ST H7 chip. Here’s what we found with 2 different memory configurations (both using 32 bit ARM Cortex M7 1MB RAM 2MB Flash):
STM 32 –H747 & 32 bit ARM Cortex M7 Flash Running “Microwave” Assistant *NLU for reducing TCR in SDK at OS level only
For those interested in the details of the memory breakdown, we can share this that shows the small amount of static code and RAM needed by Sensory for a complete integration of technology code, application code, language and acoustic models, wake words, commands and NLU.
RAM requirements for code, application code, language and acoustic models, wake words, commands and NLU.