Engineering POV: How Snips & NXP offline voice control solutions enable simple, natural interaction with everyday devices.
Snips and NXP are already providing full voice natural understanding on MPUs, now they’re working together to bring voice to every device. Controlling devices through voice interactions is more natural and straightforward than fumbling through complex user interfaces, especially on smaller, lower cost devices that don’t usually have a touch screen.
To help manufacturers easily add voice capabilities to their products, Snips has combined their expertise in on-device voice interface solutions with NXP’s i.MX RT crossover processors. This new solution works with an application-specific model. For example, in a washing machine, a user may initiate a wash cycle through spoken commands. The washing machine will then ask appropriate questions to set water temperature, spin cycle, and any other appropriate parameters.
The combination of the offline implementation that eliminates the need for cloud connectivity cost adders, such as a Wi-Fi module, running on NXP’s low cost i.MX RT crossover processor platform enable breakthrough system cost savings, making it suitable for a broader range of applications such as switches, dimmers, small appliances and thermostats.
Another key benefit is its privacy by design, which means none of the audio gets transmitted to the cloud – all processing is done locally on the device itself. This voice solution incorporates many cutting-edge technologies that are typically found in high-end hardware and co-processor DSPs. Leveraging the performance of i.MX RT processors, this solution can accomplish most, and in many cases – all of the capabilities that are typically offered in MPU+DSP designs.
The audio processing front end and the Snips local control library are the unique enabling technologies. The local control library package is easy to use and features both hotword and command detection. These two features can be used together or separately to customize the user experience.
From the software perspective, the library is efficient and easy to integrate into any application. It uses less than 100KB of RAM for typical models, leaving plenty of RAM for the rest of the application. Integrating the library into an application is easy as well.
After setting up and initializing the library, the application simply feeds an input audio stream into the Snips library. As the library detects the hotword or command, it executes callbacks for the user application to handle them.
Feeding the library is the audio processing front-end. This component is responsible for listening to multiple microphones (up to 3 on the voice solution) to clean up the audio by applying processing such as beam forming and echo cancellation. The front-end then chooses the best beam and sends the audio to the library.
Together, NXP and SNIPS are providing a complete, fully tested implementation of local voice control that can be rapidly integrated into any application.
On the horizon, NXP’s scalable IoT solutions architecture based on i.MX processors and Snips voice technology can easily be combined with other leading AI/ML capabilities such as facial recognition, object detection and anomaly detection to enable a variety of exciting new applications.
Snips and NXP are currently working with select partners on this solution.