Turnkey, low-cost NXP i.MX RT-based voice solution speeds time to market.
In January 2019, Dave Limp, Amazon’s Senior Vice President of Devices and Services, stated in an interview with The Verge, that more than 100 million Alexa Voice Service (AVS) devices have sold world-wide. While this number pales in comparison to the number of smartphones that are pre-installed with Siri or Google Voice Assistant (GVA), people who buy Alexa compatible devices are often making an active choice to use Alexa, unlike smartphone users who may never use their pre-installed assistants. Having voice assistants built into phones is a nice feature, but it doesn’t give users the power and convenience of hands-free operation that they experience with devices like Amazon’s Echo smart speakers. According to Amazon, in addition to their own Echo speakers, these 100 million Alexa devices also include a wide range of third party devices made by over 4,500 different manufacturers, consisting of over 150 products with Alexa built-in and over 28,000 different products that work with Alexa.
By reducing cost and flattening the learning curve, a recently announced solution from NXP® Semiconductors is designed to facilitate the rapid growth in the number of products with Alexa built-in. For the first time, this solution enables a low-cost microcontroller unit (MCU) to be used to build Alexa into a much broader variety of devices, eliminating the need to use an expensive applications processor or microprocessor unit (MPU) and making it much easier for many manufacturers to add Alexa to their designs. It does this by leveraging the AVS Integration for AWS IoT Core that Amazon launched November 25th which enables Alexa to be built into MCU powered IoT devices. In this post we will review the differences between products that work with Alexa and those that have Alexa built-in. We also explore the differences between MCUs and MPUs, as they are key to understanding why using the AVS Integration for AWS IoT Core to deliver Alexa built-in will enable the proliferation of Alexa into smart home and smart appliance products.
Works with Alexa vs. Alexa Built-in
There are around two hundred times as many different “Works with Alexa” products available today as there are products with Alexa built-in. This is because it is relatively straight forward for a manufacturer to add works with Alexa compatibility to an app controlled smart device. Furthermore, works with Alexa capability can be added to smart devices long after they have been sold and installed in the field. “Works with Alexa” means the device has been certified by Amazon to verify that it can be controlled by the Alexa Voice Service, using voice commands spoken into devices like Amazon’s Echo speakers. Original equipment manufacturers (OEMs) achieve this by creating their own Alexa Skills, or leveraging existing Alexa Skills such as those within the Smart Home Skill API, enabling users to control their devices with Alexa.
Alexa built-in products have microphones to listen for the Alexa wake word and then relay commands to the cloud and a speaker to play back Alexa’s subsequent responses. For true hands-free operation, most Alexa built-in devices need to have far-field capability, which means they can understand voice commands from across the room, typically at distances of up to 5m (about 20 feet). In order to extract intelligible speech from a noisy background, a good far-field voice implementation typically has an audio front end (AFE) processing capability to suppress background noise, eliminate echo, allow barge-in (commands can be recognized during audio playback) and perform beamforming from a multi-microphone array.
MPUs and MCUs
Application processors, also referred to as microprocessors (MPUs), run a complex operating system (OS) like Windows, MacOS, iOS, Android or Linux, requiring large memory footprints consisting of gigabytes (GB) of NAND Flash storage and SDRAM memory. All these operating systems manage virtual memory spaces that are mapped to the processors’ physical memory by a memory management unit (MMU). Today’s MPUs typically have two, four, eight or more processor cores and are found in powerful devices that include laptops, smart phones, tablets and smart screens, video game consoles, routers and gateways. MPUs are typically based on Arm® Cortex®-A central processing units (CPUs) and include devices such as the Apple A13 Bionic powering the current iPhone models, or the Broadcom BCM2837B0 SoC found in the latest Raspberry Pi board.
Microcontrollers (MCUs), used in embedded designs, primarily use a real-time operating system (RTOS) such as Amazon FreeRTOS. These RTOS implementations usually require very little memory, typically a megabyte (MB) or less of Flash and RAM, both of which are often integrated on-chip. MCUs are almost always single CPU devices and are used for embedded control of products like appliances, power tools, toys, automotive subsystems like engines, brakes, steering and suspension and many smart home products including light switches, smart plugs, thermostats and smoke detectors. Today MCUs are often based on Arm Cortex-M CPUs, such as the NXP Kinetis® MCUs that power the Nest Protect smoke detectors.
Difficulties Adding Alexa Built-in to a Product
Prior to the introduction of NXP’s i.MX RT106A based solution for Alexa Voice Service, Alexa built-in required OEMs to use a powerful MPU, running the Linux operating system and capable of delivering at least 750 DMIPS (Dhrystone million instructions per second) with over 50 MB of memory. Additional CPU resources and memory are required to implement the AFE processing necessary for far-field voice, often implemented on a separate dedicated DSP processor. Many products that OEMs would like to build Alexa into do not use MPUs. They are powered by MCUs, so adding an MPU, or even replacing the MCU with an MPU, would significantly drive up the cost. Furthermore, many engineers at these OEMs have spent their entire careers writing embedded code for MCUs using an RTOS and are quite unfamiliar with MPUs and operating systems like Linux.
NXP’s MCU Based Solution Makes Alexa Built-in Easy!
In February 2019, at the Embedded World Show in Nuremberg, Germany, NXP announced the world’s first MCU-based implementation of an Alexa client, based on a new member of NXP’s popular i.MX RT crossover MCU family of devices, the i.MX RT106A. This new solution, for the first time, enabled OEMs to build Alexa into products using a low-cost, low-power MCU, a device that is typically already required in any connected smart home product. As a result, OEMs can now add voice to their products at very low incremental cost, not much more than the cost of the microphones and a speaker. Running Amazon FreeRTOS, NXP’s new MCU-based AVS solution leverages the power of Amazon Web Services’ AVS Integration for AWS IoT Core to minimize the processing resources needed to build Alexa into a product.
To enable Alexa to be built into MCU-based products, instead of running the full AVS Device SDK on the physical device, Amazon instead runs a virtual Alexa client in the cloud as a containerized service instance on AWS IoT Core. With this implementation, all the HTTP/2 communication traffic is cloud-to-cloud and as a result, the Alexa client can be implemented with a low cost MCU, using MQTT messages to make updates to the service instance in the cloud. In addition to lowering device hardware costs, having the AVS device client running in the cloud also offers OEMs a significant reduction in their lifetime device management costs by reducing the frequency and size of over-the-air software updates to devices in the field. Instead of OEMs having to push frequent AVS client SDK updates, over 50 MB or more, to every one of their devices in operation in the field, now Amazon can update all the AVS virtual device service instances running in the cloud on AWS IoT Core, at zero cost to the OEMs. Similarly, because the software image for the MCU-based devices is more than two orders of magnitude smaller than it would be on an MPU, the cost for OEMs to update their own software on the physical device is also significantly reduced.
Compared to traditional MPU implementations running Linux, requiring gigabytes of RAM and Flash, NXP’s MCU solution needs less than a few hundred kilobytes of on-chip RAM and only a few megabytes of Flash, significantly reducing cost and size of an Alexa built-in design. While MCUs are typically priced lower than MPUs, it is the huge reduction in the memory requirements that enables most of the bill of material (BOM) cost savings achieved with this MCU based implementation.
The i.MX RT106A MCU at the heart of NXP’s AVS solution is powered by a 600 MHz Arm Cortex-M7 processor, with 1 MB of on chip SRAM and a wide variety of communications and other peripherals. It comes with a license to use NXP’s turnkey AVS qualified software, including a machine learning (ML) implementation of the AFE needed to meet Amazon’s far-field voice requirements. This software runs on a production-ready hardware platform, to enable OEMs to quickly and easily add Alexa to their product designs.
NXP’s i.MX RT106A MCU-based AVS solution (SLN-ALEXA-IOT) is available from NXP and authorized distributors as a complete kit for evaluation, development and prototyping, with a suggested resale price of $149.00 (US).
The hardware consists of two small, 30 mm x 40 mm (1.2” x 1.6”) boards. The MCU system-on-module (SoM) carries the i.MX RT106A processor, HyperFlash memory and a Wi-Fi/Bluetooth module. The audio board uses two or three low cost, high performance MEMS microphones and connects to a speaker driven by a smart audio amplifier.
The kit ships with software that includes everything necessary for a developer to connect to the Alexa Voice Service out-of-the-box and immediately start prototyping. This one-stop-shop software package includes far-field voice AFE processing (noise suppression, beamforming, echo cancellation and barge-in), the Amazon Wake Word Engine (WWE) and models, an AVS client application, API and all necessary drivers. Everything is provided in source code, with the exception of the AFE and WWE, making it easy for developers to port their device software onto the i.MX RT 106A from their current MCU. To enable this, NXP’s AVS solution has 300 kB of on chip RAM and at least 120 MHz of the CPU available for developers to run their software, more than enough resources for the majority of embedded applications.
By leveraging the power of Amazon FreeRTOS and AWS IoT Core, NXP’s unique MCU-based solution for Alexa Voice Service delivers the benefits of shorter time to market, with lower BOM and lifetime costs, all on a microcontroller platform familiar to embedded developers.
More details on NXP’s MCU-based AVS solution, the i.MX RT106A MCU and the SLN-ALEXA-IOT development kit can be found at www.nxp.com/mcu-avs.