Although known for their networking prowess, Layerscape processors are gaining traction in artificial-intelligence applications. These applications include security and surveillance, home and building automation, factory safety, and machine inspection. The reason is that Layerscape’s connectivity and general-purpose processing enable these processors to address applications where wired and wireless communications is a key requirement, and powerful multicore CPUs can tackle multiple computationally intensive tasks.
For those surprised that the networking-centric Layerscape family is considered for AI designs, I’ve got news. Layerscape executes AI algorithms quite well, and it’s a good fit for a lot of designs. On the hardware side, Layerscape combines either the efficient Cortex-A53 or the powerful Cortex-A72 CPUs from Arm with sizeable caches and DRAM bandwidth.
Figure 1 shows how key functions in a design using Layerscape for AI-based image processing can map to a Layerscape LS1043A or LS1046A processor. Cameras and radar sensors connect via USB or Ethernet. Ethernet can also connect to a WAN uplink and to the LAN (also available via PCIe-connected Wi-Fi) if this system is an edge gateway. The four CPUs handle application logic, networking functions, capture of camera and radar data, and AI-based classification of this data.
Figure 1: Mapping AI-Enabled Application to Layerscape
The software side is at least as important. Frameworks—software libraries for AI-related numerical computation—o ptimized for mobile and embedded devices instead of servers are coming to market, enabling performance increases. These include open-source frameworks, such as Google’s TensorFlow Lite and Tencent’s NCNN, and commercial engines like DeepView from Au-Zone. By optimizing models through judicious pruning (eliminating less-useful neural-network parameters) and quantization (e.g., mapping floating-point value to eight-bit integers), these frameworks reduce memory and computation required to crunch models. In the case of video analysis, faster performance can be seen in 5-10x gains in frames per second.
Another software approach is to bypass implementing models with generic frameworks and taking a bespoke approach to developing models optimized for a specific hardware target. Optimizations beyond pruning and quantization (e.g., relying on the similarity among adjacent frames in a video stream to quickly find previously detected objects) can extract further performance. Companies like Pilot.AI and Invision.AI have ported their object-detection models to Layerscape, achieving movie-quality frame rates.
Invision.AI, Au-Zone, and a stealth startup with AI software optimized for edge computing and IoT endpoints recently presented their software at a webinar hosted by NXP. I urge you to view the archived webinar at http://bit.ly/DN2018AIML2 (and my related webinar at http://bit.ly/DN2018AIML1). These companies made interesting points about the cost, risk, and time to market advantages of performing AI on Layerscape. Companies already fielding Layerscape-based designs can add AI capability without redesigning their hardware, provided the design has CPU headroom. We’ve seen this with companies looking to add video surveillance to their enterprise access points or home automation to their residential gateways.
A system-level approach can also rationalize limited hardware resources available for retrofitting AI and streamline upgrading systems already in the field. For example, a first level of AI classification can be added to an IP camera, smart door lock, or other device, taking advantage of any available processing headroom and memory. This level can extract features or do other preliminary classification, cascading the results downstream to the associated Layerscape-powered camera headend or home-automation hub to complete the analysis process. If a deployment has insufficient resources, it need not be ripped out and replaced but instead supplemented with an adjunct Layerscape system or module for the AI functions.
Figure 2 shows this approach in the context of a roadside unit (RSU). These are systems deployed throughout a smart city to help implement an intelligent transportation system (ITS). They monitor roads and intersections with various sensors and communicate with vehicles and adjacent RSUs. NXP has shown RSU demos in the past, see https://www.nxp.com/intelligentRSU. In the Figure 2 example, the vehicles, cameras, and radars preclassify the data they capture, communicating their findings to the RSU. The RSU tracks and plots vehicles and pedestrians, analyzes their motion and queuing, controls traffic signals, and communicates with other systems—a big load that would be even bigger if a first level of processing hadn’t been done near the various sensors.
Figure 2: Cascaded AI Can Play a Role in the Smart City
Regardless of the software approach taken, it’s important to keep it up to date. This is a key function of NXP’s EdgeScale suite, which takes advantage of Layerscape’s platform trust hardware to securely update AI models and firmware from a cloud-based console. See my earlier blog post on EdgeScale at https://blog.nxp.com/networking/deploying-layerscape-based-edge-computing-nodes-just-became-easier-thanks-to-nxp-edgescale and see https://www.nxp.com/edgescale for more information on how EdgeScale helps manage devices throughout their lifecycle.
The Layerscape recipe of combining processing and I/O is well suited to supporting AI. We find the Arm Cortex-A72 CPU—the workhorse used in many Layerscape processors with one to 16 cores—performs about as well as a single thread of a server-grade processor or a single core of a PC-grade processor. We’ve seen this result on benchmarks in the SPEC suite, in networking tasks, and in video compression.
The Arm Cortex-A53 CPU—the lower-cost stablemate of the Cortex-A72—works well for applications when paired with optimized software and in less-demanding situations. For example, a video surveillance system operating at only 8fps can compress this video in the H.264 format using only a single Cortex-A53 CPU, with cycles remaining for other tasks. An adjacent Cortex-A53 CPU running commercial AI software can identify bodies at this frame rate or faster.
Layerscape’s abundant USB, Ethernet, and PCI ports can connect to cameras, radar modules, and other sensors generating input to be analyzed. These I/O ports are also essential for LAN and WAN connections. It’s hard to imagine a system using AI that doesn’t also communicate. Competing processors may have useful multimedia engines but cannot match Layerscape’s interfacing options and networking performance.
In conclusion, Layerscape can support AI functions. Developers need not rely on an expensive coprocessor add-on or think their only option is a competing chip with hardware acceleration but without Layerscape’s networking and I/O or cost efficiency. Nor must one wait to implement AI. Get started today!