NXP collaborating to heat up big data in a “flash”

NXP collaborating to heat up big data in a “flash”

It‘s scary how smart and fast the Internet has become. Massive data centers can serve up – nearly instantaneously – the answers to virtually any question I have and provide a never ending stream of posts, pictures and videos of my friends. I easily find out the shirt I was looking at the other day is now on sale and my favorite band is coming to town with a “click here to buy tickets now” reminder.

Behind the scenes of this instant gratification are approaches to processing data. Among these new approaches is the use flash storage technologies (also called non-volatile memory). This approach allows sub-microsecond access to terabytes of data, enabling data scientists to view and process much more data across distributed systems at the same time – a “have your cake and eat it too” moment.

New storage technologies and new approaches

As we consider how to best leverage flash storage technology, one thing becomes obvious: the advent of flash technology changes the established balance of processing and storage. Flash storage can offer data at rates that are over a thousand times faster than hard drives. While CPUs overall have gotten faster, they haven’t gotten a thousand times faster over the last 10 years. So, to take advantage of the faster storage, we need to rethink how we process data.

One approach that NXP and other industry leaders are exploring is the placement of multicore CPUs tightly coupled to, and distributed with, the flash storage itself. This development has led to the creation of an Intelligent Flash Storage platform that can be used as a testbed for a number of new emerging storage technologies. These include various forms of non-volatile memory, the new NVM Express protocol running over PCI Express or Ethernet Fabrics and the use of ARM®-based processors directly attached to the flash storage. These technologies introduce powerful new paradigms that provide new ways of solving big data processing problems.

In terms of form factor, the platform is a PCI Express form factor board that integrates the following:  eight high performance 64-bit ARM CPU cores, 40 Gbps of Ethernet network connectivity and protocol acceleration, Gen 3 PCI Express connectivity to a host processor over the standard PCI Express slot interface and an FPGA-based flash subsystem that supports malleable connectivity to a variety of flash memory modules.

The intersection of IO and processing is especially fruitful for research

We are still in the early years of exploring how to process big data and there is much more innovation and experimentation to come. The ability to contribute to this exciting area is something to be proud of.

By bringing together teams from many of the top universities, tier-1 research labs and some of the largest data centers in the world, collectively, we are creating not only a platform, but we want to push the limits of intelligent storage systems.

The ability to explore new heterogeneous memory architectures will guide the industry in how storage tiering will evolve. Being able to recreate what was a previously complex system in a small solid state drive will increase performance over an order of magnitude.

The multi-disciplined industry team is meeting in Boston this week to spend a few days collaborating, presenting ideas and challenges, as well as sharing findings from use of the platform. The goal over the next few months is to use this industry team for real world research on big data and distributed storage problems.

How do you think storage tiering will evolve? Where do you see the future of intelligent storage systems?


Matthew Short
Matthew Short
In his current role as a senior marketing manager focused on storage for NXP’s Digital Networking business, Matt Short is working with key players in the industry to develop innovative new solutions. Matt has over 15 years of experience in applications engineering, systems engineering, systems architecture and product and segment marketing. He holds BSEE and MSEE degrees from the University of Texas.


  1. Avatar ademarr1 says:

    For years, using RAM for high-velocity information processing has been the default preference for organizations, in part because it turned into cheap and will handle the workload. however as information grows around the world, RAM has struggled to maintain up with demand. Now many are beginning to show to flash because the method by using which information may be processed and used more effectively.

    Pass4sure 70-410 dumps

  2. Avatar Manan says:

    Interesting Article!
    Is a comparison against HDDs fair at a time when case is being made for emerging storage technologies approaching memory speeds? Shouldn’t the case of such “in-storage processing” systems be made bench-marking them against DRAM speeds rather than HDDs? SSDs have already taken the leap from HDDs.

  3. Avatar NXP says:

    Thanks for your comment. One thing we are seeing is that the nature of processing is changing. Information (or more correctly) data is growing like crazy and processing has finally broken through the multiprocessing barrier and it is not difficult now to apply hundreds or thousands of processors on a problem. But as the problems and systems get larger and larger we owe it to ourselves to look for ways to more intelligently segment the workloads to reduce the overall size and power dissipation of systems. Small processors closely coupled to data stored in flash is an architecture that makes sense to study.

  4. Avatar NXP says:

    Good question. Typically Flash storage drives are compared to HDDs because they offer the same block storage interface. Newer storage class memories are typically compared against DRAMs because they offer the same interface, memory-mapped, interface as memory. What we are really trying to understand is how small low power processors placed very near the storage can be used to offload big data center processors. Some of that offload may take the form of transforming block based storage into file, object or in memory data structures. Some may take the form of decompression or decryption. Some may take the form of sorting and parsing data, reformatting, searching. It is time to re-examine the architecture of storage in light of new technologies and workloads. The Intelligent SSD board we have built is meant to be a platform for exploring these offloads.

  5. Avatar AHK says:

    What role will the FPGA play in your envisioned architecture? Just glue-logic between the ARMs and the Flash storage? Or have you also considered the use of the FPGA as a very high bandwith / low-latency compute accelerator for near-storage processing itself? This is something we are currently working on (including HLL/DSL-based programmability).

Buy now