zobnik

Latest generation of NXP QorIQ processors (like older QorIQ families) comes with dedicated high performance coprocessors for accelerating network processing by offloading main CPU cores. Devices like LS1088, LS2088 (and many others to come) are equipped with AIOP (Advanced IO Processor) – clusters of many RISC engines optimised for flexible networking and I/O operations. While programming of AIOP is closed to general public, Borea offers custom AIOP microcode development.

Usage of AIOP is virtually limitless. In addition to using NXP provided microcodes to accelerate network protocol processing we are able to develop entirely custom microcodes that can be downloaded into AIOP enabled NXP chips during run time. We are developing custom RISC microcodes for various NXP platforms since 2007.

AIOP is a cluster of highly optimised RISC engines tailored to perform network packet manipulations more efficiently than GPP (general purpose processor) cores (for example ARM) using same amount of energy/space. Cluster can depending on chip contain many cores (up to 16), creating huge computational resource. This gives user of NXP QorIQ platform unique advance over other solutions in at least two ways:

  • more networking processing performance
  • higher amount of flexibility in implementing protocols and interfaces

Motivation to design a custom microcode can be therefore driven by both. If user requires performance level even higher than the one provided by NXP standard AIOP microcodes, we can develop tailored solutions for specific networking tasks. By removing various unnecessary options designed into standard microcodes (to support wide user base and their requirements) custom microcode implements more streamlined processing and does not loose performances processing irrelevant operations.

Perhaps even more interesting is the second aspect of custom designed microcodes – flexibility. Unique architecture of DPAA2 and strategic placement of high performance programmable device like AIOP between the GPP cores and hardware interfaces allows higher level of flexibility than any other hardware based solution. QorIQ device can be programmed to route portion or entire network traffic thru AIOP. Therefore AIOP can either help the GPP with processing or it can even automatically without any host intervention process all or majority of network traffic.

Processing inside AIOP is very deterministic with low latency and low jitter, therefore it can be used to create custom networking protocols, for example various kinds of redundant industrial protocols. It can be used to integrate specific external chips to the QorIQ without any glue logic or FPGA. Many modern chips use Ethernet or some other kind of framed protocols over Serdes. Custom microcode can be very efficient in creating glue interface between QorIQ and such chips. A good example is connecting DSL modem chips which understand G.999.1 protocol, a non-native protocol for QorIQ. In many cases interface conversion logic can be removed from the board and implemented inside AIOP.

AIOP in addition to be connected to WRIOP (wire rate I/O processor providing hardware Ethernet connectivity) is also connected to the rest of QorIQ chip. Therefore it has direct access to on-chip accelerators like security engine, pattern matching, DDR memory, etc. It can autonomously place or take data from DDR and communicate with a high speed external peripheral. Being able to access also devices PCIe bus it can transfer data from any peripheral to PCIe and process the data at the same time.

Even if no connectivity is required AIOP can be beneficial. With its 16 cores on LS2088A it can offload computational tasks from the main ARM GPP.

Custom microcode development usually consists of the following steps:

requirement analysis / specification
hardware modification recommendations to support the microcode
microcode design / development
development of API for the microcode
integration into the OS (by providing standard compliant drivers)
microcode testing
microcode validation/certification, performance testing

We consult what parts of the system could be realised by the dedicated processors (AIOP ) in more efficient way than by using the host CPU (ARM GPP) or external chips. We treat AIOP as dedicated network processing element and as a cluster of generic processing elements at the same time, where application must be correctly distributed among all cores (ARM and AIOP RISC processors) to get best performance of the whole system. Micro-code based application approach offers superior multi-core performance compared to simply running SMP Linux kernel on all cores or similar software techniques.

Areas we cover:

system architecture development (putting right things together)
development of customised microcodes
testing of microcodes
integration into other software (operating system)

Design

Main design target for our microcodes is performance. Architecture of the microcode (together with the architecture of the whole system) is designed to be optimal for given application:

adaptation of the microcode to host CPU software: we can adjust microcode to be compatible with existing CPU software or suggest the user how to change CPU software to be faster together with the microcode

adaptation of the microcode to external chips: (we can adjust microcode to be compatible with external chips (for example xDSL line transceivers, …) or suggest different architecture of FPGAs hooked to the AIOP, …

Due to customised microcode design we typically achieve greater than 100% improvement compared to standard microcodes that have been developed for generic market. That is because we target the solution for the application. We can design the microcode from scratch and as such have no dependency on NXP’s code.

Integration

Developed microcode is never left “alone” at the user due to the complexity of operation and integration. We provide support for integration. Microcode is integrated into users environment by providing API, demo OS package. We design the microcode as one component in the system, a holistic approach that eases integration, simplifies conformance testing and reduces time to market. On-site or remote (via the Internet) debugging of the microcode is also provided. This helps us resolve issues in the microcode itself or even help you show the problem in your hardware or the rest of the system. We can for example help your FPGA designer whose FPGA is connected to our microcode to understand the problem in FPGA by tweaking a microcode for testing purposes.