Heterogeneous computing -- parallel use of specialized processors alongside a CPU -- will be the basis for many future embedded systems. What can you do now to maximize the chance that your software will port to these platforms?
Ten years ago, for most IT people, heterogeneous computing implied something very close to distributed computing, namely, linking many ordinary computers to create a high-performance computing platform. But there was also a community of explorers who were speeding up floating-point intensive calculations by finding ways to offload calculations from a CPU. At that time, their preferred target was hardware designed for a different purpose -- a graphics processing unit (GPU). For this group, heterogeneous computing had a much broader definition; namely, information systems that integrated many different types of computing resources, and their CPU-GPU experiments were just the first step.
Today, the mainstream is somewhere between these two definitions but moving toward the second, more general version. The technology opening up heterogeneous computing is the smartphone because, as in other aspects of embedded systems, smartphone market volumes and investment are helping expand the frontiers of what can be achieved at affordable cost. Dismantle a smartphone (or tablet), and it’s no surprise to find GPU and DSP hardware as well as a CPU. This technology is, in many cases, delivering heterogeneous computing and is available for use in other embedded systems.
Will heterogeneous computing capability be significant outside of the specific environment of the smartphone and tablet business? The answer, in time, will be yes. But there are a number of barriers to overcome, not the least of which is the software.
Heterogeneous computing is an example of “horses for courses.” This expression is rooted in the sport of horse racing. The horse that wins on a flat, straight course may come last on a course with bends and hills. As every engineer (or horse-race enthusiast) knows, the winning solution (or horse) depends on the problem (or course).
This principle applies to heterogeneous computing environments: Each processor type should execute software that plays the processor to its strengths. Thus, for example, a CPU will be strong when flow control in the software is complex, because a CPU handles things like cache, pipelining, and instruction pre-fetch so that an unexpected jump in flow control is not a problem. This differs from a GPU, where peak performance arises when the GPU is executing identical instructions across a large dataset. A jump in flow control will knock a GPU off its peak performance podium. A DSP is somewhere between the two.
This creates a challenging problem for the software specialists on a product development team, especially in the early design stages. While the team is developing the system architecture, the software specialists have to evaluate the costs and benefits of alternative ways to structure the software. They’ll probably want to think about a traditional microcontroller CPU-centric solution alongside of a possible GPU or DSP. If the functional requirements are like those of a smartphone, that is, the product must offer some image processing, audio handling, and so on; then a design study based on the assumption of a CPU/GPU/DSP combination would be easy to justify. But there will be many specifications where the anchor role for each processor type will be much less obvious.