Nikkei Electronics Asia -- April 2007
Column--MicroTech Watch
Multicore Design Issues

E-Mail Article
Tweet This
Digg This
Share this with friends on Facebook
Buzz Up!
Mar 28, 2007 14:15 Nikkei Electronics Asia
At a recent Linley Tech seminar in San Jose, a panel of IP experts discussed trends in system-on-chip (SoC) design using third-party CPU cores. The seminar highlighted some key trends, particularly in the use of multiple CPU cores. Speakers from Freescale, IBM, MIPS, Tensilica, and ARM participated.

More, Faster CPUs
The trend toward multicore designs was evident in the results of a survey of the seminar attendees, who were mainly designers of networking and communications SoCs. About 60% of the attendees were designing a chip with more than one CPU core, and nearly half of those were using four or more cores. Multicore designs are well suited to packet processing, particularly in the data plane. And once software is ported to run on multiple cores, the step from two CPUs to four is relatively easy.

There was some debate among the panelists about the pace of this trend. IBM's Harry Linzer reported that only a small percentage of his customers are involved in multicore designs. But Tensilica's Sumit Gupta sees many customers doing multicore designs, particularly in the data plane. This difference may reflect on the company's products: IBM's cores are more powerful, but Tensilica's are small enough that a chip can easily include several of them.

The speed of licensed CPUs continues to increase as well. Whereas most designers were implementing 200-300MHz CPUs a couple of years ago, 77% of the attendees surveyed said their current designs use CPUs at 300MHz or above. In fact, some expect their licensed CPU to exceed 600MHz, a mark that few designs achieve today. But the newest CPUs, such as ARM's Cortex, combined with increasing usage of 90nm and even 65nm technology, are driving clock speeds to new heights.

Multithreading vs Multicore
Darren Jones of MIPS discussed the advantages of multithreading. The MIPS 34K is the only commercially available multithreaded CPU core, although other vendors are using this technology internally. Darren noted that multithreading adds only a small amount of die area to the 34K CPU while improving performance by 60% on certain EEMBC benchmarks.

Gupta pointed out that Tensilica's Diamond 570T CPU core, at 0.5mm2, is so small that four of these CPUs can fit in the same space as a single 34K. The 570T runs at about half the clock speed of the MIPS 34K, but on highly parallel applications, a four-core configuration could deliver better performance than a single 34K.

Most applications don't scale linearly with multiple CPUs, however, so a four-core design won't deliver four times the single-core performance. Jones also pointed out that a multithreaded design, such as the 34K, generates better single-thread performance than a multicore design.

According to Jones, the 34K supports up to nine thread contexts to allow programmers to "park" critical routines in the CPU, avoiding the need to fetch thread state.

Dealing with Complexity
Moving from a single-core to a multicore model requires major software work. The hardware impact is debatable. Toby Foster of Freescale noted that simply combining two CPUs that have already been validated together is not a difficult challenge, particularly since modern SoCs have extensive custom logic outside the CPU that are usually the design bottleneck. In fact, Gupta argued that a multicore design is simpler than a design containing several special-purpose hardware engines; these custom engines can be replaced with off-the-shelf CPUs plus software.

Another complexity with multicore designs is the need for a high-bandwidth interconnect for the CPUs. Even an SoC with only one CPU may have high-speed memory and I/O controllers, or fixed-function accelerators, that must be efficiently interconnected.

Support from the CPU vendor is critical in a multicore design. Vendors such as ARM provide MP-validated CPUs with system-level simulation tools that can identify inter-CPU problems before fabrication. Freescale will instead design and validate a semicustom chip to meet a customer's specifications, offloading validation.

by Linley Gwennap