
As SoC design and
fabrication have moved to smaller lithographic geometries and higher
integration complexities, the pre-silicon phase of design
implementation has seen marked improvements in efficiency and
throughput. However, the cost and time-saving benefits of design
automation are rapidly eroded in the post-silicon phase (Fig 1). Even
if the design is perfect, silicon validation is still required to prove
that the chip works correctly at-speed and in-system under different
operating conditions. Additionally, debugging is necessary to resolve
hardware-software integration problems or whenever the design turns out
to be not entirely flawless.
First silicon validation and debug requires a labor-intensive engineering effort of several months, and it has become the least predictable and most time-consuming part - 35% on average - of the development cycle of a new chip at 90nm (Fig 2). This cost (and variability) is expected to increase at 65nm and below because the complexity of silicon validation increases faster than design complexity and because existing ad-hoc validation methodologies are ineffective in dealing with unprecedented levels of SoC device complexity.
Even the most
sophisticated SoC design methodology cannot fully account for all the
parameters that impact silicon behavior, or for all logic corner cases
that occur in the real life of a chip working at-speed and in-system.
Moreover, on designs with embedded processors, it is nearly impossible
to ensure the software and hardware will operate cohesively when merged
into first silicon.
Software integration and test issues aside, there are real hardware issues to contend with during bring up and early integration. Simply consider that the simultaneous occurrence of two unlikely events may never be simulated or analyzed pre-silicon, but may cause unexpected effects when encountered in-system. Pre-silicon verification methods - simulation, emulation, FPGA prototyping, timing analysis, and formal verification - are oblivious to many deep sub-micron problems that occur in the actual device. Many other integration problems, configuration problems, and unexpected behaviors resulting from signal integrity, power, noise, or process-related issues, may be similarly difficult to find pre-silicon.
Since perfect logic, timing verification and perfect embedded software of a complex SoC at 90nm or below is practically impossible pre-silicon, post-silicon validation has become an essential step in the design implementation methodology. The most important phase of this process is in-system, at-speed validation, which is the first opportunity for a newly manufactured chip to work in its intended environment and to interact with other chips on a system board while operating at its target frequency. In-system, at-speed validation introduces many new functional patterns and explores states not encountered previously during pre-silicon verification or during manufacturing test. Only real-life usage under stress conditions exposes functional and timing errors that escape pre-silicon verification. However, locating such problems in a chip working at-speed, in-system is much more difficult and far more expensive than in pre-silicon or on a tester.
In simulation, all SoC signals are accessible. Likewise, in a tester, we have at-speed access to the I/O pins of the chip. In a system, however, one no longer has direct at-speed access to the chip. This limitation represents a significant reduction in both observability and controllability, and makes assessing the internal dynamic behavior of the chip an extremely difficult process. Unlike tester-based experiments that are fully repeatable, in-system operation is not completely deterministic because of unpredictable interactions among independent events such as external interrupts, irregular network traffic, bus arbitration, or transactions involving asynchronous clock domains. This non-determinism makes many problems appear as intermittent, and severely complicates silicon validation and debug.