Like bridges and huts, chips are usually designed with plenty of margin. Margin is added everywhere from the models used to describe transistor behavior all the way up to transaction-level interactions between blocks. Adding this margin does not come for free, however. Margin adds silicon area, consumes power, and reduces the achievable performance levels for every chip made. As processes shrink, achieving design goals is becoming increasingly challenging, and all that margin looks like a fertile ground to find that last performance boost needed to hit the target frequency. Enter SSTA (Statistical Static Timing Analysis). Here is a technique that promises to reduce margins without compromising quality by approaching the margin problem scientifically.
Conceptually, SSTA is simple. Manufactured transistors and wires vary in performance, across the set of all chips made and within every chip as well. Quantifying this variation allows margining to be done realistically, rather than assuming that the worst case of everything will happen simultaneously. SSTA also makes sense. How likely is it that the logic going to a flip-flop will be made of the slowest possible transistors, while its clock will be the fastest possible devices? Or that metal capacitance and resistance will both be high simultaneously? Especially since the latter scenario seems to require metal to be both extra thick and extra thin at the same time. Even accounting for Murphy’s law, this sounds like pretty unlikely, so SSTA has some room to work.
At the moment, commercial SSTA approaches tend to focus on transistors rather than metal, so let’s concentrate on transistors. Successfully predicting the statistical behavior of delay in silicon requires Monte Carlo SPICE models, a library characterization method, and a statistical timing engine that can make sense of them. It also requires a methodology: will we use SSTA to simplify hold fixes, or to cut setup time and boost performance? How much speedup will we get? The numbers people talk about are on the order of 10%. Significant, but not astounding. Characterization costs are much higher, file sizes are much bigger, and margin is still needed – the “unknown unknowns” still lurk out there, and failing to heed them could put you in the position of the designers of the Tacoma Narrows Bridge, which collapsed four months after opening in 1940 because of aeroelastic flutter, something they had not margined for.
When discussing SSTA, it’s often tempting to use statistics to trade off yield and performance: if 98% of our chips will meet timing, maybe we can just keep those and ignore the other 2%. Two hidden assumptions make this problematic, however. First, we need a test methodology that’s able to discriminate between the 98% and the 2%. The details would easily fill another post, but suffice to say that it is harder than it sounds. Second, the mapping between timing models and yield is not nearly as pretty as we would like. It’s common in SSTA to assume a process distribution with a mean value at typical silicon and +/- 3 sigma values at the fast and slow corners. This makes for convenient math, but is not how fabs work. The fab guarantees that the process will remain bound by the various corners, but does not guarantee how many chips will fall in any particular region, nor how much time the process will spend in any given part of process space. This means that the expected 2% number of failing chips could be significantly higher. It also means that the failing 2% might come all at once – like the week that the chip is supposed to ramp its volume for Christmas.
If you’re thinking that the performance gain of SSTA is worth the effort and complexity, the ecosystem to use it is out there, but will require some effort on your part to get the libraries and validate the tools. A common approach is to quantify the potential benefits on a design that did not use SSTA for sign-off, then use those results to guide design on a future chip. If the whole thing sounds like more trouble than you’re willing to endure, you might want to try the “advanced OCV (On-Chip Variation)” (e.g. here) approach, which derates paths differently depending on their length and physical location (longer paths will have less on-chip variation than short ones, because random fluctuations will partially cancel each other out). Constructing the derating tables can be challenging, depending on how much margin you want to shave, but the results should be close to what full-fledged SSTA could achieve.
In the end, better margins are the key to better design, and margins need to be quantified before they can be reduced. Both SSTA and improved OCV can bridge the gap between knowing there is statistical variation in timing and accounting for it correctly, but neither is a totally safe crossing just yet. If you believe otherwise, I have a bridge for sale in Brooklyn that might interest you…
Rob Aitken, ARM Fellow, spends his days in the technology trenches with nanometer scale devices and picosecond timing, looking at the circuits that eventually get put together to make smart phones or mildly clever toasters. He is a fan of all aspects of chip design, from transistors on up, and also of the various tools and methods that enable efficient, productive and successful design and manufacturing.
Shortlink to this post: http://bit.ly/jJvfQ
0 Comments On This Entry
Please log in above to add a comment or register for an account
Fortune Brainstorm Green
on May 13 2013 10:58 AM
Moonshot - a shot in the ARM for the 21st century data center
on Apr 09 2013 01:22 PM
Bringing the Benefits of the Smartphone to Pay-TV
on Mar 14 2013 05:34 PM
2013 - A Lucky Year For All Smartphone Consumers
on Mar 13 2013 06:58 PM
Internet of You at Mobile World Congress with M2M, Sensors and LTE
on Mar 12 2013 02:44 PM