The need to move cache management to hardware
In the old days, cache coherency management was largely done in software, invalidating large parts of the cache to ensure no stale data could get accessed, and forcing the cache to gradually be reloaded from main memory. There are several reasons why this is no longer appropriate. Caches have got very large and the penalty for off-chip access back to main memory is enormous. A large amount of data flowing through a slow interface is bad news for system performance, and off chip memory accesses have a big negative impact on power.
Caches now require more intelligence. Instead of invalidating lots of data that might turn out to be stale, the cache controllers need to invalidate on a line-by-line basis in order to ensure that anybody reading an address gets the latest value written. This even needs to be extended to devices that don't even have caches since a DMA device cannot simply go to the main memory due to delayed write-back.
Verifying the AMBA4 protocol with Jasper’s ActiveModel
Obviously these protocols are pretty complicated, so how do you verify them? It’s not just about verifying that a given RTL implementation of the protocol is good, that is the normal verification problem that formal and simulation techniques have been used on for years. It’s about verifying that the protocol itself is correct. In particular, that the caches are coherent (any reader correctly gets the last write) and deadlock-free (all operations will eventually complete, or a weaker condition, at least one operation can always make progress).
ARM discussed how they addressed this issue using Jasper’s ActiveModel table-driven formal technology. This creates a circuit that is a surrogate for the protocol. Yes, it has registers and gates but these are not something implementable, they capture the fundamental atomic operations in the protocol. Then, using standard proof techniques, this circuit can be analyzed to make sure it has the good properties that it needed.
It was a worthwhile exercise. It turned out that the original spec contained some errors. Of course they were weird corner cases, but that is what formal verification is so good at. Simulation hits the stuff you already thought of. There was one circumstance under which the whole cache infrastructure could deadlock, and another in which it was possible to get access to stale data.
There is a white paper for download, a video for viewing and a blog that discusses this very topic.
Guest Partner Blogger:
ARM welcomes its wealth of Partners in the ARM Connected Community (CC) to submit guest blogs to be published on our multiple community blogs. If interested in participating please submit email inquiries to Tell.Us@arm.com.
The ARM Connected Community (CC) is an extensive ecosystem covering all aspects of ARM processor-based design, from chip implementation through to system and device design. The CC provides a platform for collaborative innovation, with multiple types of forums for members to work with one another, and with customers, to solve industry challenges, all with the purpose of enabling designers to focus on differentiating features and an accelerated time-to-market for ARM powered solutions.
0 Comments On This Entry
Please log in above to add a comment or register for an account
Kickin’ It Up in Austin at DAC’s 50th with ARM and its Partners
on May 22 2013 08:34 PM
ARM Cortex-A57 Test Chip on TSMC 16nm FinFET Process Optimizes Tools & Flows
on May 21 2013 08:48 AM
Seven tips for ARM Accredited Engineer exam success
on May 20 2013 09:22 AM
The Server in Your Hand - and the three new interfaces inside it
on May 09 2013 08:54 PM
A DATE with Computing Destiny
on Mar 18 2013 06:57 PM