High performance and power efficiency are critical to the latest mobile devices, and AMBA® 4 ACE™ is a fundamental technology supporting ARM’s big.LITTLE processing. In case you missed the announcements, the big.LITTLE technology offers an innovative way to run the ‘always on’ tasks on the highly efficient Cortex™-A7 processor, while the high performance and responsive applications are predominantly executed on the Cortex™-A15 processor. So what does this have to do with AMBA 4? Well AMBA 4 ACE and the CoreLink™ CCI-400 Cache Coherent Interconnect offer the critical glue to join these processors together into a big.LITTLE multi-processing (MP) system. Let me explain…Earlier this year ARM announced the public release of the AMBA 4 phase 2 specification including ACE, or AXI Coherency Extensions. This new specification, supported by Cortex-A15, Cortex-A7 and Mali™-T604, allows hardware managed coherency and cache sharing between these processor cores. Sharing a workload across multiple cores offers greater performance and power efficiency. Without hardware coherency the software is responsible for cache maintenance including cleaning, flushing and invalidating caches. This takes significant processing cycles and energy as data is cleaned out from caches to external memory. All system architects know that external memory accesses are nearly always higher latency and higher power than on chip memory, that’s why we have caches. The hardware coherency introduced with AMBA 4 ACE allows the different processing engines to view each other’s caches and removes or reduces the need for the cache maintenance operations.
big.LITTLE MP allows a single chip to contain two very different sized, but fully code-compatible, processors to share the processing workload. In this case the Cortex-A7 can run the less demanding OS and background activities, and the Cortex-A15 is called in to service the more demanding applications like a web page render. AMBA 4 ACE allows these processors to see the same view of memory, including any shared workloads. Further, a process running on the small core can migrate quickly to the large core as demand requires. The hardware coherency ensures that any cached data in the small core can be passed seamlessly to the large core without having to access external memory. Without this hardware coherency the system would need to stop, clean caches from the small core to main memory, then once complete, start the big core. This would take time and energy.
CoreLink CCI-400 Cache Coherent Interconnect is ARM’s first implementation of AMBA 4 ACE and has been designed from the start to support big.LITTLE. It has two full ACE ports for the processor clusters, supporting up to quad core Cortex-A15 or Cortex-A7 on each port; and three ACE-Lite™ ports for I/O coherent devices like the Mali-T604 and Mali-T658. This I/O coherency allows the GPU to read shared data from either the big or LITTLE core depending on which is running (or even both!). This could benefit a range of real world use cases such as the User Interface (UI) running on the little core with acceleration from the GPU, through to high performance gaming with the big core and GPU sharing the game engine, physics and rendering responsibilities. In either case a seamless, efficient view of shared memory offers the high performance and power efficiency demanded by future mobile devices and GPU computing. Hardware coherency reduces the need for cache cleaning and invalidating when sharing data with these I/O devices and this can improve I/O performance and simplify software.
At a system level CCI-400 is designed to integrate seamlessly with other CoreLink 400 IP including, DMC-400 for dual channel LPDDR2/DDR2/DDR3, MMU-400 for system virtualisation and NIC-400 for connecting the rest of the system peripherals and controllers with minimal routing and cost. Throughout the CCI-400 design process, ARM’s interconnect team has been working closely with the processor and graphics teams to ensure that the CCI-400 offers the right balance of system performance and power efficiency for big.LITTLE to make the best of ARM’s joined-up story. I’m excited to see the Cortex-A7 release, and to welcome another processor to the AMBA 4 family.
For more information on CoreLink CCI-400, AMBA 4 ACE and big.LITTLE check out the following:
- AMBA 4 ACE technology webinar and white paper
- CoreLink CCI-400 information
- big.LITTLE information, blog and white paper
- CoreLink 400 family: CCI-400, DMC-400, MMU-400, NIC-400
Neil Parris, Interconnect Product Manager, ARM. Neil joined ARM in 2000 working on system architecture, consulting projects and ASIC design built around ARM processors. Following the design role Neil supported a number of major customers as part of the US Field Applications team before moving to the current role of Interconnect Product Manager. Neil holds a 1st class Masters degree in Electronic Engineering Systems from the University of Sheffield, UK.
All company and product names appearing in the ARM Blogs are trademarks and/or registered trademarks of ARM Limited per ARM’s official trademark list. All other product or service names mentioned herein are the trademarks of their respective owners.
0 Comments On This Entry
Please log in above to add a comment or register for an account
-
-
ARM Cortex-A57 Test Chip on TSMC 16nm FinFET Process Optimizes Tools & Flows
on May 21 2013 08:48 AM
-
-
-
»
My Blog Links
»
Search My Blog
»











