This was replaced in the ARMv6 architecture by exclusive loads and stores (LDREX and STREX). This works on the principle of a monitor existing for the location in memory. This effectively tags the memory with the identity of the agent(s) trying to access it. In a spinlock implementation, an exclusive load reads data from the memory, tagging it with its identifier. A short number of instructions later, it uses an exclusive store to write data to memory but this only works if the tag is still valid and the tag will only be valid if some other agent has not also modified that location since the exclusive load.
At the same time that the load and store exclusives were added to the ARM architecture, the SWP instruction was depreciated and the architecture notes that use of SWP is not guaranteed to work for SMP systems. The load and store exclusives and the deprecation of the SWP instructions is described in detail in the ARM ARM .
As an aid to removing legacy SWP instructions, ARMv7 allows you to disable the SWP instruction. In ARM SMP Linux to help find legacy uses of the SWP instruction, we disable SWP but emulate the instructions (via the undefined instruction trap) and log those emulations. While this generates extra instruction overhead, it ensures that the software operates safely. You should also be aware that the SWP instruction does not exist in the Thumb 2 instruction set and so will see errors if you try and assemble code containing the SWP instruction into the Thumb 2 instruction set.
For SMP performance we want to the replace the use of the SWP instruction with appropriate load and store exclusive instructions in libraries and applications. This easiest way to achieve this is to make use of the GCC compiler built-ins (described here) . ARM GCC will either directly generate the correct inline code or insert a call to a kernel user helper function containing the right code.
As an example, consider the following assembly code function implementing a spin lock:
1: swp r2,r1,[r0]
This would be replaced with
int __spin_lock(spinlock_t *lock)
while (__sync_lock_test_and_set(&lock->flag, 1));
To release the lock, you need to call another of the GCC builtins, in this case __sync_lock_release():
Code wishing to lock a data structure would look something like this:
// modify the locked data structure
// release the lock
The built in functions take care of all of the details for you, including dealing with weakly ordered memory systems via memory barriers. This is only a brief introduction for more details I suggest that you read the “Barrier Litmus Tests and Cookbook” document in ARM’s Infocenter .
In the next article, I explain how to implement spin locks in assembler and describe how memory barriers should be used.
- ARM DDI 0406B_errata_2009_Q3 (ID100209) : ARM® Architecture Reference Manual ARM®v7-A and ARM®v7-R edition
- PRD03-GENC-007826 1.0 : Barrier Litmus Tests and Cookbook
David Rusling, ARM Fellow, David was born a few weeks before Sputnik was launched. He's always liked mathematics, but America's space program together with 'Star Trek' made him think that computers were really interesting and so he graduated in 1982 with a degree in Computer Science. The future turns out to have less flashing lights than he expected. After hacking networking boxes for Digital Equipment Corporation, he got involved in the port of Linux to the Alpha processor. This gave him an abiding respect for the power of open source in general and Linux in particular. He worked on StrongARM before moving to ARM where he added tools experience. He's an ARM Fellow; which he says, "really means that I'm a techno-dweeb with a wide freedom to meddle." His official role is to set the technical direction for ARM's tools and software story.
Shortlink to this post: http://bit.ly/d4OL8R
0 Comments On This Entry
Please log in above to add a comment or register for an account
Search My Blog
Coding Using NEON Technology
on May 21 2013 08:57 AM
on May 08 2013 06:15 PM
New Platform Bring-Up with ARM® Development Studio 5 (DS-5™)
on Apr 30 2013 09:55 AM
如何利用全志安卓4.0 HDMI Dongle进行ARM DS-5 Streamline性能分析
on Apr 26 2013 10:50 AM
DS-5 Streamline Performance Analyzer on Allwinner Android 4.0 HDMI Dongle
on Apr 25 2013 04:58 PM