
Sid Sacek wrote:
Why would you need to worry about such a thing? That's the responsibility of the compiler/hardware. Rob Stewart
Are you sure the compiler would take care of this?
I've worked on a number of different CPU's, including the SPARC and the RS/6000, and I've never seen any special assembly emitted by the compiler that would do that. I feel like there's still something missing from the picture.
How familiar are you with the term ccNUMA? As I'm not too familiar with it myself, I just did a quick look up at wikipedia (<http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access>): Cache coherent NUMA (ccNUMA) Nearly all CPU architectures use a small amount of very fast non-shared memory known as cache to exploit locality of reference in memory accesses. With NUMA, maintaining cache coherence across shared memory has a significant overhead. Although simpler to design and build, non-cache-coherent NUMA systems become prohibitively complex to program in the standard von Neumann architecture programming model. As a result, all NUMA computers sold to the market use special-purpose hardware to maintain cache coherence[citation needed], and thus class as "cache-coherent NUMA", or ccNUMA. In case this information is correct, I see little reason why you or your compiler would need to generate extra code. Regards, Thomas