
I do not intend to argue about the difference between cache coherency and spatial locality; suffice to say they are related. If you have a set of nodes for a map scattered over a heap, and traverse them, you will be generally hitting different cache lines. If you have a set of nodes for a map on the stack, and traverse them, you will generally be hitting fewer cache lines. This is a platform-independant fact of all modern hardware. Now, if you have an unordered_map<string, list<Foo> >, and that is on the stack using a monotonic::allocator, using this data-structure will be much faster than using the exact same structure with a std::allocator. Why? Because the first one is not doing *any memory management*. Despite claims otherwise, I challenge anyone to show me where the proposed system does a single memory allocation, or a memory de-allocation. All the storage is in a boost::array<char, N>. That is the allocation buffer. It is allocated by the compiler. It is on an aligned boundary. It is probably on the stack. I am just returning newly constructed objects from within that buffer. Should I align objects within that buffer? Sure, I've already done it, it's a trivial change. But its still not an allocation, it is a re-interpretation of space within an already-allocated buffer. So, the first reason why my data-structures are much faster than using the std:: ones is that they do no memory allocation. No new, no delete. "Allocation" is simply incrementing a pointer. "Deallocation" does nothing at all. The second reason why containers using monotonic::allocator are faster than otherwise is that they can use the stack for storage of std:: containers. This means that you can put a map<string, list<map<int, vector<foo>>>> on the stack. All the nodes and the pointers within those nodes are also on the stack. This means that traversing that structure is fast, and going from the vector<foo> inside to the outer map<string, ..> is fast, as the values for both will be stored adjacent to each other on the stack. Make a new structure using the same storage, somewhere down a call-chain perhaps, and it will play very nicely with the existing ones. I have a set of motivational examples in the documentation. So, the three things that are different about this proposal is that 1. there is no memory management and 2. the stack can be used to store the nodes in a map, set or list (or unordered_map, etc). 3. no memory fragmentation. The main compelling reason to use a monotonic::vector<T> over a std::vector<T> is that the former will not fragment memory, but the second one will. The downside of couse is that resizing a monotonic::vector is very wasteful, as the memory for the old vector is not freed. That is why, for example, the following fails: monotonic::storage<100*sizeof(int)> storage; //< on the stack! not on the heap, as others have suggested monotonic::vector<int> v(storage); for (int n = 0; n < 100; ++n) v.push_back(n); // will fail at n = 34 with an exhausted storage but this succeeds: monotonic::storage<100*sizeof(int)> storage; monotonic::vector<int> v(100, storage); // create with an initial size and storage for (int n = 0; n < 100; ++n) v[n] = n; As the vector in the first case expands, it resizes. But the old memory is not freed - that is why it is explicitly called a "monotonic allocator". On Thu, Jun 11, 2009 at 8:18 AM, Scott McMurray <me22.ca+boost@gmail.com>wrote:
2009/6/10 Christian Schladetsch <christian.schladetsch@gmail.com>:
Cache coherency is King!
Well, for portability, cache obliviousness is King, since there's no way you can tune for every specific combination of memory caches, instruction caches, page caches, etc all at the same time.
( http://en.wikipedia.org/wiki/Cache-oblivious_algorithm ) _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost