
On Jun 10, 2009, at 4:32 AM, Christian Schladetsch wrote:
I was wrong.
boost::monotonic::inline_storage<N> storage; std::map<int, float, std::less<int>, boost::monotonic::allocator<int> > v(storage);
Won't even work. Storage is not the allocator. Rather, to not use the container overloads would require:
boost::monotonic::inline_storage<N> storage; std::map<int, float, std::less<int>, boost::monotonic::allocator<int> > v(boost::monotonic::allocator<int>(storage));
Which is dramatically even worse again.
Yes, it is sad that C++ does not have parameterized typedef's, like haXe (http://www.haxe.org), but if you at least abstract away the specific containers, by: template<template C, typename T> class map : public C<T, std::less<T>, boost::monotonic::allocator<T> > { typedef C<T, std::less<T>, boost::monotonic::allocator<T> > base_cont; .... }; and get rid of that polymorphic "storage_base", or at least use a good default; you should actually make it a policy instead of having to pass a "storage_base" (polymorphic) object to the constructor of your container. Also, that the default constructor of your containers create a container that throws on any addition of elements is perhaps not ideal. Even the fact that the user has to make sure that this "storage_base" object reside in a safe place for the lifespan of your container is sad.
On Wed, Jun 10, 2009 at 8:16 PM, Christian Schladetsch < christian.schladetsch@gmail.com> wrote:
Hi David,
auto_buffer is, in fact, related, inasmuch as it also attempts to use the
stack for storage.
Ok, so you could use Thorsten's auto_buffer as your storage, which actually would give you, potentially, the best of two worlds (allocation speed and - you seem to push this angle - cache) and dynamic heap beyond that initial segment.
Cache coherency is King!
Yes, I have noticed that you push that angle very hard. Three comments: 1. In most high-performance applications I have written, the number of re-allocations of std::vector (or similar abstractions in other languages) has been quite low, and the cache hit ratio pretty good, with that quite firm contiguous block. You seem to think that there is something magical about stack space relative heap space here. Can you explain that in more detail: i.e., why is cache coherency for a fixed block, such as your "internal_storage" so much higher than for an equally fixed block on the heap (using std::vector)? This as you labelled the use of the former as silly and a no go in high- performance applications. To paraphrase, "nobody writing high-speed applications would ever consider using std::vector". 2. Why is cache coherency so much better when you use an inline array then when Thorsten does it? He also allocates inline (at first) and you only have a first, anyway. So, can you please explain why cache coherency is so much more King in your internal_storage<1000> storage; vector<Foo> chrisCont(storage); // and let us just pray that 'storage' survives this ;-) than in Thorsten's auto_buffer<store_n_bytes<1000> > thorCont; ? NOTE: I know that your contiguous block could potentially scan various containers, which *could be* a good thing for cache coherence, but in real life, there is often *one* container involved in a certain high- performance task. What is the use case you had in mind with more than one container, while having good cache coherence? It would have to involve a lot of activity in a specific region of that block (i.e., basically working on one container) or deal with a tiny block, which would make it less viable for most high-performance applications. And, there could not be many reallocations by the container class (such as growing.) Corollary: did you even look at Thorsten's auto_buffer before exclaiming that "cache coherency is King"?
auto_buffer<T> is not analogous to monotonic::inline_storage<N>
However, boost::monotonic provides an allocator that means that *all* STL
containers - and other ones - can use the stack for storage. Or the heap.
NOTE: your "storage_base" is not a base abstraction at all, but a
"heap_storage"; having a non-leaf having a concrete but totally different behavior from its leaf child is less than beautiful.
I agree that the current implementation is not ideal, but I don't follow what you mean here. storage_base may be ugly, but it is also required so that different containers can use the same storage.
What I mean is that if you look at the inheritance graph: storage_interface (abstract) | storage_base (allocates on heap) | internal_storage (allocates inline, or stack...) It is weird that "storage_base" (i) is concrete and (ii) does a completely different thing than the leaf, "internal_storage"; and (iii) the name does not imply its behavior at all; it should be "heap_storage", and the graph should look like this: storage_interface (abstract) | storage_base (whatever concrete stuff is common for all or most storages) / \ heap_storage internal_storage I would actually merge the two abstract types, but that is just me.
I agree that it could be improved. I had a lot of problems with the more fundamental problems of std::allocator
Why is the behavior of whether to pick the heap storage ("storage_base"... ugh) or stack (if not embedded in a heap-allocated object...) decided at runtime, via that polymorphic "storage_interface"? It is set at the creation of the allocator, so why not make it a type parameter of the allocator, with a default to, say, "inline_storage"? Do you think it is important to have two containers sharing the exact same type even when one uses "storage_base" and the other "inline_storage"?
I agree that there are some ugly aspects to the current proposal. However, it also works as advertised which is very important.
I do not know what was advertised exactly, but (i) it does crash with a bus error when the container is default constructed, without explicitly setting a storage, and (ii) it does crash on certain platforms and uses cases, due to misaligned memory blocks. [I will comment more on this in another reply to you] Besides, it does not even compile on GCC with strict error checking. And, your 'map' does not work at all, since the allocator is expecting the key type instead of the proper 'pair<key, value>' which is expected by the standard container. Oh, well, by skipping the 'map' and changing the 'P' to 'p' in a few instances, I got it to compile. What is the meaning of your "New" function in your allocator? It does some funky back-stepping, repeating a construction. GCC NOTE: on 4.3 and 4.4, INT_MAX is not defined in your code, whereas on GCC 4.0 it is.
I hate the current implementation's basis of using a virtual base class. It is anathema to me. I only presented it as it stands because I was more concerned about adoption of the general idea, rather than the smaller and less-relevant details of the implementation. I think most of us could write this from scratch with no problem, and without virtuals.
Point being, if the idea is adopted, then yes there are optimisations and beautifying to do.
That said, you may have also missed the point that multiple containers can use the same storage - which may be on the heap or the stack.
No, I did not miss that.
In that respect, auto_buffer<T> is a subset of the proposed boost::monotonic::vector<T>.
I still fail to see why you wrap all those containers, instead of providing
just that allocator. Is the only reason what I just mentioned, to keep the storage mechanism variable ("storage_base" or "internal_storage") while the container types equal? That is the only somewhat logical rationale for me. Regarding injecting stuff, such as your allocator, in a template class higher up the hierarchy (such as in std::vector), you should have a look at my proposal at a "type injector": http://blog.davber.com/2006/07/24/hierarchy-generation-in-c/
I will consider dropping the idea of including the containers on top of the allocator.
The 'wrapping of the containers' is very small, very straight- forward, only requires the passing-forward of construction parameters, and is completely safe as none of them have any state outside of their base classes.
Inheriting from the std::containers also means that existing code that pattern-matches against the std::containers will also match against monotonic::containers.
I do not follow you.
They are functionally equivalent,
So, you can replace 'std::vector' with 'boost::monotonic::vector', perhaps with a textual replace all?
and in-memory equivalent. They are the same thing, except that monotonic::containers must be given either an allocator or storage from which to make one.
Ah, yes, so they are not functionally equivalent per se.
The fact that STL containers do not have virtual dtors is not relevant. There is no state in the proposed monotonic containers that is not in std::containers.
There is a case for:
boost::monotonic::inline_storage<N> storage; std::vector<int, boost::monotonic::allocator<int> > v(storage); // (A)
versus: boost::monotonic::inline_storage<N> storage; boost::monotonic::vector<int> v(storage); // (B)
but what about: boost::monotonic::inline_storage<N> storage; boost::monotonic::map<int, float> v(storage);
versus: boost::monotonic::inline_storage<N> storage; std::map<int, float, std::less<int>, boost::monotonic::allocator<int> > v(storage);
My main concern would be that people just aren't comfortable with the idea of using custom allocator types, but they will accept the idea of needing to pass the storage in once they realised that they needed a monotonic container.
The people you target, who would think that the, in my opinion, proper way of doing it is so much more hairy than your "wrapped" way would also assume that they can use the container default-constructed.
There are other important issues, like having to give the predicate type for the associative containers first.
These issues are not important, in my opinion.
If I didn't think it was important, I wouldn't have proposed the library in the first place.
Alas, it is intrinsically important, QED?
Given that I think that it's important, it is also important to me that the usage is clean. (A) above may be `better`, but (B) is cleaner.
I think the usage that *requires* an explicit storage object is less than clean. You should at least default to a specific storage, worst case heap-allocating it, since you do not want to slice a concrete storage object. I like some aspects of your idea, which is why I take the time to go through your code and run some samples. I think Thorsten's auto_buffer covers most cases in performance-critical applications, but like the idea of extending it to maps and lists. So, get rid of those wrappers and tidy up the allocators a bit, and see how you can integrate it with Thorsten's effort, and we might have ourselves a winner. /David