Re: [boost] Proposal: Monotonic Containers

10 Jun 2009

      On Jun 10, 2009, at 4:32 AM, Christian Schladetsch wrote:
...
I was wrong.
boost::monotonic::inline_storage<N> storage;
std::map<int, float, std::less<int>,  
boost::monotonic::allocator<int> >
v(storage);
Won't even work. Storage is not the allocator. Rather, to not use the
container overloads would require:
boost::monotonic::inline_storage<N> storage;
std::map<int, float, std::less<int>,  
boost::monotonic::allocator<int> >
v(boost::monotonic::allocator<int>(storage));
Which is dramatically even worse again.
Yes, it is sad that C++ does not have parameterized typedef's, like  
haXe (http://www.haxe.org), but if you at least abstract away the  
specific containers, by:

template<template C, typename T>
class map : public C<T, std::less<T>, boost::monotonic::allocator<T> > {
	typedef C<T, std::less<T>, boost::monotonic::allocator<T> > base_cont;
	....
};

and get rid of that polymorphic "storage_base", or at least use a good  
default; you should actually make it a policy instead of having to  
pass a "storage_base" (polymorphic) object to the constructor of your  
container.

Also, that the default constructor of your containers create a  
container that throws on any addition of elements is perhaps not  
ideal. Even the fact that the user has to make sure that this  
"storage_base" object reside in a safe place for the lifespan of your  
container is sad.
...
On Wed, Jun 10, 2009 at 8:16 PM, Christian Schladetsch <
christian.schladetsch@gmail.com> wrote:
...
Hi David,
auto_buffer is, in fact, related, inasmuch as it also attempts to  
use the
...
...
stack for storage.
Ok, so you could use Thorsten's auto_buffer as your storage, which
actually would give you, potentially, the best of two worlds  
(allocation
speed and - you seem to push this angle - cache) and dynamic heap  
beyond
that initial segment.
Cache coherency is King!
Yes, I have noticed that you push that angle very hard.

Three comments:

1. In most high-performance applications I have written, the number of  
re-allocations of std::vector (or similar abstractions in other  
languages) has been quite low, and the cache hit ratio pretty good,  
with that quite firm contiguous block. You seem to think that there is  
something magical about stack space relative heap space here. Can you  
explain that in more detail: i.e., why is cache coherency for a fixed  
block, such as your "internal_storage" so much higher than for an  
equally fixed block on the heap (using std::vector)? This as you  
labelled the use of the former as silly and a no go in high- 
performance applications. To paraphrase, "nobody writing high-speed  
applications would ever consider using std::vector".

2. Why is cache coherency so much better when you use an inline array  
then when Thorsten does it? He also allocates inline (at first) and  
you only have a first, anyway. So, can you please explain why cache  
coherency is so much more King in your

	internal_storage<1000> storage;
	vector<Foo> chrisCont(storage); // and let us just pray that  
'storage' survives this ;-)

than in Thorsten's

	auto_buffer<store_n_bytes<1000> > thorCont;

?

NOTE: I know that your contiguous block could potentially scan various  
containers, which *could be* a good thing for cache coherence, but in  
real life, there is often *one* container involved in a certain high- 
performance task.

What is the use case you had in mind with more than one container,  
while having good cache coherence? It would have to involve a lot of  
activity in a specific region of that block (i.e., basically working  
on one container) or deal with a tiny block, which would make it less  
viable for most high-performance applications. And, there could not be  
many reallocations by the container class (such as growing.)

Corollary: did you even look at Thorsten's auto_buffer before  
exclaiming that "cache coherency is King"?
...
...
auto_buffer<T> is not analogous to monotonic::inline_storage<N>
However, boost::monotonic provides an allocator that means that  
*all* STL
...
containers - and other ones - can use the stack for storage. Or  
the heap.
NOTE: your "storage_base" is not a base abstraction at all, but a
...
"heap_storage"; having a non-leaf having a concrete but totally  
different
behavior from its leaf child is less than beautiful.
I agree that the current implementation is not ideal, but I don't  
follow
what you mean here. storage_base may be ugly, but it is also  
required so
that different containers can use the same storage.
What I mean is that if you look at the inheritance graph:

						storage_interface (abstract)
								|
						storage_base (allocates on heap)
								|
						internal_storage (allocates inline, or stack...)

It is weird that "storage_base" (i) is concrete and (ii) does a  
completely different thing than the leaf, "internal_storage"; and  
(iii) the name does not imply its behavior at all; it should be  
"heap_storage", and the graph should look like this:

						storage_interface (abstract)
								|
						storage_base (whatever concrete stuff is common for all or most  
storages)
						   /                        \
				heap_storage               internal_storage

I would actually merge the two abstract types, but that is just me.
...
...
I agree that it could be improved. I had a lot of problems with the  
more
fundamental problems of std::allocator
...
Why is the behavior of whether to pick the heap storage  
("storage_base"...
ugh) or stack (if not embedded in a heap-allocated object...)  
decided at
runtime, via that polymorphic "storage_interface"? It is set at  
the creation
of the allocator, so why not make it a type parameter of the  
allocator, with
a default to, say, "inline_storage"? Do you think it is important  
to have
two containers sharing the exact same type even when one uses  
"storage_base"
and the other "inline_storage"?
I agree that there are some ugly aspects to the current proposal.  
However,
it also works as advertised which is very important.
I do not know what was advertised exactly, but (i) it does crash with  
a bus error when the container is default constructed, without  
explicitly setting a storage, and (ii) it does crash on certain  
platforms and uses cases, due to misaligned memory blocks. [I will  
comment more on this in another reply to you] Besides, it does not  
even compile on GCC with strict error checking. And, your 'map' does  
not work at all, since the allocator is expecting the key type instead  
of the proper 'pair<key, value>' which is expected by the standard  
container. Oh, well, by skipping the 'map' and changing the 'P' to 'p'  
in a few instances, I got it to compile.

What is the meaning of your "New" function in your allocator? It does  
some funky back-stepping, repeating a construction.

GCC NOTE: on 4.3 and 4.4, INT_MAX is not defined in your code, whereas  
on GCC 4.0 it is.
...
...
I hate the current implementation's basis of using a virtual base  
class. It
is anathema to me. I only presented it as it stands because I was  
more
concerned about adoption of the general idea, rather than the  
smaller and
less-relevant details of the implementation. I think most of us  
could write
this from scratch with no problem, and without virtuals.
Point being, if the idea is adopted, then yes there are  
optimisations and
beautifying to do.
That said, you may have also missed the point that multiple  
containers can
use the same storage - which may be on the heap or the stack.
No, I did not miss that.
...
...
...
In that respect, auto_buffer<T> is a subset of the proposed
boost::monotonic::vector<T>.
I still fail to see why you wrap all those containers, instead of  
providing
...
just that allocator. Is the only reason what I just mentioned, to  
keep the
storage mechanism variable ("storage_base" or "internal_storage")  
while the
container types equal? That is the only somewhat logical rationale  
for me.
Regarding injecting stuff, such as your allocator, in a template  
class
higher up the hierarchy (such as in std::vector), you should have  
a look at
my proposal at a "type injector":
http://blog.davber.com/2006/07/24/hierarchy-generation-in-c/
I will consider dropping the idea of including the containers on  
top of the
allocator.
The 'wrapping of the containers' is very small, very straight- 
forward, only
requires the passing-forward of construction parameters, and is  
completely
safe as none of them have any state outside of their base classes.
Inheriting from the std::containers also means that existing code  
that
pattern-matches against the std::containers will also match against
monotonic::containers.
I do not follow you.
...
...
They are functionally equivalent,
So, you can replace 'std::vector' with 'boost::monotonic::vector',  
perhaps with a textual replace all?
...
...
and in-memory
equivalent. They are the same thing, except that  
monotonic::containers must
be given either an allocator or storage from which to make one.
Ah, yes, so they are not functionally equivalent per se.
...
...
The fact that STL containers do not have virtual dtors is not  
relevant.
There is no state in the proposed monotonic containers that is not in
std::containers.
There is a case for:
boost::monotonic::inline_storage<N> storage;
std::vector<int, boost::monotonic::allocator<int> > v(storage); //  
(A)
versus:
boost::monotonic::inline_storage<N> storage;
boost::monotonic::vector<int> v(storage); // (B)
but what about:
boost::monotonic::inline_storage<N> storage;
boost::monotonic::map<int, float> v(storage);
versus:
boost::monotonic::inline_storage<N> storage;
std::map<int, float, std::less<int>,  
boost::monotonic::allocator<int> >
v(storage);
My main concern would be that people just aren't comfortable with  
the idea
of using custom allocator types, but they will accept the idea of  
needing to
pass the storage in once they realised that they needed a monotonic
container.
The people you target, who would think that the, in my opinion, proper  
way of doing it is so much more hairy than your "wrapped" way would  
also assume that they can use the container default-constructed.
...
...
There are other important issues, like having to give the predicate  
type
for the associative containers first.
These issues are not important, in my opinion.
...
...
If I didn't think it was important, I wouldn't have proposed the  
library in
the first place.
Alas, it is intrinsically important, QED?
...
...
Given that I think that it's important, it is also
important to me that the usage is clean. (A) above may be  
`better`,  but (B)
is cleaner.
I think the usage that *requires* an explicit storage object is less  
than clean. You should at least default to a specific storage, worst  
case heap-allocating it, since you do not want to slice a concrete  
storage object.

I like some aspects of your idea, which is why I take the time to go  
through your code and run some samples. I think Thorsten's auto_buffer  
covers most cases in performance-critical applications, but like the  
idea of extending it to maps and lists. So, get rid of those wrappers  
and tidy up the allocators a bit, and see how you can integrate it  
with Thorsten's effort, and we might have ourselves a winner.

/David

Re: [boost] Proposal: Monotonic Containers

David Bergman