Re: [boost] [unordered] Buffered functions?

28 Dec 2008

      on Sat Dec 27 2008, Howard Hinnant <hinnant-AT-twcny.rr.com> wrote:
...
On Dec 27, 2008, at 6:57 PM, David Abrahams wrote:
...
One thing I've always wondered about this: in that case -- at least when
you're not also trying to use the small object optimization -- why not
always use a Pimpl idiom where an empty container (with no capacity) can
be represented by a single null pointer?
I've simply had clients complain to me:  why is sizeof(vector) 4 words instead of 3?
Please make it 3.
I take it you mean that nobody has ever complained to you that
sizeof(vector) should have been one word.
...
Given that complaint, I can attempt to generate reasonable use cases
which motivate such a complaint.  But my basic premise is that those
complaining know their needs far better than I do.
I've never actually had anyone complain to me about the sizeof an
unordered_set.  However it seems logical to me that if clients are
concerned about the overhead of one container, they could be concerned
about the overhead of another (and thus any) container.
By the same token, it seems logical to me that if reducing the size of a
vector to 3 words matters, reducing it to one word matters even more ;-)
...
Concerning your suggestion about
container<unique_ptr<unordered_set<A>>> as a substitute for
container<unordered_set<A>> (or equivalent),
That's a twisty way to say it.  Just to be sure we understand one
another, I'll clarify that I never suggested about asking clients to use
the unique_ptr approach.  Rather, my suggestion was that the library use
unique_ptr (or equivalent) under the covers to minimize the size of
empty data structures.
...
my best guess is that there may be clients with use cases where the
median size of the inner container is small but non-zero, and thus the
extra heap allocation and indirection ends up costing more.
I'm guessing that most of those clients either haven't had the
opportunity to make the measurement, or else they use unique_ptr (or
equivalent) explicitly to get the effect they want.  Also, the lack, up
to now, of a conforming auto_ptr-sized thingy that can be stored in
containers has surely impacted the measurements people are able to
conveniently make.
...
Suffice it to say that from a general purpose library point of view,
more overhead is never better if you can get the equivalent
functionality/performance out of less overhead.  And historically, the
design of the std::containers has given precedence to speed and space
performance as opposed to turning basic exception safety into strong
exception safety (and I agree with that design philosophy for a
general purpose library).
Me too...
...
...
...
...
...
It seems like you think compressed pair can only compress away one
member, but IIUC you can chain them and then derive your hash table from
that to compress as much as you like.  Of course it would be nice to
have a compressed_tuple that handles it all for us, but I thikn you
_can_ do it.
I have high hopes that std::tuple will become the compressed_pair of
the future.  Technically, and especially with variadic template
language support, it is not that hard to do (tuple is a pain no matter
what without variadic template language support).  Standards-wise, the
current C++0X draft (N2798) does not outlaw, nor mandate, the empty
member optimization for tuple,
Except that there is no "empty member optimization," right?  Unless this
is one of the langauge changes I've failed to notice (and there are a
few), I think what you mean is that tuple is free to use derivation to
take full advantage of the EBO, right?
...
and I hope to keep it that way.  Recent straw polls on the LWG have
been favorable to allowing the empty member optimization for
std::tuple.
Until we get CWG to agree that the EMO exists, I doubt straw polls in
LWG can make much difference ;-)
I used the term "empty member optimization" to refer to the library technique for
optimization as opposed to a language optimization.   Your assumption regarding my
meaning referring to derivation under the covers is correct.  As long as we're on the
subject, I should warn off  naive attempts at this optimization which look like:
namespace std {
template <class T, class A>
class vector
   : private A
{
    ...
};
}  // std
Such an implementation of "EMO" is error prone as it associates the namespace of the
client-supplied A with that of the vector.  I'm not  preaching to Dave with this
statement as he has a proven track record  in this department with his implementation
of boost::noncopyable
(http://www.boost.org/doc/libs/1_37_0/libs/utility/utility.htm#Class_noncopya...
).
Probably you should be preaching to me.  That protection was added in
2004, and although I worry about unintended ADL a lot, the technique I
used there wouldn't help anyone trying to leverage EBO.  In fact overuse
of boost::noncopyable can undermine EBO for reasons pointed out in...
...
I mention this warning aimed at the widest possible audience.  Instead
one should use the techniques employed by boost::compressed_pair, or
go back to (as far as I know) the originator of this technique, Nathan
Myers (http://www.cantrip.org/emptyopt.html ).
...Nathan's article.  Nice article, Nathan!  I never saw that before.

The conundrum here is that if we use that technique for tuple, then a
tuple of empty types can never be empty itself.  I'm starting to get ill
just thinking about a generic framework for composing "logically empty"
types that undoes this "EBO erasure" effect when necessary.

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com