Re: [Boost-users] [Multi-index] hash_unique memory consumption vs std::map

5 Nov 2008


      On Wed, Nov 5, 2008 at 5:27 AM,  <joaquin@tid.es> wrote:
...
There is no way to fine-tune the size of the bucket array: this is in
general the minimum of a list
of prime numbers, roughly each the double of the prior, that is compatible
with the
maximum load factor specified . With the default maximum load factor
mlf=1.0, the size of the
bucket array can range approximately between n and 2n, where n is the number
of elements.
You can use max_load_factor(z) to set the maximum load factor slightly below
1.0 to see if that
improves the situation (that is, if the bucket size keeps at the size
immediately preceding the one
you have now). The member function bucket_count() gives you the size of the
bucket
array. Is this indeed much larger than the number of elements?
It is indeed much larger.  I have 22 attributes with my particular
test graph.  bucket_count() is 53.  I investigated further and this is
because 53 is the lowest possible size for bucket_array_base.  Should
I just stick to using std::map when I know that N will be less than
53, or would modifying prime_list[] make sense?
...
Nevertheless, the differences in memory consumption do not look consistent
to me: a std::map
has an overhead of 12-16 bytes per element (16 in most cases, 12 if some
optimizations
for a so-called "color" internal parameter are applied) for 32-bit
architectures. For a hashed index
the overhead (with lf=1.0) should be between 8 and 12 bytes per element. We
are missing
something here. Can you provide more detailed info on how you're measuring
memory
consumption? Are there any aspect you might not be taking into account?
I'm using the MS specific _CrtMemDumpStatistics function immediately
after loading the graph.  I've also used VTune, but that reports
system-wide memory usage, so it's hard to be more detailed than "x
uses more than y" using that profiler.

--Michael Fawcett