
Michael Fawcett
On Wed, Nov 5, 2008 at 3:22 PM, Joaquin M Lopez Munoz
Can you declare some iterator "it" to the hashed index and evaluate the following expression
sizeof(*(it.get_node()));
and compare with
sizeof(id_attribute_pair_type)
You're right. The first prints out 32, the second prints out 24.
You're in a 32-bit platform, right? Then, the first sizeof() should ideally be 24+4=28, not 32. I think you're having a padding issue due to the presence of a long type in id_attribute_pair_type. Can you try changing the order in which long and _variant_t are declared within id_attribute_pair_type? This might avoid padding and get you a sizeof(*(it.get_node()))=28. If not, well, we're out of luck.
In my test graph there are 106,096 edges in the graph. Each edge has 22 attributes.
In the hashed_index case, this results in a bucket_array with bucket_count() == 23 and 8 bytes overhead per node with a node (in my case) at 24 bytes. So that means 106,096 * 23 * 32 == 78,086,656 bytes.
Not entirely correct. The right formula is n_edges*(n_attributes*sizeof(node)+ bucket_count*4) which in our case is 106,096*(22*32 + 23*4) = 84,452,416
In the std::map case, this results in 22 nodes at 12 bytes overhead per node. 106,096 * 22 * 36 == 84,028,032 bytes.
Not consistent with the previous calculation and your reported saving of 1% in memory consumption :-/ Also, you might want to investigate whether the overhead for std::map is 12 bytes per element or 16 bytes per element --the latter is more likely. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo