Re: [boost] [container] New container: why is everybody silent?

25 Aug 2014

      Jeremy Murphy <jeremy.william.murphy <at> gmail.com> writes:
...
OK.  I might have missed it, but I can't see in the docs an explanation of
how to choose L and M for a given data structure and cache line size.  Is
there a formula?
...
I guess I'm concerned that these L and M values limit the ease of
portability, by which I mean the additional effort required to consider the
performance effects of the parameters on one machine or another is very
different to the ease of use of std::vector.
Jeremy, I proposed good default values (L=30, M=60), which could be however 
changed. The default values are good for most PCs. I've performed experiments 
on my laptop and found that these values form plateau. I mean, if you change 
L or M twice, the result changes not more than 7 to 10%. However, if you 
decrease them strongly (L=4, M=4), you can lose 50% performance, if you 
increase them strongly (M=500), insertion/deletion will be performed slowly.  
___ To be brief, you should use default values ___. (Compare this with vector 
behavior: you use default growth strategy, but you can change it if you want.)

If want to get best performance, you should take into account not only 
machine, but also operations your application performs, also size of your 
structure, also memory manager, also typical size of container. If the 
application likes to read/access, large M is preferable. If it likes to 
insert/delete, small M is preferable. Default L is good enough unless you 
want to split/merge containers or perform bulk insertions/deletions (in that 
case L should be slightly increased). Large L is slightly better for big 
containers. Not to mention memory allocation: if ptree_seq contains 1 
element, it allocates the whole leaf or M+ elements. So mobile apps can 
prefer smaller L and M for memory saving.
___ To be brief, if you want 100% performance, you should tune L and M during 
the profiling stage, when the whole application is complete ___.
...
Well, the point of the bibliography is primarily for anyone looking at your
data structure to understand why it is designed that way.  A specific
reference to a textbook or paper allows us to see that and also potentially
see what you missed.  If your implementation is based on a Wikipedia
article then cite that.  Maybe I'm banging the science drum too hard, but I
personally find that kind of explicit trail of theory helpful.
I mentioned article in wikipedia about rope just because it helps the common 
understanding. It shows how to keep sequence in tree-like structure. The idea 
is quite simple: to keep number of elements in each subtree near the pointer 
to that subtree. If you insert (delete) element, you modify log(N) subtrees, 
so you update log(N) numbers. If you want to access an element with given 
index, you use the search algorithm, which is very similar to tree-search 
algorithm, but you use these numbers instead of index.
Also, I mention rope because it is very respectful and is included into gcc 
under the name of __gnu_cxx::rope.
However, the rest is my optimizations. Firstly, I use btree instead of binary 
tree, which makes data structure cache-efficient. Secondly, I use some 
algoritmic tricks, which allow to improve performance (by constant value). 
The best example is using recursion-free, one-pass update algorithm, which I 
described on my title page.
To be brief, I used an already known concept of keeping sequence in tree-
like structure, and then effectively applied some known optimizations to it. 

Best regards,
Aleksandr Kupriianov