[countertree] Formal Review Request

Hi all, I would like to request a formal review of the library “Countertree + Suballocator” [countertree] Project location ( zip file with code and documentation) : https://dl.dropbox.com/u/8437476/works/countertree_code_doc.zip Quick view of documentation with code download : https://dl.dropbox.com/u/8437476/works/countertree/index.html For the people who don't know this project, this is a description : *COUNTERTREE* This library is an implementation of a binary red-black counter tree. This tree have an additional counter in each leaf. This permit the access to the elements by the position, like in a vector. It is a random access container with random access iterators . Based on this tree we have : - With unordered information we have vectors (countertree::vector_tree) with identical interface than std::vector. The vector_tree have the same speed inserting and deleting in any position (all the operations are O(log N)).It is slower than std:vector inserting and deleting at end, but much faster for to insert and delete in any other position. - With ordered information, we have in the countertree namespace the classes set, multiset, map and multimap, with identical interface than the STL classes, with the plus of access to the elements by position, like in a vector. The iterators are random access , and you can subtract two iterators in a O(log N) time for to know the number of nodes between them (even with the end( ) and rend( ) iterators) *SUBALLOCATOR* In the allocation of equal size elements ( as in STL list, set, multiset,map and multimap), when the number of elements grows, many allocators begin to have speed problems. For to improve the speed, many allocators request to the Operating System big chucks of memory ( pool allocators). With this, the allocator don't need request memory to the operating system for each allocation. But many allocators don't return well the unused chucks of memory to the Operating System and the memory used by the allocator is the maximum used, never decrease . The *suballocator is a solution to these problems*, and others memory problems described in the suballocator page. The suballocator is a layer between the allocator and the data structures, compatible with any allocator with the STL definition. The suballocator request memory to the allocator, and return to it when unused. The suballocator replace to the allocator in the allocation of equal size elements With the suballocator a) *We have a very fast allocation* *(around 2 times faster than the std::allocator of GCC 4.7, CLANG 3.0 and 3 times than Visual Studio 10 *See details in the *Suballocator Benchmark*)* b) *Return the suballocator return memory to the allocator, this can use in the allocation of others types of data or for return to the *Operating System, decreasing the memory used by the program, *( as you can see in the *Suballocator Benchmark *)* c) *You can use with any allocator if it is according with the STL definition*. The suballocator provides speed and memory management to any allocator. d) Even the time of the allocation is a small part of the time spent in the insertion in a std::set, the suballocator obtain time reductions over over the 30% respect the std::allocator. The secret is the cache performance due to the data locality improvement. *COUNTERTREE + SUBALLOCATOR* The join of the two ideas provide us data structures with a suballocator built-in. They are, in the namespace countertree, the vector_tree_pool, set_pool, multiset_pool,map_pool and multimap_pool, with identical interface than the STL classes but better performance for big number of elements It is fast, useful and easy to understand and use,. They are the like the STL classes with a few additional functions. This library is designed thinking in programmers with a basic knowledge of C++. As I say in the documentation, if you know the STL classes vector, set , multiset, map , multimap and allocator, you know more than 95% needed for to use this library. I showed the library to several friends and colleagues, and one of them said me “If your potential users are not experts, and they need more than 5 minutes to understand what's the goal of the library and what they can do with it, many of them leave the page.... , and the library”. The first page of the documentation explain the library, the reasons and what can do. And in the next pages show the details and how can do in a a easy way. I had checked this code with GCC 4.7 , CLANG/LLVM 3.0 and Visual C++ 10 ( all with 32 and 64bits.). In code of the project is composed by the code of the classes, the test programs, the benchmarks programs used and mentioned in the documentation, and several examples of the code I had checked all the requirements for to request the review. But I am not sure if all is OK. If you miss something or something is wrong , please , mail me and I will correct as soon as possible Sincerely yours Francisco Tapia fjtapia@gmail.com

Hello Francisco De : Francisco José Tapia <fjtapia@gmail.com>
À : boost@lists.boost.org Envoyé le : Mercredi 3 octobre 2012 22h23 Objet : [boost] [countertree] Formal Review Request
Hi all, I would like to request a formal review of the library “Countertree + Suballocator” [countertree]
[snip]
For the people who don't know this project, this is a description :
*COUNTERTREE*
This library is an implementation of a binary red-black counter tree. This tree have an additional counter in each leaf. This permit the access to the elements by the position, like in a vector. It is a random access container with random access iterators .
I just had a look at your library because I was curious to see how you managed to implement random access iterators on a binary tree, but it seems the iterators you provide are not really random access: as far as I can tell from your code, moving an iterator forward or backward is an O(log N) operation (according to your remark about the "shift" function in file "countertree/tree/node.hpp"), and not a constant-time operation. Am I missing something? Luc P.S.: This is my first post on this mailing list, so please point out any rule or policy I could be breaking. Moreover, English is not my mother tongue, so please forgive the potential errors I could make.

This library is an implementation of a binary red-black counter tree. This tree have an additional counter in each leaf. This permit the access to the elements by the position, like in a vector. It is a random access container with random access iterators .
I just had a look at your library because I was curious to see how you managed to implement random access iterators on a binary tree, but it seems the iterators you provide are not really random access: as far as I can tell from your code, moving an iterator forward or backward is an O(log N) operation (according to your remark about the "shift" function in file "countertree/tree/node.hpp"), and not a constant-time operation.
Am I missing something?
I think that he means he implemented O(log N) operator[] on the container and presumably can perform O(log N) iterator arithmetic, allowing him to provide the same syntax as random access containers and iterators. However, since constant time is a requirement of STL random access iterator concept he probably should rephrase his description of the library. I'm not sure I see the use case of O(log N) operator[] or O(log N) iterator arithmetic on maps and sets. I find the sub-allocator much more interesting. Can't we implement the same allocation scheme using C++11 stateful allocators? Does this really need special support within the containers themselves? There has been some recent discussion of the need for a Boost.Allocator library. Regards, Luke

The countertree iterators don't store the position, because after the creation of an iterator, if you insert or delete elements in the tree, the position of the element pointed bu the iterator can change. The countertree iterators need to travel to the root of the tree for to know their position, with the exception of the iterators end(), and rend(). I understand that the random access iterators permit you access directly to the element, without a sequential access. Commonly they are associated with the vectors with iterators O(1), and the iterators of a list or set with are sequential and not random access iterators O(N). The iterators of the vector_tree are in the middle O (log N), but permit you access to the elements directly without sequential access. I don't know other iterators similar. Independently of the classification, I say random access iterator because you can use the random access algorithms of the STL library as std::sort, std::binary_search wit good results. Obviously they are not so fast as the iterators of a vector, but run well and fast. The std::sort over a counter_tree<int> with 1000000 random elements, spend 2.3 seconds on my computer, more or less like to insert the elements in a STL set If I don't define the iterator as random access I can't use these functions. I examined much information about the iterators, beginning with the Boost library iterators and many other documents and I have not a solution. Perhaps someone can help me in order to find a solution which permit to access to the STL library functions without the definition as random access. The main utility of the vector_tree is when you must insert or delete many elements in central positions. It is not worst or best than others data structures, it is different. It is only an option more, with different features than other data structures. If you consider it useful, use .Only that The suballocator concept have two big parts. First, from where and how obtain the memory the allocator. Second How dispatch the memory to the data structure. The suballocator is the second part with a big number of fixed size elements data structures ( STL list, set , multiset, map and multimap) About the suballocator. It is a simple layer over the allocator. It have the same interface. The data structure received a suballocator as template parameter. But the suballocator use internally an allocator. It request a big chuck of memory to the allocator , and manage in a fast and easy way. When a chuck of memory is unused by the data structure, is returned to the allocator. Usually, that big chucks of memory are returned to the Operating System from the allocator , and this reduce the memory consumption of the program. You can see this , running the benchmark_allocator.cpp. The suballocator run over ANY allocator with the STL allocator interface. It don't take care about the kind of allocator or the origin of the memory. If you want to know more about the algorithm of the suballocator, in the documentation , in the point 5.2 you have a document with a description of the data structure and algorithms. About the C++11 stateful allocators, I need a bite of time for to study in deep , and provide you something more than a trivial opinion Sorry! For Luc . Don't worry about your English. I am sure than sometimes I destroy the English dictionary, but the Boost people are well-mannered and don't say me nothing. And about myself, you can ask me, if I know, I will respond you, and if I don't know, we will share our ignorance, which is the main reason for to study more. Sincerely yours Francisco Tapia

The main utility of the vector_tree is when you must insert or delete many elements in central positions. It is not worst or best than others data structures, it is different. It is only an option more, with different features than other data structures. If you consider it useful, use .Only that
That's the problem, it shouldn't be up to us to figure out if your library is useful. We've all been happily coding in C++ for 10+ years without it. We know how to solve every programming problem we are presented with without the use of your library. Sometimes it is obvious, but in this case it isn't obvious to me where I would want to use your library. You need to provide some kind of motivation for why you think the library is needed. Something like "when you are doing xyz and you run into problem A you end up settling for solution S, but my library is a better solution because of t, u, v." Your iterators are not random access iterators because O(1) iterator arithmetic is a requirement of the standard. Calling std::sort or std::binary_search on a tree seems completely wrong. The tree is a search data structure, why would we want to call an algorithm that doesn't know how to take advantage of the tree data structure? You say you wanted to be able to call these stl algorithms on your iterator types, but you didn't explain why. I can't think of a single programming activity xyz where I run into problem A where your library is a better solution than solution S that I would normally apply, but I shouldn't have to. You should know why you wrote your library and be able to tell us.
About the C++11 stateful allocators, I need a bite of time for to study in deep , and provide you something more than a trivial opinion Sorry!
It is pretty important to figure out how what you're doing relates to the new standard. It takes a while for boost libraries to be accepted and released. Use of the new standard will be even more prevalent by the time your library would be getting widespread exposure. Keeping a small object arena around to recycle elements instead of going always to the allocator is a good idea. I'm pretty sure we can do it with the C++11 stl by baking it into the stateful allocator. It would be much more useful as a general allocator than as a feature of yet another implementation of a few containers. Regards, Luke

On Fri, Oct 5, 2012 at 2:20 AM, Francisco José Tapia <fjtapia@gmail.com>wrote:
The countertree iterators don't store the position, because after the creation of an iterator, if you insert or delete elements in the tree, the position of the element pointed bu the iterator can change. The countertree iterators need to travel to the root of the tree for to know their position, with the exception of the iterators end(), and rend().
I understand that the random access iterators permit you access directly to the element, without a sequential access. Commonly they are associated with the vectors with iterators O(1), and the iterators of a list or set with are sequential and not random access iterators O(N). The iterators of the vector_tree are in the middle O (log N), but permit you access to the elements directly without sequential access. I don't know other iterators similar.
The dynamically allocated augmented B+ trees support the functionality of both random access and bidirectional iterators. The computational complexity of operators of these iterators is better than in proposed augmented red black trees. However, I dropped the support of this functionality in array based augmented B+ trees. In addition to this, Boost.Container offers containers, such as stable_vector, with random access iterators that are not invalidated by insert/erase operations. This is possible, since these containers have features of both array based and dynamically allocated data structures. IMO, these iterators are superior to iterators in augmented dynamically allocated data structures.
Regards, Vadim Stadnik

Thanks Mr Stadnik. At end, I forgot the references. The “Introduction to Algorithms” is one of my favorite books. When I bought the book I had done my first implementation of the counter trees. I tried to use the balanced algorithms of the book in this version , but with counters of the nodes, was impossible, and I began from zero to design the algorithms based on the definition of Tree234. I will add references to the project. It can be useful for the people. About the double augmented tree for support accumulate function in a log (N) time, it is possible , but it is a hard work, because you must redesign all the algorithms for to balance the tree. In the countertree the insertion or deletion NOT invalidate the existing iterators. The iterator have a pointer to the node, and when you want to know the position, travel from their position to the root examining the node counters. If you insert an element in a previous position to one iterator, you can access to the element pointed by the iterator, but when you check the position you will obtain the next number. You can safely mix access, insertion and deletion with iterators and positions. You have arithmetic with the iterators, even with the end() and rend() iterators. unsigned D = V.end() - ( V.begin() +5) ; The arithmetic operations are log(N), because the iterators obtain the position as I described before. About the dynamically allocated augmented B+ trees, I must read more, in order to provide you a former opinion, but it looks very interested. About the Boost::container the data structures similar to the data structures of countertree, are implemented ( as I understand) over sorted vectors, or similar data structures, which have performance problems when insert or delete in central positions. But I must study more. Regards Francisco Tapia

Hi Mr Simonson. About the utility of vector_tree. Some years ago I worked in a atomic simulation of silicon doping for the semiconductor design. The silicon to high temperature produce spontaneous “defects” in the atomic structure of the silicon. This depend of the temperature, and the probability of the kind of “defect”. The "defects" makes the silicon semiconductor Each time interval we throw a buch of electrons with high energy against the silicon. When the electron crash with the defect, depending of the energy of the electron , and the angle, can occur several things. The defect absorb the electron, , the electron break the defect and generate new defects, others rebound and in others destroy the defect. The positions are continuously changing. You can't sort by any parameter. I had the defects in a vector_tree, and divide the number of defects between the number or cores of the machine, assign to each core their defects and begin to process. The possibility of insert and delete in any position at log(N), was very useful. As you can imagine, the complexity was bigger than the described here, but I thing this is not place for to show a detailed specification of the project. The word is bigger than I had seen, even bigger than I can imagine. And every day I learn and expect new things for to be astonished. It is the same since 28 years ago, when I began to program in C, and 22 years ago when I began to program in C++ ( I bought the TurboC++ 1.0 in 1990, 1 week after it appear in the market ). About the standard: The history shows than the facts change the rules. According to the standard, to use std::sort and binary search over a vector_tree is wrong. But it run well, and the time spent in sort a million elements is similar to insert the elements in a std::set. I don't question the standard. I only show the facts. It run well, and, the most important for me, I thing it can be useful to the programmers community. This is my motivation for to do this. The standard can be expanded, relaxed, modified or nothing. What do you suggest me ? Delete the library, because it is not according to the standard, or perhaps , think about what need to be changed, or modified in the standard for to adapt to the new reality. The new C++11 appear when the library was finished, and I was writing the documentation and checking the code. The new standard is very good, but have a problem, if you do according the last standard , you are discarding the majority of the C++ compilers used actually. My idea is to use the new features of C++11, but trying to maintain the compatibility with older compilers, at least until the new standard was widespread. About the statefull allocators. I have my own idea, but I need to study more in order to have a founded opinion. The suballocator can be transformed in statefull allocators with a few changes in the actual code, but I need to think about the utility and the problems related. Regards Francisco Tapia

You can't sort by any parameter. I had the defects in a vector_tree, and divide the number of defects between the number or cores of the machine, assign to each core their defects and begin to process. The possibility of insert and delete in any position at log(N), was very useful. As you can imagine, the complexity was bigger than the described here, but I thing this is not place for to show a detailed specification of the project.
Interesting that we both have background in chip design. So I surmise that you wanted to simulate equal probability that an electron interact with a defect by generating a random index into your countertree data structure holding defect elements. It's not clear to me why you needed to insert into the middle, however. If you weren't sorting by any parameter why not just push_back each new element onto a vector? Threading obviously brings in a lot of complexity, but I don't see how the countertree solves any of those problems. I could implement fast lookup and randomized indexing into a set using a std::vector and std::set that are cross referenced. First randomly select a vector index with equal then randomly select the set element based upon is value and the distribution function. When removing an element from the set swap its vector element to the end and pop_back() updaing set element index for the swapped end element. Perhaps there is some application where indexes needs to be ordered the same as the set, which would then pretty much require countertree to be made efficient.
About the statefull allocators. I have my own idea, but I need to study more in order to have a founded opinion. The suballocator can be transformed in statefull allocators with a few changes in the actual code, but I need to think about the utility and the problems related.
The list merge/split and thread safety/local being the two problems that spring immediately to my mind. It is because there are problems that we need a library. Regards, Luke

On Thu, Oct 4, 2012 at 6:23 AM, Francisco José Tapia <fjtapia@gmail.com>wrote
*COUNTERTREE*
This library is an implementation of a binary red-black counter tree. This tree have an additional counter in each leaf. This permit the access to the elements by the position, like in a vector. It is a random access container with random access iterators . Based on this tree we have :
Hi Francisco,
The method that you are using is called "augmenting data structures". I suggest you to add reference to this book, which gives detailed description of this method in chapter 14: T.H. Cormen, C.E. Leiserson, R.L. Rivest and C. Stein. *Introduction to Algorithms*. 2nd Edition, MIT Press and McGraw-Hill, 2001. Figure 14.1 is very similar to the figure in your documentation. Did you try double augmenting that supports algorithm accumulate() with logarithmic running time? Regards, Vadim Stadnik
participants (4)
-
Francisco José Tapia
-
Luc Touraille
-
Luke Simonson
-
Vadim Stadnik