
Dear Experts, I find myself in need of an augmented tree. My problem is something like [but not exactly] this: - Say I have a line of text, i.e. a sequence of characters, in a variable-width font; each character stores its width in pixels. - I need to efficiently insert and remove characters. - I need to efficiently find the range of characters that spans a given range of pixels. It seems to me that an augmented tree lets me do all of those operations in log(N) time - and no other data structure offers the same performance. I'm aware of two augmented trees that have been proposed here. Francisco Jose Tapia has proposed "Countertree", but this only stores the count of elements not the sum of a property of the elements (i.e. the widths in my example). Vadim Stadnik has proposed stl_ext_adv, which is a doubly-augmented B+ tree. Has anyone looked at Vadim's code since it was proposed last year? I have a couple of concerns about Vadim's proposal: 1. It uses B+ trees, and in the discussion it was suggested that these are inferior to red-black trees in some way. 2. In order to get the property to be accumulated (i.e. the widths in my case) it requires that arithmetic can be done on the stored type itself; this is fine for e.g. a sequence of ints, but if the stored type is a struct it is necessary to add various "unnatural" operator overloads. I think I would prefer it to have some sort of user-supplied accessor to get the property to be accumulated. Is there any other work that I've overlooked? Is there a "friendly" red-black tree implementation somewhere that could be augmented relatively easily? Regards, Phil.

On Thu, Oct 11, 2012 at 10:23 AM, Phil Endecott < spam_from_boost_dev@chezphil.org> wrote:
Dear Experts,
I find myself in need of an augmented tree. My problem is something like [but not exactly] this:
- Say I have a line of text, i.e. a sequence of characters, in a variable-width font; each character stores its width in pixels.
- I need to efficiently insert and remove characters.
- I need to efficiently find the range of characters that spans a given range of pixels.
It seems to me that an augmented tree lets me do all of those operations in log(N) time - and no other data structure offers the same performance.
I'm aware of two augmented trees that have been proposed here. Francisco Jose Tapia has proposed "Countertree", but this only stores the count of elements not the sum of a property of the elements (i.e. the widths in my example). Vadim Stadnik has proposed stl_ext_adv, which is a doubly-augmented B+ tree. Has anyone looked at Vadim's code since it was proposed last year? I have a couple of concerns about Vadim's proposal:
1. It uses B+ trees, and in the discussion it was suggested that these are inferior to red-black trees in some way.
2. In order to get the property to be accumulated (i.e. the widths in my case) it requires that arithmetic can be done on the stored type itself; this is fine for e.g. a sequence of ints, but if the stored type is a struct it is necessary to add various "unnatural" operator overloads. I think I would prefer it to have some sort of user-supplied accessor to get the property to be accumulated.
Is there any other work that I've overlooked?
Is there a "friendly" red-black tree implementation somewhere that could be augmented relatively easily?
You could use the implementation from the GCC STL! Actually, I think a lot of STL implementations use RB-trees for ordered associative containers. I bet there's a better choice out there, but I'm not really an expert. -Greg

On Thu, Oct 11, 2012 at 03:42:06PM -0400, Greg Rubino wrote:
You could use the implementation from the GCC STL! Actually, I think a lot of STL implementations use RB-trees for ordered associative containers. I bet there's a better choice out there, but I'm not really an expert.
The resulting GPL-licensed derived work would be quite unusable in pretty much anything. IANAL, but the runtime library exception [1] only seems to be applicable when the GCC compiler compiles and links the bits and pieces into your program. It wouldn't surprise me if libstdc++ was full of GCC intrinsics and extensions too. I have a personal policy of not reading GPL source code, so I cannot really say. [1] http://gcc.gnu.org/onlinedocs/libstdc++/manual/license.html -- Lars Viklund | zao@acc.umu.se

Dear Experts,
I find myself in need of an augmented tree. My problem is something like [but not exactly] this:
- Say I have a line of text, i.e. a sequence of characters, in a variable- width font; each character stores its width in pixels.
- I need to efficiently insert and remove characters.
- I need to efficiently find the range of characters that spans a given range of pixels.
It seems to me that an augmented tree lets me do all of those operations in log(N) time - and no other data structure offers the same performance.
I'm aware of two augmented trees that have been proposed here. Francisco Jose Tapia has proposed "Countertree", but this only stores the count of elements not the sum of a property of the elements (i.e. the widths in my example). Vadim Stadnik has proposed stl_ext_adv, which is a doubly-augmented B+ tree. Has anyone looked at Vadim's code since it was proposed last year? I have a couple of concerns about Vadim's proposal:
1. It uses B+ trees, and in the discussion it was suggested that these are inferior to red-black trees in some way.
2. In order to get the property to be accumulated (i.e. the widths in my case) it requires that arithmetic can be done on the stored type itself; this is fine for e.g. a sequence of ints, but if the stored type is a struct it is necessary to add various "unnatural" operator overloads. I think I would prefer it to have some sort of user-supplied accessor to get the property to be accumulated.
Is there any other work that I've overlooked?
Is there a "friendly" red-black tree implementation somewhere that could be augmented relatively easily?
This implementation was written to allow easy augmentation, and it's worked well for me. www.stroustrup.com/tree-appendix.pdf It is described at onlinelibrary.wiley.com/doi/10.1002/spe.564/pdf. I don't know about the license. Alec

On Thu, Oct 11, 2012 at 10:23 AM, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
Dear Experts,
I find myself in need of an augmented tree. My problem is something like [but not exactly] this:
- Say I have a line of text, i.e. a sequence of characters, in a variable-width font; each character stores its width in pixels.
- I need to efficiently insert and remove characters.
- I need to efficiently find the range of characters that spans a given range of pixels.
It seems to me that an augmented tree lets me do all of those operations in log(N) time - and no other data structure offers the same performance.
I'm aware of two augmented trees that have been proposed here. Francisco Jose Tapia has proposed "Countertree", but this only stores the count of elements not the sum of a property of the elements (i.e. the widths in my example). Vadim Stadnik has proposed stl_ext_adv, which is a doubly-augmented B+ tree. Has anyone looked at Vadim's code since it was proposed last year? I have a couple of concerns about Vadim's proposal:
1. It uses B+ trees, and in the discussion it was suggested that these are inferior to red-black trees in some way.
I've done some experimentation with in-memory B+ trees, and they have different performance characteristics than red-black trees. They can be 10 times or more faster for search operations than red-black trees, particularly on architectures with very high speed hardware caches that are large in relation to the size of the tree. They can also use less space than an equivalent red-black tree. I'm not sure what would happen to the performance in an app dominated by inserts and deletes rather than searches, but a red-black tree is likely better for that scenario. The B-tree, read-black tree, UB-tree, and many of their variants were all invented by Rudolf Bayer. Each is superior for some usage scenarios and applications, and inferior for others. --Beman

On Fri, Oct 12, 2012 at 1:23 AM, Phil Endecott < spam_from_boost_dev@chezphil.org> wrote:
Dear Experts,
I find myself in need of an augmented tree. My problem is something like [but not exactly] this:
- Say I have a line of text, i.e. a sequence of characters, in a variable-width font; each character stores its width in pixels.
- I need to efficiently insert and remove characters.
- I need to efficiently find the range of characters that spans a given range of pixels.
Thank you for the interest in augmented data structures! Because of the last requirement it seems to me that you do not have to sum up width values. The problem requires a type that represents an interval or a range or a one-dimensional extension rectangle. There is an example in documentation to the stl_ext_adv library that shows how to count intersections of one interval with a sequence of intervals in logarithmic time. This example looks more relevant to the described problem.
2. In order to get the property to be accumulated (i.e. the widths in my case) it requires that arithmetic can be done on the stored type itself; this is fine for e.g. a sequence of ints, but if the stored type is a struct it is necessary to add various "unnatural" operator overloads. I think I would prefer it to have some sort of user-supplied accessor to get the property to be accumulated.
If my first guess was incorrect and you really need to sum the width values and would like to avoid complications of operator overloading then you can create for this algorithm an additional container, such as sequence or multiset, which supports fast accumulate() algorithm for type <int> that represents the width parameter only. This approach requires mapping and synchronization of processing between original data container(s) and the new additional container. Hope that my comments are helpful, Regards, Vadim Stadnik

Hi, I have designed on paper new algorithms for to insert, delete and balance the countertree, in order to improve the speed. The actual version is around 10% - 15% slower than the GCC version of red-black trees. The new version is easy to understand, and I hope, offer better results. In the new version you have a catalog of operations with the nodes. The node operations have two parts, the pointers and the node color, and the augmented information. In the countertree the augmented information are the node counters. But if you design the process to do with your augmented information for all the operations of the catalog, you can create your augmented tree with the pointer and color operations, and your code for each operation. And you can have augmented trees with several concepts, like the position, and an accumulator of the value of the nodes. I hope begin to work with the new algorithms at Christmas. If you are interested, when I have something useful , I will comment and offer to you. Sincerely yours Francisco Tapia

Phil Endecott wrote:
Dear Experts,
I find myself in need of an augmented tree. My problem is something like [but not exactly] this:
- Say I have a line of text, i.e. a sequence of characters, in a variable-width font; each character stores its width in pixels.
- I need to efficiently insert and remove characters.
- I need to efficiently find the range of characters that spans a given range of pixels.
It seems to me that an augmented tree lets me do all of those operations in log(N) time - and no other data structure offers the same performance.
I'm aware of two augmented trees that have been proposed here. Francisco Jose Tapia has proposed "Countertree", but this only stores the count of elements not the sum of a property of the elements (i.e. the widths in my example). Vadim Stadnik has proposed stl_ext_adv, which is a doubly-augmented B+ tree. Has anyone looked at Vadim's code since it was proposed last year? I have a couple of concerns about Vadim's proposal:
1. It uses B+ trees, and in the discussion it was suggested that these are inferior to red-black trees in some way.
2. In order to get the property to be accumulated (i.e. the widths in my case) it requires that arithmetic can be done on the stored type itself; this is fine for e.g. a sequence of ints, but if the stored type is a struct it is necessary to add various "unnatural" operator overloads. I think I would prefer it to have some sort of user-supplied accessor to get the property to be accumulated.
Is there any other work that I've overlooked?
Is there a "friendly" red-black tree implementation somewhere that could be augmented relatively easily?
I don't know if this will work for you, but in Boost.Geometry we have two implementations of the R-tree. Currently none of them is in the release branch but you can play with it. The R-tree is a spatial index based on B-tree, if you use 1 dimensional data you should get what you want. Implemented queries allows to retrieve values in various ways, e.g. values in some range or nearest to some other value. If you want to check it out, here is the address: http://svn.boost.org/svn/boost/sandbox-branches/geometry/index/ Regards, Adam

I examined your documentation, and it looks very interesting the R-trees. My idea with the countertree is provide a binary balanced tree, with the additional feature of access by the position like a vector. In the unsorted structures like vector_tree, and in the sorted structures like set, map,multiset and multimap. This feature can be useful sometimes, but mainly provide comfort to the SW developers, you can manage a map with a for loop , like in a vector. Reexamining the R-trees, I remember an old project, which perhaps can be interesting to you. Many years ago I implemented the functionality of a R-tree, with 3D objects using countertrees. The capability of know the position of the iterators was extremely useful, in order to optimize the search. The insertion, deletion and access are O (logN) operations. I don't know the true utility of this, because I must study in deep the R-trees, and check the performance of the two systems, in order to evaluate them. The implementation run well with any number of dimensions. If you are interested, say me and I will write you a description, here, in Boost , or if you prefer, in a private message. If not, sorry, and forget it. If I can help you about something, say me and I will try. And, as I say before, it looks very interesting Sincerely yours Francisco Tapia

Sorry Mr Wulkiewicz In my previous message, I talked about a implentation based in countertree. I had been reexamining my code, and I detect a fail in my solution. I have a case which my solution fail, and inhabilite my solution as implementation of the R-tree equivalent system. It is when an element is internal to other, and must search the nearest element. Sorry again, Yours Francisco Tapia
participants (8)
-
Adam Wulkiewicz
-
Beman Dawes
-
Chapman, Alec
-
Francisco José Tapia
-
Greg Rubino
-
Lars Viklund
-
Phil Endecott
-
Vadim Stadnik