Re: [boost] Review managers wanted for augmented data structures
on Mon Mar 04 2013, Vadim Stadnik
On Mon, Mar 4, 2013 at 7:35 PM, Stefan Strasser
wrote: Am 04.03.2013 08:48, schrieb Martin Knoblauch Revuelta:
Do you mean that you would declare them RandomAccess iterators?
We can expect to have containers with millions of elements. For that size, O(log N) is about 20 times faster than O(log N log N). That makes a big difference.
isn't that a reason for providing optimized specializations of some STL algorithms, instead of introducing another iterator category?
how would a new iterator category even help? as long as your iterator is a forward iterator, the user is allowed to call std::binary_search on your iterator, which uses the same algorithm on any iterator category. (only functions used by binary_search like std::advance are specialized on the category) where in all this would your tree-optimized binary_search code be called?
the standard doesn't require random access iterator ops to be constant time.
Unfortunately, there is a problem of iterator categories with augmented trees. The C++ standard specifies computational complexity for iterator operations, although they are not shown in tables, since all of them are required to be constant time. In both C++03 and C++11 this is formulated in the section 24.1.8.
Are you sure these iterators don't fall under the *amortized* constant time requirements, just like those of the ordered associative containers? -- Dave Abrahams
On Sun, Mar 31, 2013 at 1:37 PM, Dave Abrahams
on Mon Mar 04 2013, Vadim Stadnik
wrote: On Mon, Mar 4, 2013 at 7:35 PM, Stefan Strasser
Am 04.03.2013 08:48, schrieb Martin Knoblauch Revuelta:
Do you mean that you would declare them RandomAccess iterators?
We can expect to have containers with millions of elements. For that size, O(log N) is about 20 times faster than O(log N log N). That makes a big difference.
isn't that a reason for providing optimized specializations of some STL algorithms, instead of introducing another iterator category?
how would a new iterator category even help? as long as your iterator is a forward iterator, the user is allowed to call std::binary_search on your iterator, which uses the same algorithm on any iterator category. (only functions used by binary_search like std::advance are specialized on the category) where in all this would your tree-optimized binary_search code be called?
the standard doesn't require random access iterator ops to be constant time.
Unfortunately, there is a problem of iterator categories with augmented trees. The C++ standard specifies computational complexity for iterator operations, although they are not shown in tables, since all of them are required to be constant time. In both C++03 and C++11 this is formulated in the section 24.1.8.
Are you sure these iterators don't fall under the *amortized* constant time requirements, just like those of the ordered associative containers?
Yes, I am sure. The algorithms that implement random access to container elements by index (also often named 'rank') are efficient search algorithms using counters of elements stored in tree nodes. These algorithms are similar to algorithms searching an element by a key that support functions of associative containers, such as c.lower_bound(key) and c.upper_bound(key). In the general case, both kinds of search algorithms on trees have the same logarithmic running time. However, the augmented array based B+ trees have an important advantage over red-black trees. They offer constant time access to the nearest container elements in the worst case against constant amortized time. Thus, these new data structure can provide the efficiency of sequential processing close to that of std::vector. Regards, Vadim Stadnik
on Sun Mar 31 2013, Vadim Stadnik
The algorithms that implement random access to container elements by index (also often named 'rank') are efficient search algorithms using counters of elements stored in tree nodes. These algorithms are similar to algorithms searching an element by a key that support functions of associative containers, such as c.lower_bound(key) and c.upper_bound(key). In the general case, both kinds of search algorithms on trees have the same logarithmic running time. However, the augmented array based B+ trees have an important advantage over red-black trees. They offer constant time access to the nearest container elements in the worst case against constant amortized time. Thus, these new data structure can provide the efficiency of sequential processing close to that of std::vector.
Ah. Seems like we don't have appropriate concepts to characterize these things. What do you say about the big-O when the dominating factor will always have such a small constant that it becomes insignificant? It feels like a cop-out, but I guess I'd go with random_access. -- Dave Abrahams
On Mon, Apr 1, 2013 at 12:40 AM, Dave Abrahams
on Sun Mar 31 2013, Vadim Stadnik
wrote: The algorithms that implement random access to container elements by index (also often named 'rank') are efficient search algorithms using counters of elements stored in tree nodes. These algorithms are similar to algorithms searching an element by a key that support functions of associative containers, such as c.lower_bound(key) and c.upper_bound(key). In the general case, both kinds of search algorithms on trees have the same logarithmic running time. However, the augmented array based B+ trees have an important advantage over red-black trees. They offer constant time access to the nearest container elements in the worst case against constant amortized time. Thus, these new data structure can provide the efficiency of sequential processing close to that of std::vector.
Ah. Seems like we don't have appropriate concepts to characterize these things. What do you say about the big-O when the dominating factor will always have such a small constant that it becomes insignificant? It feels like a cop-out, but I guess I'd go with random_access.
Asymptotically, most of operators of these new iterators have O(1) cost, this group includes increment, decrement and comparison operators. However, a few operators {+= , -=, +, -, [ ]} have O(log N) cost. Quite obviously, such iterators are much closer to random access iterators than to bidirectional. In addition to this, without std::random_access_iterator_tag user algorithms can not take full advantage of the efficient algorithms offered by augmented data structures. The running time can increase quite significantly from O(log N) to O(N), since standard library algorithms would use the subset of operators of bidirectional iterators only. Regards, Vadim Stadnik
participants (2)
-
Dave Abrahams
-
Vadim Stadnik