[intrusive] Looking for some advice on intrusive splay trees

Hi to all, Lately I've been trying to implement splay trees for Intrusive. Excerpt taken from en.wikipedia.org/wiki/Splay_tree: "A splay tree is a self-balancing binary search tree with the additional unusual property that recently accessed elements are quick to access again. It performs basic operations such as insertion, look-up and removal in O(log(n)) amortized time. For many non-uniform sequences of operations, splay trees perform better than other search trees, even when the specific pattern of the sequence is unknown. The splay tree was invented by Daniel Sleator and Robert Tarjan." As most of you know, splay trees splay (rebalance) the tree also on searches, so that the target node is placed as root, yielding to better search times for frequently searched nodes (cache effect). The implementation I've taken as model is taken from the article written by Ralf Mattethat for C++ Users Journal (September 2005) titled "Implementing Splay Trees in C++" (http://www.ddj.com/architect/184402007). Following usual STL-like implementations I've used left and right pointers of the header to point to the leftmost and rightmost node respectively. Several intrusive rbtree implementation tricks and optimizations can be easily reused for splay trees so I think we can have a useful intrusive splay tree. My main issue is related to constness and thread-safety. I would like to offer a STL-like interface (that is, the interface actually implemented for other Intrusive containers like set, multiset and rbtree), but take in care that the tree will rebalance in each find/lower_bound/upper_bound operation so those operations couldn't be const members unless we declare the header as mutable. I plan to add optional (runtime) rebalancing for each operation (for example, the user could call find() without rebalancing if he doesn't want to alter the cache effect, or just needs more predictable search times). iterator it = splay_tree.find(a);//Splay operation performed iterator it = splay_tree.find(a, dont_splay);//No splay Allowing const overloads (casting the constness away internally) with splaying (rebalancing): const_iterator it = const_splay_tree.find(a); //Splaying performed would break usual de-facto thread-safety guarantees for STL and Intrusive containers (read-only access from different threads is thread-safe). So the main question is: what approach do you think would be better? ¿Just splay in const operations and put a thread-safety warning? ¿Make const versions non-splaying? ¿Don't offer const versions of search functions? Regards, Ion

const_iterator it = const_splay_tree.find(a); //Splaying performed
would break usual de-facto thread-safety guarantees for STL and Intrusive containers (read-only access from different threads is thread-safe).
In other words, this is dangerous and misleading.
So the main question is: what approach do you think would be better? ¿Just splay in const operations and put a thread-safety warning? ¿Make const versions non-splaying? ¿Don't offer const versions of search functions?
Making the const versions non-splaying would destroy the functional equivalence between const and non-const search functions which might come as a surprise for programmers that did not read the documentation well. So i would opt for "don't offer const versions of search functions". IMO using iterator it = splay_tree.find(a, dont_splay);//No splay is superior. Kevin

On 10/18/07, Kevin Sopp <baraclese@googlemail.com> wrote:
const_iterator it = const_splay_tree.find(a); //Splaying performed
would break usual de-facto thread-safety guarantees for STL and Intrusive containers (read-only access from different threads is thread-safe).
In other words, this is dangerous and misleading.
So the main question is: what approach do you think would be better? ¿Just splay in const operations and put a thread-safety warning? ¿Make const versions non-splaying? ¿Don't offer const versions of search functions?
Making the const versions non-splaying would destroy the functional equivalence between const and non-const search functions which might come as a surprise for programmers that did not read the documentation well. So i would opt for "don't offer const versions of search functions". IMO using iterator it = splay_tree.find(a, dont_splay);//No splay is superior.
How about reversing that? Offer the const find as a non-splaying operation, and offer a non-const find overload that explicitly asks for splaying? iterator it = splay_tree.find(a);//No splay iterator it = splay_tree.find(a, splay);//Splay operation performed or maybe spell out all splaying functions as *_and_splay (e.g. find_and_splay). --Michael Fawcett

Michael Fawcett escribió:
On 10/18/07, Kevin Sopp <baraclese@googlemail.com> wrote:
Making the const versions non-splaying would destroy the functional equivalence between const and non-const search functions which might come as a surprise for programmers that did not read the documentation well. So i would opt for "don't offer const versions of search functions". IMO using iterator it = splay_tree.find(a, dont_splay);//No splay is superior.
How about reversing that? Offer the const find as a non-splaying operation, and offer a non-const find overload that explicitly asks for splaying?
iterator it = splay_tree.find(a);//No splay iterator it = splay_tree.find(a, splay);//Splay operation performed
or maybe spell out all splaying functions as *_and_splay (e.g. find_and_splay).
Thanks for your comments. Both options seems reasonable. I think the user of a splay_tree expects splaying as the default option, so I'm starting to think that avoiding const versions could be a better choice (generic code using non-const containers would work). Users that are sure about thread-safety can uncast the container before passing it to the algorithm. Anyway, I'm afraid my opinion might change again ;-) Thanks, Ion

Thanks for your comments. Both options seems reasonable. I think the user of a splay_tree expects splaying as the default option, so I'm starting to think that avoiding const versions could be a better choice (generic code using non-const containers would work). Users that are sure about thread-safety can uncast the container before passing it to the algorithm. Anyway, I'm afraid my opinion might change again ;-)
I like this approach. A splay tree is inherently non-const. Attempting to support const operations is a mistake for a variety of reasons. Removing all const operations on a collection will have a number of negative consequences, but I'd like to believe that a lot of that is required when you have a collection that behaves like a splay tree. I can think of a bunch of ways of softening the blow to the STL collection concept, but they all revolve around hiding the fact that you don't want to do non-const lookups with a splay tree. I'd rather advertise that behavior rather than trying to hide it.

"Ion Gaztañaga" <igaztanaga@gmail.com> wrote in message news:471926A7.3000401@gmail.com... Dave Jenkins escribió:
Rather than remove the self-adjusting optimization, we decided to enable it conditionally when BOOST_DISABLE_THREADS is defined and exclude it otherwise.
Ummm.. IMHO this option is surely safe but many times a library is compiled as multithreaded even if only a thread is used.
Defining BOOST_DISABLE_THREADS doesn't prevent you from compiling multithreaded. And if it *is* defined, you can safely splay in all cases.
Apart from that, having threading enabled does not mean that containers will be shared between threads (we can have a GUI and the backend thread).
That's a good point. You could use a macro specific to your library to indicate that it's safe to splay in all cases. But, if it's only *sometimes* safe to splay, that won't work. In that case, my favorite is your idea of default splaying with a thread safety warning. And also offering a splay_tree.find(a, dont_splay) for when you want don't want splaying.
Sadly, a splay tree that does not splay on searches, is no longer a splay tree and will have quite bad performance. You didn't have any noticeable performance impact when multithreaded was enabled?
We did see a 10% drop in performance when the self-adjusting optimization is disabled, for a typical search. Of course, a lot depends on how skewed the access pattern is. Our ternary search trie is hidden inside other code, so it would be difficult to selectively disable/enable. That's why we went for the "big hammer" of BOOST_DISABLE_THREADS. Regards, Dave Jenkins

"Ion Gaztañaga" <igaztanaga@gmail.com> wrote in message news:4717AEEC.1030607@gmail.com... Splaying performed would break usual de-facto thread-safety guarantees for STL and Intrusive containers (read-only access from different threads is thread-safe). We had the same worry about thread-safety on the self-adjusting trie in xpressive (xpressive/detail/utility/symbols.hpp). Rather than remove the self-adjusting optimization, we decided to enable it conditionally when BOOST_DISABLE_THREADS is defined and exclude it otherwise.

Dave Jenkins escribió:
"Ion Gaztañaga" <igaztanaga@gmail.com> wrote in message news:4717AEEC.1030607@gmail.com... Splaying performed would break usual de-facto thread-safety guarantees for STL and Intrusive containers (read-only access from different threads is thread-safe).
We had the same worry about thread-safety on the self-adjusting trie in xpressive (xpressive/detail/utility/symbols.hpp). Rather than remove the self-adjusting optimization, we decided to enable it conditionally when BOOST_DISABLE_THREADS is defined and exclude it otherwise.
Ummm.. IMHO this option is surely safe but many times alibrary is compiled as multithreaded even if only a thread is used. Apart from that, having threading enabled does not mean that containers will be shared between threads (we can have a GUI and the backend thread). Sadly, a splay tree that does not splay on searches, is no longer a splay tree and will have quite bad performance. You didn't have any noticeable performance impact when multithreaded was enabled? Regards, Ion
participants (5)
-
Dave Jenkins
-
fred bertsch
-
Ion Gaztañaga
-
Kevin Sopp
-
Michael Fawcett