[multi_index] Multi-Index as fast running median solver

I'm hoping to use multi-index' ordered_index (non-unique) to do a fast running median. Given a set of M numbers and a window size of N compute the medians (for each of M - N + 1 solutions). Here is my solution: template<typename T> class FastMedianSolver { private: // boost::multi_index does most of the heavy lifting for this FastMedianSolver. // This index is given both "sequence" and "tree" semantics. typedef typename boost::multi_index_container< T,indexed_by<sequenced<>,ordered_non_unique<identity<T> > > > SortedList; typedef typename SortedList::template nth_index<0> SeqIndex; typedef typename SeqIndex::type SIndex; typedef typename SortedList::template nth_index<1> TreeIndex; typedef typename TreeIndex::type TIndex; SortedList mQueue; const size_t mSize; const size_t mHalfSize; public: typedef typename SIndex::iterator iterator; /** * Create a FastMedianSolver with queue size as argument * * @param size - size of the queue. */ // Note: fast divide by 2 is: X >> 1 FastMedianSolver(size_t size) : mSize(size), mHalfSize(size >> 1) {} /** * insert a value into the median queue */ void insert(const T& value) { mQueue.push_front(value); if(mQueue.size() > mSize) mQueue.pop_back(); } /** * Get the median for the elements within the queue. Note: * the algorithm requires a full queue, o.w. may throw an exception. */ const T& getMedian() { // Ensure queue is filled up before we ask for median! assert((mQueue.size() >> 1) == mHalfSize); typename TIndex::iterator tNdx(mQueue.get<1>().begin()); // Now grab the median element // Note that unfortunately the root() node is not exposed so the only way to get the central node // is to iterate to it :( (this will incur a O(N) operation on this function which is a little unfortunate for(int i = 0; i < mHalfSize; ++i,++tNdx); // This would be better if it were available and accurate! // typename TIndex::iterator tNdx(mQueue.get<1>().beginAtRoot()); // TODO: Check if we need to average the output of 2 positions if mSize is even return *tNdx; } }; Note, that to get to the central node I just iterate the ordered list half way through set: for(int i = 0; i < mHalfSize; ++i,++tNdx); However, what I was hoping was that I could get the root node directly. ordered_index doesn't expose an iterator to root() so I put it in ordered_index.hpp as: iterator beginAtRoot(){return make_iterator(root());} However, this isn't working out because the tree isn't always balanced. My question is 2-fold is there any way to force the internal red-black tree to stay balanced; and secondly, would that have an appreciable performance penalty. Thank you, Ryan

Ryan Fogarty <ryan.fogarty.msece <at> gmail.com> writes:
I'm hoping to use multi-index' ordered_index (non-unique) to do a fast running median. Given a set of M numbers and a window size of N compute the medians (for each of M - N + 1 solutions). Here is my solution:
[...]
However, what I was hoping was that I could get the root node directly. ordered_index doesn't expose an iterator to root() so I put it in ordered_index.hpp as:
iterator beginAtRoot(){return make_iterator(root());}
However, this isn't working out because the tree isn't always balanced. My question is 2-fold is there any way to force the internal red-black tree to stay balanced; and secondly, would that have an appreciable performance penalty.
Hi Ryan, Excuse my very late answering, been on vacation the last two weeks. ordered_index internal implementation relies on red-back trees, which are not perfectly balanced but only roughly so (the longest root-to-leaf path is at most twice as long as the shortest one). I don't think it's easy to modify the handling code to improve on that. Joaquín M López Muñoz Telefónica, Invewstigación y Desarrollo
participants (2)
-
Joaquin M Lopez Munoz
-
Ryan Fogarty