Accumulators: Using data vector from Interprocess

Hi, I have large data to be processed hence planning to use memory mapped file backed vector from boost::interprocess. Could I use this vector directly in accumulators and perform computations on data in the boost::interprocess container without having to copy it all into the contain in accumulator_set? Also, is there a way to remove an entry once I add it to an accumulator_set? I want to emulate a sliding window calculation. Ex: If my window size is 100, when I add the 101st entry, I want to remove the 1st entry so that the size is always 100. -dhruva -- Contents reflect my personal views only!

dhruva wrote:
Hi, I have large data to be processed hence planning to use memory mapped file backed vector from boost::interprocess. Could I use this vector directly in accumulators and perform computations on data in the boost::interprocess container without having to copy it all into the contain in accumulator_set?
Also, is there a way to remove an entry once I add it to an accumulator_set? I want to emulate a sliding window calculation. Ex: If my window size is 100, when I add the 101st entry, I want to remove the 1st entry so that the size is always 100.
-dhruva
You can use an accumulator with any container that allows you to walk through the contained values, so it will work with this one. Most accumulators don't store any of the data points, and the ones that do store any, don't store all. So, in general you will not be making a copy of all the data. For the moving average on a widow, the accumulators library doesn't currently include an accumulator that does this so a custom accumulator has to be defined. However, you might look in the sandbox and see if one is there. I don't recall who did it, but I remember a discussion about making one of these in the past. I think there is an available version somewhere. John

John Phillips wrote:
dhruva wrote:
Hi, I have large data to be processed hence planning to use memory mapped file backed vector from boost::interprocess. Could I use this vector directly in accumulators and perform computations on data in the boost::interprocess container without having to copy it all into the contain in accumulator_set?
Also, is there a way to remove an entry once I add it to an accumulator_set? I want to emulate a sliding window calculation. Ex: If my window size is 100, when I add the 101st entry, I want to remove the 1st entry so that the size is always 100.
You can use an accumulator with any container that allows you to walk through the contained values, so it will work with this one. Most accumulators don't store any of the data points, and the ones that do store any, don't store all. So, in general you will not be making a copy of all the data.
Correct, thanks John.
For the moving average on a widow, the accumulators library doesn't currently include an accumulator that does this so a custom accumulator has to be defined. However, you might look in the sandbox and see if one is there. I don't recall who did it, but I remember a discussion about making one of these in the past. I think there is an available version somewhere.
Also correct. A moving average accumulator is an often requested feature. I knocked one together as a proof-of-concept during the accumulators review, but never polished it. You can find it in this message: http://lists.boost.org/Archives/boost/2007/07/124979.php HTH, -- Eric Niebler BoostPro Computing http://www.boostpro.com

Eric Niebler wrote:
A moving average accumulator is an often requested feature. I knocked one together as a proof-of-concept during the accumulators review, but never polished it. You can find it in this message:
Whoops, that's a rolling average algorithm for the time series library. The implementation for the accumulators library is considerably simpler. I just knocked this together, but it seems to work. /////////////////////////////////////////////////////////////////////////////// // Copyright 2008 Eric Niebler. Distributed under the Boost // Software License, Version 1.0. (See accompanying file // LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) #include <boost/circular_buffer.hpp> #include <boost/accumulators/framework/extractor.hpp> #include <boost/accumulators/framework/depends_on.hpp> #include <boost/accumulators/framework/accumulator_base.hpp> #include <boost/accumulators/framework/parameters/sample.hpp> #include <boost/accumulators/numeric/functional.hpp> namespace boost { namespace accumulators { BOOST_PARAMETER_NESTED_KEYWORD(tag, rolling_average_window_size, window_size) namespace impl { template<typename Sample> struct rolling_average_impl : accumulator_base { // for boost::result_of typedef typename numeric::functional::average<Sample, std::size_t>::result_type result_type; template<typename Args> rolling_average_impl(Args const & args) : sum_(args[sample | Sample()]) , buffer_(args[rolling_average_window_size]) { } rolling_average_impl(rolling_average_impl const &that) : sum_(that.sum_) , buffer_(that.buffer_) { this->buffer_.set_capacity(that.buffer_.capacity()); } template<typename Args> void operator ()(Args const & args) { this->push_back_(args[sample]); } result_type result(dont_care) const { BOOST_ASSERT(this->buffer_.size() != 0); return numeric::average(this->sum_, this->buffer_.size()); } private: rolling_average_impl &operator=(rolling_average_impl const &); void push_back_(Sample const &value) { if(this->buffer_.full()) { this->sum_ -= this->buffer_[0]; } this->sum_ += value; this->buffer_.push_back(value); } Sample sum_; circular_buffer<Sample> buffer_; }; } namespace tag { struct rolling_average : depends_on< > , tag::rolling_average_window_size { typedef accumulators::impl::rolling_average_impl< mpl::_1 > impl; }; } namespace extract { extractor<tag::rolling_average> const rolling_average = {}; } using extract::rolling_average; }} #include <iostream> #include <boost/accumulators/accumulators.hpp> int main() { using namespace boost; using namespace accumulators; accumulator_set<double, features< tag::rolling_average > > acc(tag::rolling_average::window_size = 5); acc(1.); std::cout << rolling_average(acc) << std::endl; acc(2.); std::cout << rolling_average(acc) << std::endl; acc(3.); std::cout << rolling_average(acc) << std::endl; acc(4.); std::cout << rolling_average(acc) << std::endl; acc(5.); std::cout << rolling_average(acc) << std::endl; acc(6.); std::cout << rolling_average(acc) << std::endl; acc(7.); std::cout << rolling_average(acc) << std::endl; return 0; } -- Eric Niebler BoostPro Computing http://www.boostpro.com

Hello, The help and support on this list is unmatched by any of the paid support you get from most companies. Thank you all for such detailed responses. For someone like be getting into boost, this really helps. On Wed, Dec 24, 2008 at 12:49 AM, Eric Niebler <eric@boostpro.com> wrote:
Eric Niebler wrote:
A moving average accumulator is an often requested feature. I knocked one together as a proof-of-concept during the accumulators review, but never polished it. You can find it in this message:
Whoops, that's a rolling average algorithm for the time series library. The implementation for the accumulators library is considerably simpler. I just knocked this together, but it seems to work.
/////////////////////////////////////////////////////////////////////////////// // Copyright 2008 Eric Niebler. Distributed under the Boost // Software License, Version 1.0. (See accompanying file // LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
#include <boost/circular_buffer.hpp>
BOOST_PARAMETER_NESTED_KEYWORD(tag, rolling_average_window_size, window_size)
In this example, the window size is know and fixed (I agree it was based on my requirement that I had posted in the original post). However, my window is a time window. Something like weekly sliding window average. Since the data collection frequency could vary in my case, I will not be able to use a circular list. I used a deque where I add at the end and pop off from the top till the difference between the last time stamp (most recently added) to first time stamp is NOT greater than the window size. It is here that I intend to create the deque using interprocess. I will have deque will have a pair of time stamp and value (pair<timestamp, data>). Would accumulator handle a complex type in a list or should it be a list of data only on which computations will be made. Is there some concept like accessor which I can implement to define getting the data from the list to send it. I could get the 'second' from my deque entry and return that function. Before someone says, RTFM, I am reading the accumulators docs but generally find boost a little difficult to get started with. Waiting for a good book in English teaching boost with some real life examples or a cookbook styled book, I learn best through cookbook styled books! thanks, -dhruva -- Contents reflect my personal views only!
participants (3)
-
dhruva
-
Eric Niebler
-
John Phillips