boost::time_series pre-review questions

boost::time_series is available here. The review stars at the end of the month. http://boost-consulting.com/vault/index.php?directory=Math%20-%20Numerics Here are some initial questions and observations. All questions in this post are related to the simple_moving_average_functor, which I have included at the bottom of this post. I have more questions, but I thought this would be a good start. 1) A simple way to do a "simple_moving_average" is to use a std::deque, adding elements using "push_back" and removing elements using "pop_front". This technique can be used for any of the smoothing algorithms that involve a fixed number of elements. Do any of the boost::time_series containers mimic the "deque" capability to add and remove elements at both ends of the array? I used a "deque" in the sample "functor" at bottom of this post. 2) The "simple_moving_average_functor" has two functions, "finalize()" and "operator()". (see below) The second function passes in the offsets, for "start" and "stop". However, the "finalize" function does not pass any offsets. Im not shur how to integrate these "offsets" into the calculating the "moving average". The functor that I created below just ignores them. Any suggestions? 3) After calling commit(), am I not able to add any more elements to a "time_series" container. In other words, do I have to re-create the "time_series" container each time. Take for example, the case when Im calculating the "simple_moving_average" over a period of 10 days. As I move through the data set, all that I need to do is add another datat-point, and then remove the last data point. To re-populate the "time_series" container each time would be non-optimal. I wonder if "time_series" container could release that data, and let me start adding/removing elements to it again? 4) Do "range_run_storage::transform" and "range_run_storage::for_each" support lambdas? The example using "range_run_storage::transform" might look like this: boost::range_run_storage::transform(series1, series2, boost::lambda::_1 + boost::lambda::_2, inserter); Example using "std::transform" std::transform(series1.begin(),series1.end(),series2.begin(), boost::lambda::_1 + boost::lambda::_2, std::back_inserter(series3)); 5) What is the prefered method to do simple math against the elements in a "time_series" container. Say for example I wanted to multiply the number 1.10 against each element. How would I do this using boost::range_run_storage::for_each(). The example might look like this: boost::range_run_storage::for_each(series1, boost::lambda::_1 * boost::lambda::constant(1.10)); Example using "std::for_each" std::for_each(series1.begin(),series1.end(), boost::lambda::_1 * boost::lambda::constant(1.10)); template<typename out_t, typename series_t> struct simple_moving_average_functor { std::size_t periods; std::deque<double> data; out_t out; simple_moving_average_functor(const out_t& out, std::size_t periods, const series_t& series) : out(out), periods(periods){} template <typename value_t, typename offset_t> void operator ()(const value_t& value, offset_t start, offset_t stop) { if (data.size() < periods) { data.push_back(value); return; } double sum = 0; std::for_each(data.begin(),data.end(), boost::lambda::var(sum) += boost::lambda::_1); out(sum/periods, start,stop); data.pop_front(); data.push_back(value); } out_t const &finalize() { double sum = 0; std::for_each(data.begin(),data.end(), boost::lambda::var(sum) += boost::lambda::_1); return out(sum/periods); } };

Tom Brinkman wrote:
boost::time_series is available here. The review stars at the end of the month. http://boost-consulting.com/vault/index.php?directory=Math%20-%20Numerics
Here are some initial questions and observations.
All questions in this post are related to the simple_moving_average_functor, which I have included at the bottom of this post.
I have more questions, but I thought this would be a good start.
1) A simple way to do a "simple_moving_average" is to use a std::deque, adding elements using "push_back" and removing elements using "pop_front". This technique can be used for any of the smoothing algorithms that involve a fixed number of elements.
Do any of the boost::time_series containers mimic the "deque" capability to add and remove elements at both ends of the array? I used a "deque" in the sample "functor" at bottom of this post.
No, none of the time series are like deque. Most of the time series types use a vector of some ilk as the backing storage. I could probably make this configurable. But I don't think you need a deque-ish time series here, unless I'm mistaken. This is just an implementation detail, and it doesn't need to be a time series, IIUC. Besides, deque is the wrong data structure. You really want a circular buffer.
2) The "simple_moving_average_functor" has two functions, "finalize()" and "operator()". (see below) The second function passes in the offsets, for "start" and "stop". However, the "finalize" function does not pass any offsets. Im not shur how to integrate these "offsets" into the calculating the "moving average". The functor that I created below just ignores them. Any suggestions?
I think the right thing to do is pretend that values at skipped offsets are zero, and factor them in to your moving average. The idea is that a sparse time series with a 3 at offset 1 and a 4 at offset 10 should yield the same moving average as a dense series with zeros everywhere else. Make sense? I think your implementation can be a lot more efficient. You're summing all the values in the window at each iteration. Can't you store a running sum, and a circular buffer of elements in the window? When you get a new element, you subtract the value popped from the back and add the value pushed to the front, divide by the size of the buffer, and you're done. (Well, you also need to factor in any zeros at offsets that have been skipped.)
3) After calling commit(), am I not able to add any more elements to a "time_series" container. In other words, do I have to re-create the "time_series" container each time. Take for example, the case when Im calculating the "simple_moving_average" over a period of 10 days. As I move through the data set, all that I need to do is add another datat-point, and then remove the last data point. To re-populate the "time_series" container each time would be non-optimal. I wonder if "time_series" container could release that data, and let me start adding/removing elements to it again?
One of us is confused---I don't know which. :-) The moving average function takes an input series and the size of the window, and creates a new time series that contains the averages, right? So your functor will stuff averages into an OutputInserter and when it's done, it calls commit(). And then you're done. You don't need to add any more to the output series. Forgive me if I've totally made a hash of what you're trying to do.
4) Do "range_run_storage::transform" and "range_run_storage::for_each" support lambdas?
The example using "range_run_storage::transform" might look like this:
boost::range_run_storage::transform(series1, series2, boost::lambda::_1 + boost::lambda::_2, inserter);
That should work. This version of range_run_storage::transform() is documented to accept a BinaryFunction as its 3rd parameter. "_1 + _2" should model BinaryFunction. range_run_storage::for_each() takes a TernaryFunction. Check the docs.
Example using "std::transform"
std::transform(series1.begin(),series1.end(),series2.begin(), boost::lambda::_1 + boost::lambda::_2, std::back_inserter(series3));
Careful here. Time series are not containers in the STL sense. They don't have begin() and end() member functions. It's not clear what that would mean when a series could stretch from -inf to +inf. (The only exception is a clipped_series<>, which *does* model an STL Container, because it is guaranteed to have a definite start and end offset.) And using a std::back_inserter isn't going to work because time series do not have an insert() member function. Stick to the range_run_storage algorithms---that's what they're there for.
5) What is the prefered method to do simple math against the elements in a "time_series" container. Say for example I wanted to multiply the number 1.10 against each element. How would I do this using boost::range_run_storage::for_each().
Check out the range_run_storage::transform_inplace() algorithm. HTH, -- Eric Niebler Boost Consulting www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com

I've read the documentation for this library and can't begin to understand what it is actually for. I'm pretty sure its going over my head, could someone put it in laymans terms for me?
participants (3)
-
Eric Niebler
-
Jan Stetka
-
Tom Brinkman