Re: [boost] Upcoming review of Time Series

26 Jul 2007


      Hugo Duncan wrote:
...
Hi,
...
And if anybody would like to peruse the documentation online, you can
find it here:
http://boost-sandbox.sourceforge.net/libs/time_series/doc/html/
Since I am currently doinng some time series work, I thought I would have  
a look.  First glance shows an impressive amount of clear thinking and  
extensability.
Thanks.
...
Below are some use cases that are important to me, and  
which I can not immeadiatley see as possible within the framework.
I see no mention of how to handle large time series that need processing  
incrementally.  There are a lot of concepts in there, so I may well have  
missed something.  At the moment I handle these by reading the data into a  
circular buffer with enough capacity to provide as much time history as I  
need for the algorithms to run at each time point.  Using some  
imagination, I can see using the library to do this in either of two ways.
The first would be to use the existing series types and update the series  
values for each new datapoint - but I don't see a way of discarding data  
points.
The second would be to define a new series type that accepted new values,  
discarded old values, and generally did all the book keeping - is that  
possible within the confines of the framework?
I don't think either of these is the right approach. It shouldn't be the 
series' job to keep a circular buffer of data for the algorithm to use. 
Rather, if the algorithms requires a buffer of previously seen data, it 
should cache the data itself, as in the rolling average implementation I 
sent around a few days ago.

The Sequence concept on which all the time series' are built requires 
readable and incrementable cursors. That means the time series 
algorithms *should* all work with an "input" or single-pass series types 
-- that is, one with a destructive read. That would be the way to go 
IMO. I could see a time series type implemented in terms of std::istream 
that reads runs from std::cin, for instance. Or more practially, one 
that memory-maps parts of a huge file and traverses it with single pass 
cursors. This would be a very interesting time series! The algorithms 
haven't been tested with such a single pass series, but I don't see a 
fundamental problem with it.
...
I also have data that has non-constant sampling periods.  At the moment I  
handle these by piecewise sampling to another (constant period) timebase.   
Can this fitted into the framework?
I'm not 100% sure I understand your use case. But most of the series 
types and algorithms allow non-discrete sequences. That is, the offsets 
can be floating point. Could that help?
...
Finally, I don't see a convolution algorithm for applying filters.   
Probably easy enough to implement, and is in my view important enough to  
include in the core algorithms.
Yup, no convolution yet. Sure would be nice. Patches welcome! :-)


-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

The Astoria Seminar ==> http://www.astoriaseminar.com