
On 7/31/07, John Phillips <phillips@delos.mps.ohio-state.edu> wrote:
My apologies for the delay in this posting, but the review period for the Time Series library submitted by Eric Neibler runs from Monday, July 30 until Wednesday, August 8. From the documentation:
I have started to review the library, and at this point I am confused :-) So I thought I would share my thoughts in case someone can de-confuse me and I can produce a more useful review. To preface, I would be *very* interested in having a time series library in Boost. I use time series frequently in my work, although I wouldn't call myself an expert. That said, it took me a while to understand the language of the documentation. The whole "runs, discretization, coarseness" business didn't readily make sense to me. The lightbulb finally went off when I saw: "The integral of a series is calculated by multiplying the value of each run in the series by the run's length, summing all the results, and multiplying the sum by the series' discretization." Aha! Now I knew exactly what run and discretization meant - it is as if you are modeling a piecewise constant function with something like a scalable "x" axis. I went on to experiments with some dense, sparse, and piecewise constant series to see how they behave. The first thing that struck me as odd was: dense_series<double, daily> dense(start = 0, stop = 2); sparse_series< double, daily, double > sparse; sparse = dense; // Compilation error! (GCC 4.0.1/darwin) However, if I change the discretization type to int, all is good. Is there a reason? Before assigning to sparse, I populated the dense series with: ordered_inserter< dense_series< double > > in_d( dense ); in_d(10,0)(11,1)(12,2).commit(); After assigning to sparse, I was a little surprised at what sparse was spitting back out via the [] operator. Basically, sparse[0] == 10, sparse[1] == 11, sparse[2] == 12, and seemingly 0 everywhere else, but the docs say: sparse_series<> A sparse series where runs have unit length. So how come I get sparse[0.1] == sparse[-0.1] == 0? Shouldn't there be a unit length interval around 0 where sparse is 10? I think you may have commented on this recently, but I found it rather unexpected, given what the docs say (otherwise, perfectly good behavior). I moved on to: piecewise_constant_series< double, int, double > pc; pc = dense; And now more weirdness... pc[x] == 10 for x \in [0, 1] (inclusive)!!! i.e., pc[0] == pc[1] == 10, but pc[1.01] == 11. The first reason why this strikes me as odd is, if I start with a discrete time series (as dense_series models very nicely), and put it into something that expands it to a piecewise constant function, the choice of translating "value 10 at time 0" to "value 10 at interval [0, 1]" is not obvious... if I wanted approximation, I would be more likely to assign to a particular pc[x] the value of the closest point in dense - i.e. I'd be more likely to set pc[0.9]==11 because 0.9 is closer to 1 than to 0. That strategy would make it difficult to deal with the infinite pre-run and post-run, though, and you could argue that assignment is not about this kind of approximation but about something else - so you may want to specify in the docs what the semantics of conversion through assignment is. The second reason why I find this odd is, I would at least expect that for an explicitly specified point such as dense[1], pc[1] would have the same value after assignment. Anyway, at this point I stopped experimenting as it seemed to me that there was something fundamental that I wasn't getting.
From reading the docs, I do have some other comments:
"It may seem odd that even the unit series have a value parameter. This is because 1 is not always convertible to the series' value type. Consider a delta_unit_series< std::vector< int > >. You may decide that for this series, the unit value should be std::vector< int >(3, 1). For unit series with scalar value types such as int, double and std::complex<>, the value parameter is ignored." Question: if I was dealing with something that needs to have the "1" specified, couldn't I just use delta_series instead of delta_unit_series? About discretization: your prenamed discretizations start with with daily as mpl::int<1>. That's great if that's how the integrals need to be calculated. However, in a lot of cases one would want to have "secondly" as mpl::int<1>. Could you provide another set of prenamed discretizations that provide that option? Although, if someone used both of them in the same app, then compile-time checking would consider them equivalent, no? Could the units lib be used for this instead? About the docs: time series can be expressed very clearly visually - I think that the docs would benefit greatly from some visual examples. I guess that's it for now. Sorry about my confusion :-) I can try to give a little more feedback if I start to understand the fundamentals of the library better. Back to bed for me... I actually tried to go to bed earlier but couldn't sleep because I was thinking about this lib. So I woke up to write this. Weird. Stjepan