
Steven Watanabe wrote:
AMDG
Eric Niebler <eric <at> boost-consulting.com> writes:
My point is, the phrase "if you want buffering" doesn't make any sense. The user (or generic algorithm) doesn't know and shouldn't care what the optimal insertion strategy for a given series type is.
Is there any data structure for which buffering actually helps?
If we're talking about buffering of sorted data being appended to the end of a sequence, then I think I agree with you. No *useful* data structure comes to mind. But IMO the problem is larger than "buffering sorted data being appended to the end of a sequence." More below.
Let's back up. Why do you believe that
back_insert (v1, from1, to1) back_insert (v2, from2, to2) back_insert (v3, from3, to3)
should be lower-level than
begin transmission back_insert (v1, from1, to1) back_insert (v2, from2, to2) back_insert (v3, from3, to3) end transmission
?
It is possible to implement either in terms of the other so in a sense neither one is intrinsically "lower-lever."
I disagree with this. If anything non-trivial is going on in commit(), then the second version cannot be *efficiently* implemented in terms of the first, which means the second is necessarily the more primitive of the two.
The first is simpler.
Agreed.
I do not think the second gains enough flexibility to justify the added complexity.
This is a judgment call, and I have done a poor job justifying the added complexity.
I've just been running some tests with binary trees and the results are surprising. Neither msvc 8.0 nor gcc 3.4 implement the range insertion in linear time for sorted ranges (In spite of the standard). Both just do an element by element insert.
Gah! That's terrible. :-(
Second, I discovered that the map_back_inserter I posted is as fast as a buffered version ( the buffering is insignificant, BTW).
Very interesting data points. Thanks for the tests. The inserter design (with commit()) came out of a design discussion with Dave A., which I took the time to review this morning. He was coming at the problem from the perspective of MTL4 -- matrices and vectors. For MTL4, he wanted generic ways of representing sparse data and of building sequences. We wanted a mechanism that was general. He imagined a desire to temporarily violate the invariants of a sequence while building it, and then using commit() to "make it good". Some examples of invariant breaking insertions might be inserting data that is unsorted, or sorted in reverse order, or inserting at the front of a sequence (not necessarily invariant-breaking, but possibly inefficient otherwise). For these purposes, there would be a collection of inserters, not just the "ordered_inserter" in Time_series, which is a rather uninteresting example. This perhaps explains why we've been talking past each other. I've had this desire for extreme generality in the back of my mind. The Sequence and RangeRunStorage concepts are general and can be reused in other contexts, where alternate representations may be more common and pre-sorted data may be less. Where does this leave inserters and the time series library? Perhaps we decide that within the time series domain, inserting pre-sorted data at the end of the sequence is the only scenario that matters, and everything else is unnecessary complication. For the purpose of generality though, the inserter interface should remain on the lower-level RangeRunStorage interface. Perhaps for the TimeSeries concepts, the inserters are hidden, and push_back How It's Done. How does this sound to you? Personally, I wasn't ready to say that appending pre-sorted data is the only scenario that matters; hence, I exposed the more general interface. Perhaps someone more steeped in the time series and signal processing domains could weight in. -- Eric Niebler Boost Consulting www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com