
Steven Watanabe wrote:
AMDG
Eric Niebler <eric <at> boost-consulting.com> writes:
I don't doubt it. std::map does it for its range-based insert so it must be possible. And this may seem like splitting hairs to you, but what I said is true -- you can't efficiently build a std::map by pushing in sorted data in one element at a time, because the interface doesn't allow for it. You have to pass it a range of sorted elements. You may say, "Well, then don't use a std::map to build a time series!" Fair enough. But when a perfectly satisfactory data structure like std::map cannot be adapted to satisfy a concept, it suggests a faulty concept.
Or that the container is not perfectly satisfactory for the particular task. std::map is simply not designed for inserting many elements at the end. Buffering the input still increases the overhead significantly because std::map is not allowed to assume that the input range is sorted. It must check. So using std::map is worse than a custom container for two reasons--it needs temporary storage and it must check the range before inserting it.
I agree that with a push-based insertion interface, std::map will be worse than a custom type that is designed with a push-based interface in mind. The question is whether models of the concept are *required* to have a push-based interface, or can they expose an opaque "collector" (as you refer to it below) that gives them a more flexible collection strategy.
And (crucially) you would do away with commit() on both ordered_assign and ordered_append? Or would you keep them on the low-level interface and add a higher level interface that called commit for you and handled the 90% case?
I would remove commit and require buffering to be explicit.
The thought I'm not sold on is:
- get rid of commit() on the lower level interface.
I consider immediate insertion lower level than buffering.
This is misrepresenting my position. Buffering is not inherent in what I've been endorsing. The low-level interface I've been suggesting is: begin transmission back_insert (v1, from1, to1) back_insert (v2, from2, to2) back_insert (v3, from3, to3) end transmission Whether or not buffering happens during this operation is entirely up to the container. This is a very general form for a push-based interface. It can accommodate immediate insertion *or* buffering.
OK. It seems to me that such a thing could be built on top of an ordered_append that has a commit(), simply by calling commit() after every appended element. Would you agree?
I would prefer to implement the "commit" version in terms of the plain version. See below.
Anyway, the STL-ish assign(begin,end) doesn't have this problem because the end is specified explicitly and it has extra benefits (the size may be known), but this doesn't mesh well with the signal processing domain. In assign(begin,end), the container *pulls* the data. In signal processing, data is often read by polling, and then *pushed* into a container, hence the desire for something like a back_inserter.
Just wondering... Isn't it possible to have a pull model from the container's side *and* a push model from the data's side with coroutines?
Yes if we had them. And we do in some sense (see last year's Google Summer of Code).
Time passes ...
If we ditched the assigners and the appenders and went with the range-based STL-ish interface, what happens? Any convenient push-based interface we build on top would have to resort to sticking in one value at a time, just like std::back_inserter. Do you really feel that's an improvement? It's an honest question.
Yes I do.
If you want buffering it doesn't have to be done behind the scenes. By adjusting the interface slightly it can be implemented in terms of concepts that already exist.
typedef get_efficient_collector<series_type>::type collector_type; collector_type collector; back_inserter<collector_type> inserter(collector); for(;;) *inserter++ = make_tuple(poll_some_value(), last_time, current_time); series.assign(std::move(collector));
Precisely. This is how a buffered ordered inserter work today. Now consider the case of a generic time series algorithm that needs to construct a new series. It doesn't know the optimal way to build the series, so it has to ask the series for its "efficient collector". In short, the efficient collector is the de-facto standard way to build a series generically. My point is, the phrase "if you want buffering" doesn't make any sense. The user (or generic algorithm) doesn't know and shouldn't care what the optimal insertion strategy for a given series type is. Only the series type knows that. This is why I made the "efficient collector" interface the default. Everything else can be built on top. Let's back up. Why do you believe that back_insert (v1, from1, to1) back_insert (v2, from2, to2) back_insert (v3, from3, to3) should be lower-level than begin transmission back_insert (v1, from1, to1) back_insert (v2, from2, to2) back_insert (v3, from3, to3) end transmission ? -- Eric Niebler Boost Consulting www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com