
On Thu, Dec 1, 2011 at 10:45 PM, Andrii Sydorchuk < sydorchuk.andriy@gmail.com> wrote:
You are right, if you mean the cost of construction of an advanced data structure, it is O(N log N). As soon as the required data have been
written
the sum and mean value of any consecutive elements can be found in O(log N) time. Please, have a look at performance tests of the fast algorithm accumulate() in the section:
The next solution works in O(1) and requires only O(N) for construction step: template <typename T> class accumulator { public: void init(std::vector<T> &elems) { sum.reserve(elems.size() + 1); sum.push_back(0); for (std::vector<T>::iterator it = elems.begin(); it != elems.end(); ++it) { sum.push_back(sum.back() + *it); } }
T operator()(T first, T last) const { return sum[last+1] - sum[first]; }
private: std::vector<T> sum; }; I guess this solution is slightly better for those math targets you mentioned above. It also works with any associative operation.
Andrii
This is the best possible approach provided that data are not modified or updated. Unfortunately, in practical data analysis, data fitting and approximation this is a relatively rare case. For large and huge data sets the cost of vector update operations will be quite significant and often prohibitive. It is also interesting to note that if the same solution was based on container sequence<bp_tree_idx> from the proposed library then it would efficiently support update operations for container elements. However, it would have linear cost of updating the sum data. The best possible performance for a user algorithm which requires a wide set of operations can be achieved with sequence<bp_tree_idx_acc>. The constant cost of summation similar to this example is possible, but not implemented yet. This is discussed in section "Iterators and accumulators". Thank you, Vadim