On 03/26/17 14:00, Vladimir Batov via Boost wrote:
On 2017-03-26 21:28, Andrey Semashev via Boost wrote:
What you're probably seeing is the overhead that comes from inserting in the middle of the vector that is used by flat_set internally. This especially matters if elements have inefficient move/copy.
Yes, I initially thought so too... but the value_type is merely a pair of pointers. The comparison operation is expensive. So, it appears it plays quite a role during initial construction. When I eliminated that as described and sorted once... huge boost.
Well, if the ordering predicate plays the defining role, I don't quite see where your gain is. Inserting N elements in an empty flat_set or inserting N elements in a vector and then sorting would give you the same number of calls to the predicate, unless I'm missing something. I still think that moving around lots of pointers on every insertion (you said the end size was ~50000 elements and I'm assuming normal distribution of insert positions) is the culprit. E.g. towards the end of construction inserting an element in the middle would have to move 50000 * 8 = ~390 KiB of data, which is quite a lot. Did you try profiling the code?
No, this won't work in general because these methods are not supposed to modify the container. In particular, they don't invalidate iterators or references, and calling these methods concurrently is thread-safe.
I do not feel those begin() et al will invalidate anything because the first call to, say, begin() will sort the underlying container once. After that it all works as usual, right?
No, not quite. flat_set< int > fs; fs.reserve(5); auto it1 = fs.begin(); auto it2 = fs.insert(42).first; auto it3 = f2.begin(); // invalidates it1 and it2 Also, as I mentioned, multiple threads may me calling begin() or find() concurrently. These methods must also have const-qualified versions to be able to be used on non-mutable containers.