
On Mar 16, 2006, at 4:29 PM, Vaclav Vesely wrote:
Howard, thank you very match for your suggestions. I didn't know about move semantics proposal so far.
My pleasure Vaclav. I think your code is close to optimum for what can be achieved today, and indeed looks quite similar to what I put into the CodeWarrior STL to simulate move semantics (before I had language support to do it right). I well understand your desire to see these kinds of optimizations today, as opposed to years from now.
There are three related operation to optimize:
template<typename Type> void move(Type& dst, Type& src)
You might consider calling this one move_assign. That gives you a nice symmetry to your naming scheme.
Default implementation for move is copy assignment. But for each type should be safe alternative implementation by swap. Maybe the swap could be good default implementation for types without trivial copy constructor. I can't imagine an example of a type with non- trivial constructor, where copy implementation would be more effective.
One example is pair<string, int>. :-( Admittedly this is because pair doesn't have a swap and should (and I will do my best to see to it that future std::pair does have swap - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/ n1856.html#20.2.3%20-%20Pairs ). But the example stands: Any class that holds a heavy type where the designer neglected to give the class a swap will be pessimized by a move_assign that swaps instead of copies. I've also seen rather simple classes that are given copy constructors when the compiler-generated copy constructor would have been both trivial and correct. Admittedly that's a (performance) bug, but I've seen it happen often enough. I think I'd stick with a default implementation of copy assignment.
template<typename Type> void move_construct(Type& dst, Type& src)
Default implementation is copy construct. For classes with default constructor can be move effective to default construct and than swap. For specialize this function for 3rd party types (STL for example) it would be handy to be able to declare, that we prefer default-construct-swap alternative but to leave at compiler to decide if there is an default constructor and so if it's possible. Unfortunately I don't know, how to write has_default_constructor trait. I tried to use concept_check but it seems to be a wrong way. Maybe it's not possible without help from compiler. Any idea?
Sorry, I'm afraid I don't have any ideas short of compiler help. Other reasonable optimizations (for some classes) are a memcpy followed by memset the source to 0 (or equivalent). This is just a documentation suggestion for you, not a recommendation for you to make that the default. Some classes are specifically designed to default construct (or otherwise have a valid state) with all bits 0 (one of my favorite std::string implementations was designed with this in mind. ;-) Also for empty allocators, probably every std::vector could use that technique).
In the attached test there is a trait "use_swap_for_move" and "use_swap_for_move_construct". The first is false by default, the second takes over the value of the first. Thus it's possible to change implementation of all three functions by specializing only the "use_swap_for_move" trait (typical usage). But for non-default- constructible types it's possible to change only the implementation of the move function.
You've got a convenience - performance tradeoff going here. What you have seems most convenient. However the memcpy/memset algorithm, especially if it doesn't actually call memcpy and memset, is going to be faster than default construct then swap for some types. Perhaps that just adds up to a documentation issue? As an example: // assume vector is a friend, empty allocator assumed template <class T> inline void move_construct(std::vector<T>& dst, std::vector<T>& src) { dst.begin_ = src.begin_; dst.end_ = src. end_; dst.last_ = src. last_; src.begin_ = 0; src.end_ = 0; src.last_ = 0; } 3 reads, 6 writes. Versus: template <class T> inline void move_construct(std::vector<T>& dst, std::vector<T>& src) { // default construct dst dst.begin_ = 0; dst.end_ = 0; dst.last_ = 0; // swap src, dst tmp.begin_ = src.begin_; tmp.end_ = src. end_; tmp.last_ = src. last_; src.begin_ = dst.begin_; src.end_ = dst.end_; src.last_ = dst.last_; dst.begin_ = tmp.begin_; dst.end_ = tmp.end_; dst.last_ = tmp.last_; } 9 reads, 12 writes, which looks over 2x slower. With luck your optimizer might store all three words of tmp in registers (on a sufficiently register-rich platform) and that would bring your cost down to: 6 reads, 9 writes, which is still at least 50% slower than the memcpy/ memset algorithm. Good luck with your library. I hope to make it obsolete in C++0X! :-) -Howard