[smart ptr] Any interest in copy-on-write pointer for C++11?

Hi, is there any interest in a copy-on-write pointer implementation? I wrote a cow_ptr<T> template class for C++11 which has the following use cases: 1. They are well suited as pimpl pointers for classes with value semantics. With most other smart pointers it is always necessary to reimplement the copy constructor, and the copy assignment operator, sometimes also the destructor. This is not necessary here and you'll get the right semantics and even the additional performance gain due to lazy copying. A futher advantage is that constness of cow pointer member functions propagates to the Impl object naturally making it easier to write const-correct code. // central_class_with_value_semantics.h class CentralClassWithValueSemantics { public: // ... public interface goes here ... // private: struct Impl; // forward declaration cow_ptr<Impl> m; }; // central_class_with_value_semantics.cpp struct CentralClassWithValueSemantics::Impl { // ... definition of hidden members goes here ... // } This is called the pimpl idiom, or private implementation idiom, handle body idiom, or compiler firewall idiom. 2. For classes with members whose copy-operations are expensive and/or which take a lot of space in memory, these members can be wrapped in a cow pointer. An example are matrix or image classes whose data might be stored in a std::vector. The matrix header information may be changed without deep copy. This boosts performance and optimizes memory usage, but at the same time retains the value semantics one feels comfortable with. class Matrix { public: // ... public interface goes here ... // private: size_t nRows, nCols; // can be touched without deep copy. cow_ptr<std::vector<float>> data; // copy-on-write } In such classes the default version of copy and move construction will usually work just fine. 3. You can add cloning to a class hierarchy from the outside. With cow_ptr<Base> a( new Derived1 ); cow_ptr<Base> b( new Derived2 ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // makes a deep copy of a as a Derived1 class. // There is no slicing involved. you copy Base objects polymorphically. The class Base can even be abstract here. It is only required that Derived1 and Derived2 be CopyConstructible. 4. You can create arrays with elements that retain polymorphic behaviour and have genuine value sematics at the same time with std::vector<cow_ptr<Base>> polymorphic_array_with_value_semantics; -- There are probably other use cases. I published the code on github ( https://github.com/ralphtandetzky/cow_ptr). Any suggestions for improvements would be greatly appreciated. Regards, Ralph Tandetzky.

Intersting. Is it thread safe? On 8 Feb, 2013, at 11:16 PM, Ralph Tandetzky <ralph.tandetzky@googlemail.com> wrote:
Hi,
is there any interest in a copy-on-write pointer implementation? I wrote a cow_ptr<T> template class for C++11 which has the following use cases:
...
There are probably other use cases. I published the code on github ( https://github.com/ralphtandetzky/cow_ptr). Any suggestions for improvements would be greatly appreciated.
Regards, Ralph Tandetzky.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

It is as thread-safe as necessary. The reference counter is atomic. It's all well documented on github <https://github.com/ralphtandetzky/cow_ptr>. You can find all the thread-safety guarantees there. Thread-safety is documented at the bottom of the long comment preceeding the cow_ptr class. On 02/08/2013 05:33 PM, Alexey Tkachenko wrote:
Intersting. Is it thread safe?
On 8 Feb, 2013, at 11:16 PM, Ralph Tandetzky <ralph.tandetzky@googlemail.com> wrote:
Hi,
is there any interest in a copy-on-write pointer implementation? [...]

On 08/02/13 17:50, Ralph Tandetzky wrote:
It is as thread-safe as necessary. The reference counter is atomic. It's all well documented on github <https://github.com/ralphtandetzky/cow_ptr>. You can find all the thread-safety guarantees there. Thread-safety is documented at the bottom of the long comment preceeding the cow_ptr class.
This code is actually C++11 and incompatible with C++03. It is quite strange that you chose to implement a COW facility in the only dialect of the language that never needs it.

On 08/02/13 17:50, Ralph Tandetzky wrote:
It is as thread-safe as necessary. The reference counter is atomic. It's all well documented on github <https://github.com/ralphtandetzky/cow_ptr>. You can find all the thread-safety guarantees there. Thread-safety is documented at the bottom of the long comment preceeding the cow_ptr class.
You should have just used std/boost::shared_ptr and added your COW logic on top. Your code reinvents the wheel, is quite inefficient (be it in terms of memory layout, construction or assignment), and I'm not even sure this is really thread-safe. Lockfree programming is tricky. If I remember right, boost::shared_ptr required to make some copy operations atomic as well, which requires a spinlock or 128-bit CAS. Surely this also applies here.

On 02/08/2013 06:20 PM, Mathias Gaunard wrote:
On 08/02/13 17:50, Ralph Tandetzky wrote:
It is as thread-safe as necessary. The reference counter is atomic. It's all well documented on github <https://github.com/ralphtandetzky/cow_ptr>. You can find all the thread-safety guarantees there. Thread-safety is documented at the bottom of the long comment preceeding the cow_ptr class.
You should have just used std/boost::shared_ptr and added your COW logic on top. Your code reinvents the wheel, is quite inefficient (be it in terms of memory layout, construction or assignment), and I'm not even sure this is really thread-safe. Lockfree programming is tricky. If I remember right, boost::shared_ptr required to make some copy operations atomic as well, which requires a spinlock or 128-bit CAS. Surely this also applies here.
It wouldn't work to add the cow logic on top of a shared_ptr, when polymorphism is involved. Then an explicit clone() member function of the template type would be needed. The nice thing about the cow_ptr is that you can get this cloning facility non-intrusively. Concerning thread-safety: Calling member functions of cow_ptr is not atomic. However, two cow_ptrs that point to the same object can be used simultaneously to modify the object. In doubt both cow_ptrs will both make a copies of the pointed to object. In this sense it is safe to use cow_ptrs in multithreaded environments. However, my cow_ptr class is not thread-safe in the sense that you could write to one cow_ptr from several threads simultaneously. Implementing this might lead to a bad performance penalty.

On 02/08/2013 06:52 PM, Ralph Tandetzky wrote:
On 02/08/2013 06:20 PM, Mathias Gaunard wrote:
On 08/02/13 17:50, Ralph Tandetzky wrote:
It is as thread-safe as necessary. The reference counter is atomic. It's all well documented on github <https://github.com/ralphtandetzky/cow_ptr>. You can find all the thread-safety guarantees there. Thread-safety is documented at the bottom of the long comment preceeding the cow_ptr class.
You should have just used std/boost::shared_ptr and added your COW logic on top. Your code reinvents the wheel, is quite inefficient (be it in terms of memory layout, construction or assignment), and I'm not even sure this is really thread-safe. Lockfree programming is tricky. If I remember right, boost::shared_ptr required to make some copy operations atomic as well, which requires a spinlock or 128-bit CAS. Surely this also applies here.
It wouldn't work to add the cow logic on top of a shared_ptr, when polymorphism is involved. Then an explicit clone() member function of the template type would be needed. The nice thing about the cow_ptr is that you can get this cloning facility non-intrusively.
Concerning thread-safety: Calling member functions of cow_ptr is not atomic. However, two cow_ptrs that point to the same object can be used simultaneously to modify the object. In doubt both cow_ptrs will both make a copies of the pointed to object. In this sense it is safe to use cow_ptrs in multithreaded environments. However, my cow_ptr class is not thread-safe in the sense that you could write to one cow_ptr from several threads simultaneously. Implementing this might lead to a bad performance penalty.
Now that we're at it: What's wrong with the memory layout, construction and assignment? How can it be made more effective? By the way, I have implemented copy assignment and move assignment separately now, so the code will be a little faster. Is that the kind of optimization you meant?

On 08/02/13 18:52, Ralph Tandetzky wrote:
It wouldn't work to add the cow logic on top of a shared_ptr, when polymorphism is involved. Then an explicit clone() member function of the template type would be needed. The nice thing about the cow_ptr is that you can get this cloning facility non-intrusively.
I don't see how doing that while using shared_ptr is a problem. Either you make a shared_ptr of your structure that contains your cloning thing, or you store it separately. Of course you'll need to change the virtual function declaration to T* clone(T*), since the structure wouldn't contain the object anymore. Storing it separately seems like a better idea since it avoids the unneeded use of dynamic allocation (all derived types are the same size).

On 08/02/13 16:16, Ralph Tandetzky wrote:
Hi,
is there any interest in a copy-on-write pointer implementation? I wrote a cow_ptr<T> template class for C++11 which has the following use cases:
1. They are well suited as pimpl pointers for classes with value semantics. With most other smart pointers it is always necessary to reimplement the copy constructor, and the copy assignment operator, sometimes also the destructor. This is not necessary here and you'll get the right semantics and even the additional performance gain due to lazy copying. A futher advantage is that constness of cow pointer member functions propagates to the Impl object naturally making it easier to write const-correct code.
I don't see any reason to make this use lazy copying/COW. COW is only useful if you're copying when you do not want to. The solution is easy: do not copy if you don't want to; this is made easier when you have the option to move instead. COW could be used as an alternative implementation for compatibility with code that isn't move-aware, but that's essentially the only use case. Now the reason why this isn't more common is that there is no clean way to copy an object polymorphically. There are two approaches: - You can use the static type of the object when inserted and assume it is the same as the dynamic type. That's not ideal, and this requires storing that information separately, adding a bit of space overhead. This is the approach boost::any uses. - You can make types that need polymorphic copy to have some sort of virtual copy constructor, which follows the same principle as other virtual functions. A member function of declaration T* clone() const is often used, but this has obvious problems with flexibility, since it couples allocation and construction. The best you can do is add two functions, one that gives the size and alignment, the other that constructs an object at a given address. With this approach, even if you insert an object from a reference/pointer to base, the derived object will correctly get copied, which is a very useful property when writing code that uses polymorphism with value objects. From your description, I take it you're using the first approach. Finally, and this will be my last complaint with this design, is that what you're modeling isn't really a pointer, but rather a smart object. It might therefore be worthwhile to change not only the name, but also the interface so that construction takes references rather than pointers.

On 02/08/2013 06:01 PM, Mathias Gaunard wrote:
On 08/02/13 16:16, Ralph Tandetzky wrote:
Hi,
is there any interest in a copy-on-write pointer implementation?
I don't see any reason to make this use lazy copying/COW. COW is only useful if you're copying when you do not want to. The solution is easy: do not copy if you don't want to; this is made easier when you have the option to move instead. COW could be used as an alternative implementation for compatibility with code that isn't move-aware, but that's essentially the only use case.
Now the reason why this isn't more common is that there is no clean way to copy an object polymorphically.
There are two approaches: - You can use the static type of the object when inserted and assume it is the same as the dynamic type. That's not ideal, and this requires storing that information separately, adding a bit of space overhead. This is the approach boost::any uses. - You can make types that need polymorphic copy to have some sort of virtual copy constructor, which follows the same principle as other virtual functions. A member function of declaration T* clone() const is often used, but this has obvious problems with flexibility, since it couples allocation and construction. The best you can do is add two functions, one that gives the size and alignment, the other that constructs an object at a given address. With this approach, even if you insert an object from a reference/pointer to base, the derived object will correctly get copied, which is a very useful property when writing code that uses polymorphism with value objects.
From your description, I take it you're using the first approach.
Finally, and this will be my last complaint with this design, is that what you're modeling isn't really a pointer, but rather a smart object. It might therefore be worthwhile to change not only the name, but also the interface so that construction takes references rather than pointers.
Copy-on-write is the most important use case of the class design. Even larger libraries like Qt use copy-on-write for some types (like QImage or QPixmap) and make the user interface much easier to use. A major reason is that value semantics are easier to reason about than reference semantics. For this purpose cow_ptr<T> is an enabler. Moving isn't always sufficient. The class design allows you to copy objects polymorphically. I tried it. It works. It's useful. I used it in production code. From the sematics you're right. It's almost like wrapping an object. But the syntax is too close to the syntax of other smart pointers, so I decided to not make it take a reference. The operator-> would not have an appropriate equivalent. Ralph Tandetzky

On 08/02/13 18:41, Ralph Tandetzky wrote:
Copy-on-write is the most important use case of the class design. Even larger libraries like Qt use copy-on-write for some types (like QImage or QPixmap) and make the user interface much easier to use.
That's because Qt is broken, and is written to support broken C++ programming practices. If you only copy when you need to copy, then there is no need for the COW mechanism. COW was invented to workaround issues with code that copied but didn't actually need to. COW also has the nasty effect that your real object is gonna change its address just because you modified it. For this reason the C++11 standard prevents implementation of standard components like std::string from using that mechanism.
A major reason is that value semantics are easier to reason about than reference semantics. For this purpose cow_ptr<T> is an enabler. Moving isn't always sufficient.
Value semantics do not require COW. There is no logic between your statements. Value semantics mean that when you modify a copy of an object, then the original is left unmodified. To achieve this you can either copy when requested or delay it until you're actually modifying. The simplest way to deal with this is to simply copy when asked to copy by the user, which is not only straightforward and keeping a good separation of concerns, but it also means you don't need the delaying mechanism and the overhead attached to it. KISS.
The class design allows you to copy objects polymorphically. I tried it. It works. It's useful.
It would work just as well without COW. COW does not affect observable behaviour, it's purely an implementation-specific detail. It should not leak in the interface.
I used it in production code.
That just means the code does what you need and is stable enough to work reliably in your use cases. This doesn't say anything about the quality of the design.

On 02/09/2013 07:01 AM, Mathias Gaunard wrote:
On 08/02/13 18:41, Ralph Tandetzky wrote:
Copy-on-write is the most important use case of the class design. Even larger libraries like Qt use copy-on-write for some types (like QImage or QPixmap) and make the user interface much easier to use.
That's because Qt is broken, and is written to support broken C++ programming practices. If you only copy when you need to copy, then there is no need for the COW mechanism. COW was invented to workaround issues with code that copied but didn't actually need to. COW also has the nasty effect that your real object is gonna change its address just because you modified it. For this reason the C++11 standard prevents implementation of standard components like std::string from using that mechanism.
Obviously, COW was invented for code that copied but didn't need to. And that's still a valid reason to use COW today in C++11. Yes, you can move objects and avoid a copy. Or you can swap cheaply, if you need to. But sometimes you need to copy, if you don't know for sure whether the reference count is 1. Under your assertions the STL is just as broken. For example std::vector<T>::pushback() might change the address of the contained data and therefore invalidate all pointers and iterators to it. That's still a source of many bugs unfortunately, especially for newbies in C++. The COW implementation of std::string had the following weird effect: std::string a("Hello world!"); char & p = a[11]; std::string b( a ); // makes a cheap copy p = '.'; // modifies a and b The reason is the escaping reference. The same effect can be achieved with QImage because the interface allows escaping pointers to the contained image data. The cow_ptr<T> design I'm proposing tries to avoid this pitfall. There's a const version of the member function get(), but not mutable one, since that would let a pointer escape instantly. The only function that lets a non-const pointer escape directly is operator->(), which is reasonable, since it is unlikely for a user of the class to write T * raw = ptr.operator->(); To still make it possible to modify the pointed to object easily you can use the cow_ptr member function apply() which takes a functor which takes a raw pointer to T or to use the macro COW_APPLY in the following way auto ptr = make_cow<Type>( /* constuctor arguments */ ); ptr.apply( [&]( Type * p ) { p->modify(); } ); // equivalent to ptr->modify(); COW_APPLY(ptr) { ptr->modify(); }; // does the same If you really want to, you can still let a pointer or a reference to the pointed-to object escape. But it's harder. This way the interface is easy to use correctly and hard to use incorrectly. If you use it correctly (i.e. don't let references escape), then it does not suffer from the above problem with the cow implementation of string and I believe it's safe.
A major reason is that value semantics are easier to reason about than reference semantics. For this purpose cow_ptr<T> is an enabler. Moving isn't always sufficient.
Value semantics do not require COW. There is no logic between your statements. Value semantics mean that when you modify a copy of an object, then the original is left unmodified. To achieve this you can either copy when requested or delay it until you're actually modifying.
Sorry, my statement was incomplete. The goal is to avoid unnecessary copies, because copying can be really expensive (think of the data inside an image or matrix class). I would prefer a matrix class with value sematics instead of reference semantics. It's easier to reason about. And I would like to be able to write Matrix a, b(1000,1000); a = b; without having to fear that the whole 1000x1000 matrix is deeply copied.
The simplest way to deal with this is to simply copy when asked to copy by the user, which is not only straightforward and keeping a good separation of concerns, but it also means you don't need the delaying mechanism and the overhead attached to it. KISS.
For client code of a class using cow_ptrs for data members internally it is even easier not to worry about making copies, but the class automatically does it for you. cow_ptr helps to implement that behaviour.
The class design allows you to copy objects polymorphically. I tried it. It works. It's useful.
It would work just as well without COW. COW does not affect observable behaviour, it's purely an implementation-specific detail. It should not leak in the interface.
It can be an implementation-specific detail and can make code faster without the client code knowing about it. But it can also be a thing the client code relies upon as with the Matrix class above. Sometimes client code might want to know, if copies can be made cheaply.
I used it in production code.
That just means the code does what you need and is stable enough to work reliably in your use cases. This doesn't say anything about the quality of the design.
At least I could convince you that there are legit use cases. That was harder than I expected.

On 09/02/13 09:49, Ralph Tandetzky wrote:
Under your assertions the STL is just as broken. For example std::vector<T>::push[_]back() might change the address of the contained data and therefore invalidate all pointers and iterators to it.
That is intrinsic to this data structure. Modifying the size of a sequence is a whole other thing than modifying its elements.
For client code of a class using cow_ptrs for data members internally it is even easier not to worry about making copies, but the class automatically does it for you. cow_ptr helps to implement that behaviour.
So you admit the only rationale for your cow_ptr is to support sloppy programming practices. Making everything slow and unnatural just to make it "easier for the user".
It can be an implementation-specific detail and can make code faster without the client code knowing about it. But it can also be a thing the client code relies upon as with the Matrix class above. Sometimes client code might want to know, if copies can be made cheaply.
A client should never rely on copying not being a 0(n) operation, because it is. Don't want to copy? Don't copy. Now if you want to add an option to turn your copies into copy-on-writes to measure there is any performance gain, why not. But that's just a pessimization that may become an optimization in certain cases, it's not a general approach.

WebKit has a class vaguely like this for your case #2: https://code.google.com/p/chromium/codesearch/#chrome/src/third_party/WebKit... used at https://code.google.com/p/chromium/codesearch/#chrome/src/third_party/WebKit/Source/WebCore/rendering/style/RenderStyle.h&rcl=1360310731&l=137. Semantically every copy is a real copy, and, contrary to Mathias' assertion, couldn't be replaced by a move in C++11, but they want to share identical values when doing so is cheap. On Fri, Feb 8, 2013 at 7:16 AM, Ralph Tandetzky <ralph.tandetzky@googlemail.com> wrote:
Hi,
is there any interest in a copy-on-write pointer implementation? I wrote a cow_ptr<T> template class for C++11 which has the following use cases:
1. They are well suited as pimpl pointers for classes with value semantics. With most other smart pointers it is always necessary to reimplement the copy constructor, and the copy assignment operator, sometimes also the destructor. This is not necessary here and you'll get the right semantics and even the additional performance gain due to lazy copying. A futher advantage is that constness of cow pointer member functions propagates to the Impl object naturally making it easier to write const-correct code. // central_class_with_value_semantics.h class CentralClassWithValueSemantics { public: // ... public interface goes here ... // private: struct Impl; // forward declaration cow_ptr<Impl> m; };
// central_class_with_value_semantics.cpp struct CentralClassWithValueSemantics::Impl { // ... definition of hidden members goes here ... // } This is called the pimpl idiom, or private implementation idiom, handle body idiom, or compiler firewall idiom.
2. For classes with members whose copy-operations are expensive and/or which take a lot of space in memory, these members can be wrapped in a cow pointer. An example are matrix or image classes whose data might be stored in a std::vector. The matrix header information may be changed without deep copy. This boosts performance and optimizes memory usage, but at the same time retains the value semantics one feels comfortable with. class Matrix { public: // ... public interface goes here ... // private: size_t nRows, nCols; // can be touched without deep copy. cow_ptr<std::vector<float>> data; // copy-on-write } In such classes the default version of copy and move construction will usually work just fine.
3. You can add cloning to a class hierarchy from the outside. With cow_ptr<Base> a( new Derived1 ); cow_ptr<Base> b( new Derived2 ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // makes a deep copy of a as a Derived1 class. // There is no slicing involved. you copy Base objects polymorphically. The class Base can even be abstract here. It is only required that Derived1 and Derived2 be CopyConstructible.
4. You can create arrays with elements that retain polymorphic behaviour and have genuine value sematics at the same time with std::vector<cow_ptr<Base>> polymorphic_array_with_value_semantics;
--
There are probably other use cases. I published the code on github ( https://github.com/ralphtandetzky/cow_ptr). Any suggestions for improvements would be greatly appreciated.
Regards, Ralph Tandetzky.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On 02/08/2013 06:35 PM, Jeffrey Yasskin wrote:
WebKit has a class vaguely like this for your case #2: https://code.google.com/p/chromium/codesearch/#chrome/src/third_party/WebKit... used at https://code.google.com/p/chromium/codesearch/#chrome/src/third_party/WebKit/Source/WebCore/rendering/style/RenderStyle.h&rcl=1360310731&l=137. Semantically every copy is a real copy, and, contrary to Mathias' assertion, couldn't be replaced by a move in C++11, but they want to share identical values when doing so is cheap.
On Fri, Feb 8, 2013 at 7:16 AM, Ralph Tandetzky <ralph.tandetzky@googlemail.com> wrote:
Hi,
is there any interest in a copy-on-write pointer implementation?
Thank you for pointing this out.

On 08/02/13 18:35, Jeffrey Yasskin wrote:
WebKit has a class vaguely like this for your case #2: https://code.google.com/p/chromium/codesearch/#chrome/src/third_party/WebKit... used at https://code.google.com/p/chromium/codesearch/#chrome/src/third_party/WebKit/Source/WebCore/rendering/style/RenderStyle.h&rcl=1360310731&l=137. Semantically every copy is a real copy, and, contrary to Mathias' assertion, couldn't be replaced by a move in C++11, but they want to share identical values when doing so is cheap.
If you want identical values to use the same resource, you should use a flyweight factory.

On 02/08/2013 09:28 PM, Mathias Gaunard wrote:
On 08/02/13 18:35, Jeffrey Yasskin wrote:
WebKit has a class vaguely like this for your case #2: https://code.google.com/p/chromium/codesearch/#chrome/src/third_party/WebKit...
used at https://code.google.com/p/chromium/codesearch/#chrome/src/third_party/WebKit/Source/WebCore/rendering/style/RenderStyle.h&rcl=1360310731&l=137. Semantically every copy is a real copy, and, contrary to Mathias' assertion, couldn't be replaced by a move in C++11, but they want to share identical values when doing so is cheap.
If you want identical values to use the same resource, you should use a flyweight factory.
A flyweight factory has different use cases: - Normally there is no reference count in a flyweight factory or their objects. Hence, when you want to modify an object, then a copy is made in every case, even if there is only one copy. Therefore, we can see, that a flyweight factory does usually not implement copy-on-write and can be less efficient in the case that one single object is modified regularly. - If there's a single factory in a multi threaded environment, then synchronization can be a real bottleneck. This is not the case for cow_ptr<T> which works lock free. Hence flyweight factories have different use cases and performance characteristics. COPY-ON-WRITE HAS ITS OWN VALID USE CASES. A cow_ptr<T> template class can be utilized to implement copy-on-write for different situations. With cow_ptr<T> come some other nice use-cases such as - using it as a pimpl pointer without the need to implement copy and move constructors and assignments or the destructor, - adding cloning to a type hierarchy non-intrusively and - building arrays of polymorphic types with value semantics. (See the file cow_ptr.h in the github repository <https://github.com/ralphtandetzky/cow_ptr.git> for details.) Sometimes these use-cases mix. I've seen it in my own production code. Believe me, there are legit use-cases of cow_ptr. My questions are: 1. Would this fit into the boost libraries? 2. What improvements can I make to the current design and implementation? I would really appreciate constructive feedback.

On Feb 8, 2013 12:29 PM, "Mathias Gaunard" <mathias.gaunard@ens-lyon.org> wrote:
On 08/02/13 18:35, Jeffrey Yasskin wrote:
WebKit has a class vaguely like this for your case #2:
https://code.google.com/p/chromium/codesearch/#chrome/src/third_party/WebKit...
used at https://code.google.com/p/chromium/codesearch/#chrome/src/third_party/WebKit/Source/WebCore/rendering/style/RenderStyle.h&rcl=1360310731&l=137 . Semantically every copy is a real copy, and, contrary to Mathias' assertion, couldn't be replaced by a move in C++11, but they want to share identical values when doing so is cheap.
If you want identical values to use the same resource, you should use a flyweight factory.
I'm sure the WebKit project would welcome a patch demonstrating that flyweights are a more efficient technique for CSS matching. I don't expect they are because these values are built up through several mutations, and a hash table lookup plus a copy on each mutation sounds more expensive than the current copy-on-write system. Jeffrey

On 08/02/13 22:44, Jeffrey Yasskin wrote:
I'm sure the WebKit project would welcome a patch demonstrating that flyweights are a more efficient technique for CSS matching. I don't expect they are because these values are built up through several mutations, and a hash table lookup plus a copy on each mutation sounds more expensive than the current copy-on-write system.
I don't pretend to know anything about what WebKit is actually doing. I interpreted what you said to mean that they want all values that are equal to use the same object. In that case the logical approach is indeed to use a flyweight factory. I think I just misunderstood what this was about; maybe they want to do partial COW on subtrees to minimize memory usage for redundant information, which is a whole different beast. COW on whole data structures is useless, but it is very useful when partial sharing is involved. That is not, however, the case that was presented here.

On 02/09/2013 07:13 AM, Mathias Gaunard wrote:
On 08/02/13 22:44, Jeffrey Yasskin wrote:
I'm sure the WebKit project would welcome a patch demonstrating that flyweights are a more efficient technique for CSS matching. I don't expect they are because these values are built up through several mutations, and a hash table lookup plus a copy on each mutation sounds more expensive than the current copy-on-write system.
I don't pretend to know anything about what WebKit is actually doing. I interpreted what you said to mean that they want all values that are equal to use the same object. In that case the logical approach is indeed to use a flyweight factory.
I think I just misunderstood what this was about; maybe they want to do partial COW on subtrees to minimize memory usage for redundant information, which is a whole different beast. COW on whole data structures is useless, but it is very useful when partial sharing is involved. That is not, however, the case that was presented here.
There are many different use cases. Here are two use cases for which I used cow_ptr so far: 1. Implement an image class that has genuine value semantics. I explained that in other e-mails before. 2. For a property tree, that looks in essence like this: class PropertyTree : public AbstractProperty { public: /* implementation of the public interface */ private: std::list<cow_ptr<AbstractProperty>> properties; }; Instead of std::list<cow_ptr<AbstractProperty>> you might as well use cow_ptr<std::list<AbstractProperty>> instead. The result is quite similar. Note the list of polymorphically behaving elements. Deep copies are performed properly, if necessary. You can keep a change history of the property tree in memory easily, since for little changes almost all the data of the stored trees are shared. At the same time you retain value semantics. And even "deep copies" are relatively cheap, since only the pointers in the next layer of the tree need to be copied. I found cow_ptr<T> quite useful for implementing this. I would really like to get feedback on the design of cow_ptr<T> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, if copy-on-write is a useful pattern in general.

Ralph Tandetzky <ralph.tandetzky@googlemail.com> writes:
On 02/09/2013 07:13 AM, Mathias Gaunard wrote:
I would really like to get feedback on the design of cow_ptr<T> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, if copy-on-write is a useful pattern in general.
No doubt. As an author that's only natural. But this debate, about whether the library even needs to exist, is a necessary and vital part of the process, and keeps Boost the quality that it is. Don't take the scepticism personally. It's just the way peer review works. Alex -- Swish - Easy SFTP for Windows Explorer (http://www.swish-sftp.org)

On 9 February 2013 14:40, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote:
Ralph Tandetzky <ralph.tandetzky@googlemail.com> writes:
On 02/09/2013 07:13 AM, Mathias Gaunard wrote:
I would really like to get feedback on the design of cow_ptr<T> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, if copy-on-write is a useful pattern in general.
No doubt. As an author that's only natural. But this debate, about whether the library even needs to exist, is a necessary and vital part of the process, and keeps Boost the quality that it is.
Don't take the scepticism personally. It's just the way peer review works.
Saying that COW is *never* needed is not scepticism, it's dogma.

Daniel James <dnljms@gmail.com> writes:
On 9 February 2013 14:40, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote:
Ralph Tandetzky <ralph.tandetzky@googlemail.com> writes:
On 02/09/2013 07:13 AM, Mathias Gaunard wrote:
I would really like to get feedback on the design of cow_ptr<T> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, if copy-on-write is a useful pattern in general.
No doubt. As an author that's only natural. But this debate, about whether the library even needs to exist, is a necessary and vital part of the process, and keeps Boost the quality that it is.
Don't take the scepticism personally. It's just the way peer review works.
Saying that COW is *never* needed is not scepticism, it's dogma.
Maybe so, but the to-and-fro it causes is precisely what illuminates the issues involved for the rest of us. It doesn't really matter whether someone is being dogmatic or just playing devils advocate. Alex -- Swish - Easy SFTP for Windows Explorer (http://www.swish-sftp.org)

On 02/09/2013 04:39 PM, Alexander Lamaison wrote:
Daniel James <dnljms@gmail.com> writes:
On 9 February 2013 14:40, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote:
Ralph Tandetzky <ralph.tandetzky@googlemail.com> writes:
On 02/09/2013 07:13 AM, Mathias Gaunard wrote:
I would really like to get feedback on the design of cow_ptr<T> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, if copy-on-write is a useful pattern in general. No doubt. As an author that's only natural. But this debate, about whether the library even needs to exist, is a necessary and vital part of the process, and keeps Boost the quality that it is.
Don't take the scepticism personally. It's just the way peer review works. Saying that COW is *never* needed is not scepticism, it's dogma. Maybe so, but the to-and-fro it causes is precisely what illuminates the issues involved for the rest of us. It doesn't really matter whether someone is being dogmatic or just playing devils advocate.
Alex @Daniel Thank you for your understanding.
@Alex You're right. I was just a bit annoyed. Sorry. As you put it, it really makes sense, Alex. Someone has to have a go at it in order to prove something to be right. It's not just about design, but also about the usefulness of this class in implementing copy-on-write. I only hope, that some people will agree with me, that is a helpful tool.

Ralph Tandetzky <ralph.tandetzky@googlemail.com> writes:
On 02/09/2013 04:39 PM, Alexander Lamaison wrote:
Daniel James <dnljms@gmail.com> writes:
On 9 February 2013 14:40, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote:
Ralph Tandetzky <ralph.tandetzky@googlemail.com> writes:
On 02/09/2013 07:13 AM, Mathias Gaunard wrote:
I would really like to get feedback on the design of cow_ptr<T> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, if copy-on-write is a useful pattern in general. No doubt. As an author that's only natural. But this debate, about whether the library even needs to exist, is a necessary and vital part of the process, and keeps Boost the quality that it is.
Don't take the scepticism personally. It's just the way peer review works. Saying that COW is *never* needed is not scepticism, it's dogma. Maybe so, but the to-and-fro it causes is precisely what illuminates the issues involved for the rest of us. It doesn't really matter whether someone is being dogmatic or just playing devils advocate.
Alex @Daniel Thank you for your understanding.
@Alex You're right. I was just a bit annoyed. Sorry. As you put it, it really makes sense, Alex. Someone has to have a go at it in order to prove something to be right.
Nicely put.
It's not just about design, but also about the usefulness of this class in implementing copy-on-write.
Hang in there. I'm sure you'll get the technical critique you need. But most likely after the motivation hurdle has been passed. Alex -- Swish - Easy SFTP for Windows Explorer (http://www.swish-sftp.org)

On 09/02/13 16:05, Daniel James wrote:
Saying that COW is *never* needed is not scepticism, it's dogma.
Many things about programming frameworks are dogmatic, or nothing would ever be consistent.

On 10 February 2013 18:42, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
On 09/02/13 16:05, Daniel James wrote:
Saying that COW is *never* needed is not scepticism, it's dogma.
Many things about programming frameworks are dogmatic, or nothing would ever be consistent.
Boost isn't a programming framework.

Ralph Tandetzky wrote:
I would really like to get feedback on the design of cow_ptr<T> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, if copy-on-write is a useful pattern in general.
Looks pretty good, but I don't like the name much.

On 02/09/2013 05:25 PM, Peter Dimov wrote:
Ralph Tandetzky wrote:
I would really like to get feedback on the design of cow_ptr<T> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, if copy-on-write is a useful pattern in general.
Looks pretty good, but I don't like the name much.
What's wrong with the name?

Ralph Tandetzky wrote:
On 02/09/2013 05:25 PM, Peter Dimov wrote:
Ralph Tandetzky wrote:
I would really like to get feedback on the design of cow_ptr<T> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, if copy-on-write is a useful pattern in general.
Looks pretty good, but I don't like the name much.
What's wrong with the name?
Well. 1. It's an acronym. 2. ... but uses lowercase. 3. It's a farm animal. 4. ... enticing you to use member function names such as "moo". 5. It is sufficiently descriptive for one of the primary use cases (eliminating redundant expensive copies) but not for the other (a polymorphic value pointer).

On Sat, Feb 9, 2013 at 11:07 PM, Peter Dimov <lists@pdimov.com> wrote:
Ralph Tandetzky wrote:
On 02/09/2013 05:25 PM, Peter Dimov wrote:
Ralph Tandetzky wrote:
I would really like to get feedback on the design of cow_ptr<T> >> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, >> if copy-on-write is a useful pattern in general.
Looks pretty good, but I don't like the name much.
Ralph, IMO this library is useful (should not be overused, though). But I don't like the name either.
What's wrong with the name?
Well.
1. It's an acronym. 2. ... but uses lowercase. 3. It's a farm animal. 4. ... enticing you to use member function names such as "moo".
:) this got me wondering if the Moo language uses COW a lot.
5. It is sufficiently descriptive for one of the primary use cases (eliminating redundant expensive copies) but not for the other (a polymorphic value pointer).

On 09/02/13 10:15, Ralph Tandetzky wrote:
2. For a property tree, that looks in essence like this:
class PropertyTree : public AbstractProperty { public: /* implementation of the public interface */
private: std::list<cow_ptr<AbstractProperty>> properties; };
Instead of std::list<cow_ptr<AbstractProperty>> you might as well use cow_ptr<std::list<AbstractProperty>> instead.
The latter wouldn't allow polymorphism, which would be a problem. I suppose you meant cow_ptr< std::list< cow_ptr<AbstractProperty> > >. That kind of usage however does allow to share subtrees, and I can see how this can be really useful if you do a lot of out-of-place tree transformations (which is, AFAIK, the most popular approach for tree transformations).
I would really like to get feedback on the design of cow_ptr<T> <https://github.com/ralphtandetzky/cow_ptr.git> rather than discussing, if copy-on-write is a useful pattern in general.
This utility of yours is very similar to things that have existed in the C++ world for a good 10 years. There has been many other implementations, many other slightly different designs and there are also proposals for standardization. Furthermore COW does not appear to be strictly necessary to provide the required behaviour, so it seems there is unnecessary coupling here. I advise you to start by solving the issues with API and cloning requirements first. Once that's done, you can choose to use COW or not as a backend, but that's mostly transparent to the user and a QoI issue. I don't know the status of the proposal, but you also have to recognize that doing the same thing as a standard proposal but in an incompatible way is not something very positive, unless you can point out why the standard proposal is wrong and take steps to have it corrected.

Mathias Gaunard wrote:
I advise you to start by solving the issues with API and cloning requirements first.
...
I don't know the status of the proposal, but you also have to recognize that doing the same thing as a standard proposal but in an incompatible way is not something very positive, unless you can point out why the standard proposal is wrong and take steps to have it corrected.
I disagree on both counts. There's nothing* wrong with the API. There's nothing wrong in proposing the same thing that has already been proposed in another way. * Except the inability to convert cow_ptr<Derived> to cow_ptr<Base>. Implementing that is a challenge though.

Le 08/02/13 16:16, Ralph Tandetzky a écrit :
Hi,
is there any interest in a copy-on-write pointer implementation? I wrote a cow_ptr<T> template class for C++11 which has the following use cases:
As others I would prefer another name for the class. Could you compare how you class relates to value_ptr as defined here (file://localhost/Users/viboes/Downloads/n3339-1.pdf)?
3. You can add cloning to a class hierarchy from the outside. With cow_ptr<Base> a( new Derived1 ); cow_ptr<Base> b( new Derived2 ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // makes a deep copy of a as a Derived1 class. // There is no slicing involved. you copy Base objects polymorphically. The class Base can even be abstract here. It is only required that Derived1 and Derived2 be CopyConstructible.
It seems to me that the copy of polymorphic object works only with the help of the user and that the following should be mentioned as a limitation Base* ba = new Derived1; cow_ptr<Base> a( ba ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // couldn't makes a deep copy of a as a Derived1 class as the type has been lost. // There is a slicing involved. The conversion to bool should be explicit. *explicit* operator bool() const noexcept; Best, Vicente

Le 10/02/13 03:04, Vicente J. Botet Escriba a écrit :
Le 08/02/13 16:16, Ralph Tandetzky a écrit :
Hi,
is there any interest in a copy-on-write pointer implementation? I wrote a cow_ptr<T> template class for C++11 which has the following use cases:
As others I would prefer another name for the class. Could you compare how you class relates to value_ptr as defined here (file://localhost/Users/viboes/Downloads/n3339-1.pdf)?
I meant " N3339: A Preliminary Proposal for a Deep-Copying Smart Pointer" http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3339.pdf
3. You can add cloning to a class hierarchy from the outside. With cow_ptr<Base> a( new Derived1 ); cow_ptr<Base> b( new Derived2 ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // makes a deep copy of a as a Derived1 class. // There is no slicing involved. you copy Base objects polymorphically. The class Base can even be abstract here. It is only required that Derived1 and Derived2 be CopyConstructible.
It seems to me that the copy of polymorphic object works only with the help of the user and that the following should be mentioned as a limitation
Base* ba = new Derived1; cow_ptr<Base> a( ba ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // couldn't makes a deep copy of a as a Derived1 class as the type has been lost. // There is a slicing involved.
If this is confirmed, I see it as a show-stopper, and the cloning strategy must be redefined. I don't see any relational operators. Is this intentional? It is worth adding interactions with nullptr_t? What about a T* release() member function? What about a reset(U*) member function ? Best, Vicente

On 10/02/13 10:27, Vicente J. Botet Escriba wrote:
It seems to me that the copy of polymorphic object works only with the help of the user and that the following should be mentioned as a limitation
Base* ba = new Derived1; cow_ptr<Base> a( ba ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // couldn't makes a deep copy of a as a Derived1 class as the type has been lost. // There is a slicing involved.
If this is confirmed, I see it as a show-stopper, and the cloning strategy must be redefined.
This is something that I already pointed out elsewhere on the thread. To me the best approach is to make the class have a "virtual copy constructor" (a clone virtual member function counts as one), which is the only way to fix this issue. I don't quite like the approach in N3339 because it couples allocation and copy construction.

On 02/10/2013 10:27 AM, Vicente J. Botet Escriba wrote:
Le 10/02/13 03:04, Vicente J. Botet Escriba a écrit :
Le 08/02/13 16:16, Ralph Tandetzky a écrit :
Hi,
is there any interest in a copy-on-write pointer implementation? I wrote a cow_ptr<T> template class for C++11 which has the following use cases:
As others I would prefer another name for the class. Could you compare how you class relates to value_ptr as defined here (file://localhost/Users/viboes/Downloads/n3339-1.pdf)?
I meant " N3339: A Preliminary Proposal for a Deep-Copying Smart Pointer" http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3339.pdf
Thank you for this awesome reference. That was very enlightening. The most obvious difference is that when value_ptrs are copied, then the pointee object is cloned instantly. There is no copy-on-write. But the value semantics are quite similar. Both value_ptr and cow_ptr support polymorphic cloning. I might use some ideas in value_ptr's source code like the automaticly choosing between default_clone and default_copy.
3. You can add cloning to a class hierarchy from the outside. With cow_ptr<Base> a( new Derived1 ); cow_ptr<Base> b( new Derived2 ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // makes a deep copy of a as a Derived1 class. // There is no slicing involved. you copy Base objects polymorphically. The class Base can even be abstract here. It is only required that Derived1 and Derived2 be CopyConstructible.
It seems to me that the copy of polymorphic object works only with the help of the user and that the following should be mentioned as a limitation
Base* ba = new Derived1; cow_ptr<Base> a( ba ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // couldn't makes a deep copy of a as a Derived1 class as the type has been lost. // There is a slicing involved.
In the DefaultCloner there is an assertion assert( typeid( *p ) == typeid( const T& ) ); which would be violated in the line you marked. So you'll notice that during debugging when the copy is performed. There's no way to check that at compile-time unfortunately.
If this is confirmed, I see it as a show-stopper, and the cloning strategy must be redefined.
I don't see any relational operators. Is this intentional?
To a degree. It's not totally obvious, if only the pointers will be compared, or if pointers are equal, if the pointed-to objects will be compared for equality as well. I'd prefer the first variant and I will add them in the next revision.
It is worth adding interactions with nullptr_t?
I never needed it. I'm not sure. Maybe for completeness sake. Should I add it?
What about a T* release() member function?
Rather not. Should this make a copy? It has to, if there's more than one copy. If there's only one copy, then it depends if there is an internal concrete_counter or a wrapping_counter. A wrapping counter wouldn't be able to release without copying. But the most compelling reason for me is that the caller wouldn't know for sure how to delete the object. The cow_ptr could have been initialized with some kind of special deleter. (It's kind of similar to shared_ptr.)
What about a reset(U*) member function ?
This stuff can be done with the constructors and the assignment operators. So, strictly speaking there's no need for it. You might get a bit better performance with reset() though. (Is performance actually the reason for shared_ptr and unique_ptr to provide a reset() function?) At first I wanted to keep the class interface simple. I might add it in the future. Are there compelling reasons to add it now?
Best, Vicente
Thank you for your great feedback!

On 02/10/2013 10:27 AM, Vicente J. Botet Escriba wrote:
Le 10/02/13 03:04, Vicente J. Botet Escriba a écrit :
Le 08/02/13 16:16, Ralph Tandetzky a écrit :
Hi,
is there any interest in a copy-on-write pointer implementation? I wrote a cow_ptr<T> template class for C++11 which has the following use cases:
As others I would prefer another name for the class. Could you compare how you class relates to value_ptr as defined here (file://localhost/Users/viboes/Downloads/n3339-1.pdf)?
I meant " N3339: A Preliminary Proposal for a Deep-Copying Smart Pointer" http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3339.pdf
Thank you for this awesome reference. That was very enlightening. The most obvious difference is that when value_ptrs are copied, then the pointee object is cloned instantly. There is no copy-on-write. But the value semantics are quite similar. Both value_ptr and cow_ptr support polymorphic cloning.
I might use some ideas in value_ptr's source code like the automaticly choosing between default_clone and default_copy. Glad to hear this inspired you.
3. You can add cloning to a class hierarchy from the outside. With cow_ptr<Base> a( new Derived1 ); cow_ptr<Base> b( new Derived2 ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // makes a deep copy of a as a Derived1 class. // There is no slicing involved. you copy Base objects polymorphically. The class Base can even be abstract here. It is only required that Derived1 and Derived2 be CopyConstructible.
It seems to me that the copy of polymorphic object works only with the help of the user and that the following should be mentioned as a limitation
Base* ba = new Derived1; cow_ptr<Base> a( ba ); cow_ptr<Base> c; c = a; // performs a shallow copy. c->doSomething(); // couldn't makes a deep copy of a as a Derived1 class as the type has been lost. // There is a slicing involved.
In the DefaultCloner there is an assertion assert( typeid( *p ) == typeid( const T& ) ); which would be violated in the line you marked. So you'll notice that during debugging when the copy is performed. There's no way to check that at compile-time unfortunately. Ok. So typeid( *p ) == typeid( const T& ) is a precondition of the constructor from a Y*. I find this quite unfortunate. Which is the advantage this provide to the user? Better performances? if yes where?
If this is confirmed, I see it as a show-stopper, and the cloning strategy must be redefined.
I don't see any relational operators. Is this intentional?
To a degree. It's not totally obvious, if only the pointers will be compared, or if pointers are equal, if the pointed-to objects will be compared for equality as well. I'd prefer the first variant and I will add them in the next revision. Could your class store nullptr? Does your class have value semantics or pointer semantics? IMHO, it should have value semantics and the comparison should be made on the stored value.
It is worth adding interactions with nullptr_t?
I never needed it. I'm not sure. Maybe for completeness sake. Should I add it? This depend on what do you want, pointer or value semantic. If you want
Le 11/02/13 19:15, Ralph Tandetzky a écrit : pointer semantic, interactions with nullptr_t should be added, otherwise not evidently.
What about a T* release() member function?
Rather not. Should this make a copy? It has to, if there's more than one copy. If there's only one copy, then it depends if there is an internal concrete_counter or a wrapping_counter. A wrapping counter wouldn't be able to release without copying. But the most compelling reason for me is that the caller wouldn't know for sure how to delete the object. The cow_ptr could have been initialized with some kind of special deleter. (It's kind of similar to shared_ptr.)
OK, I see.
What about a reset(U*) member function ?
This stuff can be done with the constructors and the assignment operators. So, strictly speaking there's no need for it. You might get a bit better performance with reset() though. (Is performance actually the reason for shared_ptr and unique_ptr to provide a reset() function?) At first I wanted to keep the class interface simple. I might add it in the future. Are there compelling reasons to add it now?
I suspect that if the operation can be implemented with better performances than using constructor and assignment it is a compelling reason. Best, Vicente

On 02/11/2013 09:48 PM, Vicente J. Botet Escriba wrote: > Le 11/02/13 19:15, Ralph Tandetzky a écrit : >> On 02/10/2013 10:27 AM, Vicente J. Botet Escriba wrote: >>> Le 10/02/13 03:04, Vicente J. Botet Escriba a écrit : >>>> Le 08/02/13 16:16, Ralph Tandetzky a écrit : >>>>> Hi, >>>>> >>>>> is there any interest in a copy-on-write pointer implementation? I >>>>> wrote a cow_ptr<T> template class for C++11 which has the >>>>> following use cases: >>>>> ... >>>>>>> 3. You can add cloning to a class hierarchy from the outside. With >>>>>>> cow_ptr<Base> a( new Derived1 ); >>>>>>> cow_ptr<Base> b( new Derived2 ); >>>>>>> cow_ptr<Base> c; >>>>>>> c = a; // performs a shallow copy. >>>>>>> c->doSomething(); // makes a deep copy of a as a >>>>>>> Derived1 class. >>>>>>> // There is no slicing involved. >>>>>>> you copy Base objects polymorphically. The class Base can even >>>>>>> be abstract here. It is only required that Derived1 and Derived2 >>>>>>> be CopyConstructible. >>>>>>> >>>>>> >>>>>> It seems to me that the copy of polymorphic object works only >>>>>> with the help of the user and that the following should be >>>>>> mentioned as a limitation >>>>>> >>>>>> Base* ba = new Derived1; >>>>>> cow_ptr<Base> a( ba ); >>>>>> cow_ptr<Base> c; >>>>>> c = a; // performs a shallow copy. >>>>>> c->doSomething(); // couldn't makes a deep copy of a >>>>>> as a Derived1 class as the type has been lost. >>>>>> // There is a slicing involved. >>>> >>>> In the DefaultCloner there is an assertion >>>> assert( typeid( *p ) == typeid( const T& ) ); >>>> which would be violated in the line you marked. So you'll notice >>>> that during debugging when the copy is performed. There's no way to >>>> check that at compile-time unfortunately. > Ok. So typeid( *p ) == typeid( const T& ) is a precondition of the > constructor from a Y*. I find this quite unfortunate. Which is the > advantage this provide to the user? Better performances? if yes where? The advantage relative to what? I assume in comparison to make_cow. Well, there are several advantages: 1. You can construct from a pointer that points to something already allocated. You do not need to allocate and copy. 2. You have a custom copier/cloner and custom deleter. The assertion performes automatic checking in debug mode. In release mode everything can run at full speed. >>>> >>> If this is confirmed, I see it as a show-stopper, and the cloning >>> strategy must be redefined. >>> >>> I don't see any relational operators. Is this intentional? >> >> To a degree. It's not totally obvious, if only the pointers will be >> compared, or if pointers are equal, if the pointed-to objects will be >> compared for equality as well. I'd prefer the first variant and I >> will add them in the next revision. > Could your class store nullptr? > Does your class have value semantics or pointer semantics? > IMHO, it should have value semantics and the comparison should be made > on the stored value. That's a good question. Right now I tend to prefer genuine value semantics, because right now I don't see a point in making it nullable. I never had this usecase. Then cow_ptr would not be appropriate as name anymore. In order to make the possibly polymorphic behaviour of the pointees more visible in the name of the class, clone_on_write<T> might be better then plain copy_on_write<T>. What about a reset(U*) member function ? >> This stuff can be done with the constructors and the assignment >> operators. So, strictly speaking there's no need for it. You might >> get a bit better performance with reset() though. (Is performance >> actually the reason for shared_ptr and unique_ptr to provide a >> reset() function?) At first I wanted to keep the class interface >> simple. I might add it in the future. Are there compelling reasons to >> add it now? >> > I suspect that if the operation can be implemented with better > performances than using constructor and assignment it is a compelling > reason. Ok, well then, I will implement it soon. > Best, > Vicente Thanks, Ralph

SUMMARY ======= I would like to summarize the discussion about copy-on-write so far: * I wrote a copy-on-write-pointer implementation <https://github.com/ralphtandetzky/cow_ptr.git> with the following use cases: 1. A const-correct pimpl pointer. 2. Helper class for implementing copy-on-write for higher level structures, where copying is expensive. (E.g. matrix or image classes) 3. cow_ptr<Base> wraps polymorphic classes giving them genuine value semantics, so they can be put into standard containers, even if Base is abstract. 4. It can be used to add cloning to a class hierarchy non-intrusively. * Thread-safety was discussed. (brought up by Alexey) - The reference counter is atomic. - All constant operations on cow_ptr and its pointee are thread-safe as long as const operations on the pointee are thread-safe. - If for the pointee constant operations are thread-safe and if it is safe to write to a pointee from one thread as long as no one else is reading or writing, then the same is true for all individual cow_ptrs pointing to that object and all access though these cow_ptrs. * Is cow still necessary? (brought up by Mathias) - Since in C++11 you can move objects cheaply instead of copying them an important use case of copy-on-write is gone. Before move-semantics returning objects by value was sometimes a bad performance issue. Copy-on-write solved that. - Cow is still useful today for matrix classes or image classes or even trees that share state under the hood, but should not influence each other when writing. - Example: If you want to implement a property tree, you can use the approach class PropertyTree : AbstractProperty { public: /* implementation of public interface. */ private: std::list<cow_ptr<AbstractProperty>> properties; }; - Having this you can keep a history of a big property tree in memory easily. std::vector<PropertyTree> history; auto current = history.back(); current.modify(); history.push_back( current ); * Is COW unsafe? (brought up by Mathias) - COW is sometimes considered unsafe. That's why the C++ standard COW implementations of std::string. - The code std::string a("Hello world!"); char * p = &a[11]; std::string b( a ); *p = '.'; // modifies a and b, if std::string was implemented using COW. does not work correctly, for COW-implementations of std::string. - The reason this does not work is the escaped pointer. When escaping pointers are strictly avoided, this effect cannot happen. Therefore cow_ptr does not provide a non-const version of the get() member function, but a member function modify() (formerly known as apply()) which can be used in the following way: cow_ptr<MyType> p( new MyType ); auto q = p; p.modify( [&]( MyType * p ){ p->doSomething(); p->doSomethingElse(); } ); COW_MODIFY(p) { p->doSomething(); p->doSomethingElse(); }; // equivalent to the line above - It is still possible for a pointer to escape, but the interface is such that it is easy to use it correctly and hard to use incorrectly. - The interface design of std::string prevents the possibility for implementing it correctly. Hence COW must be considered during interface design phase of a class. * Alternatives to COW (brought up by Mathias) - C++11 move and cloning. -Most often unnecessary copies can be avoided using C++11 move-semantics and cloning where necessary. - Flyweight factory. - Objects are accessed by a hash value. There's always only one copy of identical objects. For complex objects that are modified often recalculating the hash and synchronizing the hash table thread-safely can be a bad performance bottleneck. - shared_ptr<T const> - Even with shared_ptr<T const> you never know, if there's a shared_ptr<T> object (non-const) through which the pointee is modified. shared_ptrs are really shared. It is likely more error prone to use shared_ptr to implement COW. If T is a polymorphic class but does not have a clone() member function, then cloning will not work properly because of slicing. shared_ptr is useful for many things, but it's probably not the best tool to implement COW. * The name (brought up by Peter) - cow is an acronym and lower case. It's a farm animal ... enticing me to write member function names like "moo". The name does not reflect the ability to contain polymorphic value pointers. (Peter) - clone_on_write<T> would be a suggestion of mine. It might be useful to drop the _ptr suffix completely, since the class has value semantics. - Others have suggested to split cow_ptr<T> into a read_ptr<T> and write_ptr<T> classes. * Slicing problems (brought up by Vincente) - The constructor taking an Y * pointer might lead to slicing problems, if the pointee is not an Y object, but somethings derived. - The default_copier will make a runtime-check assert( typeid(*p) == typeid(Y) ). * Comparison to adobe::copy_on_write<T> <http://cppnow.org/session/value-semantics-and-concepts-based-polymorphism/> (brought up by Andreas) - This class is constructed by moving a T object into itself. Copying is implemented as cheap copy of a pointer with reference counting. - other than constructors, destructors and assignment operators there are only the public member functions read() and write(). read() returns a const reference to the contained object, write() makes an internal copy, if the reference count is greater than 1, and then returns a non-const reference to the contained object. - The class does not support cloning for polymorphic T, but always uses the copy-constructor of T in order to copy. - Hence the class interface is extremely simple. * Comparison to value_ptr<T> <http://www.google.de/url?sa=t&rct=j&q=n3339&source=web&cd=3&sqi=2&ved=0CD4QFjAC&url=http%3A%2F%2Fwww.open-std.org%2Fjtc1%2Fsc22%2Fwg21%2Fdocs%2Fpapers%2F2012%2Fn3339.pdf&ei=umkbUabjN6nh4QTAmoCYAg&usg=AFQjCNGikPTGbnWijae8tzd1KTLvz1C63Q> in N3339 (open-std) (brought up by Vincente) - Basic properties: A value_ptr<T> mimics the value semantics of its pointee. Hence the pointee lifetime is the pointer lifetime, and the pointee is copied whenever the pointer is copied. Internally the pointee can be of a derived class of T. In this case the object is cloned properly. - Hence value_ptr<T> has the use-cases 1, 3 and 4 of cow_ptr<T>, but does not implement copy-on-write (use case 2). - value_ptr has the cloner and the deleter as template arguments of the class. The current implementation of cow_ptr only has the pointee type as template parameter. The cloner and deleter are stored dynamically. - value_ptr does not have a reference counter. - Other than that value_ptr<T> and cow_ptr<T> are extremely similar from the public interface. - In conjunction with copy_on_write<T> this can be used to do the same stuff as cow_ptr<T> does. The way to use it would be copy_on_write<value_ptr<T>>. * pointer-semantics or value-semantics and nullptr (brought up by Vincente) - Should the COW-class be nullable? If not, then it should probably not be called cow_ptr. - This question has not been discussed to the end yet. Personally, I don't think that null-cow_ptrs are very useful. * Different member and non-member functions (brought up by Vincente) - relational operators (brought up by Vincente) - It is not clear, whether operator==() on cow_ptrs should only compare pointers or also pointees. This would depend on whether the COW-class is considered a pointer or a genuine value. - release() - Should not exists, because the callee would not know what deleter to call. (similar to shared_ptr) - reset() - Will be implemented in order to provide the performance benefits. * The write_ptr<T> and read_ptr<T> solution (brought up by Peter) - read_ptr<T> would be similar to shared_ptr<T const> and write_ptr<T> would be a unique_ptr<T> equivalent. read_ptr<T> has a member function which returns a write_ptr<T> through which the pointee can be modified. Afterwards the write_ptr<T> can be moved back into the read_ptr<T>: read_ptr<T> pr; if ( write_ptr<T> pw = pr.write() ) { pw->modify(); pr = std::move( pw ); } - This possibly provides a better separation of concerns (safer, clearer, more flexible). - However, the above code is not exception-safe, if pr becomes a nullptr when the write() function is called. It makes exception-safe code harder to write. - In case the use_count is greater than 1: Should pr.write() make the copy? Or should pw.operator->() make the copy? This is not sufficiently discussed yet. Thank you for all your constructive feedback! Ralph

Le 13/02/13 16:46, Ralph Tandetzky a écrit :
* pointer-semantics or value-semantics and nullptr (brought up by Vincente)
- Should the COW-class be nullable? If not, then it should probably not be called cow_ptr. - This question has not been discussed to the end yet. Personally, I don't think that null-cow_ptrs are very useful.
* Different member and non-member functions (brought up by Vincente)
- relational operators (brought up by Vincente)
- It is not clear, whether operator==() on cow_ptrs should only compare pointers or also pointees. This would depend on whether the COW-class is considered a pointer or a genuine value.
If you choose for your class to have value semantics the relational operator should compare pointees. Vicente

On 02/13/2013 06:43 PM, Vicente J. Botet Escriba wrote:
Le 13/02/13 16:46, Ralph Tandetzky a écrit :
* pointer-semantics or value-semantics and nullptr (brought up by Vicente)
- Should the COW-class be nullable? If not, then it should probably not be called cow_ptr. - This question has not been discussed to the end yet. Personally, I don't think that null-cow_ptrs are very useful.
* Different member and non-member functions (brought up by Vicente)
- relational operators (brought up by Vicente)
- It is not clear, whether operator==() on cow_ptrs should only compare pointers or also pointees. This would depend on whether the COW-class is considered a pointer or a genuine value.
If you choose for your class to have value semantics the relational operator should compare pointees.
Vicente I totally agree. And in my opinion making the class have non-nullable value semantics would be the best alternative. I would then call it clone_on_write<T>. Any comments, opinions or suggestions, everybody? That would also bury the idea of having a read_ptr<T> and a write_ptr<T>, where I find the interface unnecessarily complicated without much gain (IMHO).
Ralph

It seems to me that this class has some overloaded duties: 1) Allowing value semantics on polymorphic objects 2) Implementing copy-on-write 3) (not mentioned explicitly yet) allowing objects that would normally be expensive to move to become cheap to move It seems like a better solution would be to separate these into two classes, on implemented in terms of the other: First, something more like a proposed value_ptr. It would allow 1 and 3 only. I personally see many use cases for this. 1 is obvious, but 3 allows you to, for example, create a std::vector of value_ptr that handles insertion into the middle with dramatically better overall performance than std::vector and std::list in many use cases. value_ptr could be easily implemented in terms of std::unique_ptr, with the addition of a copy constructor and copy-assignment operator to perform a "deep copy". The main questions for this class would be whether it allows nullptr values and the inclusion of the release member function (in line with goal 3, but out of line with goal 1) and how the cloning is implemented. Second, your cow_ptr (with hopefully a different name). This could probably be implemented with no loss of efficiency as a wrapper around a value_ptr, dramatically simplifying the implementation. I have no interest in the overhead of copy-on-write, but was planning on writing a value_ptr class. value_ptr seems to be an important enough building block to justify inclusion on its own, and copy-on-write is a fundamentally different problem than value-semantics. As such, the two should not be coupled to each other. For the cloning, have you considered a template function rather than requiring a virtual clone member function? The template function approach was used for the boost::ptr_container library. I disagree with their design decision of providing a default implementation for everything, though, as that seems like it could lead to errors. You could, for instance, create a template function (like the ptr_container library uses), such as new_clone, and use SFINAE on !std::is_class<T> to provide a default implementation of a plain copy (which is always safe on non-class types, as they cannot be the base of anything), and require users to define their own overload for their class. Have you considered this design decision? It seems that this would work easier for non-class types, such as std::array.
participants (10)
-
Alexander Lamaison
-
Alexey Tkachenko
-
Daniel James
-
David Stone
-
Jeffrey Yasskin
-
Mathias Gaunard
-
Matus Chochlik
-
Peter Dimov
-
Ralph Tandetzky
-
Vicente J. Botet Escriba