Preliminary release of clone_ptr library

(this will be the first and last "progress update" message until it comes time to summit for a review to boost) For better or for worse (it seems there's much debate on clone_ptr) I have started on the clone_ptr library and have a working preliminary clone_ptr class. Details can be found at the following URL and I will update the website as my work progresses: http://www.peltkore.net/~alipha/clone_ptr.html If anyone is interested in receiving email updates or would like to collaborate on the project (which could just simply be someone I could bounce ideas off of) email me. To contribute some to the discussion: It has been noted that the Boost.pointer_container library could be used instead of having a container of clone_ptr. However, what if you want to use non-standard containers? Yes, Boost.pointer_container has ptr_*_adapter classes, but they can't be used for containers which don't follow the container requirements set by the C++ standard. Secondly, even if a third party container class follows the container requirements, it is still unnecessary work to create your own pointer container. Consider this generic tree class: http://www.aei.mpg.de/~peekas/tree/ http://www.aei.mpg.de/~peekas/tree/doxygen/html/functions.html There's dozens of functions in the tree class that would have to be wrapped in a ptr_tree class that the boost::ptr_sequence_adapter class doesn't wrap for you. Compare this to: tree<clone_ptr<T> > foo; The clone_ptr solution seems a lot easier. The same argument could be made with boost::graph, etc. Also, I've decided to give clone_ptr clone-on-write semantics instead of clone-on-copy. This should alleviate any concerns about performance when inserting into a std::vector or copying containers (copying clone_ptr would give similar performance to boost::shared_ptr). Kevin Spinar

Kevin Spinar wrote :
Also, I've decided to give clone_ptr clone-on-write semantics instead of clone-on-copy. This should alleviate any concerns about performance when inserting into a std::vector or copying containers (copying clone_ptr would give similar performance to boost::shared_ptr).
I see clone_ptr<base> just like a base object but with the ability to behave polymorphically. Hence to me it would seem normal that when copying actual cloning is done, and not refcounting which makes it more like shared_ptr which is designed to handle multiple ownership. In that case, std::vector<clone_ptr<base> > would be comparable in performance with std::vector<base>. When C++ supports proper move semantics, the containers will probably avoid the extensive copying that is being done nowadays. By the way, I think the boost smart pointers should be refactored to a policy-based design, which would allow people to choose more custom behaviours.

Glad to see it's coming along! Another couple of suggestions -- it would be nice if the clone_ptr took a clone allocator as a policy ( http://boost.org/libs/ptr_container/doc/reference.html#the-clone-allocator-c...), and as well, I would personally like to see the clone-on-copy vs clone-on-write implementation able to be toggled through a template argument if it's not too much trouble, since clone-on-copy still has the advantages of smaller size, more deterministic performance, and perhaps most importantly, more predictable exceptions. An explicit "separate" member function would also be nice, which would explicitly clone the object if it currently has shared ownership. The main purpose of this would be to get around the fact that non-const -> and * may throw an exception if the object isn't yet cloned -- this way, you may clone the object explicitly prior to doing other unrelated operations which otherwise would have a no-throw guarantee, potentially saving the programmer from having to explicitly try and catch or have to unintuitively call non-const get prior to performing other operations in order to implicitly remove the possibility of exceptions being thrown later on in code. Finally, I like the fact that you are propogating the constness to the stored object, however, since that is the path you are taking, I don't think that clone_ptr is a proper name. Traditionally, a pointer merely references the object and its constness doesn't affect the target, and as well, you don't normally expect copying of the target when copying something that merely references the target. In fact, with the way it is now, I'd argue that the * and -> operators are only used because there is no way to directly forward the interface of the object, not because you are implementing a pointer concept (i.e. they're only used because you can't overload the . operator), so I wouldn't really call it a smart pointer at all. As you said in documentation, it's really more like an object wrapper than a smart pointer. Because of this, perhaps "smart_object" or "dynamic_object" or something along those lines would be a more accurate name than "clone_ptr." Really, the pointer aspect is more of an implementation detail than it is a logical part of the concept. -- -Matt Calabrese

On 7/8/06, Matt Calabrese <rivorus@gmail.com> wrote:
Glad to see it's coming along!
Another couple of suggestions -- it would be nice if the clone_ptr took a clone allocator as a policy ( http://boost.org/libs/ptr_container/doc/reference.html#the-clone-allocator-c...),
clone_ptr accepts a clone allocator as a parameter to the constructor: template<class Y, class Alloc> clone_ptr(Y * p, Alloc); My apologies that it was not more clear. However, it seems I am bending the requirements of a clone allocator by performing default and copy construction.
and as well, I would personally like to see the clone-on-copy vs clone-on-write implementation able to be toggled through a template argument if it's not too much trouble, since clone-on-copy still has the advantages of smaller size, more deterministic performance, and perhaps most importantly, more predictable exceptions.
The problem is, I don't see that it is feasible to write a clone-on-copy implementation that is compatible with assocative containers. To be usable with assocative containers, a clone_ptr must exhibit the following behavior: clone_ptr<base> p(new derived); clone_ptr<base> p2(p); // assertion: p and p2 are equal (p < p2 == false && p2 < p == false) p2->modify_object(); // assertion: p and p2 are no longer equal If the first assertion was not true, then copy construction of assocative containers would fail: the clone_ptrs in the container would not maintain their relative sort order after being copied. If the second assertion was not true, then you would be unable to write code resembling the following: clone_ptr<base> p = *myset.begin(); p->modify_object(); myset.insert(p); // insert fails--object already exists in the set
An explicit "separate" member function would also be nice, which would explicitly clone the object if it currently has shared ownership. The main purpose of this would be to get around the fact that non-const -> and * may throw an exception if the object isn't yet cloned -- this way, you may clone the object explicitly prior to doing other unrelated operations which otherwise would have a no-throw guarantee, potentially saving the programmer from having to explicitly try and catch or have to unintuitively call non-const get prior to performing other operations in order to implicitly remove the possibility of exceptions being thrown later on in code.
Thank you, I'll plan to add it.
Finally, I like the fact that you are propogating the constness to the stored object, however, since that is the path you are taking, I don't think that clone_ptr is a proper name. Traditionally, a pointer merely references the object and its constness doesn't affect the target, and as well, you don't normally expect copying of the target when copying something that merely references the target. In fact, with the way it is now, I'd argue that the * and -> operators are only used because there is no way to directly forward the interface of the object, not because you are implementing a pointer concept (i.e. they're only used because you can't overload the . operator), so I wouldn't really call it a smart pointer at all. As you said in documentation, it's really more like an object wrapper than a smart pointer. Because of this, perhaps "smart_object" or "dynamic_object" or something along those lines would be a more accurate name than "clone_ptr." Really, the pointer aspect is more of an implementation detail than it is a logical part of the concept.
Indeed. Perhaps poly_object? Kevin Spinar

On 7/9/06, Kevin Spinar <spinarkm@gmail.com> wrote:
The problem is, I don't see that it is feasible to write a clone-on-copy implementation that is compatible with assocative containers. To be usable with assocative containers, a clone_ptr must exhibit the following behavior:
This problem doesn't actually have to do with clone-on-write vs clone-on-copy, the problem is that the less than operator is improperly defined for the concept that is being modeled. As we established, the types being created aren't really smart pointers but are actually smart objects. We are really only accessing the objects indirectly via * and -> because we have no way of more directly forwarding the interface in C++. Following this path, you begin to realize that the internal address of the object is an implementation detail and should be allowed to change without changing the logical state the object. Just like with other types, copying an object copies the state of the object, only now the physical clone operation is performed lazily. If defined, comparing objects before and after the physical clone operation should not have different results, since the state is still logically the same. What this all means is that < should not compare the internal pointer value, or rather, it shouldn't if you expect it to act correctly for the use that we are describing (as a smart object). The same logic holds true for other operations such as ==, !=, etc. Initially I was imagining that the operations wouldn't be defined at all, and if you wanted such functions you would have to do something like *left < *right, and if you wished to use them in standard containers and algorithms you would have to specify the sorting predicate rather than using the default less (you simply use an adapted predicate which performs the operation through a level of indirection). In retrospect, perhaps I was a bit premature about stating that the operations should not be defined at all in my response to Thorsten, since it does make it more difficult to use with sorted containers and algorithms with default sorting predicates. Since we now consider * and -> as a necessary evil and the closest one can get to overloading the . operator, perhaps allowing some operations to be automatically forwarded isn't a bad thing. Taking that route, it would make sense to overload comparison operations in such a way that just forwards the call to the dynamically allocated object. This would allow you to work with the objects intuitively with standard containers and algorithms without explicitly having to specify sorting predicates and should go along with the concept of the instantiation as being a smart object rather than a smart pointer. There is one more issue that I can see. Now that we are moving more towards the concept of a smart object, how is a null pointer handled, or rather, should it be allowable at all? Variants do not automatically allow a null state, so should a smart object follow suit? If you take the approach that the type cannot be in a null state, you may guarantee the user that they always have a valid object, which could be very beneficial for all of the same reasons that it is beneficial to have variants non-nullable. In particular, this allows the user to safely avoid all null pointer checks when they have a constructed smart object, which can potentially greatly reduce errors. The problem with this is, in addition to the fact that some users may wish to have a nullable state, how do you default-construct the object? For instance, in the case of an abstract class, you wouldn't be able to default construct the type unless perhaps a default child type were specified via a template argument to the smart object type template itself (which would default to the first type if left unspecified). Ideally, I think it would be great to allow for something like this, though it does increase the complexity of the template. Since only a few of us are showing serious interest, it's hard to say how common or important such behavior could be. I wouldn't mind if such functionality were excluded, but it is definately something to think about. If the desire is there, perhaps both a "smart object" and a "nullable smart object" concept could be maintained (with the former being internally implemented with the latter), but really that can all be thought about later, after everything else gets ironed out. On 7/9/06, Kevin Spinar <spinarkm@gmail.com> wrote:
Indeed. Perhaps poly_object?
Sounds good to me. It could possibly even be shortened to poly_obj if it seems to long, though I don't know if "obj" is as immediately recognizable as meaning "object" as "ptr" is to "pointer." -- -Matt Calabrese

On 7/9/06, Matt Calabrese <rivorus@gmail.com> wrote:
What this all means is that < should not compare the internal pointer value, or rather, it shouldn't if you expect it to act correctly for the use that we are describing (as a smart object). The same logic holds true for other operations such as ==, !=, etc. Initially I was imagining that the operations wouldn't be defined at all, and if you wanted such functions you would have to do something like *left < *right, and if you wished to use them in standard containers and algorithms you would have to specify the sorting predicate rather than using the default less (you simply use an adapted predicate which performs the operation through a level of indirection).
First, let me make an assumption. Most people who would use clone_ptr<T> would have T be an abstract base class with no member variables. Now, how would someone write a sorting predicate with that? Double dispatch is a solution, but probably not a desirable solution. Or is my assumption incorrect? I agree with your logic, but I don't know if it's very practical to implement clone_ptr this way.
There is one more issue that I can see. Now that we are moving more towards the concept of a smart object, how is a null pointer handled, or rather, should it be allowable at all? Variants do not automatically allow a null state, so should a smart object follow suit? ... The problem with this is, in addition to the fact that some users may wish to have a nullable state, how do you default-construct the object? For instance, in the case of an abstract class, you wouldn't be able to default construct the type unless perhaps a default child type were specified via a template argument to the smart object type template itself (which would default to the first type if left unspecified). ... If the desire is there, perhaps both a "smart object" and a "nullable smart object" concept could be maintained (with the former being internally implemented with the latter),
Another alternative would be to throw an exception upon using * or -> on a default-constructed clone_ptr, though that would add yet more overhead to those operators.
On 7/9/06, Kevin Spinar <spinarkm@gmail.com> wrote:
Indeed. Perhaps poly_object?
Sounds good to me. It could possibly even be shortened to poly_obj if it seems to long, though I don't know if "obj" is as immediately recognizable as meaning "object" as "ptr" is to "pointer."
I feel poly_obj would be rather recognizable. Also, the clone_ptr library has been updated ( http://www.peltkore.net/~alipha/clone_ptr.html ): the boost::*_pointer_cast functions have been implemented. And loufoque, I didn't ignore your message; it just seemed that in answering Matt's message, I also responded to your comments. Kevin Spinar

On 7/9/06, Kevin Spinar <spinarkm@gmail.com> wrote:
First, let me make an assumption. Most people who would use clone_ptr<T> would have T be an abstract base class with no member variables. Now, how would someone write a sorting predicate with that? Double dispatch is a solution, but probably not a desirable solution. Or is my assumption incorrect?
Right, for most uses it would be difficult to implement. Still, as tough as it is, comparing the pointer value simply isn't correct, so I wouldn't even consider that an option at all. If you agree, then the only options I can think of are either some form of forwarded operation, or no definition at all. It's true that neither one seems great, so maybe there is a better solution out there, but I personally don't see it. The only reason I am leaning towards some form of forwarded solution is that in the case that comparison operators actually are implemented, it allows the user to work with sorted containers and algorithms without having the explicitly specify a comparison predicate. To be honest, I don't mind either way as I personally don't have much of a need for the comparison functionality. That said, I do remember that Korcan Hussein was working on a multimethods library just a few weeks ago -- if that's still going on, I'd imagine that it could make the implementation of comparison operators a little less painful than manual double-dispatch. Another alternative would be to throw an exception upon using * or ->
on a default-constructed clone_ptr, though that would add yet more overhead to those operators.
True, and I personally do not enjoy working with operations that can potentially throw or dereference null pointers, especially with such seemingly simple and common operations. Writing code is much easier when errors and exceptions are avoided entirely by design if possible. -- -Matt Calabrese

Kevin Spinar wrote:
Also, the clone_ptr library has been updated ( http://www.peltkore.net/~alipha/clone_ptr.html ): the boost::*_pointer_cast functions have been implemented.
Regarding: template<class Y, class Alloc> clone_ptr(Y * p, Alloc); // Y must be complete The template parameter name 'Alloc' implies a standard allocator. This is pretty confusing; you should rename it to CloneAlloc if you really want to use the 'CloneAllocator concept'. The alternative is to just take a standard allocator. You might also want to not ignore the actual argument and make an effort to support stateful allocators (although this is somewhat more complicated, it is also considerably more useful.)

Kevin Spinar wrote:
I feel poly_obj would be rather recognizable.
clone_ptr (or copy_ptr, or copy_on_write_ptr) is a better name from a "marketing standpoint" than poly_obj, because it's more recognizable.
Also, the clone_ptr library has been updated ( http://www.peltkore.net/~alipha/clone_ptr.html ): the boost::*_pointer_cast functions have been implemented.
The copy on write semantics are problematic from an usability standpoint. Consider this code: // thread 1 int x1 = p->x; // thread 2 int x2 = p->x; Is this legal (assuming that both threads use the same pointer p)? You can't tell without looking at p. If it's const, or points at const, the code is legal. The problem is that there are no visible mutating operations in the code; it looks innocent enough as all we do is read p->x, right? One option is to always return a const from -> and * and provide a separate mutable accessor.

On 7/09/06, Matt Calabrese <rivorus@gmail.com> wrote:
As we established, the types being created aren't really smart pointers but are actually smart objects.
If that is true, then is the following form really correct?: clone_ptr<Base> p(new Derived); Wouldn't it be more-correct to write: Derived d; clone_ptr<Base> p(d); On 7/10/06, Matt Calabrese <rivorus@gmail.com> wrote:
On 7/9/06, Kevin Spinar <spinarkm@gmail.com> wrote:
First, let me make an assumption. Most people who would use clone_ptr<T> would have T be an abstract base class with no member variables. Now, how would someone write a sorting predicate with that? Double dispatch is a solution, but probably not a desirable solution. Or is my assumption incorrect?
Right, for most uses it would be difficult to implement. Still, as tough as it is, comparing the pointer value simply isn't correct, so I wouldn't even consider that an option at all. If you agree, then the only options I can think of are either some form of forwarded operation, or no definition at all. It's true that neither one seems great, so maybe there is a better solution out there, but I personally don't see it. The only reason I am leaning towards some form of forwarded solution is that in the case that comparison operators actually are implemented, it allows the user to work with sorted containers and algorithms without having the explicitly specify a comparison predicate. To be honest, I don't mind either way as I personally don't have much of a need for the comparison functionality.
Ok, perhaps comparison forwarding would be the best solution.
Another alternative would be to throw an exception upon using * or -> on a default-constructed clone_ptr, though that would add yet more overhead to those operators.
True, and I personally do not enjoy working with operations that can potentially throw or dereference null pointers, especially with such seemingly simple and common operations. Writing code is much easier when errors and exceptions are avoided entirely by design if possible.
I suppose the best solution would be what you suggested earlier: "perhaps a default child type were specified via a template argument to the smart object type template itself (which would default to the first type if left unspecified)." On 7/10/06, Peter Dimov <pdimov@mmltd.net> wrote:
clone_ptr (or copy_ptr, or copy_on_write_ptr) is a better name from a "marketing standpoint" than poly_obj, because it's more recognizable.
Opinions? Accurate name or recognizable name?
Regarding:
template<class Y, class Alloc> clone_ptr(Y * p, Alloc); // Y must be complete
The template parameter name 'Alloc' implies a standard allocator. This is pretty confusing; you should rename it to CloneAlloc if you really want to use the 'CloneAllocator concept'.
Indeed and I'll admit there is other naming and documentation issues I should address before releasing this.
The alternative is to just take a standard allocator.
The reasoning to have a a clone allocator concept ( http://www.boost.org/libs/ptr_container/doc/reference.html#the-clone-allocat... ) is to allow for cloning of objects which don't have public copy constructors or to allow people to do something other than copy construct.
You might also want to not ignore the actual argument and make an effort to support stateful allocators (although this is somewhat more complicated, it is also considerably more useful.)
I agree. ptr_container's clone allocators don't seem to be meant for keeping stateful allocators (their functions are static). Perhaps it's time to break away from ptr_container's clone allocator and have my own clone allocator concept.
The copy on write semantics are problematic from an usability standpoint. Consider this code:
// thread 1
int x1 = p->x;
// thread 2
int x2 = p->x;
Is this legal (assuming that both threads use the same pointer p)? You can't tell without looking at p. If it's const, or points at const, the code is legal. The problem is that there are no visible mutating operations in the code; it looks innocent enough as all we do is read p->x, right?
I intended to add mutexes and thread safety to the code. Is there more to this issue than that? Though I suppose adding locks to a getter function isn't that desireable efficiency-wise.
One option is to always return a const from -> and * and provide a separate mutable accessor.
Not exactly the most desirable solution, but it seems many things in this aren't turning out as I originally expected. Of course, if we go back to clone-on-copy semantics, this won't be an issue. Kevin Spinar
participants (4)
-
Kevin Spinar
-
loufoque
-
Matt Calabrese
-
Peter Dimov