Proposal: Polymorphic Value Objects

I am proposing the addition of the following very simple utility in Boost, and thus request a very informal and early review of the ideas and design. Idea ---- The idea is to provide an utility that allows the manipulation of dynamically typed objects, that may be of any derived type of a given base type, as if it were a simple base object. This is quite similar to a smart pointer that does deep-copying, but without any pointer or memory allocation exposed. As code, it appears as such: poly_obj<Base> b = Derived(); poly_obj<BaseBase> b2 = b; // b2 is a copy of b (Derived::Derived(const Derived&) was called), implicit upcasting Motivation ---------- The motivation for this is that often usage of dynamic typing with inheritance and inclusion polymorphism often means explicit memory allocation and deallocation, and thus are often a source of unsafety. shared_ptr is one way to solve the problem, yet the shared object doesn't behave as a normal C++ object: it has entity semantics (= reference) instead of value semantics (= copying). shared_ptr is highly regarded as being one of the most useful utilities in boost by neophytes because it allows easier and simpler programming, without the usual issues in regards with pointers people are used to. However it might not always be the right solution, and I believe I have seen it being used even when sharing wasn't needed at all. As a matter of fact, it appears that sharing is rarely needed, and makes the program more difficult to understand. (sharing between threads also cause a few issues). Indeed, the object is freed when it is no longer used, but it's not always trivial to trace whether it's still used or not. Sharing also means you can expect side-effects coming from anywhere. Pure value objects make the program more easy to understand by making it more deterministic and thus to maintain. Value semantics is what most of the standard library is built upon. And thus it makes more sense to use the standard containers with poly_obj than with shared_ptr, because this is kind of semantics they were designed for. Design ------ To implement that utility, there is a need for a way to call the appropriate copy constructor and move constructor for the right types, even with type erasure. There is no standard way, however, to simply make constructors virtual, so there is a need to do such a thing. A few implementations of clone_ptr, a similar idea, that I have seen tried to make this non-intrusive. This of course adds some overhead, mainly in size of the object, but it has another downside: it requires to assume that the types of the objects being inserted in it are their real types. Thus, writing that kind of code: poly_obj<Base> b = poly_obj<Base>(Derived()).get(); Would produce type splitting. (the get() member function returns a real C++ reference to the object) I believe this is not really a good thing. And the only way to fix that is to make the virtuality of the constructors intrusive. I thus came up with the following design: class Foo { ... public: virtual void clone(void* p) const { return new(p) Foo(*this); } virtual std::pair<std::size_t, std::size_t> size_and_align() const { return std::make_pair(sizeof(Foo), boost::alignment_of<Foo>::value); } }; (Of course, a 'mixin' can be used to not have to repeatly rewrite such code for every class) It should be quite important to stabilize that interface, so remarks are welcome. It is judged very important that the user should be able to define how the memory is allocated. Since virtual templates aren't possible, an allocator cannot be passed to the 'clone' member function. The solution I came up with allows to clone the object on already initialized memory, and provides the necessary information to know what that memory should be like. It is, however, still not really possible to use standard allocators, which need to know the type. An additional member for moving might also be needed. Code ---- Some initial code is available on the vault, in "Utilities". http://www.boost-consulting.com/vault/index.php?direction=0&order=&directory=Utilities Note that the tests are just here to show how it's to be used, and are not real tests at all. No support for moving yet, and no support for using custom allocators either (since an allocator interface would have to be agreed upon, first). operator= also needs a facility to compare types, and thus it uses RTTI. Since I've seen a few times people who complained about the usage of RTTI, there is also a hack to replace it.

Mathias Gaunard wrote:
I am proposing the addition of the following very simple utility in Boost, and thus request a very informal and early review of the ideas and design.
Idea ----
The idea is to provide an utility that allows the manipulation of dynamically typed objects, that may be of any derived type of a given base type, as if it were a simple base object. This is quite similar to a smart pointer that does deep-copying, but without any pointer or memory allocation exposed.
[ 8< -- snip -- ]
The motivation for this is that often usage of dynamic typing with inheritance and inclusion polymorphism often means explicit memory allocation and deallocation, and thus are often a source of unsafety. shared_ptr is one way to solve the problem, yet the shared object doesn't behave as a normal C++ object: it has entity semantics (= reference) instead of value semantics (= copying).
shared_ptr is highly regarded as being one of the most useful utilities
[ 8< -- snip -- ]
As a matter of fact, it appears that sharing is rarely needed, and makes the program more difficult to understand. (sharing between threads also cause a few issues). Indeed, the object is freed when it is no longer used, but it's not always trivial to trace whether it's still used or not. Sharing also means you can expect side-effects coming from anywhere.
I'm confused. Surely the whole point of shared_ptr is to relieve the programmer of the burden of worrying when its still used. And what side effects do you speak of?
Pure value objects make the program more easy to understand by making it more deterministic and thus to maintain.
How are programs made more deterministic?
Design ------
To implement that utility, there is a need for a way to call the appropriate copy constructor and move constructor for the right types, even with type erasure. There is no standard way, however, to simply make constructors virtual, so there is a need to do such a thing.
A few implementations of clone_ptr, a similar idea, that I have seen tried to make this non-intrusive. This of course adds some overhead, mainly in size of the object, but it has another downside: it requires to assume that the types of the objects being inserted in it are their real types.
Thus, writing that kind of code: poly_obj<Base> b = poly_obj<Base>(Derived()).get(); Would produce type splitting.
Actually, if you have the static type at the point of construction, you're set. Here's one I just threw together: http://rafb.net/p/bIQgtl24.html Edd

Edd Dawson wrote:
Here's one I just threw together: http://rafb.net/p/bIQgtl24.html
Sorry, wasn't thinking straight. That link expires after 24 hours. Here we are: #include <boost/scoped_ptr.hpp> #include <sstream> #include <cassert> #include <iostream> template<typename T> class poly_obj { public: template<typename U> explicit poly_obj(const U &obj) : guts_(new guts<U>(obj)) { if (typeid(obj) != typeid(U)) { // static type != dynamic type. // copy-ctor will slice. throw "you've been naughty!"; // ^ std::bad_cast, or related in real life, maybe? } } // this could be templated, if you like poly_obj(const poly_obj &other) : guts_(other.guts_->clone()) { } // so could this poly_obj &operator= (const poly_obj &rhs) { poly_obj temp(rhs); temp.guts_.swap(guts_); return *this; } // These probably won't work if T is const, haven't thought it // through properly const T &get() const { assert(guts_); return guts_->get(); } T &get() { assert(guts_); return guts_->get(); } private: struct guts_base { virtual ~guts_base() { } virtual guts_base *clone() const = 0; virtual T &get() = 0; }; template<typename U> struct guts : guts_base { guts(const U &obj) : obj_(obj) { } guts_base *clone() const { return new guts(obj_); } T &get() { return obj_; } U obj_; }; private: boost::scoped_ptr<guts_base> guts_; }; struct B { virtual ~B() {} }; struct D : B { }; struct E : D { }; int main() { D d; poly_obj<B> po(d); (void)dynamic_cast<D &>(po.get()); std::cout << "We have a D!\n"; E e; try { poly_obj<B>(static_cast<D &>(e)); } catch(const char *bad) { std::cerr << bad << '\n'; } return 0; } Edd

Edd Dawson wrote:
I'm confused. Surely the whole point of shared_ptr is to relieve the programmer of the burden of worrying when its still used.
The problem is that with large programs you're not able to tell easily where it is being used. (also, you may create cycles by oversight, but that could be fixed eventually) The fact that it is automatically destructed when not used doesn't not mean that it can't be still used at wrong places, especially after code refactoring and modifications.
And what side effects do you speak of?
http://en.wikipedia.org/wiki/Side_effect_%28computer_science%29
How are programs made more deterministic?
Because ownership is fixed to a scope. You thus perfectly know when the object will be constructed and destructed. With shared ownership, ownership is fully dynamic, and thus not determined.
Actually, if you have the static type at the point of construction, you're set. Here's one I just threw together: http://rafb.net/p/bIQgtl24.html
I looked at your code and I am not sure throwing an exception is the right thing to do if the static and dynamic types aren't the same. The stuff should "just work". In your code also you used 'new' directly. It is obvious that people may want to use their custom allocator. Plus, if you think about it, a virtual constructor should only be responsible for constructing, not allocating the memory for the object.

Mathias Gaunard wrote:
Edd Dawson wrote:
I'm confused. Surely the whole point of shared_ptr is to relieve the programmer of the burden of worrying when its still used.
The problem is that with large programs you're not able to tell easily where it is being used. (also, you may create cycles by oversight, but that could be fixed eventually) The fact that it is automatically destructed when not used doesn't not mean that it can't be still used at wrong places, especially after code refactoring and modifications.
And what side effects do you speak of?
http://en.wikipedia.org/wiki/Side_effect_%28computer_science%29
Perhaps I should have said "*which* side effects do you speak of"! I'm aware of the definition of the term as is pertinient to this discussion, but I'm unsure of the particular side effects exhibited by the use of shared_ptr that are deemed undesirable and supposedly avoided by the use of poly_obj.
How are programs made more deterministic?
Because ownership is fixed to a scope. You thus perfectly know when the object will be constructed and destructed. With shared ownership, ownership is fully dynamic, and thus not determined.
It's deterministic in that it will be destructed at the point where it is no longer used. Often that's all the determinism you need. I agree with you that shared_ptr is often over used or at least used rather flippantly, but when used properly it is exactly deterministic enough. Perhaps I'm agreeing with you, here? Hard to tell :)
Actually, if you have the static type at the point of construction, you're set. Here's one I just threw together: http://rafb.net/p/bIQgtl24.html
I looked at your code and I am not sure throwing an exception is the right thing to do if the static and dynamic types aren't the same.
I'm not 100% convinced either. But there's a tradeoff (more on this in a sec). It does however ensure (not at compile-time, unfortunately) that the user is passing in the object in a state where slicing can be avoided.
The stuff should "just work".
That's certainly desirable, yes. My code wasn't meant to be a full solution (only a mockup of an alternative). I just don't like having to change existing code; it's fine deriving from virtual_ctors<> when creating new classes, but it won't work with the stuff I already have (or have I missed something?). One could imagine modifying my code to check for a specialisation of clonable_object<> or something similar, which defines how a clone may be created without changing existing code.
In your code also you used 'new' directly. It is obvious that people may want to use their custom allocator.
Sure. I haven't thought about allowing custom allocation. To be honest I'm not at all well read on allocators. If you consider my "design" any further, I'll leave you to consider how an allocator interface would be incorporated :) My naive gut feeling is that it shoulnd't be too difficult.
Plus, if you think about it, a virtual constructor should only be responsible for constructing, not allocating the memory for the object.
I guess. But I was thinking about it at the "polymorhic value object" level, more than implementing a "virtual constructor". Again, my vote would be to go with anything that provides a non-intrusive solution. Edd
participants (2)
-
Edd Dawson
-
Mathias Gaunard