Re: [boost] [interest] underlying type library

22 Aug 2011

      On 22.08.2011 13:04, Julian Gonggrijp wrote:
...
...
Even examining the implementation for all your member variables isn't
enough.  The boost::function which holds a boost::bind(..., this, ...) where
the function object is stored inside the boost::function object itself may
now exhibit different semantics than when the function object it is holding
is in the heap.  Ugh.
Users should simply be warned that if their type has member data that
were implemented by a third party and which are initialized with a
Nevin Liber wrote: 
pointer to (a part of) the main object, they should assume their type
to depend on the object's memory location for its validity.
There is nothing simple about this warning. It's a pretty complicated 
condition.
...
Way back when, C++ did bitwise copying for the compiler generated copy
operations, and was changed to member wise copying for good reason.  I
really don't want to go back to that world.
Let me reassure you that move_raw is not inherently about bitwise
copying; you can do exactly the same with memberwise copying but that
requires the author of the type to provide their own implementation.
As far as I can tell, in fact, in Stepanov's paper move_raw has 
absolutely nothing to do with bitwise copying.
...
...
On 21.08.2011 21:23, Julian Gonggrijp wrote:
...
Dear all,
I think the set of types with which bitwise move_raw will yield
undefined behaviour can be sharply defined:
The standard already does. Undefined behavior is a notion of the standard, not of a particular implementation. The standard says that this technique works only for PODs (trivially copyable types in 0x), nothing else.
Doesn't the standard say that bitwise copying of non-PODs is undefined
behaviour /because there are cases where bitwise copying will give you
an invalid result/?
No, the standard doesn't give a reason.
  Have we not identified the cases for which an
invalid result will be obtained? Can we therefore not maintain --
making use of the semantical definition of move_raw and restricting
ourselves to the set of unproblematic types -- that bitwise copying is
a safe implementation of move_raw even though in a very strict
juridical sense it may be undefined behaviour?
Absolutely not.
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
Undefined behavior isn't just about "a very strict juridical sense" of
Sebastian Redl wrote:
things. Compilers are allowed to assume that UB doesn't happen. If you 
memcpy over a non-POD, the compiler is allowed to assume the whole 
branch containing the memcpy is dead code - it cannot ever be reached, 
because reaching it would invoke UB. Let's say you have this:
...
...
Conceptually, if you think of classes as "aggregates and then some", 
there is the underlying aggregate. 
I'm not thinking of classes in that way. The underlying type is
something different from an "underlying aggregate" (although I realise
I have given that impression in my first email; my apologies).
Instead, the underlying type is defined by its semantic relation to
the 'overlying type' as I have also stated in my reply to Mathias
Gaunard (http://lists.boost.org/Archives/boost/2011/08/184913.php):
The underlying type of T is a POD which can store the state of an
instance of T.
So far, it seems that those who have read Stepanov's paper are more
positive about the possible value of my proposal than those who didn't
read it. This seems to confirm that Stepanov is still better at
explaining the value of move_raw than I am.
OK, here's the problem I see.
Stepanov's approach requires quite a bit of manual interaction from the 
user. Not only requires it that the user defines underlying_type for 
types where the copy constructor isn't good enough, it also requires
if (x != 42) {
   memcpy(&nonpod, &source, size);
} else {
   other_code();
}
std::cout << x << std::endl;

The behavior of the memcpy is undefined. As such, the compiler can 
generate any code it wants for this branch - like the code of the second 
branch, thus eliminating the branching entirely.
But not only that. In fact, because the compiler will assume that the 
program is valid, and entering the memcpy branch would invoke UB, it can 
deduce that x cannot possibly be anything but 42! That is, the std::cout 
could output "42" even if you set x to something other than 42, because 
the optimizer replaced all occurrences of x with the constant 42.

Now, I don't know any compiler that is actually that strict, especially 
with memcpy, but my point is that *there is no such thing as benign 
undefined behavior*.

three separate overloads of move_raw.
IIUC, your library attempts to solve this task generically, by providing 
an underlying type and move_raw implementations that work with all or at 
least most types.
C++03 provides two generic ways of transferring data from one place to 
another. One is the copy constructor (and the whole point of move_raw is 
that copying is a waste of time). The other is memcpy, which is 
undefined for non-PODs. (Note that for PODs, the copy constructor is 
always good enough.)
Therefore, what services can your library provide?
- It cannot provide a perfect underlying type: it would require 
introspection, and such mechanisms are basically non-existent in C++. In 
fact, I do not see how you could provide any approximation beyond what 
Mathias proposed without the user effectively doing the work, but I'm 
interested in your approach.
- It cannot provide a reliable, generic move_raw. I pointed out problems 
with memcpy(), and others did as well. Some people on this list might 
see these as minor or theoretical, but I don't.
What, then, does your library still do? I can see it offering a standard 
way of implementing underlying_type<>. That's nice, but I'd rather 
implement proper move construction/assignment and fix my type if it 
doesn't have a cheap default constructor. Boost.Move makes this possible 
in 03. I can see it offering a memcpy()-based move_raw for opt-in, but I 
don't think it's a good idea to do that. What's left?

In my opinion, C++0x's move support has obsoleted the move_raw concept. 
Yes, there may be some types for which move_raw is more efficient, 
because it's expensive to bring a moved-from object into a valid state. 
But such types will, I believe, die out.
...
Therefore I invite those who haven't read the paper yet to still do
so. Only lectures 4 and 5 are required; it's only 11 pages with some
trailing source code that you can skip. It's a very enjoyable read and
I almost dare to promise you that you'll want to read the rest of the
paper as well. :-)
The URL: http://www.stepanovpapers.com/notes.pdf
This is a very interesting paper, thank you.

Sebastian