[serialization] patch for portable_binary_oarchive.hpp

Two serialization library tests: test_demo_portable_archive[_dll] fail to compile on many platforms with invalid type conversion error. I did not check all the platforms, but I'd expect it to fail on any big endian platform. The patch below fixes the problem, verified on HP-UX with both aC++ and gcc. Ok to commit? Index: example/portable_binary_oarchive.hpp =================================================================== --- example/portable_binary_oarchive.hpp (revision 40687) +++ example/portable_binary_oarchive.hpp (working copy) @@ -76,7 +76,7 @@ // we choose to use litle endian #ifdef BOOST_BIG_ENDIAN - char * first = static_cast<char *>(static_cast<void *>(& l)); + char * first = reinterpret_cast<char *>(const_cast<long *>(& l)); char * last = first + sizeof(l) - 1; for(;first < last;++first, --last){ char x = *last; Thanks, Boris

I've discovered that recent changes in binary_?archive render portable_binary_?archive unusable. It was originally made as an example to demostrate extension via inheritance. Since it no longer serves that purpose, it will be removed in the near future. Robert Ramey Boris Gubenko wrote:

on Fri Nov 02 2007, "Boris Gubenko" <Boris.Gubenko-AT-hp.com> wrote:
I can't imagine how your change would fix an endianness problem. Also, your change relies on the nonportable (implementation-defined) behavior of reinterpret_cast. I think the correct change to silence the type conversion error is char * first = static_cast<char *>(static_cast<void *>(const_cast<long*>(& l))); However the code in itself invokes undefined behavior by modifying a constant object (l). The byte reversal needs to be done in a separate buffer.
-- Dave Abrahams Boost Consulting http://www.boost-consulting.com

David Abrahams wrote:
I can't imagine how your change would fix an endianness problem.
It does not fix "an endianness problem". The bug is in the code conditionalized with '#ifdef BOOST_BIG_ENDIAN'. This is the only connection between endianness and the fix.
Also, your change relies on the nonportable (implementation-defined) behavior of reinterpret_cast.
It does. As well as other Boost code using reinterpret_cast :-) Scanning Boost sources (trunk) for reinterpret_cast yields 700 hits, including 8 instances in the Boost python library (two of them are commented out): bash-2.03$ find ./boost/libs/python -type f -exec grep reinterpret_cast {} \; instance<>* self = reinterpret_cast<instance<>*>(inst); _doc += str(reinterpret_cast<const char*>(detail::py_signature_tag)); // _doc += "\n"+str(reinterpret_cast<const char*>(detail::cpp_signature_tag)); _doc += str(reinterpret_cast<const char*>(detail::cpp_signature_tag)); instance<>* self = reinterpret_cast<instance<>*>(inst); _doc += str(reinterpret_cast<const char*>(detail::py_signature_tag)); // _doc += "\n"+str(reinterpret_cast<const char*>(detail::cpp_signature_tag)); _doc += str(reinterpret_cast<const char*>(detail::cpp_signature_tag)); bash-2.03$ You are, probably, referring to 5.2.10 - Reinterpret cast [expr.reinterpret.cast], para 3: -3- The mapping performed by reinterpret_cast is implementation-defined. [Note: it might, or might not, produce a representation different from the original value. ] In my fix, reinterpret_cast performs a mapping from non-null pointer to long to pointer to char. If reinterpret_cast cannot be used in this case without invoking implementation-defined behavior, I'm not sure how it can be used in a multiplatform code like Boost at all.
I'm not sure my change is incorrect. In terms of style, I think that your change is better. Thanks, Boris ----- Original Message ----- From: "David Abrahams" <dave@boost-consulting.com> To: <boost@lists.boost.org> Sent: Friday, November 02, 2007 4:33 PM Subject: Re: [boost] [serialization] patch for portable_binary_oarchive.hpp

Boris Gubenko:
Dave's insistence that we have to obfuscate our reinterpret_casts is well known and a very strict reading of the standard is on his side _in this specific case_, but I personally don't find his style better at all. (A reinterpret_cast<char*>( p ) is indirectly required to work when p is a pointer to a class type, but it's possible in theory for a sufficiently mischievous compiler to break it for long*. I know of no such compiler, present or future, of course.)

Peter Dimov wrote:
Is reinterpret_cast<char*>( pointer-to-a-class-type ) the only case when reinterpret_cast is required to work? Is it required to work in this case: array of a scalar-type to a pointer-to-scalar-type ? extern char cpp_signature_tag[]; ... reinterpret_cast<const char*>(detail::cpp_signature_tag) Just trying to understand when it is safe to use reinterpret_cast. Thanks, Boris ----- Original Message ----- From: "Peter Dimov" <pdimov@pdimov.com> To: <boost@lists.boost.org> Sent: Friday, November 02, 2007 7:03 PM Subject: Re: [boost] [serialization] patch for portable_binary_oarchive.hpp

on Fri Nov 02 2007, "Peter Dimov" <pdimov-AT-pdimov.com> wrote:
Me neither. But that's the way it is. BTW, when I asked this question of the standard committee's core working group, the response was a kind of stunned surprise that anyone would even ask, and aside from one person, they all thought it was obviously true that static_cast-ing through void was the only correct thing to do.
(A reinterpret_cast<char*>( p ) is indirectly required to work when p is a pointer to a class type,
Where does the standard say that, and what do you mean by "work?" As far as I know the only things that reinterpret_cast is required to work for are round-trip pointer conversions and -- if a suitable integer type exists -- some kind of conversion of pointers into integer types.
And most probably won't, because reinterpret_cast is so widely used with the assumption it does what everyone wants. However, we might want Boost code to pass cleanly through code verification tools, for example. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

David Abrahams:
Consider this code: class X; char * f( X * p ) { return reinterpret_cast<char*>( p ); } This must give you a char* pointing to the first byte of *p because X's definition may be struct X { char v; }; and the standard requires a reinterpret_cast to yield a pointer to the first element (9.2/17). Because of ODR, the complete type case must be compiled in the same way, even when X happens to not have a first member of type char (link-time code generation aside).

Peter Dimov wrote:
The standard says no such thing. It speaks of a pointer "suitably converted". It doesn't say that reinterpret_cast is such a suitable conversion; in fact, the definition of reinterpret_cast itself makes it very clear that it is not. A suitable conversion is for example the double static_cast that David has shown. Sebastian Redl

Sebastian Redl: ...
The sentence is A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. Note the "using a reinterpret_cast" part that immediately follows the "suitably converted" part.

All right, sorry. Seems they changed that in the final. (Yeah, I should really get the real thing. But it's damn expensive for a student.)
You should try googling for "ISO 14882". http://webstore.ansi.org/RecordDetail.aspx?sku=INCITS%2FISO%2FIEC+14882-2003 sells it for $30, and some of the other hits are even cheaper. ;-)

Peter Dimov wrote:
I'm not sure what "suitably converted" means but if the compiler has enough information to prove that a type is not a POD-struct then what relevance does this have? If none than it must also have no relevance to generic programming. The cast chain is very ugly, hard to read, and is certainly not self documenting. Perhaps this is a spot for another utility cast? Thanks, Michael Marcin

on Sat Nov 03 2007, "Peter Dimov" <pdimov-AT-pdimov.com> wrote:
Any POD struct will do.
and the standard requires a reinterpret_cast to yield a pointer to the first element (9.2/17).
Nice.
Right. OK, then, look for a core issue from me. It's ridiculous that reinterpret_cast should be defined to be so close to what everyone thinks it is, and yet, not quite. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

on Fri Nov 02 2007, "Boris Gubenko" <Boris.Gubenko-AT-hp.com> wrote:
Those are most likely all wrong; they should be looked at one by one and fixed.
I am apparently responsible for one of those, though only a careful combing through history will tell for sure. I need to bother other people about the other instances.
Very perceptive. There are almost no ways to use reinterpret_cast portably. If you want to get a char* that traverses the bytes of a long, you do it as below:
It is if you're trying to write portable code. You've already found the place in the standard where it says reinterpret_cast performs an implementation-defined mapping. If you want to assert your change is portable, please show me where in the standard it defines the behavior of your specific usage.
In terms of style, I think that your change is better.
Hah, I think it's stylistically distasteful, but at least it's correct. I wish that it were properly spelled reinterpret_cast<char*>(&l) ;-) -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

David Abrahams wrote:
No, it is not portable, it just "works" :-)
The generated code is the same with both your change (a chain of static_cast's) and my change (reinterpet_cast). This is the case with every existing compiler. However, unlike my change, your change is portable. This is what I meant by a better style: a portable construct vs. non-portable. Thanks for the explanation. Boris ----- Original Message ----- From: "David Abrahams" <dave@boost-consulting.com> To: <boost@lists.boost.org> Sent: Friday, November 02, 2007 10:28 PM Subject: Re: [boost] [serialization] patch for portable_binary_oarchive.hpp

David Abrahams wrote: programs that misuse reinterpret_cast, but, as far as I can tell, perfectly compliant. It's simple: on every reinterpret_cast, flip the most significant bit of the pointer. When you do the cast back to the original type, you have your pointer back. But the intermediate value is almost guaranteed to be invalid. (Think of the 2G barrier on typical x86-32 systems, or the fact that x86-64 only uses 48 address bits.) The tricky part is that you have to compare 0x00000000 and 0x70000000 equal, or not flip the bit for null, in order to keep the null preserving that is required of reinterpret_cast. Sebastian Redl

On Fri, 2 Nov 2007, David Abrahams wrote:
Finally; I'd been looking for the "portable" solution -- and stuck with reinterpret_cast. Maybe Boost could define a standard char_cast or Do What I Mean dwim_cast? Something like template <class T> inline char * char_cast(T *x) { return static_cast<char *>(static_cast<void *>(x)); } template <class T> inline T * char_uncast(char *x) { return static_cast<T *>(static_cast<void *>(x)); } - Daniel
participants (8)
-
Beman Dawes
-
Boris Gubenko
-
David Abrahams
-
dherring@ll.mit.edu
-
Michael Marcin
-
Peter Dimov
-
Robert Ramey
-
Sebastian Redl