[Off topic] Strict Aliasing white paper redux

Some time ago, I posted a link here to my (then new) white paper on strict-aliasing. I wrote it because people keep posting, on this and on other lists, questions that show a basic misunderstanding of aliasing and what the rules in C and C++ standards on aliasing are intended to communicate. I got some great feedback and have posted a new revision, http://dbp-consulting.com/StrictAliasing.pdf. I'm hoping that people on this list will review the paper and tell me some more ways of improving it. My intention is to write a paper that will make it so that I won't have to keep answering the same questions over and over again;) I'll just point people at the paper. Best regards, Patrick

On Mon, Jan 3, 2011 at 2:11 PM, Patrick Horgan
Some time ago, I posted a link here to my (then new) white paper on strict-aliasing. I wrote it because people keep posting, on this and on other lists, questions that show a basic misunderstanding of aliasing and what the rules in C and C++ standards on aliasing are intended to communicate. I got some great feedback and have posted a new revision, http://dbp-consulting.com/StrictAliasing.pdf. I'm hoping that people on this list will review the paper and tell me some more ways of improving it. My intention is to write a paper that will make it so that I won't have to keep answering the same questions over and over again;) I'll just point people at the paper.
Way cool Patrick! Thanks for sharing the link and the well-written document! :) -- Dean Michael Berris about.me/deanberris

Some time ago, I posted a link here to my (then new) white paper on strict-aliasing. I wrote it because people keep posting, on this and on other lists, questions that show a basic misunderstanding of aliasing and what the rules in C and C++ standards on aliasing are intended to communicate. I got some great feedback and have posted a new revision, http://dbp-consulting.com/StrictAliasing.pdf. I'm hoping that people on this list will review the paper and tell me some more ways of improving it. My intention is to write a paper that will make it so that I won't have to keep answering the same questions over and over again;) I'll just point people at the paper. Nice paper. But I believe parts are wrong or misleading. Misleading is the
Patrick Horgan wrote, On 3.1.2011 7:11: part about -fno-strict-aliasing. It is GCC specific. The wrong part, I think, is the part suggesting union as a solution. As far as I know you can only read from union through a member that you have put into it. The fact that you can access different union member is also an extension, though one more common than just GCC specific. -- VH

Some time ago, I posted a link here to my (then new) white paper on strict-aliasing. I wrote it because people keep posting, on this and on other lists, questions that show a basic misunderstanding of aliasing and what the rules in C and C++ standards on aliasing are intended to communicate. I got some great feedback and have posted a new revision, http://dbp-consulting.com/StrictAliasing.pdf. I'm hoping that people on this list will review the paper and tell me some more ways of improving it. My intention is to write a paper that will make it so that I won't have to keep answering the same questions over and over again;) I'll just point people at the paper. Nice paper. But I believe parts are wrong or misleading. Misleading is the
Patrick Horgan wrote, On 3.1.2011 7:11: part about -fno-strict-aliasing. It is GCC specific. That's a good point, and I thought about that when writing it, but so far, it seems that only gcc is giving warning messages about strict-aliasing. At least when googleing they are the only ones I see. You're right though that I should point out that it is a gcc specific
On 01/02/2011 10:53 PM, Václav Haisman wrote: option.
The wrong part, I think, is the part suggesting union as a solution. As far as I know you can only read from union through a member that you have put into it. The fact that you can access different union member is also an extension, though one more common than just GCC specific. I'm not so sure. This idiom has been around as long as unions were in C. Do you know of any compilers that don't support it? Of course memcpy or any other solution using character pointers would be supported, but compilers wouldn't generate efficient code for in this case a simple swap of 16 bit ints. Clearly the specs say that a union can only contain one object at a time. Hmmm. The C99 spec has a footnote to section 6.5.2.3/3 that seems to clearly say you can do this:
85) If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation. The current C++ spec doesn't say anything about this, so it seems to be a difference between C and C++, not a surprise, since C++ wants you do do a delete and placement new to switch active members of a union. C++ unions get implicit copy constructors. Nevertheless, it bothers me to think that a pod union would act differently in C and C++. I've CC'd a friend Nick Stoughton to see if he has any thoughts on this. Patrick

On Mon, 03 Jan 2011 00:08:23 -0800, Patrick Horgan
Some time ago, I posted a link here to my (then new) white paper on strict-aliasing. I wrote it because people keep posting, on this and on other lists, questions that show a basic misunderstanding of aliasing and what the rules in C and C++ standards on aliasing are intended to communicate. I got some great feedback and have posted a new revision, http://dbp-consulting.com/StrictAliasing.pdf. I'm hoping that people on this list will review the paper and tell me some more ways of improving it. My intention is to write a paper that will make it so that I won't have to keep answering the same questions over and over again;) I'll just point people at the paper. Nice paper. But I believe parts are wrong or misleading. Misleading is the
Patrick Horgan wrote, On 3.1.2011 7:11: part about -fno-strict-aliasing. It is GCC specific. That's a good point, and I thought about that when writing it, but so far, it seems that only gcc is giving warning messages about strict-aliasing. At least when googleing they are the only ones I see. You're right though that I should point out that it is a gcc specific option. The wrong part, I think, is the part suggesting union as a solution. As far as I know you can only read from union through a member that you have put into it. The fact that you can access different union member is also an extension, though one more common than just GCC specific. I'm not so sure. This idiom has been around as long as unions were in C. Do you know of any compilers that don't support it? Of course memcpy or any other solution using character pointers would be supported, but compilers wouldn't generate efficient code for in this case a simple swap of 16 bit ints. Clearly the specs say that a union can only contain one object at a time. Hmmm. The C99 spec has a footnote to section 6.5.2.3/3 that seems to clearly say you can do
On 01/02/2011 10:53 PM, Václav Haisman wrote: this:
85) If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.
The current C++ spec doesn't say anything about this, so it seems to be a difference between C and C++, not a surprise, since C++ wants you do do a delete and placement new to switch active members of a union. C++ unions get implicit copy constructors. Nevertheless, it bothers me to think that a pod union would act differently in C and C++.
I've CC'd a friend Nick Stoughton to see if he has any thoughts on this.
I conclude that from 9.5/1 (excerpt): "In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time. [Note: one special guarantee is made in order to simplify the use of unions: If a POD-union contains several POD-structs that share a common initial sequence (9.2), and if an object of this POD-union type contains one of the POD-structs, it is permitted to inspect the common initial sequence of any of POD-struct members; see 9.2. ]" There is the "at most one of the data members can be active at any time". There is no allowance for access using a different member, except for the case of the "common initial sequence", which is not the case your paper discusses. Also, from 3.9.2/1: "— unions, which are classes capable of containing objects of different types at different times, 9.5;" The important part is the "different types at different times". It cannot hold both uint16_t and uint32_t at the same time. IANALL -- VH

At Mon, 03 Jan 2011 00:08:23 -0800, Patrick Horgan wrote:
I'm not so sure. This idiom has been around as long as unions were in C. Do you know of any compilers that don't support it? Of course memcpy or any other solution using character pointers would be supported, but compilers wouldn't generate efficient code for in this case a simple swap of 16 bit ints. Clearly the specs say that a union can only contain one object at a time. Hmmm. The C99 spec has a footnote to section 6.5.2.3/3 that seems to clearly say you can do this
The C99 spec is irrelevant to C++; it isn't even "included by reference," as the C89 spec is. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

At Mon, 03 Jan 2011 00:08:23 -0800, Patrick Horgan wrote:
I'm not so sure. This idiom has been around as long as unions were in C. Do you know of any compilers that don't support it? Of course memcpy or any other solution using character pointers would be supported, but compilers wouldn't generate efficient code for in this case a simple swap of 16 bit ints. Clearly the specs say that a union can only contain one object at a time. Hmmm. The C99 spec has a footnote to section 6.5.2.3/3 that seems to clearly say you can do this The C99 spec is irrelevant to C++; it isn't even "included by reference," as the C89 spec is. Good point Dave, but I think I've let go of it as a possibility for C++ anyway. I'll have to rewrite that part. The C spec also mentions
On 01/03/2011 08:01 AM, Dave Abrahams wrote: things about how non-share parts become undefined when switching to a different member of the union, but since the C++ spec calls for destruction and in place construction to switch, it's clearly out. Patrick

Václav Haisman wrote:
Some time ago, I posted a link here to my (then new) white paper on strict-aliasing. I wrote it because people keep posting, on this and on other lists, questions that show a basic misunderstanding of aliasing and what the rules in C and C++ standards on aliasing are intended to communicate. I got some great feedback and have posted a new revision, http://dbp-consulting.com/StrictAliasing.pdf. I'm hoping that people on this list will review the paper and tell me some more ways of improving it. My intention is to write a paper that will make it so that I won't have to keep answering the same questions over and over again;) I'll just point people at the paper. Nice paper. But I believe parts are wrong or misleading. Misleading is the
Patrick Horgan wrote, On 3.1.2011 7:11: part about -fno-strict-aliasing. It is GCC specific. The wrong part, I think, is the part suggesting union as a solution. As far as I know you can only read from union through a member that you have put into it. The fact that you can access different union member is also an extension, though one more common than just GCC specific.
I second that. FWIW, i tried the memcpy version*. Surprise: It leads to the exact same (assembly) code. I guess you should outline that, and not rely on that extension. *: uint32_t swaphalves(uint32_t a) { uint16_t as16bit[2]; memcpy(as16bit, &a, sizeof(a)); uint16_t tmp; tmp = as16bit[0]; as16bit[0] = as16bit[1]; as16bit[1] = tmp; memcpy(&a, as16bit, sizeof(a)); return a; }

On 03/01/11 06:11, Patrick Horgan wrote:
My intention is to write a paper that will make it so that I won't have to keep answering the same questions over and over again;) I'll just point people at the paper.
There is one already written. It is "Understanding Strict Aliasing" by Mike Acton http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-al... Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org Member of ACCU, http://accu.org

On 03/01/11 06:11, Patrick Horgan wrote:
My intention is to write a paper that will make it so that I won't have to keep answering the same questions over and over again;) I'll just point people at the paper. There is one already written. It is "Understanding Strict Aliasing" by Mike Acton
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-al...
Best regards, Yes, I've read that one, (which I quite like), and many others, some of high quality and some not quite so much. What mine does that I think is useful is that it points out the misunderstanding that leads to so much confusion. If you have this confusion, you read papers such as the one
On 01/03/2011 11:02 AM, Mateusz Loskot wrote: that you referenced from the viewpoint of your misunderstanding and leave as confused as you came. The paper you referred to, while one of the best, assumes that people understand more than many do, and doesn't dispel the confusions, rather mixing as it goes along information about type-punning and information about the aliasing rules. I hope my paper, by clearly speaking to the two audiences and about what each wants to accomplish, fixes that problem. I haven't found any other that speaks to that misunderstanding, but it's at the heart of most of the posts I see on the web. Patrick
participants (6)
-
Dave Abrahams
-
Dean Michael Berris
-
Mateusz Loskot
-
Patrick Horgan
-
Thomas Heller
-
Václav Haisman