[endian] Endian types in union
Hello,
This may be more appropriate for the devel list, but someone just
pointed out on this list that an endian library has been submitted
for approval. I've *just* needed classes like this (actually I wrote
my own similar classes), and decided to see how the Boost library
stacked up. It looks quite nice, and I'd be excited to see this
included. One suggestion I would make is that the endian classes
should not have constructors. As of version 0.5, trying to put them
into a union causes a compiler error. Take the struct from the
example, and add a union:
struct header
{
big4_t file_code;
big4_t file_length;
little4_t version;
little4_t shape_type;
union
{
big4_t foo;
little4_t bar;
} baz;
};
I get the following errors:
endian_example.cpp:37: error: member 'boost::integer::big4_t
<unnamed>::header::<anonymous union>::foo' with constructor not
allowed in union
endian_example.cpp:38: error: member 'boost::integer::little4_t
<unnamed>::header::<anonymous union>::bar' with constructor not
allowed in union
By removing all constructors, the errors go away. I cam across this
very issue when developing my classes, and found the answer in this
comp.lang.c++.moderated post:
<http://groups.google.com/group/comp.lang.c++.moderated/msg/
103835a7b493723e?hl=en&>
As an experiment, I modified boost/integer/endian.hpp to remove the
constructors, and add an operator=() to see if this worked, and it did:
template
On 7/19/06, Dave Dribin
I'd suggest making this change to all classes so that they can be used in unions.
The original discussion mentioned something that appears similar and
was handled by templating the full struct, along the lines of the
following:
template <endianness foobar_endianness>
struct header
{
big4_t file_code;
big4_t file_length;
little4_t version;
little4_t shape_type;
endian
On Jul 19, 2006, at 10:20 PM, me22 wrote:
The original discussion mentioned something that appears similar and was handled by templating the full struct, along the lines of the following:
Hmmm... I'm not sure that would help in my case. The endianness need not be different for unions to be useful.
Can you explain what your use case is for this union?
Sure, I'll go over the details in a second. But what's the use case/ rationale for *not* wanting to support unions? I was actually porting some code that directly reads a disk formated as Apple's HFS+ filesystem, where endian and alignment are paramount. Apple provides structures that describes the on-disk format in /usr/include/hfs/hfs_format.h. You can see it here, since it's open source: <http://darwinsource.opendarwin.org/10.4.6.ppc/xnu-792.6.70/bsd/ hfs/hfs_format.h> Of course, these structures assume that you're on a big-endian machine. But I needed to port code from PowerPC to Intel for the new Macs. A few unions are used in that header, for example: /* A generic Attribute Record*/ union HFSPlusAttrRecord { u_int32_t recordType; HFSPlusAttrInlineData inlineData; /* NOT USED */ HFSPlusAttrData attrData; HFSPlusAttrForkData forkData; HFSPlusAttrExtents overflowExtents; }; typedef union HFSPlusAttrRecord HFSPlusAttrRecord; Which has integer types mixed with other structures, such as: /* * Atrributes B-tree Data Record * * For small attributes, whose entire value is stored * within a single B-tree record. */ struct HFSPlusAttrData { u_int32_t recordType; /* == kHFSPlusAttrInlineData */ u_int32_t reserved[2]; u_int32_t attrSize; /* size of attribute data in bytes */ u_int8_t attrData[2]; /* variable length */ }; typedef struct HFSPlusAttrData HFSPlusAttrData; Since my endian classes had no constructors, and hence supported unions, I was able to get the whole file to work as endian-neutral by doing some nasty preprocessor magic, such as: #define u_int32_t ubig4_t // #define's for other types used ... #include "hfs_format.h" #undef u_int32_t // #undef's for other types used ... Then, all the old legacy code basically just worked on Intel without modification. -Dave
On 7/20/06, Dave Dribin
Hmmm... I'm not sure that would help in my case. The endianness need not be different for unions to be useful.
I was just guessing. Now that you've described your case, I see what your reason is. I'm not a fan of that macro hackery, however...
Sure, I'll go over the details in a second. But what's the use case/ rationale for *not* wanting to support unions?
The problem is that the classes can't properly act "like ints". The
way they are now they can't be used in a union, as you've mentions.
But take out the constructors and then the following rather logical
program doesn't work any more:
#include
On Jul 20, 2006, at 10:34 AM, me22 wrote:
I'm not a fan of that macro hackery, however...
Neither am I. :) I do feel a bit dirty for doing that, and wouldn't recommend it, if you can avoid it. But it's a system header than I cannot change. I will probably end up copying it to my project, renaming it to avoid a conflict, and do a search/replace, actually.
The problem is that the classes can't properly act "like ints". The way they are now they can't be used in a union, as you've mentions.
I totally agree that they should act as much like ints as possible.
But take out the constructors and then the following rather logical program doesn't work any more: #include
#include <iostream> int main() { boost::integer::big4_t x = 42; std::cout << x << "\n"; }
True, but a workaround is to do it in two steps: boost::integer::big4_t x; x = 42;
And it can no longer be initialised in member initializer lists. I don't know whether this or unions is a more pressing concern, however.
At least initialization has a workaround by doing it in two steps. There is no work around to allow them in unions. One argument for favoring unions over initialization is that a prime requirement of the endian types are they "must be suitable for I/O - in other words, must be memcpyable." Hence, they are designed to be used in structs and and then used directly with I/O, i.e. it's great for file formats. I doubt HFS+ is alone in using unions for file formats. I can see other uses cases for unions, too. It would be a shame if it were not possible to model these.
Do SFINAE-awayed constructors prevent a class from being used in unions? That might be a way to allow both if an eventual review can't come to a conclusion.
"SFINAE-awayed constructors" is a new term for me. After a bit of Google work, I'm still not sure I understand SFINAE enough to know what it can do, and if could help here. But as far as I know, *any* constructor prevents a class from being used in a union. One possible solution is to use add static init() methods to the template: static endian init(T i) { endian e; e = i; return e; } Then, you could initialize it like: boost::integer::big4_t x = boost::integer::big4_t::init(43); Not perfect, but less ugly then using two steps, especially, if you ignore the namespaces: big_t x = big_t::init(43); -Dave
On Jul 20, 2006, at 10:34 AM, me22 wrote:
The problem is that the classes can't properly act "like ints". The way they are now they can't be used in a union, as you've mentions.
One more area where they can't act like int's is in variable length
arguments. Even using my init() method, the following code won't
compile on the printf:
big4_t x = big4_t::init(42);
std::cout << x << "\n";
printf("%d\n", x);
warning: cannot pass objects of non-POD type 'struct
boost::integer::big4_t' through '...'; call will abort at runtime
One way around this is to add a get() method, so you can use:
printf("%d\n", x.get());
Or, as I prefer, use operator():
printf("%d\n", x());
Implemented as follows:
T operator()() const
{ return detail::load_big_endian
On 7/20/06, Dave Dribin
One more area where they can't act like int's is in variable length arguments. Even using my init() method, the following code won't compile on the printf: [snip code]
I don't think this is a worry. varargs aren't typesafe and can certainly be avoided.
One way around this is to add a get() method, so you can use:
printf("%d\n", x.get());
Or, as I prefer, use operator():
printf("%d\n", x());
The .get is much better. operator() is for calling functions, while .get is already used in many places (smart pointers, for example) for that purpose. However, neither is needed. (int)x or boost::implicit_cast<int>(x) will both work fine. Of course, this is assuming that the value in x fits properly in an int. Better would be typeof_x::value_type, but then you can't be sure that your format string matches, which is the same point as varargs being evil in the first place.
Implemented as follows: [snip code]
Much better to implement it like this: T get() const { return *this; } and not duplicate knowledge. ~ Scott McMurray
On Jul 20, 2006, at 4:40 PM, me22 wrote:
I don't think this is a worry. varargs aren't typesafe and can certainly be avoided.
printf-like interfaces are still used quite a bit. The HFS+ code I was writing was actually a device driver. The kernel debugging print functions had printf-like interfaces, which is the only reason I know this is an issue.
The .get is much better. operator() is for calling functions, while .get is already used in many places (smart pointers, for example) for that purpose.
Makes sense. I can live with .get(), and if it's consistent with other libraries, then even better.
However, neither is needed. (int)x or boost::implicit_cast<int>(x) will both work fine. Of course, this is assuming that the value in x fits properly in an int. Better would be typeof_x::value_type, but then you can't be sure that your format string matches, which is the same point as varargs being evil in the first place.
The issue with using either cast is I have to remember the correct type to use in the cast. By using a .get(), I don't have to worry about it, and it's less repeated information. And as bonus, a compiler that recognizes format strings will warn me if I use an incorrect format string: big8_t y = big8_t::init(5); printf("%d\n", y.get()); gcc give's me this warning: warning: format '%d' expects type 'int', but argument 2 has type 'int_least64_t' If I just instinctively cast to int, I'd silently lose information. The warning gives me a chance to use '%lld'.
Much better to implement it like this: T get() const { return *this; } and not duplicate knowledge.
Agreed. Much better, thanks. BTW, I started a thread on Boost-dev, as well, since that may be the more appropriate list. -Dave
participants (2)
-
Dave Dribin
-
me22