[endian] endian flip and endian domain

Hi, I think that I have found a good name that can be used instead of the swap_in_place: "flip". This recall the flip operation used in bitset exchanging the value of a bit. On the endian context this will exchange in place the endianess of a data from/to the native endianess. As the same data can be used with different endianess depending of the the context, the flip function needs a specific parameter, let me call this the endian domain. As the data to flip can containg part that are big and others that are little endian, we need to map all the leaveds of the tree associated to the data structure. The domain mill map staticaly the data to its endianess. For fundamental types this will be reduced to a single endian value, but for structures, the map will be a tree having as leaves endian types. This mapping allows us to separate the endian concern from the data itself in a clear way. For structures that have been adapted as fusion sequences, we can define flip_to/from functions that will do the the flip in a generic way. Of course the user could always overload the flips functions to provide its own implementation. I have started to play with this, and even if I'm not a metaprogramming expert, I have reached to have a prototype implementation that I need to polish a little bit more, before I share it with you. I will need your help later to make it much more elegant. Anyway, let me know what do you think of this design. Best, _____________________ Vicente Juan Botet Escribá http://viboes.blogspot.com/

vicente.botet wrote:
I think that I have found a good name that can be used instead of the swap_in_place: "flip". This recall the flip operation used in bitset exchanging the value of a bit. On the endian context this will exchange in place the endianess of a data from/to the native endianess.
"Flip" is not appropriate to my mind. It works well for bits because they are binary, but it doesn't convey the idea of reordering bits/bytes to a specific sequence. I prefer my suggestion of "reorder" because, after all, the operation (ostensibly) reorders the data.
As the same data can be used with different endianess depending of the the context, the flip function needs a specific parameter, let me call this the endian domain. As
Your "domain" idea is nothing other than "endianness" as far as I can tell. Why invent a new term?
the data to flip can containg part that are big and others that are little endian, we need to map all the leaveds of the tree associated to the data structure. The domain mill map staticaly the data to its endianess. For fundamental types this will be reduced to a single endian value, but for structures, the map will be a tree having as leaves endian types.
When applied to a UDT, "flip" seems even worse as it suggests that things that are currently big endian will be "flipped" to little and vice versa, whereas "reorder" and "swap" can be interpreted easily as making everything uniform. The rest of what you're suggesting, as near as I can tell, is to create a compile-time data structure constructed from an existing structure by using Fusion. Presumably, you would be able to declare each field's endianness independently. Using the compiler, via macros and template meta-programming, to create the parallel UDT for a given UDT (such as an OS-provided structure) is highly valuable as it removes most of the hassle of and propensity for error from duplication in such cases.
This mapping allows us to separate the endian concern from the data itself in a clear way. For structures that have been adapted as fusion sequences, we can define flip_to/from functions that will do the the flip in a generic way. Of course the user could always overload the flips functions to provide its own implementation.
How is flip_to/flip_from clearer than the previous suggestions of to/from? _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

At Tue, 8 Jun 2010 07:05:36 -0400, Stewart, Robert wrote:
"Flip" is not appropriate to my mind. It works well for bits because they are binary, but it doesn't convey the idea of reordering bits/bytes to a specific sequence. I prefer my suggestion of "reorder" because, after all, the operation (ostensibly) reorders the data.
+1 You might also consider “invert,” as in “invert the endianness,” which does what “flip” tries to do with IMO a more appropriate word. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

David Abrahams wrote:
At Tue, 8 Jun 2010 07:05:36 -0400, Stewart, Robert wrote:
"Flip" is not appropriate to my mind. It works well for bits because they are binary, but it doesn't convey the idea of reordering bits/bytes to a specific sequence. I prefer my suggestion of "reorder" because, after all, the operation (ostensibly) reorders the data.
+1
You might also consider "invert," as in "invert the endianness," which does what "flip" tries to do with IMO a more appropriate word.
Given just little and big endianness, "invert" is reasonable. However, considering the possibility of mixed and middle endianness, particularly as managing endianness of multi-field structures is in scope, "invert" doesn't work so well. (That happens to be a good argument against "flip," too.) _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Sent: Tuesday, June 08, 2010 7:15 AM To: boost@lists.boost.org Subject: Re: [boost] [endian] endian flip and endian domain
At Tue, 8 Jun 2010 07:05:36 -0400, Stewart, Robert wrote:
"Flip" is not appropriate to my mind. It works well for
bits because
they are binary, but it doesn't convey the idea of reordering bits/bytes to a specific sequence. I prefer my suggestion of "reorder" because, after all, the operation (ostensibly) reorders the data.
+1
You might also consider "invert," as in "invert the endianness," which does what "flip" tries to do with IMO a more appropriate word.
-- Dave Abrahams BoostPro Computing http://www.boostpro.com
Another name for this kind of operation that has some previous history behind it is "permute". The PPC instruction set has a permute vector instruction that takes an altivec vector register (16 bytes) and re-arranges it at the byte level using the contents of another register as a list of source byte indices. Since some systems have used strange byte orderings "permute" would be a more general name than even "flip". Glenn Schrader - MITLL

I think that I have found a good name that can be used instead of the swap_in_place: "flip". This recall the flip operation used in bitset exchanging the value of a bit. On the endian context this will exchange in place the endianess of a data from/to the native endianess.
"Flip" is not appropriate to my mind. It works well for bits because they are binary, but it doesn't convey the idea of reordering bits/bytes to a specific sequence. I prefer my suggestion of "reorder" because, after all, the operation (ostensibly) reorders the data.
"transpose"? -- Alec Ross

Alec Ross wrote:
Rob Stewart wrote:
vicente.botet wrote:
I think that I have found a good name that can be used instead of the swap_in_place: "flip". This recall the flip operation used in bitset exchanging the value of a bit. On the endian context this will exchange in place the endianess of a data from/to the native endianess.
"Flip" is not appropriate to my mind. It works well for bits because they are binary, but it doesn't convey the idea of reordering bits/bytes to a specific sequence. I prefer my suggestion of "reorder" because, after all, the operation (ostensibly) reorders the data.
"transpose"?
That works, too. Do you object to "reorder" or were you just trying to enlarge the set of choices? "reorder" has few characters to type, but it does have an additional syllable to speak. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

"Flip" is not appropriate to my mind. It works well for bits because they are binary, but it doesn't convey the idea of reordering bits/bytes to a specific sequence. I prefer my suggestion of "reorder" because, after all, the operation (ostensibly) reorders the data.
"transpose"?
That works, too. Do you object to "reorder" or were you just trying to enlarge the set of choices?
No objection - but to me its sense includes the possibility of arbitrary re-ordering. And certainly I was trying to enlarge the set of candidates for consideration.
"reorder" has few characters to type, but it does have an additional syllable to speak.
-- Alec Ross

It's obviously clear that `toggle', or possibly `frobnicate' are the best verbs. -- Lars Viklund | zao@acc.umu.se

Rob Stewart wrote:
vicente.botet wrote:
I think that I have found a good name that can be used instead of the swap_in_place: "flip". This recall the flip operation used in bitset exchanging the value of a bit. On the endian context this will exchange in place the endianness of a data from/to the native endianness.
"Flip" is not appropriate to my mind. It works well for bits because they are binary, but it doesn't convey the idea of reordering bits/bytes to a specific sequence. I prefer my suggestion of "reorder" because, after all, the operation (ostensibly) reorders the data.
Well it seems that flip is not too much appreciated :( Rob Stewart wrote:
As the same data can be used with different endianess depending of the the context, the flip function needs a specific parameter, let me call this the endian domain. As
Your "domain" idea is nothing other than "endianness" as far as I can tell. Why invent a new term?
It is more a mapping from a type to the endianness of its leaves. For example if you have two interfaces with different endianess for some of the leaves of a given type, you will be able to convert the native type to either interface just supplying the domain. Rob Stewart wrote:
the data to flip can containig part that are big and others that are little endian, we need to map all the leaveds of the tree associated to the data structure. The domain mill map staticaly the data to its endianess. For fundamental types this will be reduced to a single endian value, but for structures, the map will be a tree having as leaves endian types.
When applied to a UDT, "flip" seems even worse as it suggests that things that are currently big endian will be "flipped" to little and vice versa, whereas "reorder" and "swap" can be interpreted easily as making everything uniform.
For me swap has the same inconvenient than flip, invert, toggle. Rob Stewart wrote:
The rest of what you're suggesting, as near as I can tell, is to create a compile-time data structure constructed from an existing structure by using Fusion. Presumably, you would be able to declare each field's endianness independently. Using the compiler, via macros and template meta-programming, to create the parallel UDT for a given UDT (such as an OS-provided structure) is highly valuable as it removes most of the hassle of and propensity for error from duplication in such cases.
I would prefer to separate this as other said the structure of the message can be provided by a third party. In addition the message structures could be much more complex than simple POD structs. Rob Stewart wrote:
This mapping allows us to separate the endian concern from the data itself in a clear way. For structures that have been adapted as fusion sequences, we can define flip_to/from functions that will do the the flip in a generic way. Of course the user could always overload the flips functions to provide its own implementation.
How is flip_to/flip_from clearer than the previous suggestions of to/from?
to/from are not verbs. I would prefer present_to/from, as what this functions are doing is part of the presentation layer, for messages for which the single difference between the physical and the application view is the endianness of the leaves. For other messages containing unaligned integers, the presentation will be more complex. In this case the application structure and the physical structure are represented by different C/C++ types. The domain could be used to make this mapping also. Best, Vicente -- View this message in context: http://old.nabble.com/-endian--endian-flip-and-endian-domain-tp28811857p2881... Sent from the Boost - Dev mailing list archive at Nabble.com.

Vicente Botet Escriba wrote:
Rob Stewart wrote:
Your "domain" idea is nothing other than "endianness" as far as I can tell. Why invent a new term?
It is more a mapping from a type to the endianness of its leaves. For example if you have two interfaces with different endianess for some of the leaves of a given type, you will be able to convert the native type to either interface just supplying the domain.
I still don't get it. Aren't you just indicating the endianness you want to wind up with, regardless of how the data is ordered currently?
When applied to a UDT, "flip" seems even worse as it suggests that things that are currently big endian will be "flipped" to little and vice versa, whereas "reorder" and "swap" can be interpreted easily as making everything uniform.
For me swap has the same inconvenient than flip, invert, toggle.
"Swap" connotes changing places. From the standpoint of "swap-in-place," it is certainly awkward. "Flip" fits the in-place behavior better, though I still don't care for it because the bytes or bits are not themselves being flipped. "Invert" can imply things like turning something inside out, reversing, etc., so it easily connotes more than "flip," though "swap" is the long established word for the operation. "Toggle" means essentially the same as "flip," so its a non-starter for me. "Reorder" and "transpose" allow for moving things around exactly as these operations do, without causing swap's confusion when "in-place" is introduced. I favor one of those two choices.
How is flip_to/flip_from clearer than the previous suggestions of to/from?
to/from are not verbs.
Touché! That's my own argument! I view the action as implied in those names: "to<big_endian>" means "change from host order to big endian," for example. That justifies the names not being verbs, but weakly. Still, I find "reorder_to<big_endian>," "transpose_to<big_endian>," and similar constructions somewhat less satisfying.
I would prefer present_to/from, as what this functions are doing is part of the presentation layer, for messages for which the single difference between the physical and the application view is the endianness of the leaves.
I object to your terminology since this has nothing to do with presentation. Rather, it has to do with internal representation. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

<Replying to the entire naming thread, not one message in particular> convert seems like the best word. To me. In all other cases (swap, flip, invert, reorder,...) I would expect something to *always* happen. Whereas with 'convert' I wouldn't be surprised when the operation is a no-op. ie a swap/flip/invert/etc that *sometimes* doesn't actually need to swap/flip/invert seems wrong to me. convert just says do whatever is needed. That may be swap or some more complicated reordering, or nothing. So I mentioned 'reorder' there. The mathematician in me will admit that the non-reorder is still a reordering, but between 'reorder' and 'convert' I still expect reorder to do something more specific than convert. convert can do nothing, in my mind. Feel free to discuss convert_to/from/in_place/etc... Tony

On Tue, Jun 8, 2010 at 10:39 AM, Gottlob Frege <gottlobfrege@gmail.com> wrote:
<Replying to the entire naming thread, not one message in particular>
convert seems like the best word. To me.
While I'm at it. My other thoughts on endian: - I agree that the operators on an endian type are dangerous and should be separated. (I also think operators on atomics are dangerous, but that's another story.) - I think we should consider that read on endian types is dangerous - that is what the 'function-style' seems to be saying. ie an endian type that I can read from will be read from multiple times and be inefficient. But useful at times. So separated? - so am I suggesting a base endian type that can't even be read? Possibly. Note that the endian_cast<> discussion mentioned that it didn't follow the cast<> syntax exactly, because return type and parameter type are the same. This is essentially because we want to take the source int and cast it to an endian, then cast it back to an int, all because we didn't/couldn't define our source value as an endian type. ie int j = from_big_endian_file(); int i = endian_cast<big>(j); should really be (if using types) big_endian_int j = from_big_endian_file(); int i = endian_cast<int>(j); or int i = endian_cast<int>(reinterpret_cast<big_endian_int>(j)); That's just too much to type. (And not necessarily the right names, but leave that aside for now. I'm purposely using 'big_endian_int' instead of trying to use a decent name, to avoid the naming discussions) So, I'm not sure if this can work out, but if we decide that endian-types and type-safety is important, then endian_cast<> (or convert, or...) should *only take an endian type*. Then we get big_endian_int j = from_big_endian_file(); int i = endian_cast<int>(j); // OK int k = from_big_endian_file(); int i = endian_cast<int>(k); // FAILS - not sure endian type for k but then allow specifying the type for k: int i = endian_cast<int, big_endian_int>(k); // OK - k is reinterpreted as big_endian_int Does this make sense? It's still a bit vague in my mind, but I'm just wondering if we can somehow manage to use the same functions and syntax in both the typed and untyped scenarios. Tony

On Tue, Jun 8, 2010 at 11:13 AM, Gottlob Frege <gottlobfrege@gmail.com> wrote:
On Tue, Jun 8, 2010 at 10:39 AM, Gottlob Frege <gottlobfrege@gmail.com> wrote:
- so am I suggesting a base endian type that can't even be read? Possibly.
big_endian_int j = from_big_endian_file(); int i = endian_cast<int>(j); // OK
int k = from_big_endian_file(); int i = endian_cast<int>(k); // FAILS - not sure endian type for k
but then allow specifying the type for k:
int i = endian_cast<int, big_endian_int>(k); // OK - k is reinterpreted as big_endian_int
Does this make sense? It's still a bit vague in my mind, but I'm just wondering if we can somehow manage to use the same functions and syntax in both the typed and untyped scenarios.
Tony
I think what I am saying is that, conceptually at least, the endian-types come first, and the straight functions are built on top of that (instead of the other way around as most are suggested). Now, I wouldn't want that to impact performance, which is why I say 'conceptually'. We can specialize the heck out of everything so that performance wins, but I think the conceptual foundation is important. Which is to say that an int that is in the wrong endian order isn't really an int, conceptually. So we should avoid that. Conceptually. :-) And I'm thinking that base endian-type should be unreadable on its own. This enforces type-safety *and* performance. From that base type you can then choose the function-style code or a derived type that has readability and/or operators. We should be allowed to swap in place, etc, but the framework should make it clear that you are doing something like a reinterpret-cast. To be clear, the function-style doesn't need to start with an endian-type, but the functions take endian types - which can appear in the <> brackets, so that you don't have to actually have endian-types. ie hopefully we can allow you to do whatever you want, but at the same time get the benefit of type-safety and performance, and a consistent interface to it all. ie it is more of a spectrum or more/less typish, than a this or that divide, to me at least. Tony

Gottlob Frege wrote:
- I agree that the operators on an endian type are dangerous and should be separated. (I also think operators on atomics are dangerous, but that's another story.)
Dangerous? My only concern is that they are inefficient and can lead one to write unnecessarily inefficient code.
- I think we should consider that read on endian types is dangerous - that is what the 'function-style' seems to be saying. ie an endian type that I can read from will be read from multiple times and be inefficient. But useful at times. So separated?
We've all agreed that they should be separate for reasons of efficiency and syntax.
- so am I suggesting a base endian type that can't even be read? Possibly.
That's interesting.
Note that the endian_cast<> discussion mentioned that it didn't follow the cast<> syntax exactly, because return type and parameter type are the same. This is essentially because we want to take the source int and cast it to an endian, then cast it back to an int, all because we didn't/couldn't define our source value as an endian type. ie
int j = from_big_endian_file(); int i = endian_cast<big>(j);
should really be (if using types)
big_endian_int j = from_big_endian_file(); int i = endian_cast<int>(j);
or
int i = endian_cast<int>(reinterpret_cast<big_endian_int>(j));
That's just too much to type. (And not necessarily the right names, but leave that aside for now. I'm purposely using 'big_endian_int' instead of trying to use a decent name, to avoid the naming discussions)
That's an interesting view.
So, I'm not sure if this can work out, but if we decide that endian-types and type-safety is important, then endian_cast<> (or
We have.
convert, or...) should *only take an endian type*.
That's an interesting way to view endian_cast.
Then we get
big_endian_int j = from_big_endian_file(); int i = endian_cast<int>(j); // OK
int k = from_big_endian_file(); int i = endian_cast<int>(k); // FAILS - not sure endian type for k
but then allow specifying the type for k:
int i = endian_cast<int, big_endian_int>(k); // OK - k is reinterpreted as big_endian_int
The latter is a departure from the new-style casts. This harkens back to the earlier notion of swap<to,from>() that I suggested and Tomas dismissed because of the confusion that can arise over template argument order. However, by naming the function "endian_cast," you sidestep that problem because the "to" type would always be first in keeping with the normal cast use case. Let's see how this looks in code (I'm going to assume "endian_wrapper" as the base class name for exposition): template <class T, class Endianness> T endian_cast(endian_wrapper<Endianness> _wrapped) { T const result( swap_to_host_order<Endianness>(_wrapped.value())); return result; } template <class T, class Endianness> T endian_cast(T _wrapped) { T const result(swap_to_host_order<Endianness>(_value)); return result; } It would be possible to also include the cases in which the input and output endiannesses are specified (the filter case), but those lead right back to confusion about the template argument order. Notice that endian_wrapper<> needed a means to get the value. I don't see the value of a wrapper type that specifies the endianness but gives no means to access the value.
Does this make sense? It's still a bit vague in my mind, but I'm just wondering if we can somehow manage to use the same functions and syntax in both the typed and untyped scenarios.
I think I understood you and it seems rather interesting. It is useful to make the source endianness explicit in the call which this approach does by requiring either a wrapper type with endianness or a template argument, but that is also made explicit by the other function-based interfaces we've discussed. If the value to be converted using something like endian_cast is a field in a struct, and that structure's definition is provided outside the client program, then the field will likely be a built-in type, not an endian_wrapper. That means the conversion is from T to T. convert_from<big_endian>() seems better than endian_cast<T,big_endian>(). IOW, I don't think endian_cast can be extended helpfully as you've suggested. Instead, endian_cast<int>(make_endian<big_endian>(t)) would be needed and convert_from<big_endian>(t) is much clearer. (Perhaps the latter would be implemented using the former.) That concern aside, I do like your suggested use of endian_cast with the endian_wrapper type. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

On Tue, Jun 8, 2010 at 11:52 AM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
Gottlob Frege wrote:
- I agree that the operators on an endian type are dangerous and should be separated. (I also think operators on atomics are dangerous, but that's another story.)
Dangerous? My only concern is that they are inefficient and can lead one to write unnecessarily inefficient code.
OK, not the best word, I guess. Or 'dangerous' for sufficiently small values of danger.
The latter is a departure from the new-style casts. This harkens back to the earlier notion of swap<to,from>() that I suggested and Tomas dismissed because of the confusion that can arise over template argument order. However, by naming the function "endian_cast," you sidestep that problem because the "to" type would always be first in keeping with the normal cast use case.
Let's see how this looks in code (I'm going to assume "endian_wrapper" as the base class name for exposition):
template <class T, class Endianness> T endian_cast(endian_wrapper<Endianness> _wrapped) { T const result( swap_to_host_order<Endianness>(_wrapped.value())); return result; }
template <class T, class Endianness> T endian_cast(T _wrapped) { T const result(swap_to_host_order<Endianness>(_value)); return result; }
It would be possible to also include the cases in which the input and output endiannesses are specified (the filter case), but those lead right back to confusion about the template argument order.
Technically, the output endianness would take the place of the int. But then you'd need a secondary cast, I suppose. ie big j = endian_cast<big, little>(k); int i = reinterpret_cast<int>(endian_cast<big, little>(k)); which could maybe be somehow shortened? But that is what is actually being done. If I understand the use case correctly.
Notice that endian_wrapper<> needed a means to get the value. I don't see the value of a wrapper type that specifies the endianness but gives no means to access the value.
Alternatively, you would reinterpret_cast<> internally. Otherwise I worry about the value of the wrapper - is it converted to host or not? I could live with something like: wrapper.data() // raw data. or raw_data() or raw() or... wrapper.value() // converted to host *value*
Does this make sense? It's still a bit vague in my mind, but I'm just wondering if we can somehow manage to use the same functions and syntax in both the typed and untyped scenarios.
I think I understood you and it seems rather interesting. It is useful to make the source endianness explicit in the call which this approach does by requiring either a wrapper type with endianness or a template argument, but that is also made explicit by the other function-based interfaces we've discussed.
If the value to be converted using something like endian_cast is a field in a struct, and that structure's definition is provided outside the client program, then the field will likely be a built-in type, not an endian_wrapper. That means the conversion is from T to T. convert_from<big_endian>() seems better than endian_cast<T,big_endian>(). IOW, I don't think endian_cast can be extended helpfully as you've suggested. Instead, endian_cast<int>(make_endian<big_endian>(t)) would be needed and convert_from<big_endian>(t) is much clearer. (Perhaps the latter would be implemented using the former.)
convert_from<big_endian>() is slightly nicer, but endian_cast<T, big_endian>() isn't really much worse (only 2 characters! (not really...)) The benefit of the latter is consistency and explicitness.
That concern aside, I do like your suggested use of endian_cast with the endian_wrapper type.
I'm just glad any of it made sense at all. I'd rather have you disagree with some of it than not understand what I was trying to say. Tony

On Tue, Jun 8, 2010 at 12:46 PM, Gottlob Frege <gottlobfrege@gmail.com> wrote:
On Tue, Jun 8, 2010 at 11:52 AM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
Technically, the output endianness would take the place of the int. But then you'd need a secondary cast, I suppose. ie
big j = endian_cast<big, little>(k); int i = reinterpret_cast<int>(endian_cast<big, little>(k));
which could maybe be somehow shortened? But that is what is actually being done. If I understand the use case correctly.
int i = endian_cast<big, little>(k).raw(); Not saying it's perfect or anything. In particular, I still cringe at any code of the form "functionA().functionB()" (Well, functionA()->functionB() more so). But it keeps some consistency and makes the non-type-safe stuff more explicit. Tony

Gottlob Frege wrote:
On Tue, Jun 8, 2010 at 11:52 AM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
Gottlob Frege wrote:
It would be possible to also include the cases in which the input and output endiannesses are specified (the filter case), but those lead right back to confusion about the template argument order.
Technically, the output endianness would take the place of the int.
That doesn't fit the pattern because the result is an int, not a "big endian," for example. Hence the cast is to int first.
But then you'd need a secondary cast, I suppose. ie
big j = endian_cast<big, little>(k); int i = reinterpret_cast<int>(endian_cast<big, little>(k));
Those aren't casts following the pattern of the new-style casts in the language.
Notice that endian_wrapper<> needed a means to get the value. I don't see the value of a wrapper type that specifies the endianness but gives no means to access the value.
Alternatively, you would reinterpret_cast<> internally. Otherwise I worry about the value of the wrapper - is it converted to host or not? I could live with something like:
wrapper.data() // raw data. or raw_data() or raw() or... wrapper.value() // converted to host *value*
I could live with raw() and host(). In fact, the copying conversion logic could actually be on that class: w.convert<big_endian>(); w.convert<little_endian>(); w.raw(); The latter returns the value in the unconverted order as described by w's type's nested type "type." (Whew!) The former two will implement endianness conversion (or not) depending upon w's endianness and the template argument. That is, when the two types are the same, just return raw(). When they differ, do the conversion and return the result.
convert_from<big_endian>() is slightly nicer, but endian_cast<T, big_endian>() isn't really much worse (only 2 characters! (not really...))
Actually, you're comparing those wrong. convert_from<big_endian>() means convert from big endian to host order. The equivalent cast requires knowing the host order: endian_cast<T, native_endian>(). I like the former better because it reads nicely, not just that it is shorter.
The benefit of the latter is consistency and explicitness.
Notice how easy it was to get wrong! It is also not consistent with the normal casts. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

First of all, I think we are confusing each other somewhere in these emails - I you, or you me, or both. Which, as you mention later down, does highlight the needs/problems. So if I talk past your some of your points, that might be why. Somehow, even in the confusion, I still think we are highlighting worthwhile points. On Tue, Jun 8, 2010 at 12:59 PM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
Gottlob Frege wrote:
On Tue, Jun 8, 2010 at 11:52 AM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
Gottlob Frege wrote:
It would be possible to also include the cases in which the input and output endiannesses are specified (the filter case), but those lead right back to confusion about the template argument order.
Technically, the output endianness would take the place of the int.
That doesn't fit the pattern because the result is an int, not a "big endian," for example. Hence the cast is to int first.
If you are explicitly converting with a from_type and a to_type, then (to me) the result is not actually an int, but a to_type. If you want to store it into an int, then cast it as one, or use to_type.raw().
But then you'd need a secondary cast, I suppose. ie
big j = endian_cast<big, little>(k); int i = reinterpret_cast<int>(endian_cast<big, little>(k));
Those aren't casts following the pattern of the new-style casts in the language.
Sorry if I wasn't clear, or don't understand - in my mind, k was typed/stored/declared as an int, it is holding a little-endian value that needs to be converted to big endian, then stored as another int (i). so int k = some_little_endian_int(); int i = reinterpret_cast<int>(endian_cast<big, little>(k)); or int i = endian_cast<big, little>(k).raw(); // interpret k as little, convert/cast to big, return raw (big) data.
wrapper.data() // raw data. or raw_data() or raw() or... wrapper.value() // converted to host *value*
I could live with raw() and host(). In fact, the copying conversion logic could actually be on that class:
w.convert<big_endian>(); w.convert<little_endian>(); w.raw();
Yes for the object-based approach. I like forcing everyone to use endian_cast<> but I can live with the above as well.
convert_from<big_endian>() is slightly nicer, but endian_cast<T, big_endian>() isn't really much worse (only 2 characters! (not really...))
Actually, you're comparing those wrong. convert_from<big_endian>() means convert from big endian to host order. The equivalent cast requires knowing the host order: endian_cast<T, native_endian>(). I like the former better because it reads nicely, not just that it is shorter.
to me endian_cast<int>(e) takes a known endian type and converts to a host int. ie e was previously declared to be a big_endian or something. Thus endian_cast<int, big_endian>(i) takes an int (I suppose), reinterprets it as a big_endian, then converts (swaps if necessary) it to an int. The 'int' in the cast is for the return type, not the input type of i. I wonder if this is the source of confusion? If i is not an int, (in particular, if i is an endian type?) I'm not sure what it should do. And, as mentioned endian_cast<little_endian, big_endian>(i) takes an int, reinterprets it as a big_endian, converts it to a little_endian, returning a little_endian. endian_cast<little_endian, big_endian>(i).raw(); as above, returning an "int". At least an int by declaration and storage.
The benefit of the latter is consistency and explicitness.
Is the above 4 cases clear/consistent/explicit? It wasn't completely formed in my mind at the start of this all, so thanks for the back-and-forth.
Notice how easy it was to get wrong! It is also not consistent with the normal casts.
Yes! I'm not sure where we diverged, but we have. :-) Tony

Gottlob Frege wrote:
On Tue, Jun 8, 2010 at 12:59 PM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
Gottlob Frege wrote:
On Tue, Jun 8, 2010 at 11:52 AM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
It would be possible to also include the cases in which the input and output endiannesses are specified (the filter case), but those lead right back to confusion about the template argument order.
Technically, the output endianness would take the place of the int.
That doesn't fit the pattern because the result is an int, not a "big endian," for example. Hence the cast is to int first.
If you are explicitly converting with a from_type and a to_type, then (to me) the result is not actually an int, but a to_type. If you want to store it into an int, then cast it as one, or use to_type.raw().
Here we must diverge, at least partially. For me, the point of any conversion is to get usable data. The object-based approach achieves this by providing access to a specifically ordered value each time it is requested, whether that involves a conversion or not. The function-based approach achieves this by replacing a value with the desired order (in-place) or by returning the value with the desired order. The function-based approach should always result in a client type T ready to use. I see no point in wrapping T in something to call a function that will, potentially, change its order and return a wrapped T when the point of the conversion is to get a usable T. Doing otherwise simply clutters the interface and requires an additional step from the client.
But then you'd need a secondary cast, I suppose. ie
big j = endian_cast<big, little>(k); int i = reinterpret_cast<int>(endian_cast<big, little>(k));
Those aren't casts following the pattern of the new-style casts in the language.
Sorry if I wasn't clear, or don't understand - in my mind, k was
The particular point I was raising is the two template arguments for endian_cast in the above not being new-style cast compatible.
typed/stored/declared as an int, it is holding a little-endian value that needs to be converted to big endian, then stored as another int (i).
so int k = some_little_endian_int(); int i = reinterpret_cast<int>(endian_cast<big, little>(k)); or int i = endian_cast<big, little>(k).raw(); // interpret k as little, convert/cast to big, return raw (big) data.
This example illustrates the extra syntax I don't want. Here's the simpler version of what you've shown: int const k(get_little_endian_int()); int const i(convert_to<big_endian>(k)); With a wrapper type, that might look like this: endian_wrapper<int,little_endian> k(get_little_endian_int()); int const i(endian_cast<int>(k)); It's probably not unreasonable to support both.
wrapper.data() // raw data. or raw_data() or raw() or... wrapper.value() // converted to host *value*
I could live with raw() and host(). In fact, the copying conversion logic could actually be on that class:
w.convert<big_endian>(); w.convert<little_endian>(); w.raw();
Yes for the object-based approach. I like forcing everyone to use endian_cast<> but I can live with the above as well.
If endian_wrapper is found a reasonable means to indicate endianness assumptions, and the object-based approach is based upon endian_wrapper, then the conversion logic, other than that for in-place reordering, can be put in one place.
convert_from<big_endian>() is slightly nicer, but endian_cast<T, big_endian>() isn't really much worse (only 2 characters! (not really...))
Actually, you're comparing those wrong. convert_from<big_endian>() means convert from big endian to host order. The equivalent cast requires knowing the host order: endian_cast<T, native_endian>(). I like the former better because it reads nicely, not just that it is shorter.
to me
endian_cast<int>(e)
takes a known endian type and converts to a host int. ie e was previously declared to be a big_endian or something. Thus
endian_cast<int, big_endian>(i)
takes an int (I suppose), reinterprets it as a big_endian, then converts (swaps if necessary) it to an int. The 'int' in the cast is for the return type, not the input type of i. I wonder if this is the source of confusion?
I don't like that at all. Not only does it violate cast syntax with the extra template argument, but one dictates the result type and the other the input endianness. This would be a major source of confusion. I'd say this is the purview of convert_to/from rather than endian_cast. Thus, convert_from<big_endian>(i) would return a host order object of i's type, assuming i to be in big_endian form. The question is whether convert_to/from should accept unwrapped types. (More below.)
If i is not an int, (in particular, if i is an endian type?) I'm not sure what it should do. And, as mentioned
endian_cast<little_endian, big_endian>(i)
takes an int, reinterprets it as a big_endian, converts it to a little_endian, returning a little_endian.
I don't see the point of that. Surely the endian types can provide converting constructors.
endian_cast<little_endian, big_endian>(i).raw();
as above, returning an "int". At least an int by declaration and storage.
That's an interesting case, but I don't like using a cast for it. Perhaps the following, more verbose incantation would be appropriate: convert_to<little_endian>(endian_wrapper<int,big_endian>(i)); That makes the big endian assumption about i explicit. Of course that could be made prettier with the following: template <class T> endian_wrapper<T,big_endian> as_big_endian(T _value) { return endian_wrapper<T,big_endian>(_value); } convert_to<little_endian>(as_big_endian(i));
Is the above 4 cases clear/consistent/explicit? It wasn't completely formed in my mind at the start of this all, so thanks for the back-and-forth.
I think I understand you better. I hope I've made my concerns clear and that we can soon converge on a concrete proposal of interfaces that others can critique. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.
participants (8)
-
Alec Ross
-
David Abrahams
-
Gottlob Frege
-
Lars Viklund
-
Schrader, Glenn - 1002 - MITLL
-
Stewart, Robert
-
Vicente Botet Escriba
-
vicente.botet