Re: [boost] [boost::endian] Request for comments/interest

28 May 2010

      ----- Original Message ----- 
From: "Tomas Puverle" <tomas.puverle@morganstanley.com>
Newsgroups: gmane.comp.lib.boost.devel
To: <boost@lists.boost.org>
Sent: Friday, May 28, 2010 11:05 AM
Subject: Re: [boost::endian] Request for comments/interest
...
Thanks Dave,
...
2)      To copy or not to copy.
<snip>
Dave brings up an important example which I'd like to expand on a little:
Suppose your application generates a large amount of data which may need 
to be
endian-swapped.
For the sake of argument, say I've just generated an 10GB array that 
contains
some market data, which I want to send in little-endian format to some 
external
device.
In the case of the typed interface, in order to send this data, I would 
have to
construct a new 10GB array of little32_t and then copy the data from the 
host
array to the destination array.
Since IP packets cannot be 10GB, I submit that you're going to have to break 
your 10GB array down into messages.  Then you're going to copy portions of 
the 10GB array into those messages and send them.   In the type-base 
approach the message may indeed contain an array.

   boost::array<endian<little, uint32_t>, MaxFragmentSize> buffer;

That you copy fragments of the 10GB array into before sending, and then on 
the receiving size, copy them out.
The user on either side of the interface can extract the data from the 
fields without knowing the endianness of the field or the endianness of the 
machine he's working on.
He doesn't have to know to call a swap function.  He just extracts the data 
using the standard copy algorithm.  The conversion happens automatically by 
implicit conversions.
One copy into each message.  One copy out.  What could be better than that?
...
This has several problems:
1) It is relying on the fact that the typed class can be exactly overlaid 
over
the space required by the underlying type.  This is an implementation 
detail but
a concern nonetheless, especially if, for example, you start packing your
members for space efficiency.
In the example I posted, on non-native machines, an object "T" is 
represented inside of endian<endian_t, T> as "char storage[sizeof T]".
Provided that the compiler provides some kind of "packed" directive (all 
that I use do), then field alignment isn't an issue.
Doesn't swap_in_place<>() make the same assumption of overlaying types?
...
2) The copy always happens, even if the data doesn't need to change, since 
it's
already in the correct "external" format.  This is useless work - not only 
does
it use one CPU to do nothing 10 billion times, it also unnecessarily taxes 
the
memory interfaces, potentially affecting other CPUs/threads (and more, but 
I
hope this is enough of an illustration)
In the message-based interfaces that I am used to, one always must copy some 
data structures into a message before you send it.
After all, if you're using byte-streams, then endianness doesn't really 
apply.
There is always at least one copy into the message.  The typed-interface 
only requires one copy of data into each message.
In both techniques you have to copy the information out of the message, if 
you use it, at least one time.  The problem with the swapping mechanisum is 
that the swap, requires a write and a read from every location, before you 
even read it, whether you actually read the fields or not.  And/or, the user 
has to remember whether he/she has already swapped each field.  Since 
messages are often passed from one protocol layer to the next, usually 
written by different authors, I shudder to think of the integration 
experience.  The typed method requires one read from each memory location no 
matter what the endianness is.  (IUnfortunately, in the case of poorly 
optimizing compilers, the read on non-native machines may actually make two 
copies.)  The only efficiency issue with the typed interface is that 
non-native-endianess values are read out in reverse order byte-by-byte, 
where the native endian fields can be read out of the message more 
efficiently using word-sized and aligned data transfers.
...
swap_in_place<>(r) where r is a range (or swap_in_place<>(begin,end), 
which is
provided for convenience) will be zero cost if no work needs to be done, 
while
having the same complexity as the above (but only!) if swapping is 
required.
With the swap_in_place<>() approach, you only pay for what you need (to 
borrow
from the C++ mantra)
With the typed-approach you only pay for the message fields that you read. 
No extra work is required on native-endian machines.
I think the typed-approach actually fits the "only pay for what you use" 
mantra better.

I get the impression that I'm missing something.  If you're game, I'd like 
to consider a real-world use-case that uses multiple endians and has 
different protocol layers.
That is one over-the-wire packet has several layers of headers, possibly 
with different endian alignment than the user payload contained.  This is 
common on PC's which often have big-endian IP headers and then have a 
little-endian user payload.  The whole packet is read in from a socket at 
once into a data buffer owned by a unique_ptr, so the message is not copied 
from layer-to-layer.  I work on proprietary, non-internet networks, so I'm 
not sure which protocol headers we should use for a use-case.  In my 
wireless applications, the headers are usually padded to an integral number 
of bytes, but fields within the headers are sometimes not byte-aligned.

We're only considering byte-ordering here too.  An equally important part of 
the endian problem for me, is the bit-ordering.  For this I use a similar 
technique for portable bitfields

bitfield<endian_t, w1, w2, w3, w4, w5, ...>

I'm not sure yet how your swapping technique would affect that.

If we can find the time, I think our discussions would benefit from a 
concrete example to measure against.

BTW, I like the interface design of your library and the way you use macros 
and iterators to ease the swappability of classes, including inheritance.
I'm arguing against swapping though because I've been using the type-based 
method (but not Beman's exactly) successfully for a long time.  I'm a very 
biased.  :o).

terry

Re: [boost] [boost::endian] Request for comments/interest

Terry Golubiewski