Re: [boost] [boost::endian] Request for comments/interest

2 Jun 2010

      Aha!

----- Original Message ----- 
From: "Tomas Puverle" <tomas.puverle@morganstanley.com>
Newsgroups: gmane.comp.lib.boost.devel
To: <boost@lists.boost.org>
Sent: Tuesday, June 01, 2010 11:01 AM
Subject: Re: [boost::endian] Request for comments/interest
...
...
...
...
struct Packet {
  internet::IpHeader ipHeader;
  internet::UdpHeader udpHeader;
  UserMessage userMessage;
}; // Packet
I do see one problem with this - you have defined your struct with a 
specified
endianness.
Yes, the over the wire endianness is always (in my world) specified 
independently and does not depend on the source or destination endianness.
Both endpoints use the same header file (ideally) or each side constructs 
their own.
If it gets more complicated than this, then we use CORBA.
...
However, if you wanted to send it to a different destination which 
requires a
different endianness, you would have to duplicate your data structure.
Yes, this design assumes compile-time knowledge of endianness.
So, the author cannot specify the endianness at runtime; its a template 
argument.
...
The more I think about it, the more it's clear to me that this represents 
an
example of the wrong separation of concerns (no flame intended - simply a
software engineering term):
In the typed approach, the endianness becomes the property of the message.
I think this is the appropriate separation of concerns.  The endianness is a 
property of how the data in the message is represented.
...
However, I think endianness should be the property of the destination, no?
I definitely disagree.  Because the code should not know the endianness of 
the destination.
There could be several destinations, possibly with differing endianness, for 
the same message.
Also, if a new board gets added that is has a different endian, I should 
only have to change the code on that board.  No other boards should be 
affected.
...
...
char buf[BIG_ENOUGH];
<snip>
Terry, I like your example as it does indeed seem to make simple things 
simple.
For the kind of work that I do, yes.
...
- read a matrix from disk and for the sake of argument, let's say it's in 
a
big-endian format
- read a number representing the number of its dimensions
- based on the number of elements, swap each element to the machine's
representation
//this is the part I am most interested in how you'd solve using the typed
//approach.  Ignore the potential alignment issue.
double * dataBeg = static_cast<unsigned char*>(data) + sizeof(mh),
        dataEnd = dataBeg + nElem;
swap_in_place<big_to_machine>(dataBeg, dataEnd);
//process dataBeg ...
I can't help but think that the typed approach will have to be O(N) in 
both
little and big-endian scenarios.
AHA!  Yes. If done this way, reading in the entire array and then swapping 
it in one step, then one would need an endian_swapper, like yours to make it 
a no-op in the native case.
However, if I had to solve this problem with my tools, I wouldn't read the 
whole array in at once.  I would only read in the header, but not the data 
at first.
Unfortunately, I can't remember how to do this with boost::multiarray, if we 
just allocate a really big vector for the data, then I would have read in 
each element into the data portion like this.

vector<double> dataPortion;
dataPortion.reserve(nElem);
for (int i=0; i!=nElem; ++i) {
  endian<big, double> x;
  read(fh, &x, sizeof(x));
  dataPortion.push_back(double(x));
}

This requires an extra copy into the temporary x, which cannot be a 
register, before storing the native version into the vector.
I see your point now for the in-place case.  Thank you!

Assuming I read in the whole array into memory at once, as you did, then I 
might try this (assuming aliasing rules don't really matter).
To address the in-place case with a typed interface, I would make a 
templated copy function.

template<endian_t E1, class T, endian_t E2> inline
endian<E2,T>* copy(const endian<E1, T>* first, const endian<E1,T>* last, 
endian<E2,T>* dest) {
  if (E1 == E2 && first == dest)
    return dest + nElem;
  while (first != last)
    *dest++ = *first++;
  return dest;
}

(I don't recall how to generalize the parameters as iterators.  Help 
anyone?)

... then to swap in place  ...
endian<big, double>* src = reinterpret_cast<endian<big, double>*>(mh +1);
double* dst = reinterpret_cast<double*>(src);
(void) copy(src, src+nElem, dst);  // Almost a no-op in the native, endian 
case.

... but I'm not sure if this is legal from an aliasing standpoint.

I think this approach would have similar performance to your swap() and 
swap_in_place().  Tomorrow night, I'll make some measurements.

terry

Re: [boost] [boost::endian] Request for comments/interest

Terry Golubiewski