
Thanks Dave,
2) To copy or not to copy. <snip>
Dave brings up an important example which I'd like to expand on a little: Suppose your application generates a large amount of data which may need to be endian-swapped. For the sake of argument, say I've just generated an 10GB array that contains some market data, which I want to send in little-endian format to some external device. In the case of the typed interface, in order to send this data, I would have to construct a new 10GB array of little32_t and then copy the data from the host array to the destination array. This has several problems: 1) It is relying on the fact that the typed class can be exactly overlaid over the space required by the underlying type. This is an implementation detail but a concern nonetheless, especially if, for example, you start packing your members for space efficiency. 2) The copy always happens, even if the data doesn't need to change, since it's already in the correct "external" format. This is useless work - not only does it use one CPU to do nothing 10 billion times, it also unnecessarily taxes the memory interfaces, potentially affecting other CPUs/threads (and more, but I hope this is enough of an illustration) swap_in_place<>(r) where r is a range (or swap_in_place<>(begin,end), which is provided for convenience) will be zero cost if no work needs to be done, while having the same complexity as the above (but only!) if swapping is required. With the swap_in_place<>() approach, you only pay for what you need (to borrow from the C++ mantra) Tom