
That is surprising. swap() should have no overhead compared to the endian object types. Then again, it's very early code, and I've done no optimization yet. It could be something as simply as an "inline" missing.
Both approaches are equally efficient for the same-endian case.
Do you mean endian object and swap_in_place or endian object and swap?
I didn't see a missing inline. I think I'm beginning to understand what is going on. The type-based endian approach suffers in the swap-in-place case, because its designed to be an efficient copier. My implementation of type-based endian requires a copy to a temporary, which must be in memory (not a register), before rewriting the data to the same location in converted form. std::swap() must do this too, but it can use a register for the temporary. This (I think) explains the performance advantage swapping shows for in-place conversion. However, the swap approach suffers when doing a copy, because it has to read the data, then write it (in-place), and then read it again, before writing it to the destination, if you don't want to modify the original data. terry