[endian] Use in Hash algorithms

Popular hash algorithms (MD5, SHA, ...) involve a preprocessing stage to turn bytes (or individual bits) into (32- or 64-bit) words, using a certain byte (and bit) order. For my Hash library, I ended up writing a pack function that takes an endianness and the number of bits in the input and output values, and combines or splits the input as needed. A simple example: { array<uint8_t, 8> in = {{0x67, 0x45, 0x23, 0x01, 0xEF, 0xCD, 0xAB, 0x89}}; array<uint32_t, 2> out; pack<little_octet_big_bit, 8, 32>(in, out); array<uint32_t, 2> eout = {{0x01234567, 0x89ABCDEF}}; assert(out == eout); } As a bonus, it also handles non-bytes units: { array<uint8_t, 3> in = {{31, 17, 4}}; array<uint16_t, 1> out; pack<big_bit, 5, 15>(in, out); array<uint16_t, 1> eout = {{(31 << 10) | (17 << 5) | (4 << 0)}}; assert(out == eout); } An extensive set of examples can be found here: <http://svn.boost.org/svn/boost/sandbox/hash/libs/hash/test/pack.cpp> This is used to turn the input into words, to turn the length into words for padding, for figuring out where in the word the "1" padding bit goes, and for turning the state back into octets for display. There are enough optimizations SFINAEed in that the first example just results in a memcpy on x86, but doesn't require contiguous input. (It's perfectly happy with single-pass input, though usually somewhat slower in that case.) I'm not sure how widely applicable this form of the solution would be, but I think it's a case where both the byte-swapping version and Beman's swap-on-load approach are awkward. ~ Scott McMurray

Popular hash algorithms (MD5, SHA, ...) involve a preprocessing stage to turn bytes (or individual bits) into (32- or 64-bit) words, using a certain byte (and bit) order.
<snip> Hi Scott, I am sorry but I can't speak intelligently about the suitability of my endian implementation for this particular purpose. Perhaps someone else with some experience in cryptographic hash functions can comment. I didn't want to just ignore your comment but this simply isn't an area of my expertise. Tom

On 28 May 2010 12:26, Tomas Puverle <tomas.puverle@morganstanley.com> wrote:
I am sorry but I can't speak intelligently about the suitability of my endian implementation for this particular purpose.
I'll try to pose it differently. How would you implement the following using your library? Magic<8,32> m; m.put(0xAA); m.put(0xBB); m.put(0xCC); m.put(0xDD); assert(m.get() == 0xAABBCCDD); What about supporting this slightly more complicated case? Magic<16,32> m; m.put(0xAAAA); m.put(0xBBBB); assert(m.get() == 0xAAAABBBB); And how about this far more challenging one? Magic<4, 16> m; m.put(0xA); m.put(0xB); m.put(0xC); m.put(0xD); assert(m.get() == 0xABCD); I think I see how I'd do the first, and probably the second, but the third may require a very different approach. ~ Scott McMurray

What is Magic<int, int>? terry ----- Original Message ----- From: "Scott McMurray" <me22.ca+boost@gmail.com> Newsgroups: gmane.comp.lib.boost.devel To: <boost@lists.boost.org> Sent: Friday, May 28, 2010 11:42 AM Subject: Re: [endian] Use in Hash algorithms
On 28 May 2010 12:26, Tomas Puverle <tomas.puverle@morganstanley.com> wrote:
I am sorry but I can't speak intelligently about the suitability of my endian implementation for this particular purpose.
I'll try to pose it differently.
How would you implement the following using your library?
Magic<8,32> m; m.put(0xAA); m.put(0xBB); m.put(0xCC); m.put(0xDD); assert(m.get() == 0xAABBCCDD);
What about supporting this slightly more complicated case?
Magic<16,32> m; m.put(0xAAAA); m.put(0xBBBB); assert(m.get() == 0xAAAABBBB);
And how about this far more challenging one?
Magic<4, 16> m; m.put(0xA); m.put(0xB); m.put(0xC); m.put(0xD); assert(m.get() == 0xABCD);
I think I see how I'd do the first, and probably the second, but the third may require a very different approach.
~ Scott McMurray

On 29 May 2010 00:05, Terry Golubiewski <tjgolubi@netins.net> wrote:
What is Magic<int, int>?
Something that needs implementing as a test of an endian library. For example, I could implement it with what I wrote as follows: template <int In, int Out> struct Magic { array< uint_t<In>::least, Out/In > in; unsigned i; Magic() : i() {} void put(uint_t<In>::least x) { in[i++] = x; } uint_t<Out>::least get() const { array<uint_t<Out>::least, 1> out; pack<big_bit, In, Out>(in, out); return out[1]; } }; With Beman's it would be something like this: template <int In, int Out> struct Magic { union { array< endian<big, uint_t<In>::exact>, Out/In> in; endian<big, uint_t<Out>::exact> out; }; array< uint_t<In>::least, Out/In > in; unsigned i; Magic() : i() {} void put(uint_t<In>::least x) { in[i++] = x; } uint_t<Out>::least get() const { return out; } }; I think, anyways, though it requires C++0x unions. That, of course, can't handle Ins or Outs that are not multiples of CHAR_BIT.
participants (3)
-
Scott McMurray
-
Terry Golubiewski
-
Tomas Puverle