
On Tue, Dec 3, 2024 at 9:09 PM Peter Dimov
On the other hand, it makes it trivial to generate collisions
pair
( "foo", "bar" ) pair ( "foob", "ar" ) pair ( "fooba", "r" ) so maybe not.
Ah yes, it is obvious now. :)
Thank you. But this makes me wonder if we are talking about 2 different APIs/scenarios here. 1) I am combining hashes of different objects(e.g. std::pair example with collision potential above). 2) I am hashing one stream of bytes, but I do not have them all at the moment so I am passing it to hasher as they arrive(e.g. receiving long message over tcp, but hashing it as we get parts to minimize latency of computing hash after entire message is received) currently if we have span sp we can call hash_append/hash_append_range to get one of two behaviors/scenarios mentioned above, that I feel is quite error prone. In any case I do not have a lot to suggest here, seems like a hard problem, unless somebody can think of a way to make this less error prone for users. But original reason why I wanted to do this with span is that I wondered about performance. I wanted to see what happens when we do same thing(hashing of all contiguous struct members) in 3 ways. 1. just call hash_append for each member 2. exploit that before padding we are contiguous and trivial so pass that region in one call to hash_append_range 3. Use BOOST_DESCRIBE_STRUCT automagic Long story short is that for some hashes it makes no difference, but on my machine for sha2_256 and ripemd_128 there exist small but reliable performance differences. Ranking is: 2. is fastest, 3. slowest, 1. in the middle. I am not surprised that 2. is a bit faster than 1. since we make just 1 call, but it is a bit weird that 3. is slower than 1., I would guess that described class expand into same code as case 1. Could be just compiler randomly deciding to inline/unroll something or not, but I wonder if you are aware of this differences. Maybe my assumption that describe does not do anything differently than manually written calls for each member is wrong? Code is here https://godbolt.org/z/WzPG54aTd, but it does not compile since IDK how to get hash2 onto godbolt, and anyway godbolt is too noisy to test little perf differences. I have tested this with libc++ and clang 19. I know SIMD is not used so maybe it is pointless to talk about performance, but as I said I assumed there would be no difference between 1. and 3.