
Ivan Matek wrote:
But original reason why I wanted to do this with span is that I wondered about performance. I wanted to see what happens when we do same thing(hashing of all contiguous struct members) in 3 ways.
1. just call hash_append for each member 2. exploit that before padding we are contiguous and trivial so pass that region in one call to hash_append_range 3. Use BOOST_DESCRIBE_STRUCT automagic
Long story short is that for some hashes it makes no difference, but on my machine for sha2_256 and ripemd_128 there exist small but reliable performance differences. Ranking is: 2. is fastest, 3. slowest, 1. in the middle. I am not surprised that 2. is a bit faster than 1. since we make just 1 call, but it is a bit weird that 3. is slower than 1., I would guess that described class expand into same code as case 1.
Could be just compiler randomly deciding to inline/unroll something or not, but I wonder if you are aware of this differences. Maybe my assumption that describe does not do anything differently than manually written calls for each member is wrong? Code is here https://godbolt.org/z/WzPG54aTd , but it does not compile since IDK how to get hash2 onto godbolt, and anyway godbolt is too noisy to test little perf differences. I have tested this with libc++ and clang 19.
Yeah, it's annoying that it can't "just work" somehow on CE. That's why I submit libraries to Boost, so that they show up there. One can manually copy and paste the necessary headers: https://godbolt.org/z/5Maf9Mx9h Using a "hash archetype" (a skeleton algorithm that doesn't have any definitions) it seems that MyStruct1 and MyStruct3 generate the exact same code, so it must be inlining shenanigans. mp_for_each, which the described class code uses, probably adds a few more function calls that push things over some Clang limit. As a general rule, Clang is more conservative with inlining than GCC is, especially at -O3.