
Ivan Matek wrote:
But original reason why I wanted to do this with span is that I wondered about performance. I wanted to see what happens when we do same thing(hashing of all contiguous struct members) in 3 ways.
1. just call hash_append for each member 2. exploit that before padding we are contiguous and trivial so pass that region in one call to hash_append_range 3. Use BOOST_DESCRIBE_STRUCT automagic
Long story short is that for some hashes it makes no difference, but on my machine for sha2_256 and ripemd_128 there exist small but reliable performance differences. Ranking is: 2. is fastest, 3. slowest, 1. in the middle. I am not surprised that 2. is a bit faster than 1. since we make just 1 call, but it is a bit weird that 3. is slower than 1., I would guess that described class expand into same code as case 1.
Could be just compiler randomly deciding to inline/unroll something or not, but I wonder if you are aware of this differences. Maybe my assumption that describe does not do anything differently than manually written calls for each member is wrong? Code is here <https://godbolt.org/z/WzPG54aTd> , but it does not compile since IDK how to get hash2 onto godbolt, and anyway godbolt is too noisy to test little perf differences. I have tested this with libc++ and clang 19.
Yeah, it's annoying that it can't "just work" somehow on CE. That's why I submit libraries to Boost, so that they show up there. One can manually copy and paste the necessary headers: https://godbolt.org/z/5Maf9Mx9h Using a "hash archetype" (a skeleton algorithm that doesn't have any definitions) it seems that MyStruct1 and MyStruct3 generate the exact same code, so it must be inlining shenanigans. mp_for_each, which the described class code uses, probably adds a few more function calls that push things over some Clang limit. As a general rule, Clang is more conservative with inlining than GCC is, especially at -O3.