Re: [boost] [hash2][review] Early review (due to holidays)

6 Dec 2024

      On Fri, Dec 6, 2024 at 6:12 PM Peter Dimov via Boost <boost@lists.boost.org>
wrote:
...
Andrey Semashev wrote:
...
And fixed extents are not as useful as the dynamic extent in general, in
my
experience, as most of the time we deal with variable-sized sequences.
The purpose of span is to replace pointer arguments. If your function
takes
I agree, but I wonder if we can maybe have our cake and eat it too, i.e.
no need to pick one or the other, just provide easy default for most users
and use no span interface for "low level" API.
I think one important thing is to remember that currently span will be
processed as it's bytes + it's size when you call hash_append.
If you want it processed as it's bytes we need to call hash_append_range with
begin and end iterator.

My first attempt of fixing this potential for confusion while addressing
concerns about performance would be to provide higher level api for users
and keep low level API for authors of algorithms since when it comes to
hashing even tiny overhead of using span can be problematic.

To steal placeholder name from other discussions we will call these helpers
simple_hash_{something}. It would use default flavor and not provide ptr,
len interface.
But now here we get to the point that as mentioned above sometimes span is
not treated as it's bytes.
So my intuition would be that we would need to have 2 different helpers
addressing 2 distinct use cases:

   1. simple_hash_bytes
   2. simple_hash_values

Naming is not great, but I hope you get the idea.
simple_hash_bytes can only process span/vector/array like
arguments(contiguous bytes of some range where underlying data has same
size as char), it would only append bytes, not size. Does not know how to
hash int, std::string, etc... just byte like contiguous ranges.
simple_hash_values processes values, e.g. your user defined type for which
you implemented tag_invoke, std::string, std::pair, int, etc.... here span
is hashed as it's bytes and size.

I think this would cover 2 common use cases:

   1. simple_hash_bytes   -  I am getting my data in batches and hashing
   them , but I want same result, e.g. if my file content is 1234 bytes no
   matter how I split those 1234 bytes in multiple calls with span I will
   get same result.
   2. simple_hash_values - I am hashing multiple values together, e.g.
   std::pair<std::span<char>, std::span<char>> and I do *not *want .first
    and .second to just be concatenated.

Now you can obviously construct a usecase where  neither of helpers work,
e.g. I am getting some data from a file in chunks, then I want to append
hash of std::string. But I think most cases are situations where people
just hash bytes in chunks, and situations where people hash values(each
value in 1 call). Potentially user can wrongly assume that
simple_hash_values with span will behave as simple_hash_bytes, but I think
documentation can explain that in few sentences pretty clearly.
...
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [boost] [hash2][review] Early review (due to holidays)

Ivan Matek