Re: [boost] [serialization] a proposal for an alternative to the new const-saving rule

25 Jun 2005


      Hi Dave,

----- Mensaje original -----
De: David Abrahams <dave@boost-consulting.com>
Fecha: Sábado, Junio 25, 2005 2:09 pm
Asunto: Re: [boost] [serialization] a proposal	for	an
	alternative tothenewconst-saving rule
...
"JOAQUIN LOPEZ MU?Z" <joaquin@tid.es> writes:
...
David Abrahams <dave@boost-consulting.com> wrote:
  to thenewconst-saving rule
...
"JOAQUIN LOPEZ MU?Z" <joaquin@tid.es> writes:
...
De: David Abrahams <dave@boost-consulting.com>
I'm not sure it is.  There's an imposition on users: all the types
they want to serialize have to support hashing.
No, no. Please check the piece of pseudocode on my first
message: the hash code is automatically built by
Boost.Serialization, without any intervention from the user,
I understand that.
...
and certainly without any requirement that the type be hashable
(in the sense of providing a hash_value overload or something
like this.)
I was talking about the type's sub-objects.
...
Let me illustrate with an example:
<snip>
I actually did understand that, as you can see from...
...
The process recursively goes down to primitive types,
as specified in my proposed pseudocode, and only these
have to be hashable, but fortunately they are.
...
It is nice that the serialization library automatically takes
care of
...
hashing aggregated types and leaving out the unserialized 
data...
...this remark, and the remarks at the end of my message, which you
snipped.
OK, excuse my verbosity, but now I'm satisfied that 
we both agree on the terms the algorithm is laid out.
...
...
...
uh, wait: this will never work unless you plan only to do shallow
hashing.  Otherwise you will get an exponential explosion for some
object graphs.  Is that your intention?
I'm not getting you. The hash value is calculated as part of the
saving process itself, so it has the very same complexity.
I've got the hunch you might be meaning something else,
could you please elaborate?
The question is, when encountering a pointer or reference member that
needs to be hashed, do you do what the serialization process does and
hash the thing that it refers to, or do you just hash its address?
The former is deep hashing; the latter is shallow hashing.  In a graph
with tracking, you serialize an object X once, and all subsequent
times you encounter X in the graph you just serialize an id.  If you
add hashing, the first time you encounter X you'll hash it and
serialize it.  For this one time it doesn't matter if hashing is deep
or shallow from a big-O perspective.  However, when you encounter X
again, if hashing is deep, you'll do a lot of extra work.  
Actually, a
potentially infinite amount if there are cycles, so hashing really
does have to be shallow.  I guess I answered my own question.
I guess so. Correct, pointer hashing has to be shallow.

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

JOAQUIN LOPEZ MU?Z

tags

participants (1)