[Serialization] why operator &()?

I've been looking at the serialization library (nice work, Robert) and I have a question. I searched the archives and didn't see any discussion on this, and I'm sorry I missed all the pre-approval discussion, or I would have said something sooner. What is the rationale behind using 'operator &()' to mean 'serialize this item?' I find a statement like 'ar & member1;' to be very non-intuitive: how does bitwise-and relate to serialization? It seems to me that most sensible code guidelines admonish against this kind of operator abuse (pardon my bluntness, but that's what it is when you get down to it). -- Jim

Jim Hyslop wrote:
I've been looking at the serialization library (nice work, Robert) and I have a question. I searched the archives and didn't see any discussion on this, and I'm sorry I missed all the pre-approval discussion, or I would have said something sooner.
What is the rationale behind using 'operator &()' to mean 'serialize this item?'
I find a statement like 'ar & member1;' to be very non-intuitive: how does bitwise-and relate to serialization?
It seems to me that most sensible code guidelines admonish against this kind of operator abuse (pardon my bluntness, but that's what it is when you get down to it).
What other operator would you suggest? - "<<" is not good because the library can not just save, but also load data - ">>" is not good for the same reason - Writing a.serialize(member1).serialize(member2) is very inconvenient. - Volodya

Vladimir Prus wrote:
What other operator would you suggest?
- "<<" is not good because the library can not just save, but also load data - ">>" is not good for the same reason - Writing
a.serialize(member1).serialize(member2)
is very inconvenient.
I agree. It may not be the most obvious but it allows a single serialize function rather than explicit save/load functions or chained calls as you've shown above. If someone thinks a different operator is more suitable then fine, but I like the single operator that is present at the moment. Cheers Russell

Russell Hind <rh_gmane <at> mac.com> writes:
Vladimir Prus wrote:
- Writing
a.serialize(member1).serialize(member2)
is very inconvenient.
I agree. It may not be the most obvious but it allows a single serialize function rather than explicit save/load functions or chained calls as you've shown above.
So who says they have to be chained? I agree the chained calls above look rather ugly, but what's wrong with: a.serialize(member1); a.serialize(member2); Clear, concise, and consistent. -- Jim

Jim Hyslop wrote:
So who says they have to be chained? I agree the chained calls above look rather ugly, but what's wrong with:
a.serialize(member1); a.serialize(member2);
Clear, concise, and consistent.
I guess I just was just used to a different archiving system that used << and >> so it acted like streams, that the suggestion to replace both with a single & seemed logical and clean to me. I don't have any problem with the use of & in this context. FWIW, we don't chain calls together either as all our code is for XML archives which therefore has long make_nvp calls but I still prefer the & to a .serialize call. Must be just personal preference. Cheers Russell

"Jim" == Jim Hyslop <jhyslop@dreampossible.ca> writes: [...] > I find a statement like 'ar & member1;' to be very > non-intuitive: how does bitwise-and relate to serialization?
> It seems to me that most sensible code guidelines admonish > against this kind of operator abuse (pardon my bluntness, but > that's what it is when you get down to it). Then how does bitwise-shift relate to I/O (iostream), or arithmetic addition relate to string concatenation (string), or modulo relate to formating strings (boost.format)? What's wrong when two irrelevant domains accidentally choose the same notation? Regards, Liu Jin

Liu Jin <cpp <at> vip.163.com> writes:
"Jim" == Jim Hyslop <jhyslop <at> dreampossible.ca> writes: [...] > I find a statement like 'ar & member1;' to be very > non-intuitive: how does bitwise-and relate to serialization?
> It seems to me that most sensible code guidelines admonish > against this kind of operator abuse (pardon my bluntness, but > that's what it is when you get down to it).
Then how does bitwise-shift relate to I/O (iostream), or arithmetic addition relate to string concatenation (string),
or modulo relate to formating strings (boost.format)? I haven't looked at Boost.format, but I suspect it's adopted from the well-known
I was waiting for someone to bring these up. They don't. The aforementioned coding guidelines often use these two overloads as prime examples of how *NOT* to overload operators. But, they have been in the language for so long that they are now part of the language's idiom. However, that does not make it "right" to perpetuate further operator overloads (note: I have not yet read Robert's reply in which he outlines the rationale behind the operator). printf formatter.
What's wrong when two irrelevant domains accidentally choose the same notation?
Let me turn this back at you: what's wrong with using a word instead of a symbol? Sure, using a symbol may save a few keystrokes of typing. But, code is _read_ far more often than it is written, so the few seconds you save in typing are more than offset by the several seconds' confusion experienced by each and every programmer the first time they see this new and unusual usage (my first thought, actually, was that it was a declaration, then I thought "address-of", and then finally clicked in to operator overloading). -- Jim

"Jim" == Jim Hyslop <jhyslop@dreampossible.ca> writes: > Liu Jin <cpp <at> vip.163.com> writes: >> Then how does bitwise-shift relate to I/O (iostream), or >> arithmetic addition relate to string concatenation (string), > I was waiting for someone to bring these up.
> They don't. The aforementioned coding guidelines often use these > two overloads as prime examples of how *NOT* to overload > operators. But, they have been in the language for so long that > they are now part of the language's idiom. Despite what those guidelines say, the two overloadings are easily received by newcomers. ( I know because I'm a teaching assistant in freshman's C++ course. ) People are not surprised by cout<<i or str1+str2, expecting those to mean shifting cout i bits left, or doing arithmetic addition on two strings. They just recognize it as something alien, and the moment you explained the new meaning, there's no further confusion. > However, that does not make it "right" to perpetuate further operator overloads Correct. But they are in the same category of operator overloading: overloading an operator in a completely different context, to have a completely different meaning, regarding the built-in one. The rationale behind this kind of overloading is similar. >> What's wrong when two irrelevant domains accidentally choose >> the same notation? > Let me turn this back at you: what's wrong with using a word > instead of a symbol? Sure, using a symbol may save a few > keystrokes of typing. But, code is _read_ far more often than it > is written, so the few seconds you save in typing are more than > offset by the several seconds' confusion experienced by each and > every programmer the first time they see this new and unusual > usage (my first thought, actually, was that it was a > declaration, then I thought "address-of", and then finally > clicked in to operator overloading). Nothing wrong. Just a little more readable to those new to the code, and a little less readable to those familiar with it (long words require more mental power to process, exactly the reason 1+2*3 is more readable than one plus two multiply three). I guess code is read far more often by those who know it than not. So after all it's just a minor design issue. Had Robert chosen the other path, say, serialize(ar, obj), I'd raise no objection. BTW, `serialize' is not the perfect choice here. It kinda makes me wondering: where's the corresponding deserialize calls? ;-) Regards, Liu Jin

Here is the background/history. The original inspiration for the serialization library was my experience with the serialization system in Microsofts MFC library. I liked it but wanted to make more generalized. I came upon Boot while looking for a mult-threading "wrapper" which would "guarentee" thread safety if one followed the wrappers rules. I was very impressed with the thought and rigor of the boost threading library. Hence I became interested in boost. At that time there was a small "persistence" project by Jens Maurer. It had the beginnings of what I wanted a serialization library to do. I was convinced that to be really useful - a serialization library had to do just about everything. (Usually, I disagree with suc a notiion - but this case was an exception). Jen's library had the & operator - which at first surprised and confused me. I was just starting to understand the usage of templates beyond simple parameterized class declarations/definitions at it took me a little while to actually see what the & operator was doing in this case. Of course it eventually became clear to me and I embraced it. I provides one huge benefit. It guarentees the saving and loading are symetric for most usages of the library. This seems like a small thing. But tracking down obscure asymetries in binary archives turns out to be surprizingly time consuming. So the existence of the & operator along with xml archives which can detect asymetric serialization implementation have effectively eliminated this problem. To summarize, I included the & operator because Jens did and came to like the idea. The question remained - why & rather than some other. From strictly aesthetic considerations I would ha ve preferred && because it looks like << and >>. But I elected not to change it. I suspected that the & operator was chosen due to its operator precedence being at a particular spot in the operator precedence hierarchy and I never had any problems with things like ar & m1 & m2 ... and having many wheels to invent I was please to presume that this choice was a studied one and a good one so I accepted it and moved on to fry my own fish. So that's how we got to where we are. It has gained, up to now universal acceptance, so I don't see this changing. Of course, its usage is optional. Also you're free define your own global function like serialize(ar, data) which would be equivalent and suit your own taste. The usage of << and >> are my own and are obviously analog to the concept of using << and >> for stream input / output respectively. So, for better or worse, that's how we arrived here. Robert Ramey Jim Hyslop wrote:
I've been looking at the serialization library (nice work, Robert) and I have a question. I searched the archives and didn't see any discussion on this, and I'm sorry I missed all the pre-approval discussion, or I would have said something sooner.
What is the rationale behind using 'operator &()' to mean 'serialize this item?'
I find a statement like 'ar & member1;' to be very non-intuitive: how does bitwise-and relate to serialization?
It seems to me that most sensible code guidelines admonish against this kind of operator abuse (pardon my bluntness, but that's what it is when you get down to it).

Robert Ramey <ramey <at> rrsd.com> writes:
Here is the background/history.
[...]
Jen's library had the & operator - which at first surprised and confused me. My point exactly - has this group forgotten the Principle of Least Astonishment?
I was just starting to understand the usage of templates beyond simple parameterized class declarations/definitions at it took me a little while to actually see what the & operator was doing in this case.
Templates are a red herring here - the use of operator & in this case is independent of templates. [...]
I suspected that the & operator was chosen due to its operator precedence being at a particular spot in the operator precedence hierarchy[...] Except that operator precedence only applies to "built-in" operators - user defined operators are function calls, and have that level of precedence.
The usage of << and >> are my own and are obviously analog to the concept of using << and >> for stream input / output respectively. These I understand and agree with, because the analogy is well-known and understood.
So, for better or worse, that's how we arrived here. Thanks for the explanation. Now that the library has been accepted, I'm not expecting this will change (at least not in the near future), I just wanted to know the rationale behind it. I'll have to pay more attention in the future ;=)
-- Jim

Jim Hyslop wrote:
Robert Ramey <ramey <at> rrsd.com> writes:
Templates are a red herring here - the use of operator & in this case is independent of templates.
Maybe - but its not independent of the type. The key feature of the & operator in this instance is that its meaning save vs. load changes depending on the type its applied to. Generally I dislike and avoid this kind of invisible mode shifting. But in this case it has proved more useful than error-prone.
I suspected that the & operator was chosen due to its operator precedence being at a particular spot in the operator precedence hierarchy[...]
Except that operator precedence only applies to "built-in" operators - user defined operators are function calls, and have that level of precedence.
I realize this. But once one decides he wants to overload an operator, one has to pick an operator with consideration of its precedence. So that's the main reason I stuck with it.
The usage of << and >> are my own and are obviously analog to the concept of using << and >> for stream input / output respectively.
These I understand and agree with, because the analogy is well-known and understood.
Whow knows, maybe someday & will be added to the above.
So, for better or worse, that's how we arrived here.
Thanks for the explanation. Now that the library has been accepted, I'm not expecting this will change (at least not in the near future), I just wanted to know the rationale behind it. I'll have to pay more attention in the future ;=)
I don't think coming in earlier would have changed this. The usage of the & operator for this purpose seems to have been pretty well recieved. And of course it's not obligatory. One who believes its a detriment can just avoid its usage. That is why is_saving and is_loading are include in the archive interface. Robert Ramey

Jim Hyslop <jhyslop@dreampossible.ca> writes:
Robert Ramey <ramey <at> rrsd.com> writes:
Here is the background/history.
[...]
Jen's library had the & operator - which at first surprised and confused me.
My point exactly - has this group forgotten the Principle of Least Astonishment?
There's always a tension between innovation and familiarity.
[...]
I suspected that the & operator was chosen due to its operator precedence being at a particular spot in the operator precedence hierarchy[...] Except that operator precedence only applies to "built-in" operators - user defined operators are function calls, and have that level of precedence.
No, that's wrong. In fact, IIRC, the C++ built-in operator bindings can't be described correctly in terms of operator precedence, but you can get pretty close. And to the extent you can get close, user-defined operators bind exactly as the builtin operators of the same names would. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
Jim Hyslop <jhyslop@dreampossible.ca> writes:
Except that operator precedence only applies to "built-in" operators - user defined operators are function calls, and have that level of precedence.
No, that's wrong.
You are correct, of course. 13.5 paragraph 6 states it quite clearly: "It is not possible to change the precedence, grouping, or number of operands of operators." My apologies for the inaccuracy. -- Jim

Jim Hyslop wrote:
Robert Ramey <ramey <at> rrsd.com> writes:
[...]
I suspected that the & operator was chosen due to its operator precedence being at a particular spot in the operator precedence hierarchy[...] Except that operator precedence only applies to "built-in" operators - user defined operators are function calls, and have that level of precedence.
Really? I thought the operator precedence was implemented by grammar rules. How does the grammar know whether an operator is builtin or user defined? I am pretty sure that a+b*c means a+(b*c) irrespective of whether a,b,c are builtin or UDT. Cheers, Ian
participants (7)
-
David Abrahams
-
Ian McCulloch
-
Jim Hyslop
-
Liu Jin
-
Robert Ramey
-
Russell Hind
-
Vladimir Prus