Re: [boost] [optional_io] InputStreamable refinement

Just heard back from Robert, I figure it's ok to forward this information: On Feb 26, 2009, at 10:00 AM, Robert Ramey wrote:
The serialization library does no depend up optional or it's implementation. If you change the implemenation of optional, just make the changes are consistent with the requirements of the serialization library.
The fact that the serialization of optional is implemented and tested as part of the serialization library is an anomoly. In fact, this code and test should be moved into the optional library.
I checked the optional test in the serializable library and it doesn't seem to even include optional_io.hpp. And given Robert's comments, it seems like it would be ok to make changes to optional_io.hpp. With respect to the change I suggested for operator>>() (I think in my last post I typoed and wrote operator<<(), I always get those silly things backwards!): In the case where we fail to extract optional<T>::value_type, I'm not sure what would be the best thing to do with respect to the position of the get pointer or other state bits. I had suggested this: if (in.fail () && !in.eof ()) { in.clear (); in.get (); } The intent of calling get() there is to enable this use case: stringstream sin ("one 2 three 4 five 6"); optional<int> i; while (sin >> i) { if (i) cout << "found int: " << i << endl; } // which finds 2, 4, and 6 But why move the get pointer up by one? Why one? Does this make sense? I can't provide a reasoning as to why it does, if it does. So another possibility is this: if (in.fail () && !in.eof ()) { in.clear (); in.setstate (std::ios::eofbit); } This has the added benefit of of making the following possible w/o any code change to lexical_cast: // doesn't throw, returns 0 lexical_cast<optional<unsigned> > ("-1").get_value_or (0); And if it's well documented that extracting an uninitialized optional<T> sets the eofbit on the istream, then a prudent user could (if they really wanted to) reformulate the first use case as: stringstream sin ("one 2 three 4 five 6"); optional<int> i; while (sin >> i) { if (i) cout << "found int: " << i << endl; else { sin.clear (); sin.get (); } } This leaves the why of moving the get pointer up to the user, where I think it really belongs. Because I can image some use cases where you'd want to do something other than just call get(). For example, maybe you'd want to call getline() and ignore an entire line before trying to extract another optional<T>. Or maybe you'd want to read until you found some delimiter before trying again. -- Andrew Troschinetz Applied Research Laboratories

Andrew Troschinetz <ast <at> arlut.utexas.edu> writes:
On Feb 26, 2009, at 10:00 AM, Robert Ramey wrote:
The fact that the serialization of optional is implemented and tested as part of the serialization library is an anomoly. In fact, this code and test should be moved into the optional library.
Serialization and optional_io have different approaches. We refuse to read/write an uninitialized object while serialization encode whether an object is initialized or not.
But why move the get pointer up by one? Why one? Does this make sense? I can't provide a reasoning as to why it does, if it does.
Please don't guess about iostream implemenation details. All we need is a consistency with underlying type, that is, given T t = /* ...*/ ; optional<T> o(t); expressions below must exhibit a same runtime behavior: cout << t << '\n'; cout << o << '\n'; I have a sketch of new implementation: --- boost/optional/optional_io.hpp.orig 2009-03-02 17:53:09.000000000 +0900 +++ boost/optional/optional_io.hpp 2009-03-02 19:18:28.000000000 +0900 @@ -42,11 +42,18 @@ operator<<(std::basic_ostream<CharType, CharTrait>& out, optional<T> const& v) #endif { - if ( out.good() ) + if(v) + out << *v; + else { - if ( !v ) - out << "--" ; - else out << ' ' << *v ; + // Create a sentry for side effects (e.g. out.tie()->flush()) +#if defined (BOOST_NO_TEMPLATED_STREAMS) + std::ostream::sentry guard(out);; +#else + typename std::basic_ostream<CharType, CharTrait>::sentry guard(out); +#endif + if(guard) + out.setstate(std::ios_base::failbit); } return out; @@ -62,17 +69,18 @@ operator>>(std::basic_istream<CharType, CharTrait>& in, optional<T>& v) #endif { - if ( in.good() ) + if(!v) + v = T(); // TODO: what if it throws? + + try + { + if( !(in >> *v) ) + v = optional<T>(); + } + catch(std::ios_base::failure) { - int d = in.get(); - if ( d == ' ' ) - { - T x ; - in >> x; - v = x ; - } - else - v = optional<T>() ; + v = optional<T>(); + throw; } return in; Alex

Hi Andrew, I was in the middle of the pre-vacation rush when this thread started. Now I'm back from vacations, so I can finally take a serious look into the subject :) Your requirement is to be able to do something like this: istringstream in ("test"); optional<string> s; assert (s); It fails with the current "optional_io.hpp" implementation because it uses an explicit encoding to indicate the initialization state in the code as a way to guarantee a closed roundtrip conversion from and to an uninitialized optional *wihtout any dependency on the encoding used by the underlying type* That is, the current optional_io.hpp is explicitely not designed to support your use case. Instead, it is designed to support the following use case: optional<T> in = -value_or_empty- stream << in ; optional<T> out ; stream >> out ; assert( in == out ) ;
Andrew Troschinetz <ast <at> arlut.utexas.edu> writes:
On Feb 26, 2009, at 10:00 AM, Robert Ramey wrote:
The fact that the serialization of optional is implemented and tested as part of the serialization library is an anomoly. In fact, this code and test should be moved into the optional library.
Serialization and optional_io have different approaches.
Considering the current "optional_io.h", not really.
We refuse to read/write an uninitialized object
Which is a reasonable requirement in the context of lexical_cast.
while serialization encode whether an object is initialized or not.
And so does "optional_io.h"
But why move the get pointer up by one? Why one? Does this make sense? I can't provide a reasoning as to why it does, if it does.
Please don't guess about iostream implemenation details. All we need is a consistency with underlying type
That's not all.. you also need a way to detect an unitialized value. You proposal uses extraction failure for that because it works perfectly fine in the context of lexical_cast since the stream contains exclusively the given optional (possibly empty) and nothing else. But if the stream can contain other items then you need an explicit marker. I don't think both requirements are compatible. Consider the following: stringstream stream ; optional<double> ind ; // <= empty optional<string> ins("123"); stream << ind << ins ; optional<double> outd ; optional<string> outd ; stream >> outd >> outs ; assert( ind == outd ) ; assert( ins == outs ) ; That would fail using your implementation because the stream would only contain 123 which just coincidentally represents a valid double, so in the extraction state the string instead of the double will be empty. This is what happens when failure to extract is used to indicate an empty optional. This deeply depends on the encoding detais of the underlying type. For example, in one of Andrew's use cases, there is a sequence of numbers and not-numbers. Just incidentally, since not-numbers fail to extract as numbers, that works. I recall very well now having this very same discussion on the distant past. It used to be as you want it, but for the sake of the serialization library I choose the current semantics, precisely because as Robert indicated the serialization library doesn't explicitely depend on Boost.Optional. Without any explicit dependency, optional<> *itself* needs to provide roundtrip IO functions not depending on EOF or the encoding details of the underlying type for uninitialized optionals. I implemented the IO operators in a separate header precisely to allow users to provide the other semantics, but that of course doesn't work if such a header ends up in another general utility like lexical_cast or the serialization headers. Off the top of my head, I think the best course of action would be for you to provide your implementation in "optional_io.h" but within a separate namespace, like lexical_cast_detail or so, keeping the current operators untouched. Then within the lexical_cast function a "using lexical_cast_detail;" would resolve to the proper IO operators. HTH -- Fernando Cacciola SciSoft Consulting http://www.scisoft-consulting.com

Fernando Cacciola <fernando.cacciola <at> gmail.com> writes:
[skip] That is, the current optional_io.hpp is explicitely not designed to support your use case. Instead, it is designed to support the following use case:
optional<T> in = -value_or_empty-
stream << in ;
optional<T> out ; stream >> out ;
assert( in == out ) ;
How can you guarantee this if not all underlying types guarantees this? E.g string in("two words"); stream << in; string out; stream >> out; assert( in != out ) ;
That's not all.. you also need a way to detect an unitialized value. You proposal uses extraction failure for that because it works perfectly fine in the context of lexical_cast since the stream contains exclusively the given optional (possibly empty) and nothing else.
I don't care about lexical_cast here, my implementation is based on reading Boost.Optional documentation: [--- optional<T> intends to formalize the notion of initialization (or lack of it) allowing a program to test whether an object has been initialized and stating that access to the value of an uninitialized object is undefined behavior. That is, when a variable is declared as optional<T> and no initial value is given, the variable is formally uninitialized. A formally uninitialized optional object has conceptually no value at all and this situation can be tested at runtime. It is formally undefined behavior to try to access the value of an uninitialized optional. An uninitialized optional can be assigned a value, in which case its initialization state changes to initialized. Furthermore, given the formal treatment of initialization states in optional objects, it is even possible to reset an optional to uninitialized. ---]
From "it is formally undefined behavior to try to access the value of an uninitialized optional" I conclude that it is formally undefined behavior to output an uninitialized optional.
[skip] I recall very well now having this very same discussion on the distant past. It used to be as you want it, but for the sake of the serialization library I choose the current semantics, precisely because as Robert indicated the serialization library doesn't explicitely depend on Boost.Optional.
Was it during review or after? Do you have a link to this discussion? Thanks, Alex

Hi Alexander,
Fernando Cacciola <fernando.cacciola <at> gmail.com> writes:
[skip] That is, the current optional_io.hpp is explicitely not designed to support your use case. Instead, it is designed to support the following use case:
optional<T> in = -value_or_empty-
stream << in ;
optional<T> out ; stream >> out ;
assert( in == out ) ;
How can you guarantee this if not all underlying types guarantees this?
I can't of course, and I wouldn't even try. For instance, your counter-example fails with both the current and your proposed implementation. Or are you saying that, since supporting such a requirement is not entirely possible since in the end is up to T, it isn't worth doing the best optional<T> itself can?
I don't care about lexical_cast here
OK, but you two clearly care for extracting an optional<T> from a stream that was insterted a bare T instead of an optional<T>. The current implementation is just not intended to support that.
my implementation is based on reading Boost.Optional documentation:
I figured. This is once again the now classic dicotomy between optional<T> as a super T or as a singleton container of T. This always raise endless discussions simply because both views are reasonable.
on Boost.Optional.
Was it during review or after?
Long way after the review of Optional, when lots of people started using it.
Do you have a link to this discussion?
Not that I can find real quick. I personally never ever needed to extract an optional<T> where a bare T (instead of an optional<T>) was inserted, so the current semantics just worked for me. So, let me step back a bit... Why do you (Andrew and you Alexander) need this? With the current implementation you can certainly correctly extract the correct output provided the stream was inserted an optional<T>, subject of course to the details of T (as exposed in your counter-example). Why isn't this enough practically speaking rather than theoretically? Best -- Fernando Cacciola SciSoft Consulting http://www.scisoft-consulting.com

Fernando Cacciola <fernando.cacciola <at> gmail.com> writes:
[skip] Or are you saying that, since supporting such a requirement is not entirely possible since in the end is up to T, it isn't worth doing the best optional<T> itself can?
Yes, that's what I'm saying except that I tend to disagree with the word "best" in this part of your sentence "doing the best optional<T> itself can" :) I believe, based on my reading of documentation, that my proposal is more correct.
I don't care about lexical_cast here
OK, but you two clearly care for extracting an optional<T> from a stream that was insterted a bare T instead of an optional<T>.
You're right. Current version of boost doesn't support this lexical_cast< optional<int> >(0) but I'm going to make a special case when TargetType is optional<T> to return a default value rather than throwing an exception: lexical_cast< optional<int> >("not a number").get_value_or(0); Since it's a special case, I can easily avoid using optional_io.hpp. Alex

Alexander Nasonov wrote:
Fernando Cacciola <fernando.cacciola <at> gmail.com> writes:
[skip] Or are you saying that, since supporting such a requirement is not entirely possible since in the end is up to T, it isn't worth doing the best optional<T> itself can?
Yes, that's what I'm saying except that I tend to disagree with the word "best" in this part of your sentence "doing the best optional<T> itself can" :) I believe, based on my reading of documentation, that my proposal is more correct.
And as I said, based on experience (*), arguing on which one is more correct would quickly wind up in an endless discussion as the number of participants goes up. I could or could not agree with now, but only to have someone else arguing for the current semantics in the near future. So I'm much more interested on which one is more generally useful to the most users. Robert's reply seems to indicate that the serialization library doesn't require the current semantics.. is that so? can this be verified? Only if so we can discuss whether the current semantics should be replaced. (*) This month Boost.Optional (well boost 1.30.0) is turning 6 :) Best -- Fernando Cacciola SciSoft Consulting http://www.scisoft-consulting.com

Fernando Cacciola <fernando.cacciola <at> gmail.com> writes:
And as I said, based on experience (*), arguing on which one is more correct would quickly wind up in an endless discussion as the number of participants goes up. If you request a fast track review, it would set time limits on discussions.
I could or could not agree with now, but only to have someone else arguing for the current semantics in the near future.
So I'm much more interested on which one is more generally useful to the most users.
Ok, lets look at it from a different angle. Documentation compares optional<T> with variant<T,nil_t>. The latter is not InputStreamable even if you make nil_t InputStreamable. Boost.Variant is only OutputStreamable. Users seemd to be happy (but I'm not a maintainer to say for sure).
Robert's reply seems to indicate that the serialization library doesn't require the current semantics.. is that so? can this be verified?
$ grep -rl optional_io boost/serialization/ <no matches> BTW, serialization implements its own Boost.Variant serialization. Alex

Fernando Cacciola wrote:
Or are you saying that, since supporting such a requirement is not entirely possible since in the end is up to T, it isn't worth doing the best optional<T> itself can?
I think so. (We have Boost.Serialization for that)
I personally never ever needed to extract an optional<T> where a bare T (instead of an optional<T>) was inserted, so the current semantics just worked for me.
How are you using operator>> with optional<T>? I can't imagine any common use case other than lexical_cast.
So, let me step back a bit...
Why do you (Andrew and you Alexander) need this?
With the current implementation you can certainly correctly extract the correct output provided the stream was inserted an optional<T>, subject of course to the details of T (as exposed in your counter-example).
Btw, the magic space and '--' string is not documented.
Why isn't this enough practically speaking rather than theoretically?
The current semantics is useless for me.

Hi Ilya,
Fernando Cacciola wrote:
Or are you saying that, since supporting such a requirement is not entirely possible since in the end is up to T, it isn't worth doing the best optional<T> itself can?
I think so. (We have Boost.Serialization for that)
But I had always believe the current semantics were precisely to support Boost.Serialization... Aparently I've been wrong all along... Since I never used it I never became familar with it.
I personally never ever needed to extract an optional<T> where a bare T (instead of an optional<T>) was inserted, so the current semantics just worked for me.
How are you using operator>> with optional<T>? I can't imagine any common use case other than lexical_cast.
Simply not using Boost.Serialization but other stuff that nevertheless relies on IO operators to do a closed roundtrip conversion.
So, let me step back a bit...
Why do you (Andrew and you Alexander) need this?
With the current implementation you can certainly correctly extract the correct output provided the stream was inserted an optional<T>, subject of course to the details of T (as exposed in your counter-example).
Btw, the magic space and '--' string is not documented.
Hmmm, that really doesn't look right ;)
Why isn't this enough practically speaking rather than theoretically?
The current semantics is useless for me.
OK. Use case counted :) Best -- Fernando Cacciola SciSoft Consulting http://www.scisoft-consulting.com
participants (4)
-
Alexander Nasonov
-
Andrew Troschinetz
-
Fernando Cacciola
-
Ilya Sokolov