Complex type failure with boost::lexical_cast

Hi, check out this code: #include <boost/lexical_cast.hpp> #include <iostream> struct my_type { int a; int b; }; template<typename C, typename T> std::basic_istream<C,T> &operator>>(std::basic_istream<C,T> &is, my_type &mt) { is >> mt.a >> mt.b; return is; } int main() { std::string input = "1 2"; my_type mt; try { mt = boost::lexical_cast<my_type>(s); std::cout << "TEST 1: " << mt.a << " " << mt.b << std::endl; } catch (boost::bad_lexical_cast &) { std::cout << "TEST 1 FAILED" << std::endl; } input = "3 4"; std::stringstream ss; ss << s; ss >> mt >> std::ws; if (ss.eof()) std::cout << "TEST 2: " << mt.a << " " << mt.b << std::endl; else std::cout << "TEST 1 FAILED" << std::endl; return 0; } The output I get is: TEST 1 FAILED TEST 2: 3 4 My investigation has shown that this occurs because boost::lexical_cast calls stream.unsetf(std::ios::skipws); My question is, why does it do this - as this specifically stops lexical_cast from working with complex types. So could this be an option, or completely disabled, or made an option (somehow) to lexical_cast? I understand some logic to not skipping whitespace, however I believe its use is limited - especially since there is a specialization for std::string and std::wstring already which does not even use operator>>. As a side note, yes, I could change my operator>>, however the sample operator>> above is typical for an operator>> for a complex class, almost always they assume that skipws is turned on, and don't bother checking. -- PreZ :) Founder. The Neuromancy Society (http://www.neuromancy.net)

From: "Preston A. Elder" <prez@neuromancy.net>
"s" is used a couple of places in the code, where I assume "input" should be used instead (or it gives an error when compiling). Apart from that, I get the same result.
My investigation has shown that this occurs because boost::lexical_cast calls stream.unsetf(std::ios::skipws);
There was recently another whitespace-with-lexical_cast related question at Boost users list, where the poster preferred no whitespace stripping. To quote: --- Start quote --- From: "Yuval Ronen" <ronen_yuval@yahoo.com> To: <boost-users@lists.boost.org> Sent: Saturday, January 15, 2005 5:47 PM Subject: [Boost-users] Two issues with lexical_cast <snip>
--- End quote --- So what should lexical_cast do; strip or not? Ladies and gentlemen, can we have your votes? ;)
Adding options to lexical_cast is difficult. It's meant to resemble regular casts, and any additional template- or function-arguments makes it no longer look like a cast.
Good point. I've cc'ed this reply to Kevlin Henney, as I don't think he's subscribed to the list. Regards, Terje

There's nothing more flattering than being quoted, thanks! :-)
I get the feeling that making it look like a cast is being overrated. This is clearly a case where there's a real difficulty in determining what is the most "common", or "natural" way of doing it for most people (without any intention of weakening my case in the user's list thread, of course :-) ). So IMHO, the customization's benefits will prove more important than cast resemblance.

From: "Yuval Ronen" <ronen_yuval@yahoo.com>
<snip>
There's nothing more flattering than being quoted, thanks! :-)
My question is, why does it do this - as this specifically stops lexical_cast from working with complex types. So could this be an
You're welcome. :) option,
That argument isn't really mine to begin with; Kevlin has been quite insistent on it. :) I've been questioning it, as well. However, he has insisted on that that lexical_cast is supposed to be a simple, quite general-purpose conversion component, and that more "advanced" features (such as numeric formatting, etc.) is better handled by using std::stringstream directly. The topic has periodically come up now and then. Regards, Terje

On Sat, 29 Jan 2005 23:53:45 +0100, Terje Slettebø wrote:
I am of the opinion we should follow the standard (ie. STL) in this one. The STL would accept both, and since we rely on operator>>, so should we. I mean, if you're going to be anal about spaces, be completely anal about them, but I don't think we should be, I think accepting whitespace before and after is correct, just as the stringstream operator>> does by default.
Not really, make it a policy - defaulting to one or the other, eg: myclass a = boost::lexical_cast<myclass>(s); will not care about spaces. myclass a = boost::lexical_cast<myclass, noskipws>(s); will care. Its easy enough to default the argument: enum ws_policy { skipws, noskipws }; template<typename T, ws_policy wsp = skipws> class lexical_cast { ... }; Or even better: template<typename T, std::ios_base::fmtflags newflags = std::ios_base::skipws> class lexical_cast { ... }; Then just do: std::stringstream ss; ss.flags(newflags); And carry on about our merry way. The latter way gives us great flexibility - allowing us to control the IOS flags ourselves, while still keeping a simple and backward compatible interface, and most importantly, its dead easy to implement :)
-- PreZ :) Founder. The Neuromancy Society (http://www.neuromancy.net)

But lexical_cast is not operator>>, it just uses it. BTW, I've thought of the following (rare) scenario: There's a class A which have operator<< that streams a single space, then its content and then another space at the end. Class A's operator>> requires these leading and traling spaces, otherwise it falis. In this case, a "whitespace-loose" lexical_cast should *know* to bypass all leading spaces in the input string, *except* the last space before the content, and go on to the class A's operator>>. Can he do that? Is this even possible? What I'm saying is that using a "whitespace-loose" lexical_cast, forces us to write "whitespace-loose" operator>>, which I'm not sure is always possible.
Unfortunately, it's not that easy. calling lexical_cast<T, noskipws> according to your second proposal, will make it not ignore leading whitespace, but it will still ignore trailing whitespace. Ignoring/not ignoring trailing whitespace requires some additional code other than setting ios_base flags. I'm not saying it's not possible, just that it's not that easy :-(

On Sun, 30 Jan 2005 10:01:06 +0200, Yuval Ronen wrote:
BTW, I've thought of the following (rare) scenario: You put your finger on it, 'rare'.
Besides, whats to stop them doing 'ss.unsetf(skipws)' in their operator>>. If memory serves the white space skipping isn't done until something is actually going to be pulled off the rdbuf anyway (I could be wrong, of course). If this is the case, they would have: std::istream operator>>(std::istream &is, my_class &c) { flags = is.flags(); is.flags(flags &= ~skipws); is >> c.field1 >> c.field2 >> c.field3; is.flags(flags); return is; } Of course, if I'm wrong about the whitespace being skipped only at the point where something is pulled off the buffer, then this won't work, and we're back to the other suggestion I had. My point is though, if you're going to use operator<< and operator>> with istream/ostream in a manner that is not default, then you should accomodate for the method you're breaking the defaults yourself in your operator<< and operator>> - as opposed to forcing everyone ELSE who uses operator<< and operator>> to ensure they get things in the default manner (which often the client programmer can't do if they didn't write the original code).
not so. Right now, trailing white space is skipped only because we actually have this code: T result; ss >> result >> std::ws; if (!ss.eof()) throw bad_lexical_cast; We should just take out the '>> std::ws' as well. Which should then not skip trailing white space on noskipws, and if memory serves, will do so if skipws is on. Even if it doesn't, its a simple matter of: T result; ss >> result; if (ss.flags() & skipws) ss >> std::ws; if (!ss.eof()) throw bad_lexical_cast; This handles what we want beautifully. If skipws is on, then white space at the beginning and end are skipped, AND white space between values (remember the original string "4 5"). If skipws is off, then no white space is skipped anywhere, and if you have leading or trailing white space, you'd better handle it, or the eof() check will fail. Don't fall into the trap of over-engineering the problem :) -- PreZ :) Founder. The Neuromancy Society (http://www.neuromancy.net)

I guess you're right about this. It sounds right to me.
It's not a matter of defaults. I agree with what Kevlin said about extraction operator should not assume the stream is flagged in this way or another, but make sure it's in the exact form it needs to be, or keep the flags as they are *if* this is truely a point where the extraction operator leaves room for caller to manipulate its behaviour. In case of third-party code, then I agree a customization parameter to lexical_cast would be best.
It doesn't, but (scroll down...)
this should work.
I think this is a good suggestion (except for the default skipws part :-) )

In message <033901c50655$6347e4d0$0300000a@pc>, Terje Slettebø <tslettebo@broadpark.no> writes
Yes, lexical_cast unsets the skipping of whitespace, but that does not, to my knowledge, introduce problems for correctly written stream extraction operators. If reading a representation in and writing it out for a custom type depends on the skipping of whitespace, the stream extraction operator must guarantee this -- it cannot assume the state of the input stream will be what it needs. In other words, there is a bug in the implementation of operator>>. Here is a simplified (untested and not exception safe) sketch of the basic logic: ... operator>>(... &is, my_type &mt) { std::ios::fmtflags old_fmt = is.flags(); is.setf(std::ios::skipws); is >> mt.a >> mt.b; is.flags(old_fmt); return is; } HTH Kevlin -- ____________________________________________________________ Kevlin Henney phone: +44 117 942 2990 mailto:kevlin@curbralan.com mobile: +44 7801 073 508 http://www.curbralan.com fax: +44 870 052 2289 Curbralan: Consultancy + Training + Development + Review ____________________________________________________________

On Sun, 30 Jan 2005 13:45:21 +0100, Kevlin Henney wrote: they will turn it off on their operator>>. Pretty much every operator>> I've seen does not check to see if skipws is set or not. library) where they have no control over operator>> - and that operator>> assumes the defaults (like most people writing operator>> does). Is it so hard to have lexical_cast have an OPTIONAL second template argument of the std::ios::fmtflags to use (defaulting to the default fmtflags, which is skipws) - and only do the ">> std::ws" which we do if skipws is turned on in the fmtflags we are to use? For probably >95% of cases, no code using lexical_cast will ever need to change, and any code that DOES need to change, they will have a big marker in their code that the operator>> in this case expects skipws to be turned off (ie. does not do it itself). And for correctly written code, as you said, it will have no effect (they can still use a single template argument). I'm sending this too you via. mail because I've been told you don't read the boost-dev list, thus did not see my suggestions before. My suggestion (sorry for those who have seen it) is to have: template<typename Target, typename Source, std::ios::fmtflags flags = std::ios::skipws> class lexical_stream { public: lexical_stream() { stream.flags(flags); if(std::numeric_limits<Target>::is_specialized) stream.precision(std::numeric_limits<Target>::digits10 + 1) else if(std::numeric_limits<Source>::is_specialized) stream.precision(std::numeric_limits<Source>::digits10 + 1) } // ... template<typename InputStreamable> bool operator>>(InputStreamable &output) { if (is_pointer<InputStreamable>::value) return false; if (!stream >> output) return false; if (flags & std::ios::skipws) stream >> std::ws; return (stream && stream.eof()); } // ... }; Though this would require: template<typename Target, typename Source, std::ios::fmtflags flags = std::ios::skipws> Target lexical_cast(Source in) { detail::lexical_stream<Target, Source, flags> interpreter; // ... }; So possibly a better solution would be: template<typename Target, typename Source> class lexical_stream { public: lexical_stream(std::ios::fmtflags flags = std::ios::skipws) { stream.flags(flags); if(std::numeric_limits<Target>::is_specialized) stream.precision(std::numeric_limits<Target>::digits10 + 1) else if(std::numeric_limits<Source>::is_specialized) stream.precision(std::numeric_limits<Source>::digits10 + 1) } // ... template<typename InputStreamable> bool operator>>(InputStreamable &output) { if (is_pointer<InputStreamable>::value) return false; if (!stream >> output) return false; if (stream.flags() & std::ios::skipws) stream >> std::ws; return (stream && stream.eof()); } // ... }; which would require: template<typename Target, typename Source> Target lexical_cast(Source in, std::ios::fmtflags flags = std::ios::skipws) { detail::lexical_stream<Target, Source> interpreter(flags); // ... }; Either solution will still allow: MyType a = boost::lexical_cast<MyType>(s); However for the first, to turn off skipws, I would do: MyType a = boost::lexical_cast<MyType, std::string, 0>(s); and for the second: MyType a = boost::lexical_cast<MtType>(s, 0); Obviously the second is preferable - and as I said, it adds flexibility to lexical_cast without too much complexity - while maintaining backward compatibility for >95% of usages, and adds compatibility with I'd say the majority of classes that use operator>> (most of which don't check the state of skipws) for complex classes. That is my $.02 - and I'll say no more on the subject (unless questioned directly), since I now sound like a zealot enough as it is :P -- PreZ :) Founder. The Neuromancy Society (http://www.neuromancy.net)

From: "Preston A. Elder" <prez@neuromancy.net>
When this has been discussed before, other requests for conversion customisation has also come up, such as being able to use a different locale, or other stream customisations not available from just the format flags (such as setting the number base, or other number formatting). Therefore, if something were to be optionally passable, maybe it should be a full stream object? (which reminded me of this one: http://www.refactoring.com/catalog/preserveWholeObject.html :) ) However, passing a stream may be a little hard to implement, as you can't bind a default argument to a non-const reference, so the following won't work: template<typename Target, typename Source> Target lexical_cast(const Source &arg, std::basic_stringstream<typename detail::select_stream_char<Source, Target>::type> &stream = std::basic_stringstream<typename detail::select_stream_char<Source, Target>::type>()) ... ("select_stream_char<>" is made up for the occasion; it contains the typedef for char_type in lexical_stream) Regards, Terje

From: =?iso-8859-1?Q?Terje_Sletteb=F8?= <tslettebo@broadpark.no>
I agree with Kevlin that a correctly written extraction operator should be managing skipws itself. I also agree with Preston that it is the rare one that is correctly written (wrt skipws, at least). Thus, it does seem pedantic to cause grief to so many for something apparently handled quite easily.
That's awfully heavyweight for what is rightly a simple, common expectation, don't you think? I'm sure there are times when that would be useful, but it really deviates from the concept of a cast, don't you think?
There's no reason you can't overload it, is there? -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;
participants (5)
-
Kevlin Henney
-
Preston A. Elder
-
Rob Stewart
-
Terje Slettebø
-
Yuval Ronen