
On 1/30/07, Alexander Nasonov <alnsn@yandex.ru> wrote:
John Salmon <john <at> thesalmons.org> writes:
Yes. It's possible to write more "defensive" inserters and extractors. That's beside the point. The patch allows lexical_cast to work with Udts whose implementations are not so careful. It does so while passing all existing regressions (except for two which are modified to return a very reasonable result rather than throwing a somewhat surprising exception).
I don't have an access to my home system to check if your patch doesn't break this: t> BOOST_CHECK_EQUAL(" a b c " == lexical_cast<std::string>(" a b c "));
Does it?
The patch satisfies that test on my system (Linux, gcc-3.3.3). I was surprised that such a test was not already in the regression suite. Something like it obviously should be, so I added some very similar tests just to be sure. These are in the original patch: diff -u -r1.24 lexical_cast_test.cpp --- libs/conversion/lexical_cast_test.cpp 28 Oct 2006 19:33:32 -0000 1.24 +++ libs/conversion/lexical_cast_test.cpp 27 Jan 2007 23:47:03 -0000 @@ -215,6 +213,12 @@ BOOST_CHECK_EQUAL(" ", lexical_cast<std::string>(" ")); BOOST_CHECK_EQUAL("", lexical_cast<std::string>("")); BOOST_CHECK_EQUAL("Test", lexical_cast<std::string>(std::string("Test"))); + BOOST_CHECK_EQUAL("TrailingSpace ", lexical_cast<std::string>(std::string("TrailingSpace "))); + BOOST_CHECK_EQUAL("TrailingSpace ", lexical_cast<std::string>("TrailingSpace ")); + BOOST_CHECK_EQUAL(" LeadingSpace ", lexical_cast<std::string>(std::string(" LeadingSpace "))); + BOOST_CHECK_EQUAL(" LeadingSpace ", lexical_cast<std::string>(" LeadingSpace ")); + BOOST_CHECK_EQUAL(" Embedded Space ", lexical_cast<std::string>(std::string(" Embedded Space "))); + BOOST_CHECK_EQUAL(" Embedded Space ", lexical_cast<std::string>(" Embedded Space ")); BOOST_CHECK_EQUAL(" ", lexical_cast<std::string>(std::string(" "))); BOOST_CHECK_EQUAL("", lexical_cast<std::string>(std::string("")));
Nothing in lexical_cast's documentation even hints that one should beware of whitespace when reading and writing from streams. There is no reason for lexical_cast to insist on this degree of care and attention to detail in the implementations of user-defined types. The naive pair of operator<< and operator>> in the original post work perfectly well when operating on iostreams in their standard, default state. They work perfectly well for the user who never heard of lexical_cast who uses the standard istringstream idiom:
istringstream iss("13 19"); Udt Foo; iss >> Foo; assert( Foo == Udt(13, 19) );
Shouldn't lexical_cast do just as well? When I see something like the above in a colleague's code, and I suggest lexical_cast, should I really have to also warn him about carefully handling whitespace in the Udt extractor?
I'll think about it. But noskipws is there for a good reason. You can search the boost archive.
Yes. There have been a few discussions over the years. There are two basic themes: 1 - somebody thinks that lexical_cast ought to work on a type whose operator>> isn't smart enough to handle a non-default skipws flag. The response is typically "your operator>> is broken", followed by more or less discussion about whether lexical_cast should accommodate "broken" operator>> anyway. We are repeating this theme. 2 - a desire for consistency between the handling of leading whitespace and trailing whitespace. I would suggest that the fact that questions of type 1 keep coming up is an indication that the current behavior is surprising and alternatives should be explored. If we can change it so that it is less surprising to "newbies" without breaking anything else and without introducing new complexity, I think we should. I would suggest that the answer to questions of type 2 is to observe that it's not trailing whitespace that causes the exception, it's trailing *anything*. As such, it should be clearly documented. In fact, the proposed text for the specification of lexical_cast in TR2 makes this very clear. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1973.html Boost's documentation should be reconciled with the TR2 proposal. Here's an excerpt: <snip> lexical_cast behavior is specified in terms of operator<< and operator>> on a std::basic_stringstream object. Implementations are not required to actually use a std::basic_stringstream object to achieve the required behavior. Implementations are permitted to provide specializations of the lexical_cast template. [Note: Implementations may use this "as if" leeway to achieve efficiency. -- end note.] template<typename Target, typename Source> Target lexical_cast(const Source& arg); Effects: * Inserts arg into an empty std::basic_stringstream object via operator<<. * Extracts the result, of type Target, from the std::basic_stringstream object via operator>>. Throws: bad_lexical_cast if: * Source is a pointer type. * fail() for the std::basic_stringstream object is true after either operator<< or operator>> is applied. * get() for the std::basic_stringstream object is not std::char_traits<char_type>::eof() after both operator<< and operator>> are applied. Returns: The result as created by the effects. Remarks: If Target is either std::string or std::wstring, stream extraction takes the whole content of the string, including spaces, rather than relying on the default operator>> behavior. </snip>
-- Alexander Nasonov
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost