New subject: [boost] whitespace and user-defined types in lexical_cast

28 Jan 2007

I am having some trouble with lexical_cast that I think illustrates
a problem with the underlying implementation.  My basic
problem is that I'd like lexical_cast to "work" for user-defined
types (UDTs) that conform to lexical_cast's documented specification of
OutputStreamable, InputStreamable, CopyConstructible and
DefaultConstructible.

For example:

struct Udt{
    int a, b	
    Udt(int _a, int _b) : a(_a), b(_b){}
    Udt() : a(0), b(0){}
};

std::ostream& operator<<(std::ostream& s, const Udt& u){
    return s << u.a << " " << u.b;
}

std::istream& operator>>(std::istream& s, Udt& u){
    return s >> u.a >> u.b;
}

I believe I should be able to write:

bool operator==(const Udt &u1, const Udt &u2){
    return u1.a==u2.a && u1.b==u2.b;
}

Udt(3,19) == lexical_cast<Udt>("3 19");

In fact, with the current (CVS) implementation of lexical_cast the above
expression throws a bad_lexical_cast because of lexical_cast's
insistence on unsetting the skipws flag in the internal
lexical_stream.  I could fix this by doing s.setf(std::ios::skipws) in
the Udt extractor, but that just seems wrong.  Nothing in
lexical_cast's documentation suggests that the author of Udt should do
that.

The patch below (which applies to the CVS tree, NOT to boost 1_33)
fixes this problem and makes the above lexical_cast "work".

It limits the skipws manipulation to the case where the output type is
a char or a wchar_t.  This change preserves the desired behavior of
lexical_cast<char>(" "), but it does not introduce the confusing
whitespace senstivity when the target is not a char.  It has the
side-effect of changing the behavior of some of the existing unit
tests so that they return reasonable results rather than throwing
exceptions.  I believe that users will find this behavior less
surprisng.  E.g., with the patch,

    lexical_cast<int>(" 123")

returns 123 rather than throwing an exception, so the patch also
contains changes to the unit tests that reflect this behavioral
change.  The patch also contains some new checks to make sure that
strings with leading, trailing and embedded spaces convert "correctly".

Finally, there is a new unit test:  libs/conversion/tst/lexical_cast_udt_test.cpp
that verifies the new behavior for user defined types.

I hope this patch, or something like it, can make it into a future
release of boost.

John Salmon

-------

[jsalmon@river boost]$ diff -Nau /dev/null libs/conversion/test/lexical_cast_udt_test.cpp

--- /dev/null	2004-02-23 16:02:56.000000000 -0500
+++ libs/conversion/test/lexical_cast_udt_test.cpp	2007-01-27 18:23:12.234392725 -0500
@@ -0,0 +1,48 @@
+#include <boost/config.hpp>
+#include <boost/test/unit_test.hpp>
+#include <boost/lexical_cast.hpp>
+#include <cfloat>
+#include <iostream>
+using namespace boost;
+
+void test_udt();
+
+unit_test::test_suite *init_unit_test_suite(int, char *[])
+{
+    unit_test_framework::test_suite *suite =
+        BOOST_TEST_SUITE("lexical_cast unit test");
+    suite->add(BOOST_TEST_CASE(&test_udt));
+
+    return suite;
+}
+
+// Udt:  a simple user-defined type that models InputStreamable,
+// OutputStreamable, CopyConstructable and DefaultConstructable.
+// I.e., it should "work" with lexcical_cast.
+struct Udt{
+    int a, b;
+    Udt(int _a, int _b) : a(_a), b(_b){}
+    Udt() : a(0), b(0){}
+};
+
+std::ostream& operator<<(std::ostream& s, const Udt& f){
+    return s << f.a << " " << f.b;
+}
+
+std::istream& operator>>(std::istream& s, Udt& f){
+    return s >> f.a >> f.b;
+}
+
+bool operator==(const Udt& f1, const Udt& f2){
+    return f1.a == f2.a && f1.b == f2.b;
+}
+
+void test_udt()
+{
+    Udt f(13, -11);
+    BOOST_CHECK_EQUAL(f, lexical_cast<Udt>(lexical_cast<std::string>(f)));
+    BOOST_CHECK_EQUAL(f, lexical_cast<Udt>("13 -11"));
+    Udt g(99999, 0);
+    BOOST_CHECK_EQUAL(g, lexical_cast<Udt>(lexical_cast<std::string>(g)));
+    BOOST_CHECK_EQUAL(g, lexical_cast<Udt>("99999 0"));
+}
[jsalmon@river boost]$ 

Index: boost/lexical_cast.hpp
===================================================================
RCS file: /cvsroot/boost/boost/boost/lexical_cast.hpp,v
retrieving revision 1.33
diff -u -r1.33 lexical_cast.hpp
--- boost/lexical_cast.hpp	16 Jan 2007 21:03:47 -0000	1.33
+++ boost/lexical_cast.hpp	27 Jan 2007 23:47:03 -0000
@@ -25,6 +25,7 @@
 #include <boost/mpl/if.hpp>
 #include <boost/throw_exception.hpp>
 #include <boost/type_traits/is_pointer.hpp>
+#include <boost/type_traits/is_same.hpp>
 #include <boost/call_traits.hpp>
 #include <boost/static_assert.hpp>
 #include <boost/detail/lcast_precision.hpp>
@@ -528,7 +529,8 @@
         public:
             lexical_stream(char_type* = 0, char_type* = 0)
             {
-                stream.unsetf(std::ios::skipws);
+                if( is_same<char, Target>::value || is_same<wchar_t, Target>::value)
+                    stream.unsetf(std::ios::skipws);
                 lcast_set_precision(stream, (Source*)0, (Target*)0);
             }
             ~lexical_stream()
@@ -693,7 +695,8 @@
 
                 this->setg(start, start, finish);
                 std::basic_istream<CharT> stream(static_cast<Base*>(this));
-                stream.unsetf(std::ios::skipws);
+                if( is_same<char, InputStreamable>::value || is_same<wchar_t, InputStreamable>::value)
+                    stream.unsetf(std::ios::skipws);
                 lcast_set_precision(stream, (InputStreamable*)0);
 #if (defined _MSC_VER)
 # pragma warning( pop )
Index: libs/conversion/lexical_cast_test.cpp
===================================================================
RCS file: /cvsroot/boost/boost/libs/conversion/lexical_cast_test.cpp,v
retrieving revision 1.24
diff -u -r1.24 lexical_cast_test.cpp
--- libs/conversion/lexical_cast_test.cpp	28 Oct 2006 19:33:32 -0000	1.24
+++ libs/conversion/lexical_cast_test.cpp	27 Jan 2007 23:47:03 -0000
@@ -140,14 +140,12 @@
     BOOST_CHECK_EQUAL(1, lexical_cast<int>(true));
     BOOST_CHECK_EQUAL(0, lexical_cast<int>(false));
     BOOST_CHECK_EQUAL(123, lexical_cast<int>("123"));
-    BOOST_CHECK_THROW(
-        lexical_cast<int>(" 123"), bad_lexical_cast);
+    BOOST_CHECK_EQUAL(123, lexical_cast<int>(" 123"));
     BOOST_CHECK_THROW(lexical_cast<int>(""), bad_lexical_cast);
     BOOST_CHECK_THROW(lexical_cast<int>("Test"), bad_lexical_cast);
     BOOST_CHECK_EQUAL(123, lexical_cast<int>("123"));
     BOOST_CHECK_EQUAL(123, lexical_cast<int>(std::string("123")));
-    BOOST_CHECK_THROW(
-        lexical_cast<int>(std::string(" 123")), bad_lexical_cast);
+    BOOST_CHECK_EQUAL(123, lexical_cast<int>(std::string(" 123")));
     BOOST_CHECK_THROW(
         lexical_cast<int>(std::string("")), bad_lexical_cast);
     BOOST_CHECK_THROW(
@@ -215,6 +213,12 @@
     BOOST_CHECK_EQUAL(" ", lexical_cast<std::string>(" "));
     BOOST_CHECK_EQUAL("", lexical_cast<std::string>(""));
     BOOST_CHECK_EQUAL("Test", lexical_cast<std::string>(std::string("Test")));
+    BOOST_CHECK_EQUAL("TrailingSpace ", lexical_cast<std::string>(std::string("TrailingSpace ")));
+    BOOST_CHECK_EQUAL("TrailingSpace ", lexical_cast<std::string>("TrailingSpace "));
+    BOOST_CHECK_EQUAL(" LeadingSpace ", lexical_cast<std::string>(std::string(" LeadingSpace ")));
+    BOOST_CHECK_EQUAL(" LeadingSpace ", lexical_cast<std::string>(" LeadingSpace "));
+    BOOST_CHECK_EQUAL(" Embedded Space ", lexical_cast<std::string>(std::string(" Embedded Space ")));
+    BOOST_CHECK_EQUAL(" Embedded Space ", lexical_cast<std::string>(" Embedded Space "));
     BOOST_CHECK_EQUAL(" ", lexical_cast<std::string>(std::string(" ")));
     BOOST_CHECK_EQUAL("", lexical_cast<std::string>(std::string("")));
 }
@@ -358,7 +362,7 @@
 
 void test_no_whitespace_stripping()
 {
-    BOOST_CHECK_THROW(lexical_cast<int>(" 123"), bad_lexical_cast);
+    BOOST_CHECK_EQUAL(123, lexical_cast<int>(" 123"));
     BOOST_CHECK_THROW(lexical_cast<int>("123 "), bad_lexical_cast);
 }
 
Index: libs/conversion/test/Jamfile
===================================================================
RCS file: /cvsroot/boost/boost/libs/conversion/test/Jamfile,v
retrieving revision 1.10
diff -u -r1.10 Jamfile
--- libs/conversion/test/Jamfile	20 Jan 2007 13:17:35 -0000	1.10
+++ libs/conversion/test/Jamfile	27 Jan 2007 23:47:03 -0000
@@ -32,6 +32,7 @@
         [ run lexical_cast_loopback_test.cpp <lib>../../test/build/boost_unit_test_framework ]
         [ run lexical_cast_abstract_test.cpp <lib>../../test/build/boost_unit_test_framework ]
         [ run lexical_cast_noncopyable_test.cpp <lib>../../test/build/boost_unit_test_framework ]
+        [ run lexical_cast_udt_test.cpp <lib>../../test/build/boost_unit_test_framework ]
       ;
 }
       
Index: libs/conversion/test/Jamfile.v2
===================================================================
RCS file: /cvsroot/boost/boost/libs/conversion/test/Jamfile.v2,v
retrieving revision 1.7
diff -u -r1.7 Jamfile.v2
--- libs/conversion/test/Jamfile.v2	20 Jan 2007 13:17:35 -0000	1.7
+++ libs/conversion/test/Jamfile.v2	27 Jan 2007 23:47:03 -0000
@@ -24,6 +24,7 @@
     [ run lexical_cast_loopback_test.cpp ../../test/build//boost_unit_test_framework/<link>static ]
     [ run lexical_cast_abstract_test.cpp ../../test/build//boost_unit_test_framework/<link>static ]
     [ run lexical_cast_noncopyable_test.cpp ../../test/build//boost_unit_test_framework/<link>static ]
+    [ run lexical_cast_udt_test.cpp ../../test/build//boost_unit_test_framework/<link>static ]
   ;

    

whitespace and user-defined types in lexical_cast

john＠thesalmons.org

Alexander Nasonov

John Salmon

Alexander Nasonov

John Salmon

tags

participants (3)