New subject: Input archive loads seem to corrupt shared characterstd::strings

3 Dec 2007

Folks,

We were recently tracking down some weirdness in our application, and think it may interest other users of boost archives.  In essence, we are using the std::string library that comes with gcc 4.x, which reference counts the underlying characters.  The iarchive loads seem to corrupt these characters if they are shared with other std::string instances.  Here is a paraphrased example of what we were doing.

// First, we have a string instance
static std::string defaultStringValue = "defaultValue";

// Next we make a class that uses this default to initialize a member in its base constructor
class A
{
public:
    std::string member;
    A() : member(defaultStringValue);
    template<class Archive>
        void serialize(Archive & ar, const unsigned int serializer_version)
    {
        ar & member;
    }
}

What we find happening is that if you deserialize an instance of A, the defaultStringValue *can* get clobbered (this has a bit to do with the length of the underlying value, as to whether the characters get clobbered or not).  The scenario goes something like this:
1. Create an instance of A, this member will get a reference to the underlying "defaultValue" characters.
2. Load the instance of A from an archive that has a different value of member (the same length, or maybe shorter.  If you test this, try something like "eefaultValue").
3. This seems to mess up the value of defaultStringValue, you can print it out and see.

Inspection of the code reveals the following is happening (both in 1.34 and HEAD):
text_iarchive_impl<Archive>::load(std::string &s)
{
    std::size_t size;
    * this->This() >> size;
    // skip separating space
    is.get();
    // borland de-allocator fixup
    #if BOOST_WORKAROUND(_RWSTD_VER, BOOST_TESTED_AT(20101))
    if(NULL != s.data())
    #endif
        s.resize(size);
    is.read(const_cast<char *>(s.data()), size); // <== This looks suspect
}

Basically, data() is const because of the reference counting.  Doing this const_cast and modification causes other strings that share the ref counted characters to be corrupted.

We have been applying the following patch to our production version of boost (1.34.0), to work around this issue.  Basically, it makes its own temp string and then does the assignment.

diff -du -r boost_1_34_0/boost/archive/impl/basic_binary_iprimitive.ipp boost_1_34_0.my/boost/archive/impl/basic_binary_iprimitive.ipp

--- boost_1_34_0/boost/archive/impl/basic_binary_iprimitive.ipp	2006-02-09 01:58:07.000000000 -0500
+++ boost_1_34_0.my/boost/archive/impl/basic_binary_iprimitive.ipp	2007-12-03 18:05:43.000000000 -0500
@@ -85,8 +85,9 @@
 
 template<class Archive, class Elem, class Tr>
 BOOST_ARCHIVE_OR_WARCHIVE_DECL(void)
-basic_binary_iprimitive<Archive, Elem, Tr>::load(std::string & s)
+basic_binary_iprimitive<Archive, Elem, Tr>::load(std::string & res_s)
 {
+    std::string s;
     std::size_t l;
     this->This()->load(l);
     // borland de-allocator fixup
@@ -96,6 +97,9 @@
         s.resize(l);
     // note breaking a rule here - could be a problem on some platform
     load_binary(const_cast<char *>(s.data()), l);
+
+    // Respect other references to the chars by assigning result string
+    res_s = s;
 }
 
 #ifndef BOOST_NO_CWCHAR
@@ -113,8 +117,9 @@
 #ifndef BOOST_NO_STD_WSTRING
 template<class Archive, class Elem, class Tr>
 BOOST_ARCHIVE_OR_WARCHIVE_DECL(void)
-basic_binary_iprimitive<Archive, Elem, Tr>::load(std::wstring & ws)
+basic_binary_iprimitive<Archive, Elem, Tr>::load(std::wstring & res_ws)
 {
+    std::wstring ws;
     std::size_t l;
     this->This()->load(l);
     // borland de-allocator fixup
@@ -124,6 +129,9 @@
         ws.resize(l);
     // note breaking a rule here - is could be a problem on some platform
     load_binary(const_cast<wchar_t *>(ws.data()), l * sizeof(wchar_t) / sizeof(char));
+
+    // Respect other references to the chars by assigning result string
+    res_ws = ws;
 }
 #endif
 
diff -du -r boost_1_34_0/boost/archive/impl/text_iarchive_impl.ipp boost_1_34_0.my/boost/archive/impl/text_iarchive_impl.ipp
--- boost_1_34_0/boost/archive/impl/text_iarchive_impl.ipp	2005-12-11 01:12:52.000000000 -0500
+++ boost_1_34_0.my/boost/archive/impl/text_iarchive_impl.ipp	2007-12-03 17:56:04.000000000 -0500
@@ -42,8 +42,9 @@
 
 template<class Archive>
 BOOST_ARCHIVE_DECL(void)
-text_iarchive_impl<Archive>::load(std::string &s)
+text_iarchive_impl<Archive>::load(std::string &res_s)
 {
+    std::string s;
     std::size_t size;
     * this->This() >> size;
     // skip separating space
@@ -54,6 +55,9 @@
     #endif
         s.resize(size);
     is.read(const_cast<char *>(s.data()), size);
+
+    // Respect other references to the chars by assigning result string
+    res_s = s;
 }
 
 #ifndef BOOST_NO_CWCHAR
@@ -74,8 +78,9 @@
 #ifndef BOOST_NO_STD_WSTRING
 template<class Archive>
 BOOST_ARCHIVE_DECL(void)
-text_iarchive_impl<Archive>::load(std::wstring &ws)
+text_iarchive_impl<Archive>::load(std::wstring &res_ws)
 {
+    std::wstring ws;
     std::size_t size;
     * this->This() >> size;
     // borland de-allocator fixup
@@ -86,6 +91,9 @@
     // skip separating space
     is.get();
     is.read((char *)ws.data(), size * sizeof(wchar_t)/sizeof(char));
+
+    // Respect other references to the chars by assigning result string
+    res_ws = ws;
 }
 
 #endif // BOOST_NO_STD_WSTRING
diff -du -r boost_1_34_0/boost/archive/impl/text_wiarchive_impl.ipp boost_1_34_0.my/boost/archive/impl/text_wiarchive_impl.ipp
--- boost_1_34_0/boost/archive/impl/text_wiarchive_impl.ipp	2005-07-02 01:52:14.000000000 -0400
+++ boost_1_34_0.my/boost/archive/impl/text_wiarchive_impl.ipp	2007-12-03 18:05:07.000000000 -0500
@@ -79,8 +79,10 @@
 #ifndef BOOST_NO_STD_WSTRING
 template<class Archive>
 BOOST_WARCHIVE_DECL(void)
-text_wiarchive_impl<Archive>::load(std::wstring &ws)
+text_wiarchive_impl<Archive>::load(std::wstring &res_ws)
 {
+    std::wstring ws;
+
     std::size_t size;
     * this->This() >> size;
     // skip separating space
@@ -93,6 +95,9 @@
         ws.resize(size);
     // note breaking a rule here - is this a problem on some platform
     is.read(const_cast<wchar_t *>(ws.data()), size);
+
+    // Respect other references to the chars by assigning result string
+    res_ws = ws;
 }
 #endif


Curious if anyone has thoughts on the matter...

Regards,
Chuck

    

Input archive loads seem to corrupt shared character std::strings

Chuck Bear

Robert Ramey

Chuck Bear

Corrado Zoccolo

Hervé Brönnimann

Jens Seidel

Hervé Brönnimann

Martin Bonner

Giovanni Piero Deretta

Herve Bronnimann

Robert Ramey

Felipe Magno de Almeida

Peter Dimov

Herve Bronnimann

Peter Dimov

Chuck Bear

Hervé Brönnimann

Corrado Zoccolo

Ulrich Eckhardt

tags

participants (11)