
I'd like to go back to Chuck's original posting:
Inspection of the code reveals the following is happening (both in 1.34 and HEAD): text_iarchive_impl<Archive>::load(std::string &s) { std::size_t size; * this->This() >> size; // skip separating space is.get(); // borland de-allocator fixup #if BOOST_WORKAROUND(_RWSTD_VER, BOOST_TESTED_AT(20101)) if(NULL != s.data()) #endif s.resize(size); is.read(const_cast<char *>(s.data()), size); // <== This looks suspect }
Inspection of the g++ string implementation (latest 4.1.x) shows that resize() will *almost* always create a new rep and thus a new data() for the string, if it is shared. The only exception is when the requested size == s.size(). Chuck, was this the bug that hit you? Seems to me, since you are going to rewrite your string anyway, you might as well clear() it first, then resize(size). That'll take care of creating a new representation (either clear, or resize if the string was already empty). The problem with that is that it will write over the string twice, first filling it with 0s then with the read data. But think about it: if the string is shared, you need to construct a new representation with the requested size anyway, and it will have to be filled with 0s (or with a copy of what's already in the string if you decide to resize to a different size instead of clearing). Still, suppose you wanted to avoid going over your string twice, what you'd need is to create a string with an unitialized buffer and start writing into it directly. There's only one way I can think of to achieve that: have an input iterator from your "is" stream, such as std::istream_iterator<char>(is), and use s.assign(is_begin, is_end), assuming you can define an end iterator that will stop reading after size chars (don't even know how to do that with istream_iterator, and there is no assign(int, InputIterator) method in string, sadly). In any case, however, you then lose the benefit of the single is.read() call, because the iterator is going to do ++iter and read character by character. If your stream is buffered, there shouldn't much of an impact, but then you have a double copy from stream buffer to string buffer anyway, which is akin to going over your string twice. What would be nice would be to create a new buffer and read directly into it, then transfer ownership to the string... Can't do that with std::string though. In short, I don't think you can avoid the cost of going over your string twice. The problem with clear() in gcc 4.x's implementation is that it creates a new rep with the same capacity as the old string (), at least if I've read the code correctly. With a shared string implementation, it is cheaper to assign the empty string, and that works with any other implementation. In short, I've argued that you can't do much better than: text_iarchive_impl<Archive>::load(std::string &s) { std::size_t size; * this->This() >> size; // skip separating space is.get(); // borland de-allocator fixup #if BOOST_WORKAROUND(_RWSTD_VER, BOOST_TESTED_AT(20101)) if(NULL != s.data()) #endif + s = std::string(); s.resize(size); is.read(const_cast<char *>(s.data()), size); // <== This looks suspect } Compared to Corrado's implementation, it's almost the same, but frees up resources earlier. You may actually be able to reuse the same block of memory for the resize() that you just freed. HTH, -- Hervé Brönnimann hervebronnimann@mac.com