We have a massive amount of data to serialize, on the order of several gigabytes. Lots of strings involved, maybe hundreds of millions. We discovered that the data structure in memory would bloat enormously when read back in from disk (say, from 2 gig to 3.1 gig). We think we have tracked this down to (gcc implementation) string reference counts not being "restored". I think a solution for us is to do something like the following: static map<string, bool> string_map; template <class Archive> void read_string(Archive ar, string& a_string) { string s; ar >> s; // read from disk map<string, bool>::iterator i = string_map.find(s); if (i == string_map.end()) { i = string_map.insert(make_pair(s, true)); } a_string = i->first; } void destroy_map() { string_map.clear(); } Then, when the data structures have all been read, invoke the destroy_map() method to clear the string_map object, thus decremented all refcounts of strings by one. Has anyone else encountered this and found a solution? Also, if anyone has bright ideas on a better data structure than std::map to use for storing hundreds of millions of strings at once for the above purpose, that also might be nice. Thanks. Bill