Re:[boost] Serializer and when to stop

Darren Cook wrote:
I'm using boost::serializer between two programs: the first creates the data then serializes it, the second loads that data in and analyzes it.
When I had just 420 data samples it all worked fine: I create a vector<Sample> in memory, serialized it and loaded it in the other side. When I moved to 7000 data samples the first program, the creator, sucked all the memory from the machine and then some.
7000 samples? That doesn't seem very large in today's environment. I can't see why creating an archive should consume any significant memory at all. Nothing significant is constructed. Perhaps is a bug somewhere. In any case, you do raise in interesting issue which would crop up if serialization were used in something like data logging.
So I moved to serializing each data sample as it was created. To read it back in I changed my code from this:
std::ifstream input_file(fname,std::ios::binary); boost::archive::binary_iarchive archive(input_file); std::vector<Sample> samples; archive>>samples; input_file.close();
Into this:
std::ifstream input_file(fname,std::ios::binary); boost::archive::binary_iarchive archive(input_file);
std::vector<Sample> samples; samples.reserve(8000);
while(!input_file.eof()){ Sample s; archive>>s; samples.push_back(s); }
input_file.close();
But every time it fails with an assert [1]. This seems to be when it has reached end of file. I cannot know in advance how many data samples I'll write to disk, so it seems I have three options: A: Make a special "zero" version of Sample to mark end of file B: At end of program 1 write the number of samples to a special file, and read that in before starting the above loop, so I know when to stop. C: Alter the serializer class to throw exception instead of an assert; I can then catch it and carry on.
I think I changed the assert/exception to check before loading rather than after for the next version. This would permit your code above to work as you expect. I'll have to double check.
What is the best way? Is adding a terminator byte to boost::serializer an option (i.e. so I don't have to make a terminator version of each data structure I want to serialize).
Incidentally my above loop does a copy of each sample as it adds it to the vector. Is the above code going to be common enough to make it worth including into boost::serializer as a helper function, which could then be optimized with in-place construction or something clever like that? (And then it could handle the termination handling itself as well.)
The default implementation of STL collections create a new element on the stack and copy into the collection. Its as clever as I can make it without getting tripped up in things like exception safety, objects without default constructors, etc. Robert Ramey

When I had just 420 data samples it all worked fine: I create a vector<Sample> in memory, serialized it and loaded it in the other side. When I moved to 7000 data samples the first program, the creator, sucked all the memory from the machine and then some.
7000 samples? That doesn't seem very large in today's environment. I can't see why creating an archive should consume any significant memory at all.
Sorry if it sounded like the archive was the problem - it was holding 7000 samples in memory in a std::vector along with all the other stuff the creator program also holds in memory that was the issue. So I needed to move to only holding one sample in memory at a time.
I think I changed the assert/exception to check before loading rather than after for the next version. This would permit your code above to work as you expect.
OK.
What is the best way? Is adding a terminator byte to boost::serializer an option (i.e. so I don't have to make a terminator version of each data structure I want to serialize).
Actually, adding a terminator value for the Sample class took 15 minutes and worked first time. So perhaps it is less fuss than I originally thought to add it for each class I want to archive in an open-ended list like this. Darren
participants (2)
-
Darren Cook
-
Robert Ramey