
Hello boost-users. I am writing to report an annoying serialization bug encountered while developing a project with Boost.multi_index, which is a wonderfully useful data structure. Thank you Joaquín! First a fast overview of my project: I am developing a virtual file system for my internship. In this file system, I have to store user paths which I call "entries". "entries" are associated with another "internal entry" which is the path to the physical target inside the file system. "entries" are also contained in a concept of "directory". To ease up things I store entries in a boost::multi_index with at least 3 indexes: - an unique index to look up user entries. - a non-unique index to do reverse look up (by internal targets). - a non-unique index to do directory listings. The latter is the one that interests us. The "entry" class splits the entry into the "dirname" and "file" parts. A getter function returns a const-reference to "dirname", let's call it "entry::dirname()". To do directory listings efficiently, I define an ordered-non-unique index pointinh to the const-reference returned by the entry.dirname() thanks to a wrapper "mem_fun" named getdirname(). Now comes the bug. To test my file system, I import some existing directories into my file system. When I unload it, I expect that the internal boost::multi_index is serialized and restored properly when I reload the file system program. When working on huge batches of small files, like untaring a linux kernel or copying the thousands of files from /usr/share/doc, I noticed that the import works but reloading the serialized mappings fails with a (for now unexplained but reproducable) boost::serialization::other_exception. After hunting through the headers, I located the triggered exception to line 1080 of boost/multi_index/ordered_index.hpp (line number from boost 1.33) Here is the context from the "rearranger" method where the exception is thrown:
1080 if(position!=x){ 1081 /* check the rearrangement is consistent */ 1082 if(!in_place(x->value,position,Category())){ 1083 throw_exception( 1084 archive::archive_exception( 1085 archive::archive_exception::other_exception)); 1086 }
Since my multi_index works perfectly until I save, I checked that my data is properly serialized. Since I saw no obvious errors, I modified the header to swallow the exception and see what happens.
1080 if(position!=x){ 1081 /* check the rearrangement is consistent */ 1082 if(!in_place(x->value,position,Category())){ 1083 return; throw_exception( 1084 archive::archive_exception( 1085 archive::archive_exception::other_exception)); 1086 }
After this small hack, everything "seems" to work fine but if I do a "find" on my file system some entries appear "missing". They are actually not missing but they were attributed to the wrong directory! And in the wrong parent directory, a lookup says "file not found" when find tries to access it. However, I can still access the entries by naming them explicitely (using the main index, not the dirname one). I investigated and found out that the "dirname" ordered_index (const mem_fun) is not restored properly, even through it is saved correctly. The "entry" and the "target" ordered indexes (member) are restored as intended. However look what happens to the mem_fun "dirname" index: its values are restored in positions that are not a valid binary tree. Actually it is "almost" a binary tree, just some values are wrongly restored. Like a dozen out of 1000 entries, but always the same ones. As a consequence even if my entry::operator<() predicate is proved correct, lookups through the "dirname" index give byzantine results. I tracked down the problem and found a small set of "entries" that are not restored properly and wrote a small example file. Please have a look at the attached source "seri-bug.cpp". The example dumps the contents of a 5 entries multi_index, first by entry and then by "dirname". The displaying is done before serializing to disk and after reloading the mappings. The corruption can then be see by checking the different position inside the dirname index for the "/doc/binutils-doc" entry before and after. Sample run: Original structure dump by dirname, pre serialization.
entry `/' with dirname `'. entry `/doc' with dirname `'. entry `/doc/binutils-doc' with dirname `/doc'. entry `/doc/binutils-doc/ld' with dirname `/doc/binutils-doc'. entry `/doc/binutils-doc/gas' with dirname `/doc/binutils-doc'.
Post serialized loading dump. Notice that the "/doc/binutils-doc" entry was not restored at the correct position. Since the dirnames are not properly sorted in memory, doing a find() by dirname obtains byzantine results (doing a find of "/doc" dirname returns the `/doc/binutils-doc/ld' result instead of the expected `/doc/binutils-doc' entry).
entry `/' with dirname `'. entry `/doc' with dirname `'. entry `/doc/binutils-doc/ld' with dirname `/doc/binutils-doc'. entry `/doc/binutils-doc' with dirname `/doc'. entry `/doc/binutils-doc/gas' with dirname `/doc/binutils-doc'.
Even if this sample triggers the restore anomaly, to see a boost::archive::other_exception one needs more data inside the multi_index, like a few thousand entries. I would really love to see a this bug fixed since it stalls my projects. My c++ template metaprogramming skills aren't very sharp but I think that this bug is a good opportunity to learn some about this subject. I am looking forward to test patches and offer my help testing, so that this little quirk gets solved. Sincerely, Julien Perville Intern, Global Core, Amadeus IT Group.