Re: [Boost-users] [serialization] Skipping unexpected elements in XMLarchive

22 Jul 2006

      Sounds to me that others might find this useful.  How about making a little 
write up, and maybe a small test/demo, zipping it up and uploading to the 
serialization section of the boost vault?

Robert Ramey.

Todd Greer wrote:
...
Hi,
I'm using the Serialization library, but need to skip unrecognized
data when reading in an archive. This is for forward compatability
purposes; I need for older versions of my software to consume files
produced by newer ones, which may have added elements.
As this functionality is not currently in the library, I have created
a new archive type (as I think Robert Ramey suggested at some point).
It inherits from xml_iarchive_impl and overrides load_override to
identify when it has just read an opening tag with an unexpected
name. It then skips whatever XML it finds within the unexpected
element, reads the closing tag, and returns to normal processing. I
use it with the unaltered xml_oarchive, and the pair satisfy my
requirements.
I'm including it below in the hopes that someone will find it useful.
I've tested it on MSVC 7.1 (Visual Studio .NET 2003) with a reasonably
complex XML file with both nested and non-nested unexpected elements.
If you'd like, I can post some tests as well (I just need to strip
out some project-specific details).
Please let me know if you see any places where it could be improved,
or where I've completely missed something; this is my first use of the
serialization library, and my understanding is far from comprehensive.
Many thanks to all who've contributed to the Serialization library,
Todd Greer    <tgreer <at> affinegy dot com>
Senior Software Developer, Affinegy LLC
#include <boost/archive/xml_iarchive.hpp>
#include <boost/range/end.hpp>
#include <algorithm>
const char whitespace[] = " \t\n\r";
//Skip whitespace, then return true if we're looking at '</'.
//Leave the stream right before the '<'.
template<class IStream>
bool at_close_tag(IStream& is)
{
 char ch;
 using boost::end;
 do
 {
   ch = is.get();
 } while(std::find(whitespace, end(whitespace), ch) !=
end(whitespace));
assert(ch == '<');
 ch = is.get();
 bool const end_tag = ch == '/';
 is.putback(ch);
 is.putback('<');
 return end_tag;
}
template<class Archive>
class skip_xml_iarchive_impl: public
boost::archive::xml_iarchive_impl<Archive>
{
#if BOOST_WORKAROUND(BOOST_MSVC, <= 1300) //see xml_iarchive_impl
public:
#elif defined(BOOST_MSVC)
 friend boost::archive::detail::interface_iarchive<Archive>;
protected:
#else
 friend class boost::archive::detail::interface_iarchive<Archive>;
protected:
#endif
skip_xml_iarchive_impl(std::istream & is, unsigned int flags = 0):
xml_iarchive_impl<Archive>(is, flags)
 {}
void skip_all_elements()
 {
   std::string ignore;
   if(this->This()->gimpl->parse_string(this->This()->get_is(),
     ignore) && ignore.npos != ignore.find_first_not_of(whitespace))
     return;
//We're either looking at a start tag or an end tag.
   while(!at_close_tag(this->This()->get_is()))
   {//It's a start tag.
     if(!this->This()->gimpl->parse_start_tag(this->This()->get_is()))
boost::throw_exception(archive_exception(archive_exception::stream_error
));
++depth;
     std::string const name = this->This()->gimpl->rv.object_name;
     skip_all_elements();
     load_end(name.c_str());
   }
 }
template<class T>
   void load_override(
#ifndef BOOST_NO_FUNCTION_TEMPLATE_ORDERING
   const
#endif
   boost::serialization::nvp<T> & t,
   int
   ){
     if(!t.name())
       return xml_iarchive_impl<Archive>::load_override(t, 0);
     load_start(t.name());
     // don't check start tag at highest level
     const std::string name_found =
this->This()->gimpl->rv.object_name;
     bool const skip =
       depth > 1
       && 0 != (this->get_flags() & no_xml_tag_checking)
       && name_found != t.name();
     if(skip)
       skip_all_elements();
     else
       boost::archive::load(* this->This(), t.value());
     load_end(skip ? name_found.c_str() : t.name());
     if(skip) //We still haven't loaded t...
       load_override(t, 0); //...so load t.
   }
template<class T>
   void load_override(T & t, BOOST_PFTO int i)
   { xml_iarchive_impl<Archive>::load_override(t, i); }
};
class skip_xml_iarchive: public
skip_xml_iarchive_impl<skip_xml_iarchive>
{
public:
   skip_xml_iarchive(std::istream & is, unsigned int flags = 0) :
       skip_xml_iarchive_impl<skip_xml_iarchive>(is, flags)
   {}
};