New subject: [serialization] Skipping unexpected elements in XMLarchive

20 Jul 2006

      Hi,

I'm using the Serialization library, but need to skip unrecognized data
when reading in an archive. This is for forward compatability purposes;
I need for older versions of my software to consume files produced by
newer ones, which may have added elements. 

As this functionality is not currently in the library, I have created a
new archive type (as I think Robert Ramey suggested at some point). It
inherits from xml_iarchive_impl and overrides load_override to identify
when it has just read an opening tag with an unexpected name. It then
skips whatever XML it finds within the unexpected element, reads the
closing tag, and returns to normal processing. I use it with the
unaltered xml_oarchive, and the pair satisfy my requirements.

I'm including it below in the hopes that someone will find it useful.
I've tested it on MSVC 7.1 (Visual Studio .NET 2003) with a reasonably
complex XML file with both nested and non-nested unexpected elements. If
you'd like, I can post some tests as well (I just need to strip out some
project-specific details).

Please let me know if you see any places where it could be improved, or
where I've completely missed something; this is my first use of the
serialization library, and my understanding is far from comprehensive.

Many thanks to all who've contributed to the Serialization library,
Todd Greer    <tgreer <at> affinegy dot com>
Senior Software Developer, Affinegy LLC

#include <boost/archive/xml_iarchive.hpp>
#include <boost/range/end.hpp>
#include <algorithm>

const char whitespace[] = " \t\n\r";

//Skip whitespace, then return true if we're looking at '</'.
//Leave the stream right before the '<'.
template<class IStream>
bool at_close_tag(IStream& is)
{
  char ch;
  using boost::end;
  do
  {
    ch = is.get();
  } while(std::find(whitespace, end(whitespace), ch) !=
end(whitespace));

  assert(ch == '<');
  ch = is.get();
  bool const end_tag = ch == '/';
  is.putback(ch);
  is.putback('<');
  return end_tag;
}

template<class Archive>
class skip_xml_iarchive_impl: public
boost::archive::xml_iarchive_impl<Archive>
{
#if BOOST_WORKAROUND(BOOST_MSVC, <= 1300) //see xml_iarchive_impl
public:
#elif defined(BOOST_MSVC)
  friend boost::archive::detail::interface_iarchive<Archive>;
protected:
#else
  friend class boost::archive::detail::interface_iarchive<Archive>;
protected:
#endif

  skip_xml_iarchive_impl(std::istream & is, unsigned int flags = 0):
xml_iarchive_impl<Archive>(is, flags)
  {}

  void skip_all_elements()
  {
    std::string ignore;
    if(this->This()->gimpl->parse_string(this->This()->get_is(), ignore)
      && ignore.npos != ignore.find_first_not_of(whitespace))
      return;

    //We're either looking at a start tag or an end tag.
    while(!at_close_tag(this->This()->get_is()))
    {//It's a start tag.
      if(!this->This()->gimpl->parse_start_tag(this->This()->get_is()))

boost::throw_exception(archive_exception(archive_exception::stream_error
));

      ++depth;
      std::string const name = this->This()->gimpl->rv.object_name;
      skip_all_elements();
      load_end(name.c_str());
    }
  }

  template<class T>
    void load_override(
#ifndef BOOST_NO_FUNCTION_TEMPLATE_ORDERING
    const
#endif
    boost::serialization::nvp<T> & t, 
    int
    ){
      if(!t.name())
        return xml_iarchive_impl<Archive>::load_override(t, 0);
      load_start(t.name());
      // don't check start tag at highest level
      const std::string name_found =
this->This()->gimpl->rv.object_name;
      bool const skip = 
        depth > 1
        && 0 != (this->get_flags() & no_xml_tag_checking)
        && name_found != t.name();
      if(skip)
        skip_all_elements();
      else
        boost::archive::load(* this->This(), t.value());
      load_end(skip ? name_found.c_str() : t.name());
      if(skip) //We still haven't loaded t...
        load_override(t, 0); //...so load t.
    }

    template<class T>
    void load_override(T & t, BOOST_PFTO int i)
    { xml_iarchive_impl<Archive>::load_override(t, i); }
};

class skip_xml_iarchive: public
skip_xml_iarchive_impl<skip_xml_iarchive>
{
public:
    skip_xml_iarchive(std::istream & is, unsigned int flags = 0) :
        skip_xml_iarchive_impl<skip_xml_iarchive>(is, flags)
    {}
};

-- 
Todd Greer    <tgreer <at> affinegy dot com>
Senior Software Developer, Affinegy LLC

[serialization] Skipping unexpected elements in XML archive

Todd Greer

Peter

Robert Ramey

tags

participants (3)