[serialization] Skipping unexpected elements in XML archive
Hi,
I'm using the Serialization library, but need to skip unrecognized data
when reading in an archive. This is for forward compatability purposes;
I need for older versions of my software to consume files produced by
newer ones, which may have added elements.
As this functionality is not currently in the library, I have created a
new archive type (as I think Robert Ramey suggested at some point). It
inherits from xml_iarchive_impl and overrides load_override to identify
when it has just read an opening tag with an unexpected name. It then
skips whatever XML it finds within the unexpected element, reads the
closing tag, and returns to normal processing. I use it with the
unaltered xml_oarchive, and the pair satisfy my requirements.
I'm including it below in the hopes that someone will find it useful.
I've tested it on MSVC 7.1 (Visual Studio .NET 2003) with a reasonably
complex XML file with both nested and non-nested unexpected elements. If
you'd like, I can post some tests as well (I just need to strip out some
project-specific details).
Please let me know if you see any places where it could be improved, or
where I've completely missed something; this is my first use of the
serialization library, and my understanding is far from comprehensive.
Many thanks to all who've contributed to the Serialization library,
Todd Greer
Todd, thanks for submitting this, I have the same need, plus some. The additional need I'm trying to solve is that the XML elements can be reordered, i.e. come in a different order than the serialization lib expects them. I've tried to implement this, but I'm stuck, since the serialization lib seems to parse only xml fragments, not the whole file in one shot. Would you have any idea how to solve this? Thanks, Peter
Sounds to me that others might find this useful. How about making a little write up, and maybe a small test/demo, zipping it up and uploading to the serialization section of the boost vault? Robert Ramey. Todd Greer wrote:
Hi,
I'm using the Serialization library, but need to skip unrecognized data when reading in an archive. This is for forward compatability purposes; I need for older versions of my software to consume files produced by newer ones, which may have added elements.
As this functionality is not currently in the library, I have created a new archive type (as I think Robert Ramey suggested at some point). It inherits from xml_iarchive_impl and overrides load_override to identify when it has just read an opening tag with an unexpected name. It then skips whatever XML it finds within the unexpected element, reads the closing tag, and returns to normal processing. I use it with the unaltered xml_oarchive, and the pair satisfy my requirements.
I'm including it below in the hopes that someone will find it useful. I've tested it on MSVC 7.1 (Visual Studio .NET 2003) with a reasonably complex XML file with both nested and non-nested unexpected elements. If you'd like, I can post some tests as well (I just need to strip out some project-specific details).
Please let me know if you see any places where it could be improved, or where I've completely missed something; this is my first use of the serialization library, and my understanding is far from comprehensive.
Many thanks to all who've contributed to the Serialization library, Todd Greer
Senior Software Developer, Affinegy LLC #include
#include #include <algorithm> const char whitespace[] = " \t\n\r";
//Skip whitespace, then return true if we're looking at ''. //Leave the stream right before the '<'. template<class IStream> bool at_close_tag(IStream& is) { char ch; using boost::end; do { ch = is.get(); } while(std::find(whitespace, end(whitespace), ch) != end(whitespace));
assert(ch == '<'); ch = is.get(); bool const end_tag = ch == '/'; is.putback(ch); is.putback('<'); return end_tag; }
template<class Archive> class skip_xml_iarchive_impl: public boost::archive::xml_iarchive_impl<Archive> { #if BOOST_WORKAROUND(BOOST_MSVC, <= 1300) //see xml_iarchive_impl public: #elif defined(BOOST_MSVC) friend boost::archive::detail::interface_iarchive<Archive>; protected: #else friend class boost::archive::detail::interface_iarchive<Archive>; protected: #endif
skip_xml_iarchive_impl(std::istream & is, unsigned int flags = 0): xml_iarchive_impl<Archive>(is, flags) {}
void skip_all_elements() { std::string ignore; if(this->This()->gimpl->parse_string(this->This()->get_is(), ignore) && ignore.npos != ignore.find_first_not_of(whitespace)) return;
//We're either looking at a start tag or an end tag. while(!at_close_tag(this->This()->get_is())) {//It's a start tag. if(!this->This()->gimpl->parse_start_tag(this->This()->get_is()))
boost::throw_exception(archive_exception(archive_exception::stream_error ));
++depth; std::string const name = this->This()->gimpl->rv.object_name; skip_all_elements(); load_end(name.c_str()); } }
template<class T> void load_override( #ifndef BOOST_NO_FUNCTION_TEMPLATE_ORDERING const #endif boost::serialization::nvp<T> & t, int ){ if(!t.name()) return xml_iarchive_impl<Archive>::load_override(t, 0); load_start(t.name()); // don't check start tag at highest level const std::string name_found = this->This()->gimpl->rv.object_name; bool const skip = depth > 1 && 0 != (this->get_flags() & no_xml_tag_checking) && name_found != t.name(); if(skip) skip_all_elements(); else boost::archive::load(* this->This(), t.value()); load_end(skip ? name_found.c_str() : t.name()); if(skip) //We still haven't loaded t... load_override(t, 0); //...so load t. }
template<class T> void load_override(T & t, BOOST_PFTO int i) { xml_iarchive_impl<Archive>::load_override(t, i); } };
class skip_xml_iarchive: public skip_xml_iarchive_impl
{ public: skip_xml_iarchive(std::istream & is, unsigned int flags = 0) : skip_xml_iarchive_impl (is, flags) {} };
participants (3)
-
Peter
-
Robert Ramey
-
Todd Greer