On Mon, Jun 30, 2014 at 5:07 PM, Michael Powell <mwpowellhtx@gmail.com> wrote:
On Mon, Jun 30, 2014 at 4:44 PM, Michael Powell <mwpowellhtx@gmail.com> wrote:
On Mon, Jun 30, 2014 at 3:14 PM, Michael Powell <mwpowellhtx@gmail.com> wrote:
Hello,
I am building out a general use xml parser including attributes, arbitrary number of elements, and so on.
So far so good, makes sense parsing names and so forth. However, how do you handle element content? Which could either be a string, or zero or more other elements (basically of the same rule as the enclosing element rule).
It would seem you need a terminus, the empty element tag. In such a way that populates the parent (initial) element, and its children (of the same element kind).
I'll be adapting structs to capture the results. I am also using a couple of helpful references, for instance:
http://www.w3.org/TR/xml11/ http://stackoverflow.com/questions/9473843/boost-spirit-how-to-extend-xml-pa...
I'm not sure reading the Xml specification, and some boost tickets from several years ago, the following couldn't represent content:
content %= *(chars_ - chars_("<&")) | *(comment | child_element);
Where comment is defined as expected. child_element is the potential for recursion into the element grammar where content is defined. Basically a member variable of the same type as the container struct (element grammar).
Indeed, I cook up a simple(ish) example, and I get the error:
Error 3 error C2460: 'xml::xml_element_grammar<std::_String_const_iterator<std::_String_val<std::_Simple_types<char>>>,boost::spirit::ascii::space_type>::child_element' : uses 'xml::xml_element_grammar<std::_String_const_iterator<std::_String_val<std::_Simple_types<char>>>,boost::spirit::ascii::space_type>', which is being defined i:\source\kingdom software\cppxml\xml\xiparser.h 187 1 xml
Nothing fancy, fairly plain-old-Xml there:
using boost::spirit::qi::phrase_parse; using boost::spirit::ascii::space;
std::string txt = "<test><one /><two>2</two><three att=\"3\"/></test>";
xml::xml_element_grammar<> g; xml::xelement element;
bool result = phrase_parse(txt.cbegin(), txt.cend(), g, space, element);
How do you model when parent needs to look like a child, depending on the direction of the grammar's rule? In other words, the defining rule is a "parent", but when it's done parsing, it could very well operate like a child to a container parent.
I made it a little ways past this part. Focused on the simpler parts and got those parsing fine. typedef boost::make_recursive_variant< boost::variant<std::string, std::vector<boost::recursive_variant_> > >::type tag_soup; I'm not positive, but I think the best possible way to represent what an Xml content can be, either a vector of xelement, or a std::string, is to represent that fork in the road as a recursive_variant_. There's still the parent/child nature to resolve, though. xelement[child] can have an xelement[parent], and xelement[parent] has children.
Also not sure quite how to capture the adapted parts at strategic rule opportunities.
My domain model will look something like this, keeping it simple as possible:
struct xattribute { std::string name; std::string value; };
typedef std::vector<xattribute> xattribute_vector;
struct xelement;
typedef std::vector<xelement> xelement_vector;
struct xelement { std::string name; std::string content; xattribute_vector attributes; xelement_vector children; };
Thanks...
Best regards,
Michael Powell