On Fri, Apr 16, 2010 at 11:41 AM, EricB <eric.britz@libertysurf.fr> wrote:
Hi all
I'm trying to use spirit in order to make a parser for a simple language. Last time i had to do this I was in school (some more than 15 years ago). I managed to do something that works but it does not handle "balanced parentheses". I know this is a common problem but after having look for every possible way of spelling "balanced parentheses" I did not find anything (that I could understand).
The files I want to parse are made of ASCII char. They can contains commands that should be interpreted (replacement). Commands always start with "^!" (command switch).
An example of a file content:
text text text ^!for-each( A )( B ) text text text text
I want to send the "text" strings to the std::cout and detect the commands "^!for-each( A )( B )" in order to process them. At the moment I only have one command "for-each" Where: A : is a query that can contain /'" and balanced () B : is the text to be sent to output for each result of the query. This can be a 'script' meaning text + commands, for example: aa bb ^!for-each(C)(D) tt yy uu therefore B can also contain balanced ().
The rules, I wrote does not handled balanced () very well.
For example the following command should be parsed successfully:
^!for-each( a/b/c()/e[]/d ) ( item [name] do ^!for-each(sub/text()) ( print() ) )
I need some advices. Thank you
So here are my rules { boost::spirit::chlit<> LPAREN('('); boost::spirit::chlit<> RPAREN(')'); boost::spirit::strlit<> CMDSWITCH("^!");
script // main rule = * ( (boost::spirit::anychar_p - CMDSWITCH) | command ); command = boost::spirit::discard_first_node_d[ CMDSWITCH >> (for_each | boost::spirit::eps_p // for error reporting ) ]; for_each = boost::spirit::discard_first_node_d[ boost::spirit::as_lower_d["for-each"] >> *boost::spirit::space_p >> query >> *boost::spirit::space_p >> subscript ]; query = boost::spirit::inner_node_d[ LPAREN >> *(boost::spirit::anychar_p - ( RPAREN )) >> RPAREN ]; subscript = boost::spirit::inner_node_d[ LPAREN >> *( (boost::spirit::anychar_p - ( CMDSWITCH | RPAREN )) |command ) >> RPAREN ]; }
Perhaps something like (untested, not currently at home, but should be valid): { using boost::spirit::qi; // I am lazy using boost::spirit::ascii; // assuming ascii encoding script // main rule = command | char_ ; command = "^!" >> ( for_each | eps // why an eps, why not just fail out? ) ; for_each = no_case["for-each"] >> skip(space) [ query >> subscript ] ; query = '(' >> raw[stringparen_inner] >> ')' ; subscript = '(' >> ( command | stringparen_inner // command eats the possible "^!" first, no need to test ) >> ')' ; stringparen_inner = ('(' >> stringparen_inner >> ')') | ~char_(')') ; } Do note, the above is written in the latest version of Spirit, where-as yours was written in the ancient and slower (and more verbose) version. That should handle nested parenthesis and all just fine.