[Spirit] Parser omits certain characters

Dear all, For all of the following, please see the attached test-case. I would like to specify certain characterstics of variables in the following way: d(MY_DPAR_01,-10.3,12.8,100) . So in this case there is a type-identifyer (d stands for "double") and then, in parantheses, a variable name (or alternatively an index), a lower and upper boundary and some integer parameter. What is contained in the list depends on the parameter type. It should then be possible to specify any number of variables, including their properties, in a comma-separated list. So I need to dissect a string like the following: "d(MY_DPAR_01,-10.3,12.8,100), d(0,-10.3,12.8,100), i(SOME_IPAR_17, 0,5), b(SOME_BPAR)" The first step, as I see it, is to seperate the "outer" list and to store each variable description in a boost::tuple<char, std::string>, where the char holds the type identifier and the std::string the part inside the parantheses. Then, in the second step, I want to parse that string according to what is expected for this particular type. I have now run into the problem that the following rule in Boost.Spirit "swallows" all non-alphanumeric characters. qi::rule<std::string::const_iterator, std::string(), ascii::space_type> varSpec = +(alnum | '_' | ',' | '.' | '+' | '-'); varSpec stands for the text inside of the parantheses. I would have expected this to match any string of at least one of the listed character types (i.e. alpha-numeric, underscore, etc.). The rule qi::rule<std::string::const_iterator, boost::tuple<char, std::string>(), ascii::space_type> varString = char_("dfib") >> '(' >> varSpec >> ')'; is then meant to separate the type-descriptor from the description. Parsing with these yields the output d: MYDPAR01103128100 d: 0103128100 i: SOMEIPAR1705 b: SOMEBPAR on a Mac (Mavericks, Boost 1.54) i.e., only the alpha-numeric characters remain. So clearly my assumption about the "varSpec" rule is wrong. On a Ubuntu 13.10 System (Boost 1.54, g++ 4.8.1) I get the output d: d: i: b: which is equally wrong. Funnily, if I define varSpec simply as +char_, varSpec gives the correct result on the Mac (haven't tried Ubuntu), but varString does not match anything. So I am at a loss here and would appreciate some help. Best Regards and thanks for any help you can provide, Beet

2013/11/1 beet <r.berlich@gemfony.eu>
Dear all,
For all of the following, please see the attached test-case.
I would like to specify certain characterstics of variables in the following way: d(MY_DPAR_01,-10.3,12.8,100) . So in this case there is a type-identifyer (d stands for "double") and then, in parantheses, a variable name (or alternatively an index), a lower and upper boundary and some integer parameter. What is contained in the list depends on the parameter type.
It should then be possible to specify any number of variables, including their properties, in a comma-separated list.
So I need to dissect a string like the following:
"d(MY_DPAR_01,-10.3,12.8,100), d(0,-10.3,12.8,100), i(SOME_IPAR_17, 0,5), b(SOME_BPAR)"
The first step, as I see it, is to seperate the "outer" list and to store each variable description in a boost::tuple<char, std::string>, where the char holds the type identifier and the std::string the part inside the parantheses. Then, in the second step, I want to parse that string according to what is expected for this particular type.
I have now run into the problem that the following rule in Boost.Spirit "swallows" all non-alphanumeric characters.
qi::rule<std::string::const_iterator, std::string(), ascii::space_type> varSpec = +(alnum | '_' | ',' | '.' | '+' | '-');
Note '_' is actually qi::lit('_'), which exposes *no* attribute, unlike qi::char_ (and qi::alnum, etc) which gives you a char. I'd suggest +qi::char_("a-zA-Z0-9_,.+-") for varSpec. HTH

Dear Tongari, dear all, Am 01.11.13 03:42, schrieb TONGARI J:
2013/11/1 beet <r.berlich@gemfony.eu <mailto:r.berlich@gemfony.eu>>
Dear all,
For all of the following, please see the attached test-case.
I would like to specify certain characterstics of variables in the following way: d(MY_DPAR_01,-10.3,12.8,100) . So in this case there is a type-identifyer (d stands for "double") and then, in parantheses, a variable name (or alternatively an index), a lower and upper boundary and some integer parameter. What is contained in the list depends on the parameter type.
It should then be possible to specify any number of variables, including their properties, in a comma-separated list.
So I need to dissect a string like the following:
"d(MY_DPAR_01,-10.3,12.8,100), d(0,-10.3,12.8,100), i(SOME_IPAR_17, 0,5), b(SOME_BPAR)"
The first step, as I see it, is to seperate the "outer" list and to store each variable description in a boost::tuple<char, std::string>, where the char holds the type identifier and the std::string the part inside the parantheses. Then, in the second step, I want to parse that string according to what is expected for this particular type.
I have now run into the problem that the following rule in Boost.Spirit "swallows" all non-alphanumeric characters.
qi::rule<std::string::const_iterator, std::string(), ascii::space_type> varSpec = +(alnum | '_' | ',' | '.' | '+' | '-');
Note '_' is actually qi::lit('_'), which exposes *no* attribute, unlike qi::char_ (and qi::alnum, etc) which gives you a char. I'd suggest +qi::char_("a-zA-Z0-9_,.+-") for varSpec.
thanks -- this worked nicely. I am now trying to refine my parser. I would like to distinguish the following cases in a string: 0 --> should be parsed into an integer SOME_VAR --> should be parsed into a string SOME_VAR[0] --> should be parsed into a string and an integer I have created the following rule: qi::rule<std::string::const_iterator, VARTYPE(), ascii::space_type> varReference = ( (attr(0) >> attr("empty") >> uint_) | (attr(1) >> identifier >> '[' >> uint_ >> ']') | (attr(2) >> identifier >> attr(0)) ); Here, VARTYPE is a typedef for boost::tuple<std::size_t, std::string, std::size_t>, and "identifier" stands for lexeme[+char_("0-9a-zA-Z_")] as per your suggestion. The idea is to provide the user with a mode-variable to allow easy distinction between all three cases, and to fill unused parts of the tuple with some placeholder (such as "empty" or 0), with the help of attr(). So the first entry of the tuple represents the mode, the second the variable name and the third an optional index. Now, parsing a string like "MY_DPAR_02[3]" yields "emptyMY_DPAR_02" for the std::string component of the tuple, and "SOME_IPAR_17" results in "emptySOME_IPAR_17SOME_IPAR_17" . Parsing a single 0 yields the correct result. So the parser seems to go through all three components of varReference, until it finds a matching rule. However, instead of overwriting the std::string with the string it has found, it appears to concatenate the strings of all rules it has gone through. This is a bit mysterious to me and I would appreciate your help. I'm using Boost 1.54 64 bit, the above happens on Ubuntu Linux 13.10, g++ 4.8.1 . In any case thanks, Beet

2013/11/5 beet <r.berlich@gemfony.eu>
Dear Tongari, dear all,
[...]
I have created the following rule:
qi::rule<std::string::const_iterator, VARTYPE(), ascii::space_type> varReference = ( (attr(0) >> attr("empty") >> uint_) | (attr(1) >> identifier >> '[' >> uint_ >> ']') | (attr(2) >> identifier >> attr(0)) );
Here, VARTYPE is a typedef for boost::tuple<std::size_t, std::string, std::size_t>, and "identifier" stands for lexeme[+char_("0-9a-zA-Z_")] as per your suggestion.
The idea is to provide the user with a mode-variable to allow easy distinction between all three cases, and to fill unused parts of the tuple with some placeholder (such as "empty" or 0), with the help of attr(). So the first entry of the tuple represents the mode, the second the variable name and the third an optional index.
Now, parsing a string like "MY_DPAR_02[3]" yields "emptyMY_DPAR_02" for the std::string component of the tuple, and "SOME_IPAR_17" results in "emptySOME_IPAR_17SOME_IPAR_17" . Parsing a single 0 yields the correct result.
So the parser seems to go through all three components of varReference, until it finds a matching rule. However, instead of overwriting the std::string with the string it has found, it appears to concatenate the strings of all rules it has gone through.
Try this: http://www.boost.org/doc/libs/1_54_0/libs/spirit/doc/html/spirit/qi/referenc...
This is a bit mysterious to me and I would appreciate your help.
HTH

Am 05.11.13 02:27, schrieb TONGARI J:
2013/11/5 beet <r.berlich@gemfony.eu <mailto:r.berlich@gemfony.eu>> [...] Try this: http://www.boost.org/doc/libs/1_54_0/libs/spirit/doc/html/spirit/qi/referenc...
This is a bit mysterious to me and I would appreciate your help.
Thanks a lot! Just for the reference of other readers: The following rule seems to work: varReference = ( hold[attr(0) >> attr("empty") >> uint_] | hold[attr(1) >> identifier >> '[' >> uint_ >> ']'] | hold[attr(2) >> identifier >> attr(0)] ); where the attribute of varReference is a boost::tuple<std::size_t, std::string, std::size_t> and identifier refers to a string consistng of alphanumeric characters and '_' . Best Regards, Beet
participants (3)
-
beet
-
ruediger
-
TONGARI J