[Spirit/Qi] Sequential-or attribute access in semantic action

older
[chrono] New Features + Bug fixes...

Adam Butcher

4 Dec 2009 4 Dec '09

9:23 a.m.

Hi, I am having a bit trouble with the semantic action of a sequential-or parser. Specifically my problem is with accessing the attribute of the sequential-or parser. My first observation is that the documentation at http://www.boost.org/doc/libs/1_41_0/libs/spirit/doc/html/spirit/qi/quick_re... defines the attribute type of (a || b) as tuple<A, B>. Unless I've misunderstood, this must be a typo as they should both be optional (it is never valid for both to be omitted but I would expect that client code be able to determine which out of (a), (b) or (a >> b) were provided.) My real problem is that applying a semantic action to [what I think is] an expression returning a sequential-or parser, results in the _1 referring to the attribute of the last parser on the right-hand-side of the sequential-or rather than the attribute of the sequential-or parser itself. In the following snippet (assume 'start' has an attribute of type S), idxspec = '[' >> (start || (':' >> -start)) [ _val = access_index (_r1,lvalue,_1) ] >> ']'; the function 'access_index' is called with an arg3 of type 'optional<S>' rather than the expected 'tuple<optional<S>, optional<optional<S>>>'. Wrapping the sequential-or expression in a 'repeat' directive as repeat(1)[start || (':' >> -start)] solved the problem but obviously gave me an unwanted sequence (albeit with only one element). The attribute type of the above came out to be vector<tuple<optional<S>, optional<optional<S>>>> as expected. To remove the sequence I created a parser directive called 'identity' following the form of 'repeat' which I have attached. My final rule is now idxspec = '[' >> identity[start || (':' >> -start)] [ _val = access_index (_r1,lvalue,_1) ] >> ']'; which is working fine; the _1 delivers an argument of type tuple<optional<S>, optional<optional<S>>> as required. However, I would have expected this behaviour from the first expression. Is this a a proto issue, a spirit issue or have I completely misunderstood something somewhere? It may or may not be relevant but I am using both spirit.classic and spirit 2.1 in the same file. This is historical and will be fixed in time but I guess it could be important. So far I have had no problems with the combination however. Additional info: I am using gcc 4.4.2, stlport 5.1.3 and boost 1.41.0. Regards, Adam

Attachments:

boost-spirit-nonstandard-qi-directive-identity.hpp (text/plain — 2.3 KB)

Show replies by date

Hartmut Kaiser

4 Dec 4 Dec

2:37 p.m.

New subject: [Spirit/Qi] Sequential-or attribute access in semantic action

...

I am having a bit trouble with the semantic action of a sequential-or parser. Specifically my problem is with accessing the attribute of the sequential-or parser. My first observation is that the documentation at http://www.boost.org/doc/libs/1_41_0/libs/spirit/doc/html/spirit/qi/qui ck_reference/qi_parsers/operator.html defines the attribute type of (a || b) as tuple<A, B>. Unless I've misunderstood, this must be a typo as they should both be optional (it is never valid for both to be omitted but I would expect that client code be able to determine which out of (a), (b) or (a >> b) were provided.)

Yes, that's a documentation error. The correct attribute propagation rules should be: a: A, b: B --> (a || b): tuple<optional<A>, optional<B> > a: A, b: Unused --> (a || b): optional<A> a: Unused, b: B --> (a || b): optional<B> a: Unused, b: Unused --> (a || b): Unused I'll fix that in the docs asap.

...

My real problem is that applying a semantic action to [what I think is] an expression returning a sequential-or parser, results in the _1 referring to the attribute of the last parser on the right-hand-side of the sequential-or rather than the attribute of the sequential-or parser itself. In the following snippet (assume 'start' has an attribute of type S),

idxspec = '[' >> (start || (':' >> -start)) [ _val = access_index (_r1,lvalue,_1) ] >> ']';

the function 'access_index' is called with an arg3 of type 'optional<S>' rather than the expected 'tuple<optional<S>, optional<optional<S>>>'.

Well, a sequential-or is (attribute-wise) very much the same as a plain sequence: a >> b. That means if you attach an action to the whole thing _1 will refer to the attribute of the first and _2 to the attribute of the second element in that sequence. I'll add a note to the documentation making this clear.

...

Wrapping the sequential-or expression in a 'repeat' directive as

repeat(1)[start || (':' >> -start)]

solved the problem but obviously gave me an unwanted sequence (albeit with only one element). The attribute type of the above came out to be

vector<tuple<optional<S>, optional<optional<S>>>>

as expected.

To remove the sequence I created a parser directive called 'identity' following the form of 'repeat' which I have attached. My final rule is now

idxspec = '[' >> identity[start || (':' >> -start)] [ _val = access_index (_r1,lvalue,_1) ] >> ']';

which is working fine; the _1 delivers an argument of type

tuple<optional<S>, optional<optional<S>>>

as required. However, I would have expected this behaviour from the first expression. Is this a a proto issue, a spirit issue or have I completely misunderstood something somewhere?

That's expected as well, as your identity[] directive exposes the _whole_ attribute of the embedded parser as its own attribute. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com

Adam Butcher

4:21 p.m.

New subject: [Spirit/Qi] Sequential-or attribute access in semantic action

Hi Hartmut, Thanks for responding so quickly. On Fri, December 4, 2009 2:37 pm, Hartmut Kaiser wrote:

...

Adam Butcher wrote:

...
My real problem is that applying a semantic action to [what I think is] an expression returning a sequential-or parser, results in the _1 referring to the attribute of the last parser on the right-hand-side of the sequential-or rather than the attribute of the sequential-or parser itself. In the following snippet (assume 'start' has an attribute of type S),

idxspec = '[' >> (start || (':' >> -start)) [ _val = access_index (_r1,lvalue,_1) ] >> ']';

the function 'access_index' is called with an arg3 of type 'optional<S>' rather than the expected 'tuple<optional<S>, optional<optional<S>>>'.

Well, a sequential-or is (attribute-wise) very much the same as a plain sequence: a >> b. That means if you attach an action to the whole thing _1 will refer to the attribute of the first and _2 to the attribute of the second element in that sequence. I'll add a note to the documentation making this clear.

Ah I see. So in the above snippet _1 will refer to an attribute of type optional<S> and _2 to an attribute of type optional<optional<S>>. Thanks for clearing my misunderstanding up. I use multi-argument sequences elsewhere so I should have really thought to try that (hindsight is such a wonderful thing!) In my experiments I had switched the '-start' for 'int_' and was convinced that my _1 had become 'int' by the time it reached the function -- this is what confused me into thinking that it was the right-hand-side that was being delivered as _1. I assume now that this must have been to do with my change causing some earlier error and the compiler had substituted 'int' for some unresolved type which had arrived in my function. 'int_' was probably a bad choice for the test!

...

...
My final rule is now

idxspec = '[' >> identity[start || (':' >> -start)] [ _val = access_index (_r1,lvalue,_1) ] >> ']';

which is working fine; the _1 delivers an argument of type

tuple<optional<S>, optional<optional<S>>>

as required. However, I would have expected this behaviour from the first expression. Is this a a proto issue, a spirit issue or have I completely misunderstood something somewhere?

That's expected as well, as your identity[] directive exposes the _whole_ attribute of the embedded parser as its own attribute.

I understood that my identity[] directive had effectively wrapped the contained parser into a single 'atom' which yielded a single 'flattened' attribute. I had assumed that the basic behaviour without the identity[] directive would have yielded the same. I had not twigged that the sequential-or was behaving as a sequence and was confused by my earlier erroneous experiment. So, avoiding the identity[] directive the following idxspec = '[' >> (start || (':' >> -start)) [ _val = access_index (_r1,lvalue,_1,_2) ] >> ']'; does as I originally intended. Thanks very much for your help. This lead me on to the following, rather pedantic, musing on the provision of the identity[] directive in the core library to 'flatten' composed attributes. In re-implementing my action function with two arguments I had difficultly naming the first argument as it has logically distinct semantics based on whether the second (range specifier) is provided. Previously (with identity[]) it was one argument named 'access_spec' and gave rise to the following implementation: optional<optional<S>> const& range_spec = at_c<1>(access_spec); if (range_spec) { optional<S> const& range_min = at_c<0>(access_spec); optional<S> const& range_max = *range_spec; // ... } else { S const& index = *at_c<0>(access_spec); // ... } In the above the first element of 'access_spec' is effectively unnamed until it is resolved to be either 'range_min' or 'index'. Using two arguments I was forced to name the first even though its semantics are conditional on the second. Going for arguments named 'range_min' and 'range_spec' gave rise to the following: if (range_spec) { optional<S> const& range_max = *range_spec; // ... } else { exp_attr const& index = *range_min; // ... } which leans toward the range alternative -- effectively changing the meaning of 'range_min' in the else case. I wondered if providing identity[] for situations like these may result in more logical user code. Perhaps I just think too much -- or perhaps I should just have two functions to start with (though that would probably affect the elegance of the grammar spec)! Thanks again. Regards, Adam

Hartmut Kaiser

5:38 p.m.

Adam,

...

...
Well, a sequential-or is (attribute-wise) very much the same as a plain sequence: a >> b. That means if you attach an action to the whole thing _1 will refer to the attribute of the first and _2 to the attribute of the second element in that sequence. I'll add a note to the documentation making this clear.

Ah I see. So in the above snippet _1 will refer to an attribute of type optional<S> and _2 to an attribute of type optional<optional<S>>. Thanks for clearing my misunderstanding up.

Well, I expect it to be an optional<S> only as well, but I'm not sure. Normally this kind of redundancy gets collapsed away (and there is no reason for it to be an optional<optional<S>>. Ok, I looked: you're right, but I consider this to be a bug. This will be fixed in the next release, then.

...

I use multi-argument sequences elsewhere so I should have really thought to try that (hindsight is such a wonderful thing!) In my experiments I had switched the '-start' for 'int_' and was convinced that my _1 had become 'int' by the time it reached the function -- this is what confused me into thinking that it was the right-hand-side that was being delivered as _1. I assume now that this must have been to do with my change causing some earlier error and the compiler had substituted 'int' for some unresolved type which had arrived in my function. 'int_' was probably a bad choice for the test!

Hmmm. Doesn't sounds like it. If you ever come across this again, please drop us a line.

...

This lead me on to the following, rather pedantic, musing on the provision of the identity[] directive in the core library to 'flatten' composed attributes. In re-implementing my action function with two arguments I had difficultly naming the first argument as it has logically distinct semantics based on whether the second (range specifier) is provided. Previously (with identity[]) it was one argument named 'access_spec' and gave rise to the following implementation:

[snip]

...

In the above the first element of 'access_spec' is effectively unnamed until it is resolved to be either 'range_min' or 'index'. Using two arguments I was forced to name the first even though its semantics are conditional on the second. Going for arguments named 'range_min' and 'range_spec' gave rise to the following:

[snip]

...

which leans toward the range alternative -- effectively changing the meaning of 'range_min' in the else case. I wondered if providing identity[] for situations like these may result in more logical user code. Perhaps I just think too much -- or perhaps I should just have two functions to start with (though that would probably affect the elegance of the grammar spec)!

That sounds good to me. Would you be willing to contribute the identity[] directive? In this case we would need some docs, tests, and an example or two as well. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com

Adam Butcher

6:19 p.m.

New subject: [Spirit/Qi] Sequential-or attribute access in semantic action

Hi Hartmut, On Fri, December 4, 2009 5:38 pm, Hartmut Kaiser wrote:

...

Adam Butcher wrote:

...
Hartmut Kaiser wrote:

...
Well, a sequential-or is (attribute-wise) very much the same as a plain sequence: a >> b. That means if you attach an action to the whole thing _1 will refer to the attribute of the first and _2 to the attribute of the second element in that sequence. I'll add a note to the documentation making this clear.

Ah I see. So in the above snippet _1 will refer to an attribute of type optional<S> and _2 to an attribute of type optional<optional<S>>. Thanks for clearing my misunderstanding up.

Well, I expect it to be an optional<S> only as well, but I'm not sure. Normally this kind of redundancy gets collapsed away (and there is no reason for it to be an optional<optional<S>>.)

Ok, I looked: you're right, but I consider this to be a bug. This will be fixed in the next release, then.

Oh right. I had considered it desired behaviour. The grammar spec is (start || (':' >> -start)) and not (start || (':' >> start)) I would expect optional<S> for both only in the latter case. In the former I am deliberately allowing the specifier to the right of the colon to be omitted -- within the already optional right-hand-side of the sequential-or. This allows for input such as (1) $x // $x (assume it references a sequence) (2) $x[7] // element 7 of $x (3) $x[7:] // a slice of $x beginning at element 7 (4) $x[:7] // a slice of $x up to element 7 (5) $x[2:7] // a slice of $x beginning at 2 ending at 7 (6) $x[:] // equivalent to $x (unspecified bounds) In the action function, wrt to right-hand-side of the sequential-or, I need to determine: a) whether a range specifier has been provided at all (given by (bool) at_c<1>(access_spec)) b) whether the range specifier includes an explicit bound (given by (bool) *at_c<1>(access_spec)) The left-hand-side of the sequential-or is either an (optional) 'range_begin' index if a range specifier is given or a lone 'index' specifier if no range specifier is given. If the two optionals on the right-hand-side were collapsed would I not lose the ability to distinguish between (2) and (3)?

...

...
I use multi-argument sequences elsewhere so I should have really thought to try that (hindsight is such a wonderful thing!) In my experiments I had switched the '-start' for 'int_' and was convinced that my _1 had become 'int' by the time it reached the function -- this is what confused me into thinking that it was the right-hand-side that was being delivered as _1. I assume now that this must have been to do with my change causing some earlier error and the compiler had substituted 'int' for some unresolved type which had arrived in my function. 'int_' was probably a bad choice for the test!

Hmmm. Doesn't sounds like it. If you ever come across this again, please drop us a line.

Okay. I did have a go at recreating it but it behaved, annoyingly, as intended. I might have another crack at it at some point.

...

...
This lead me on to the following, rather pedantic, musing on the provision of the identity[] directive in the core library to 'flatten' composed attributes. [snip] I wondered if providing identity[] for situations like these may result in more logical user code.

That sounds good to me. Would you be willing to contribute the identity[] directive? In this case we would need some docs, tests, and an example or two as well.

Yes no problem. The implementation was attached to my original mail but when I get the time I will have a look at adding some docs/tests/example also. I've been meaning to learn QuickBook for some time -- this may be just the excuse. Regards Adam

Hartmut Kaiser

6:41 p.m.

...

...
Well, I expect it to be an optional<S> only as well, but I'm not sure. Normally this kind of redundancy gets collapsed away (and there is no reason for it to be an optional<optional<S>>.)

Ok, I looked: you're right, but I consider this to be a bug. This will be fixed in the next release, then.

Oh right. I had considered it desired behaviour. The grammar spec is

(start || (':' >> -start))

and not

(start || (':' >> start))

I would expect optional<S> for both only in the latter case.

In the former I am deliberately allowing the specifier to the right of the colon to be omitted -- within the already optional right-hand-side of the sequential-or. This allows for input such as

(1) $x // $x (assume it references a sequence) (2) $x[7] // element 7 of $x (3) $x[7:] // a slice of $x beginning at element 7 (4) $x[:7] // a slice of $x up to element 7 (5) $x[2:7] // a slice of $x beginning at 2 ending at 7 (6) $x[:] // equivalent to $x (unspecified bounds)

In the action function, wrt to right-hand-side of the sequential-or, I need to determine: a) whether a range specifier has been provided at all (given by (bool) at_c<1>(access_spec)) b) whether the range specifier includes an explicit bound (given by (bool) *at_c<1>(access_spec))

The left-hand-side of the sequential-or is either an (optional) 'range_begin' index if a range specifier is given or a lone 'index' specifier if no range specifier is given.

If the two optionals on the right-hand-side were collapsed would I not lose the ability to distinguish between (2) and (3)?

Yes, you're right, doh! Sorry for the noise. Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com

Hartmut Kaiser

13 Dec 13 Dec

10:24 p.m.

Adam,

...

...
...
This lead me on to the following, rather pedantic, musing on the provision of the identity[] directive in the core library to 'flatten' composed attributes. [snip] I wondered if providing identity[] for situations like these may result in more logical user code.

That sounds good to me. Would you be willing to contribute the identity[] directive? In this case we would need some docs, tests, and an example or two as well.

Yes no problem. The implementation was attached to my original mail but when I get the time I will have a look at adding some docs/tests/example also. I've been meaning to learn QuickBook for some time -- this may be just the excuse.

I was thinking more about your identity[] component and I realized that we actually don't need it. Spirit already has the special placeholder spirit::_0 which refers to the whole attribute of the expression the semantic action is attached to. So your code could be rewritten as: idxspec = '[' >> (start || (':' >> -start)) [ _val = access_index (_r1, lvalue, _0) ] >> ']'; where _0 refers to the whole tuple<optional<S>, optional<optional<S>>>. Sorry for not remembering this earlier... HTH Regards Hartmut --------------- Meet me at BoostCon www.boostcon.com

5705

Age (days ago)

5714

Last active (days ago)

List overview

Download

6 comments

2 participants

participants (2)

Adam Butcher
Hartmut Kaiser