[spirit] semantic action for mismatches?

newer
Boost.Filesystem v3 as default in...

caustik

7 Jan 2011 7 Jan '11

11:54 p.m.

I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions. If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work? caustik

Show replies by date

Joel de Guzman

8 Jan 8 Jan

1:30 a.m.

On 1/8/2011 7:54 AM, caustik wrote:

...

I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Sure: r = p | eps[cleanup]; If p fails with side-effects from its direct or indirect semantic actions, the cleanup semantic action can roll them back. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

caustik

1:57 a.m.

On Fri, Jan 7, 2011 at 5:30 PM, Joel de Guzman <joel@boost-consulting.com>wrote:

...

On 1/8/2011 7:54 AM, caustik wrote:

...
I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Sure:

r = p | eps[cleanup];

If p fails with side-effects from its direct or indirect semantic actions, the cleanup semantic action can roll them back. <http://spirit.sf.net>

hmmm... Brilliant. You sir, are a gentleman and a scholar. I wonder if this is FAQ material. It's elegant yet a bit non-intuitive.

Joel de Guzman

2:12 a.m.

On 1/8/2011 9:57 AM, caustik wrote:

...

On Fri, Jan 7, 2011 at 5:30 PM, Joel de Guzman<joel@boost-consulting.com>wrote:

...
On 1/8/2011 7:54 AM, caustik wrote:

...
I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Sure:

r = p | eps[cleanup];

If p fails with side-effects from its direct or indirect semantic actions, the cleanup semantic action can roll them back. <http://spirit.sf.net>

hmmm...

Brilliant. You sir, are a gentleman and a scholar.

<blush> That was an easy way to earn that comment :-)

...

I wonder if this is FAQ material. It's elegant yet a bit non-intuitive.

There is now: http://boost-spirit.com/home/articles/doc-addendum/faq/ Thanks! Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

caustik

3:54 a.m.

On Fri, Jan 7, 2011 at 6:12 PM, Joel de Guzman <joel@boost-consulting.com>wrote:

...

On 1/8/2011 9:57 AM, caustik wrote:

...
On Fri, Jan 7, 2011 at 5:30 PM, Joel de Guzman<joel@boost-consulting.com

...
wrote:

On 1/8/2011 7:54 AM, caustik wrote:

...
I've come to notice that there seems to be a missing bit of symmetry in

...
spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Sure:

r = p | eps[cleanup];

If p fails with side-effects from its direct or indirect semantic actions, the cleanup semantic action can roll them back. <http://spirit.sf.net>

hmmm...

Brilliant. You sir, are a gentleman and a scholar.

<blush> That was an easy way to earn that comment :-)

hmm I'll have to make you work a little harder then.. So, how about if you want the rule "r" to still mismatch? Since eps always matches, the rule "r" as a whole succeeds. Also, how do you access the result of "p" from inside "cleanup"?

Joel de Guzman

4:04 a.m.

On 1/8/2011 11:54 AM, caustik wrote:

...

So, how about if you want the rule "r" to still mismatch? Since eps always matches, the rule "r" as a whole succeeds.

!eps[cleanup]

...

Also, how do you access the result of "p" from inside "cleanup"?

Use rule local variables to pass data. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Eric Niebler

4:36 p.m.

On 1/7/2011 6:54 PM, caustik wrote:

...

I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Just for reference, xpressive doesn't have this problem because actions are executed lazily. When a sub-pattern matches, its action is placed on a queue. If the pattern matching engine then needs to backtrack, the action is un-queued. Only when the *whole* pattern matches successfully is the entire action sequence executed ... in order, of course. -- Eric Niebler BoostPro Computing http://www.boostpro.com

caustik

4:41 p.m.

I might give that a shot as I can't seem to get local variables to solve my problem. I would expect these two commands to have the same result (albeit unnecessarily assigning the _a variable, which just gets tossed away anyway), but apparently they are not? r = p[_a = _1] r = p On Sat, Jan 8, 2011 at 8:36 AM, Eric Niebler <eric@boostpro.com> wrote:

...

On 1/7/2011 6:54 PM, caustik wrote:

...
I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Just for reference, xpressive doesn't have this problem because actions are executed lazily. When a sub-pattern matches, its action is placed on a queue. If the pattern matching engine then needs to backtrack, the action is un-queued. Only when the *whole* pattern matches successfully is the entire action sequence executed ... in order, of course.

-- Eric Niebler BoostPro Computing http://www.boostpro.com _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Hartmut Kaiser

5:19 p.m.

...

I might give that a shot as I can't seem to get local variables to solve my problem. I would expect these two commands to have the same result (albeit unnecessarily assigning the _a variable, which just gets tossed away anyway), but apparently they are not?

r = p[_a = _1] r = p

Automatic attribute propagation from the rhs expression to the lhs rule is disabled as soon as semantic actions are involved. If you want to enforce attribute propagation anyways, write: rule<...> r; r %= p[...]; If the rhs has no semantic actions attached, operator=() behaves exactly like operator%=(). Regards Hartmut --------------- http://boost-spirit.com

...

On Sat, Jan 8, 2011 at 8:36 AM, Eric Niebler <eric@boostpro.com> wrote:

...
On 1/7/2011 6:54 PM, caustik wrote:

...
I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Just for reference, xpressive doesn't have this problem because actions are executed lazily. When a sub-pattern matches, its action is placed on a queue. If the pattern matching engine then needs to backtrack, the action is un-queued. Only when the *whole* pattern matches successfully is the entire action sequence executed ... in order, of course.

-- Eric Niebler BoostPro Computing http://www.boostpro.com _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

caustik

5:39 p.m.

On Sat, Jan 8, 2011 at 9:19 AM, Hartmut Kaiser <hartmut.kaiser@gmail.com>wrote:

...

...
I might give that a shot as I can't seem to get local variables to solve my problem. I would expect these two commands to have the same result (albeit unnecessarily assigning the _a variable, which just gets tossed away anyway), but apparently they are not?

r = p[_a = _1] r = p

Automatic attribute propagation from the rhs expression to the lhs rule is disabled as soon as semantic actions are involved. If you want to enforce attribute propagation anyways, write:

rule<...> r; r %= p[...];

If the rhs has no semantic actions attached, operator=() behaves exactly like operator%=().

That does the trick. Apologies for the previous top-post. I've been out of the mailing list game for a while now =) This style ended up working and I can finally move on to other things: r %= p1[_a = _1] >> ('*' | (eps[bind(&class::undo, &instance, _a)] >> !eps))

...

...
p2

Hartmut Kaiser

6:03 p.m.

...

...
...
I might give that a shot as I can't seem to get local variables to solve my problem. I would expect these two commands to have the same result (albeit unnecessarily assigning the _a variable, which just gets tossed away anyway), but apparently they are not?

r = p[_a = _1] r = p

Automatic attribute propagation from the rhs expression to the lhs rule is disabled as soon as semantic actions are involved. If you want to enforce attribute propagation anyways, write:

rule<...> r; r %= p[...];

If the rhs has no semantic actions attached, operator=() behaves exactly like operator%=().

That does the trick. Apologies for the previous top-post. I've been out of the mailing list game for a while now =)

This style ended up working and I can finally move on to other things:

r %= p1[_a = _1] >> ('*' | (eps[bind(&class::undo, &instance, _a)] >> !eps))

...
...
p2

FWIW, this should do the same, but simpler: r %= p1[_a = _1] >> ('*' | !eps[bind(&class::undo, &instance, _a)]) >> p2; Regards Hartmut --------------- http://boost-spirit.com

caustik

6:16 p.m.

On Sat, Jan 8, 2011 at 10:03 AM, Hartmut Kaiser <hartmut.kaiser@gmail.com>wrote:

...

...
...
...
I might give that a shot as I can't seem to get local variables to solve my problem. I would expect these two commands to have the same result (albeit unnecessarily assigning the _a variable, which just gets tossed away anyway), but apparently they are not?

r = p[_a = _1] r = p

Automatic attribute propagation from the rhs expression to the lhs rule is disabled as soon as semantic actions are involved. If you want to enforce attribute propagation anyways, write:

rule<...> r; r %= p[...];

If the rhs has no semantic actions attached, operator=() behaves exactly like operator%=().

That does the trick. Apologies for the previous top-post. I've been out of the mailing list game for a while now =)

This style ended up working and I can finally move on to other things:

r %= p1[_a = _1] >> ('*' | (eps[bind(&class::undo, &instance, _a)] >> !eps))

...
...
p2

FWIW, this should do the same, but simpler:

r %= p1[_a = _1] >> ('*' | !eps[bind(&class::undo, &instance, _a)]) >> p2;

Ah, I was thinking that ! behaved like this: (!eps)[action] as opposed to this: !(eps[action])

caustik

9:36 p.m.

On Sat, Jan 8, 2011 at 10:16 AM, caustik <caustik@gmail.com> wrote:

...

On Sat, Jan 8, 2011 at 10:03 AM, Hartmut Kaiser <hartmut.kaiser@gmail.com>wrote:

...
...
...
...
I might give that a shot as I can't seem to get local variables to solve my problem. I would expect these two commands to have the same result (albeit unnecessarily assigning the _a variable, which just gets tossed away anyway), but apparently they are not?

r = p[_a = _1] r = p

Automatic attribute propagation from the rhs expression to the lhs rule is disabled as soon as semantic actions are involved. If you want to enforce attribute propagation anyways, write:

rule<...> r; r %= p[...];

If the rhs has no semantic actions attached, operator=() behaves exactly like operator%=().

That does the trick. Apologies for the previous top-post. I've been out of the mailing list game for a while now =)

This style ended up working and I can finally move on to other things:

r %= p1[_a = _1] >> ('*' | (eps[bind(&class::undo, &instance, _a)] >> !eps))

...
...
p2

FWIW, this should do the same, but simpler:

r %= p1[_a = _1] >> ('*' | !eps[bind(&class::undo, &instance, _a)]) >> p2;

Ah, I was thinking that ! behaved like this:

(!eps)[action]

as opposed to this:

!(eps[action])

Now, suppose to avoid all this, you want to instead just have each rule return a shared_ptr e.g.: rule<Iterator, shared_ptr<MyClass>()> start; Except, you don't want to call "delete" on MyClass* but instead you want to call your custom cleanup function. So, is there a way to tell your rule what parameters to pass to it's synthesized attribute's constructor, so that they construct like this: shared_ptr<MyClass> synthAttr(ptr, myDeleter);

caustik

10:11 p.m.

Figured it out.. struct sharedMyClass : public shared_ptr<class MyClass> { sharedMyClass () : shared_ptr<MyClass> { } sharedMyClass (class MyClass *pMyClass) : shared_ptr<pae_concept>(pMyClass, MyClass::Release) { } }; Then just have your rules use sharedMyClass as a synthesized attribute. On Sat, Jan 8, 2011 at 1:36 PM, caustik <caustik@gmail.com> wrote:

...

On Sat, Jan 8, 2011 at 10:16 AM, caustik <caustik@gmail.com> wrote:

...
On Sat, Jan 8, 2011 at 10:03 AM, Hartmut Kaiser <hartmut.kaiser@gmail.com

...
wrote:

...
...
...
...
I might give that a shot as I can't seem to get local variables to solve my problem. I would expect these two commands to have the same result (albeit unnecessarily assigning the _a variable, which just gets tossed away anyway), but apparently they are not?

r = p[_a = _1] r = p

Automatic attribute propagation from the rhs expression to the lhs rule is disabled as soon as semantic actions are involved. If you want to enforce attribute propagation anyways, write:

rule<...> r; r %= p[...];

If the rhs has no semantic actions attached, operator=() behaves exactly like operator%=().

That does the trick. Apologies for the previous top-post. I've been out of the mailing list game for a while now =)

This style ended up working and I can finally move on to other things:

r %= p1[_a = _1] >> ('*' | (eps[bind(&class::undo, &instance, _a)] >> !eps))

...
...
p2

FWIW, this should do the same, but simpler:

r %= p1[_a = _1] >> ('*' | !eps[bind(&class::undo, &instance, _a)]) >> p2;

Ah, I was thinking that ! behaved like this:

(!eps)[action]

as opposed to this:

!(eps[action])

Now, suppose to avoid all this, you want to instead just have each rule return a shared_ptr e.g.:

rule<Iterator, shared_ptr<MyClass>()> start;

Except, you don't want to call "delete" on MyClass* but instead you want to call your custom cleanup function. So, is there a way to tell your rule what parameters to pass to it's synthesized attribute's constructor, so that they construct like this:

shared_ptr<MyClass> synthAttr(ptr, myDeleter);

Joel de Guzman

10 Jan 10 Jan

12:09 a.m.

On 1/9/2011 12:36 AM, Eric Niebler wrote:

...

On 1/7/2011 6:54 PM, caustik wrote:

...
I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Just for reference, xpressive doesn't have this problem because actions are executed lazily. When a sub-pattern matches, its action is placed on a queue. If the pattern matching engine then needs to backtrack, the action is un-queued. Only when the *whole* pattern matches successfully is the entire action sequence executed ... in order, of course.

We've considered that approach before, but what do you really mean by sub-pattern? AFAICT, lazily evaluating actions can only be done at the topmost (start) node since a sub-parser cannot really know that it will be rolled back. Alas, that approach does not work well with fully typed attributed grammars and type-erased rules. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Eric Niebler

1:58 a.m.

On 1/9/2011 7:09 PM, Joel de Guzman wrote:

...

On 1/9/2011 12:36 AM, Eric Niebler wrote:

...
On 1/7/2011 6:54 PM, caustik wrote:

...
I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Just for reference, xpressive doesn't have this problem because actions are executed lazily. When a sub-pattern matches, its action is placed on a queue. If the pattern matching engine then needs to backtrack, the action is un-queued. Only when the *whole* pattern matches successfully is the entire action sequence executed ... in order, of course.

We've considered that approach before, but what do you really mean by sub-pattern?

Anything that isn't the outermost pattern-match.

...

AFAICT, lazily evaluating actions can only be done at the topmost (start) node since a sub-parser cannot really know that it will be rolled back.

Correct.

...

Alas, that approach does not work well with fully typed attributed grammars and type-erased rules.

I don't know about the "fully typed attributed" part, but type-erasure has nothing to do with it. Xpressive regexes are also type-erased. Only regex algorithms begin the pattern match. That entry point is what sets up the action context, and only the end state of the outermost regex causes the chain of actions to execute. Trust me, it works. -- Eric Niebler BoostPro Computing http://www.boostpro.com

caustik

2:03 a.m.

On Sun, Jan 9, 2011 at 5:58 PM, Eric Niebler <eric@boostpro.com> wrote:

...

On 1/9/2011 7:09 PM, Joel de Guzman wrote:

...
On 1/9/2011 12:36 AM, Eric Niebler wrote:

...
On 1/7/2011 6:54 PM, caustik wrote:

...
I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Just for reference, xpressive doesn't have this problem because actions are executed lazily. When a sub-pattern matches, its action is placed on a queue. If the pattern matching engine then needs to backtrack, the action is un-queued. Only when the *whole* pattern matches successfully is the entire action sequence executed ... in order, of course.

We've considered that approach before, but what do you really mean by sub-pattern?

Anything that isn't the outermost pattern-match.

...
AFAICT, lazily evaluating actions can only be done at the topmost (start) node since a sub-parser cannot really know that it will be rolled back.

Correct.

...
Alas, that approach does not work well with fully typed attributed grammars and type-erased rules.

I don't know about the "fully typed attributed" part, but type-erasure has nothing to do with it. Xpressive regexes are also type-erased. Only regex algorithms begin the pattern match. That entry point is what sets up the action context, and only the end state of the outermost regex causes the chain of actions to execute. Trust me, it works.

So is it possible for "sub-patterns" to be typed (like synthesized attributes in Spirit), and if so, how? I'm also curious what the difference performance characteristics are (Spirit / Xpressive).

Joel de Guzman

2:49 a.m.

On 1/10/2011 10:03 AM, caustik wrote:

...

On Sun, Jan 9, 2011 at 5:58 PM, Eric Niebler<eric@boostpro.com> wrote:

...
On 1/9/2011 7:09 PM, Joel de Guzman wrote:

...
On 1/9/2011 12:36 AM, Eric Niebler wrote:

...
On 1/7/2011 6:54 PM, caustik wrote:

...
I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Just for reference, xpressive doesn't have this problem because actions are executed lazily. When a sub-pattern matches, its action is placed on a queue. If the pattern matching engine then needs to backtrack, the action is un-queued. Only when the *whole* pattern matches successfully is the entire action sequence executed ... in order, of course.

We've considered that approach before, but what do you really mean by sub-pattern?

Anything that isn't the outermost pattern-match.

...
AFAICT, lazily evaluating actions can only be done at the topmost (start) node since a sub-parser cannot really know that it will be rolled back.

Correct.

...
Alas, that approach does not work well with fully typed attributed grammars and type-erased rules.

I don't know about the "fully typed attributed" part, but type-erasure has nothing to do with it. Xpressive regexes are also type-erased. Only regex algorithms begin the pattern match. That entry point is what sets up the action context, and only the end state of the outermost regex causes the chain of actions to execute. Trust me, it works.

So is it possible for "sub-patterns" to be typed (like synthesized attributes in Spirit), and if so, how?

AFAIK, no. xpressive does not have attributes.

...

I'm also curious what the difference performance characteristics are (Spirit / Xpressive).

I'm not quite keen on apples-oranges comparisons. Both xpressive and Spirit have their place. That said, I am not aware of any formal benchmarks, but there's an informal one posted by Overmind on the Boost users list: http://lists.boost.org/Archives/boost/2009/07/153899.php With that simple test, Spirit beats highly optimized xpressive (1.5 secs vs. 9 secs). You might want to read the whole thread. Of course, your millage may vary. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Stewart, Robert

1:02 p.m.

Joel de Guzman wrote:

...

On 1/10/2011 10:03 AM, caustik wrote:

...
I'm also curious what the difference performance characteristics are (Spirit / Xpressive).

I'm not quite keen on apples-oranges comparisons. Both xpressive and Spirit have their place. That said, I am not aware of any formal benchmarks, but there's an informal one posted by Overmind on the Boost users list:

http://lists.boost.org/Archives/boost/2009/07/153899.php

With that simple test, Spirit beats highly optimized xpressive (1.5 secs vs. 9 secs). You might want to read the whole thread.

What's disappointing is that after I did, finally, post real code, that thread fizzled. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

caustik

4:15 p.m.

On Mon, Jan 10, 2011 at 5:02 AM, Stewart, Robert <Robert.Stewart@sig.com>wrote:

...

Joel de Guzman wrote:

...
On 1/10/2011 10:03 AM, caustik wrote:

...
I'm also curious what the difference performance characteristics are (Spirit / Xpressive).

I'm not quite keen on apples-oranges comparisons. Both xpressive and Spirit have their place. That said, I am not aware of any formal benchmarks, but there's an informal one posted by Overmind on the Boost users list:

http://lists.boost.org/Archives/boost/2009/07/153899.php

With that simple test, Spirit beats highly optimized xpressive (1.5 secs vs. 9 secs). You might want to read the whole thread.

What's disappointing is that after I did, finally, post real code, that thread fizzled.

It would be a nice addition to the documentation to have some thorough benchmark tests. Any particular ideas on what good tests would be? Maybe even just a note in the FAQs with a suggestion on how tests could be run, and asking for users to contribute, would yield some help from others. I would imagine that there have been a few users who ran their own internal tests before choosing a solution, if they were made aware of the interest in gathering those results, they may volunteer that data.

Stewart, Robert

4:30 p.m.

caustik wrote:

...

It would be a nice addition to the documentation to have some thorough benchmark tests. Any particular ideas on what good tests would be?

I agree that benchmarks would be good, if only to allow one to choose among the alternatives for a given scenario. I'm sure there are plenty of cases that favor each alternative, so the set of test cases would need to be fairly broad. The example I had for parsing various number formats was certainly non-trivial and required a good deal of tuning to get good performance out of Xpressive. I was hamstrung by the need to use Spirit.Classic from Boost 1.37 when I did that. It may be that Spirit.Qi from a recent Boost release would fair better.

...

Maybe even just a note in the FAQs with a suggestion on how tests could be run, and asking for users to contribute, would yield some help from others.

Possibly. The benchmarks could be maintained independently from the affected projects or each project could provide its own optimized implementation of a given benchmark. Then, interested parties could build the benchmarks that most closely match current needs to compare the speed of the potential solutions within their own build environments. Maintaining the implementation of each benchmark with a given project means that a project's cheerleaders can tune the implementation as they like without coordination with any other projects and each project community owns their implementation. Doing them in an independent way would probably lead to their languishing without requisite attention over time. I know I haven't given any specifics in response to your questions, but perhaps this will lead to further discussion. (In fact, I suggest that you restart this discussion in a new thread.) _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Joel de Guzman

4:35 p.m.

On 1/11/2011 12:15 AM, caustik wrote:

...

On Mon, Jan 10, 2011 at 5:02 AM, Stewart, Robert<Robert.Stewart@sig.com>wrote:

...
Joel de Guzman wrote:

...
On 1/10/2011 10:03 AM, caustik wrote:

...
I'm also curious what the difference performance characteristics are (Spirit / Xpressive).

I'm not quite keen on apples-oranges comparisons. Both xpressive and Spirit have their place. That said, I am not aware of any formal benchmarks, but there's an informal one posted by Overmind on the Boost users list:

http://lists.boost.org/Archives/boost/2009/07/153899.php

With that simple test, Spirit beats highly optimized xpressive (1.5 secs vs. 9 secs). You might want to read the whole thread.

What's disappointing is that after I did, finally, post real code, that thread fizzled.

It would be a nice addition to the documentation to have some thorough benchmark tests. Any particular ideas on what good tests would be? Maybe even just a note in the FAQs with a suggestion on how tests could be run, and asking for users to contribute, would yield some help from others. I would imagine that there have been a few users who ran their own internal tests before choosing a solution, if they were made aware of the interest in gathering those results, they may volunteer that data.

I don't agree. If it were Spirit vs. other parsers (yacc, ANTLR, etc.), that would be meaningful, yes. But if it is spirit vs. xpressive, it will be an apple-orange comparison. There are things that are best suited for xpressive that spirit can't do and the same the other way around. These are different tools with some overlap. Real world uses for both tools go beyond this common overlap. Benchmarks that test the common denominator are at best for entertainment only. Would you write a compiler with xpressive? I don't think so. Would you do search and replace with Spirit? Nah. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Stewart, Robert

4:42 p.m.

Joel de Guzman wrote:

...

On 1/11/2011 12:15 AM, caustik wrote:

...
It would be a nice addition to the documentation to have some thorough benchmark tests. Any particular ideas on what good tests would be? Maybe even just a note in the FAQs with a suggestion on how tests could be run, and asking for users to contribute, would yield some help from others. I would imagine that there have been a few users who ran their own internal tests before choosing a solution, if they were made aware of the interest in gathering those results, they may volunteer that data.

I don't agree.

If it were Spirit vs. other parsers (yacc, ANTLR, etc.), that would be meaningful, yes. But if it is spirit vs. xpressive, it will be an apple-orange comparison. There are things that are best suited for xpressive that spirit can't do and the same the other way around. These are different tools with some overlap. Real world uses for both tools go beyond this common overlap. Benchmarks that test the common denominator are at best for entertainment only. Would you write a compiler with xpressive? I don't think so. Would you do search and replace with Spirit? Nah.

Where there is overlap, there is a choice. One can make that choice based solely upon convenience and familiarity, out of complete ignorance, or based upon benchmarks that show solutions to common problems that reveal differences in code written, compilation times, and runtime performance. With benchmarks for various applications of Spirit, there can be yacc, ANTLR, Xpressive, etc. solutions, as appropriate, to assist developers in selecting the right tool. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

caustik

4:42 p.m.

On Mon, Jan 10, 2011 at 8:35 AM, Joel de Guzman <joel@boost-consulting.com>wrote:

...

On 1/11/2011 12:15 AM, caustik wrote:

...
On Mon, Jan 10, 2011 at 5:02 AM, Stewart, Robert<Robert.Stewart@sig.com

...
wrote:

Joel de Guzman wrote:

...
...
On 1/10/2011 10:03 AM, caustik wrote:

I'm also curious what the difference performance

...
characteristics are (Spirit / Xpressive).

I'm not quite keen on apples-oranges comparisons. Both xpressive and Spirit have their place. That said, I am not aware of any formal benchmarks, but there's an informal one posted by Overmind on the Boost users list:

http://lists.boost.org/Archives/boost/2009/07/153899.php

With that simple test, Spirit beats highly optimized xpressive (1.5 secs vs. 9 secs). You might want to read the whole thread.

What's disappointing is that after I did, finally, post real code, that thread fizzled.

It would be a nice addition to the documentation to have some thorough

benchmark tests. Any particular ideas on what good tests would be? Maybe even just a note in the FAQs with a suggestion on how tests could be run, and asking for users to contribute, would yield some help from others. I would imagine that there have been a few users who ran their own internal tests before choosing a solution, if they were made aware of the interest in gathering those results, they may volunteer that data.

I don't agree.

If it were Spirit vs. other parsers (yacc, ANTLR, etc.), that would be meaningful, yes. But if it is spirit vs. xpressive, it will be an apple-orange comparison. There are things that are best suited for xpressive that spirit can't do and the same the other way around. These are different tools with some overlap. Real world uses for both tools go beyond this common overlap. Benchmarks that test the common denominator are at best for entertainment only. Would you write a compiler with xpressive? I don't think so. Would you do search and replace with Spirit? Nah.

Right, I didn't intend to imply the performance measurements should specifically target Spirit vs Xpressive. It's valuable to know how the different "apples" perform in relation to one another, but it's also valuable just to have some absolute measurements just to have some idea how much time your code is going to spend executing the grammar. Of course that's machine dependent, but any modern desktop platform will give at least a rough order of magnitude. For my use case, for example, the grammar will be executed across potentially dozens or more machines in a map-reduce operation, and being able to plan ahead where my performance bottlenecks are going to be is really useful.

Joel de Guzman

2:20 a.m.

On 1/10/2011 9:58 AM, Eric Niebler wrote:

...

On 1/9/2011 7:09 PM, Joel de Guzman wrote:

...
On 1/9/2011 12:36 AM, Eric Niebler wrote:

...
On 1/7/2011 6:54 PM, caustik wrote:

...
I've come to notice that there seems to be a missing bit of symmetry in spirit with regards to semantic actions.

If a rule matches, and thus executes it's semantic action(s), but a rule which includes that rule mismatches, there seems to be no way to "unwind" the code executed down the chain. For example, if one of your semantic actions allocates memory or increments a reference count, how do you free / release that reference in the mismatch scenario? I've thought about using something like a shared_ptr, but it seems like that gets pretty sloppy and unnatural. Is there something you can think of that would work?

Just for reference, xpressive doesn't have this problem because actions are executed lazily. When a sub-pattern matches, its action is placed on a queue. If the pattern matching engine then needs to backtrack, the action is un-queued. Only when the *whole* pattern matches successfully is the entire action sequence executed ... in order, of course.

We've considered that approach before, but what do you really mean by sub-pattern?

Anything that isn't the outermost pattern-match.

...
AFAICT, lazily evaluating actions can only be done at the topmost (start) node since a sub-parser cannot really know that it will be rolled back.

Correct.

...
Alas, that approach does not work well with fully typed attributed grammars and type-erased rules.

I don't know about the "fully typed attributed" part, but type-erasure has nothing to do with it. Xpressive regexes are also type-erased. Only regex algorithms begin the pattern match. That entry point is what sets up the action context, and only the end state of the outermost regex causes the chain of actions to execute. Trust me, it works.

Of course I trust you! :-) But what works for xpressive might not work with Spirit. What is the signature of your lazy actions stored in the queue? How do you make it accept and return arbitrary types (attributes)? Inherited attributes may make use of continuation passing style but synthesized attributes can't. That's what I meant. of course attributes do not matter in xpressive. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Eric Niebler

12 Jan 12 Jan

6:10 a.m.

On 1/9/2011 9:20 PM, Joel de Guzman wrote:

...

On 1/10/2011 9:58 AM, Eric Niebler wrote:

...
I don't know about the "fully typed attributed" part, but type-erasure has nothing to do with it. Xpressive regexes are also type-erased. Only regex algorithms begin the pattern match. That entry point is what sets up the action context, and only the end state of the outermost regex causes the chain of actions to execute. Trust me, it works.

Of course I trust you! :-) But what works for xpressive might not work with Spirit.

What is the signature of your lazy actions stored in the queue?

Ah, that's a perceptive question that cuts to the quick. That function accepts only iterators into the matched string and returns void. No attributes.

...

How do you make it accept and return arbitrary types (attributes)? Inherited attributes may make use of continuation passing style but synthesized attributes can't. That's what I meant. of course attributes do not matter in xpressive.

Different design, different trade-offs. I still don't know whether Spirit's attributes preclude lazy action invocation, though. I tend to agree with Joel that performance comparisons between xpressive and Spirit and not that interesting. -- Eric Niebler BoostPro Computing http://www.boostpro.com

5310

Age (days ago)

5315

Last active (days ago)

List overview

Download

25 comments

5 participants

participants (5)

caustik
Eric Niebler
Hartmut Kaiser
Joel de Guzman
Stewart, Robert