[spirit] alert when backtracking?

Hello, is there any way to detect at runtime or compile time whether Boost.Spirit needs backtracking? It would be nice to be alerted to the fact that the grammar is not LL(1) and thus has potential for optimization. Regards, Arno -- Dr. Arno Schödl | aschoedl@think-cell.com Technical Director think-cell Software GmbH | Chausseestr. 8/E | 10115 Berlin | Germany http://www.think-cell.com | phone +49 30 666473-10 | US phone +1 800 891 8091 Amtsgericht Berlin-Charlottenburg, HRB 85229 | European Union VAT Id DE813474306 Directors: Dr. Markus Hannebauer, Dr. Arno Schoedl

Hello,
is there any way to detect at runtime or compile time whether Boost.Spirit needs backtracking? It would be nice to be alerted to the fact that the grammar is not LL(1) and thus has potential for optimization.
Regards,
Arno [snip] I'm not that familiar with spirit; however, I just can't imagine how it would be practical (i.e. doesn't take sooo long) to calculate
On 03/07/11 10:53, Arno Schödl wrote: this information at compile time. I base this conclusion on the outline for calculating just the lookahead sets for a simple gramar. This outline is located in the boost vault directory: http://www.boostpro.com/vault/index.php?&directory=Strings%20-%20Text%20Processing in zip file: LewiLL1.lhs_rhs.zip According to that file, it requires converging 3 times over a complete traversal of the grammar. Obviously, to do that at compile time, based on reports of the difficulty of doing subrules, it would take much too much time. OTOH, you could calculate the information offline using the code (or rather an updated version of the code) in this other zip file: cfg_lookahead_extends.zip Of course that still wouldn't give you the final answer. You'd finally have to, again, traverse the grammar tree, testing each alternative to see if the lookahead sets intersect for any pair of elements in an alternative. Of course someone may have thought of alternative, but that's how discouraging it looks to me ATM. -Larry

is there any way to detect at runtime or compile time whether Boost.Spirit needs backtracking? It would be nice to be alerted to the fact that the grammar is not LL(1) and thus has potential for optimization.
Spirit.Qi backtracks only when using alternatives. That's easy enough to spot by looking at the granmmar. Regards Hartmut --------------- http://boost-spirit.com

On 03/07/11 17:45, Hartmut Kaiser wrote:
is there any way to detect at runtime or compile time whether Boost.Spirit needs backtracking? It would be nice to be alerted to the fact that the grammar is not LL(1) and thus has potential for optimization.
Spirit.Qi backtracks only when using alternatives. That's easy enough to spot by looking at the granmmar.
Does spirit backtrack at *every* alternative even if the grammar is LL(1)? IOW, spirit does *not* look ahead 1 token (as an LL(1) parser would) to decide which alternative has to be chosen. Instead, it just keeps trying the alternatives until 1 succeeds, and if none succeed then it backtracks to the last alternative that has not been exhaustively tried. Is that about right? -Larry

On 3/8/2011 9:31 AM, Larry Evans wrote:
On 03/07/11 17:45, Hartmut Kaiser wrote:
is there any way to detect at runtime or compile time whether Boost.Spirit needs backtracking? It would be nice to be alerted to the fact that the grammar is not LL(1) and thus has potential for optimization.
Spirit.Qi backtracks only when using alternatives. That's easy enough to spot by looking at the granmmar.
Does spirit backtrack at *every* alternative even if the grammar is LL(1)? IOW, spirit does *not* look ahead 1 token (as an LL(1) parser would) to decide which alternative has to be chosen. Instead, it just keeps trying the alternatives until 1 succeeds, and if none succeed then it backtracks to the last alternative that has not been exhaustively tried. Is that about right?
Yep, Spirit is PEG: http://en.wikipedia.org/wiki/Parsing_expression_grammar alternatives, to be pedantic, are more accurately called "Ordered choice". Some PEG parsers use tricks such as memoization to minimize the potential quadratic worst case performance, but in Spirit, we did not find the need. Spirit beats ANTLR/PCCTS speed, for example, which is an LL(k) parser. Techniques such as the Nabialek trick and use of syntactic and semantic predicates are enough. Regards, -- Joel de Guzman http://www.boostpro.com http://boost-spirit.com

is there any way to detect at runtime or compile time whether Boost.Spirit needs backtracking? It would be nice to be alerted to the fact that the grammar is not LL(1) and thus has potential for optimization.
Spirit.Qi backtracks only when using alternatives. That's easy enough to spot by looking at the granmmar.
Kleene Stars can also backtrack, right? It tries the rule, and aborts the loop when the rule fails, where failure of the rule may involve backtracking? I know that Spirit.Qi does not build lookahead tables, but simply executes the rule and backtracks. But as long as this backtrack happens after a single character, I'd expect a performance similar to LL(1) parsers like ANTLR - they are effectively doing the same thing. What I'd like to know is if there is more backtracking than one character. Maybe I could write an instrumented iterator that does the check at runtime. Compile time would be nice and possible, but more difficult. Regards, Arno -- Dr. Arno Schödl | aschoedl@think-cell.com Technical Director think-cell Software GmbH | Chausseestr. 8/E | 10115 Berlin | Germany http://www.think-cell.com | phone +49 30 666473-10 | US phone +1 800 891 8091 Amtsgericht Berlin-Charlottenburg, HRB 85229 | European Union VAT Id DE813474306 Directors: Dr. Markus Hannebauer, Dr. Arno Schoedl

On 03/08/11 07:07, Arno Schödl wrote:
is there any way to detect at runtime or compile time whether Boost.Spirit needs backtracking? It would be nice to be alerted to the fact that the grammar is not LL(1) and thus has potential for optimization.
Spirit.Qi backtracks only when using alternatives. That's easy enough to spot by looking at the granmmar.
Kleene Stars can also backtrack, right? It tries the rule, and aborts the loop when the rule fails, where failure of the rule may involve backtracking?
I know that Spirit.Qi does not build lookahead tables, but simply executes the rule and backtracks. But as long as this backtrack happens after a single character, I'd expect a performance similar to LL(1) parsers like ANTLR - they are effectively doing the same thing. What I'd like to know is if there is more backtracking than one character. Maybe I could write an instrumented iterator that does the check at runtime. Compile time would be nice and possible, but more difficult.
Regards,
Arno
[snip] I'd guess that most characters have no semantic actions associated with them, and that, consequently, when semantic actions have to be undone, that this involves backing up the lexer by more than 1 character. Thus, I'd guess the problem reported in this thread: http://sourceforge.net/mailarchive/message.php?msg_id=27132247 would be an example of 'more backtracking than one character'.

is there any way to detect at runtime or compile time whether Boost.Spirit needs backtracking? It would be nice to be alerted to the fact that the grammar is not LL(1) and thus has potential for optimization.
Spirit.Qi backtracks only when using alternatives. That's easy enough to spot by looking at the granmmar.
Kleene Stars can also backtrack, right? It tries the rule, and aborts the loop when the rule fails, where failure of the rule may involve backtracking?
Indeed. Now as I think of it, more parsers do backtrack more than one character (actually all parsers matching more than one character will do so when failing).
I know that Spirit.Qi does not build lookahead tables, but simply executes the rule and backtracks. But as long as this backtrack happens after a single character, I'd expect a performance similar to LL(1) parsers like ANTLR - they are effectively doing the same thing. What I'd like to know is if there is more backtracking than one character. Maybe I could write an instrumented iterator that does the check at runtime. Compile time would be nice and possible, but more difficult.
I'm not sure about compile time detection, but runtime detection should be feasible. Thanks! Regards Hartmut --------------- http://boost-spirit.com
participants (4)
-
Arno Schödl
-
Hartmut Kaiser
-
Joel de Guzman
-
Larry Evans