[boost] Re: Interest in a Recursive Descent Parsing Library

26 Dec 2004


      Joel de Guzman wrote:
...
David Abrahams wrote:
...
Joel de Guzman wrote:
...
Using first-follow makes a parser deterministic, it can't be both.
An LL(1) parser, for instance, doesn't do backtracking.
Yes, but LL(1) is not synonymous with "recursive descent with first/follow."
Hmmm... I don't recall mentioning that. Did I?
No.  I'm arguing that the mere use of first/follow does not in and of
itself make a parser deterministic.
...
...
...
...
...
typically with one symbol lookahead (LL(1)). A strictly predictive
parser is a lot more restrictive in the grammars that one can use,
however. There's no way to make it parse adhoc grammars like the
C pre-processor or, ahem, c++. What I am inclined to do is to mix
determinism with classic spirit's non-determinism. Exactly the
recipe that you are alluding to.
I'm not sure if it is or not -- I'm just getting more confuseder. ;^)
Classic LL(1) does not have the "dynamic failure conditions" that
you mention.
Yes, I know that.  But then, LL(1) is not synonymous with "recursive
descent with first/follow."
Sure. But I don't see the point. What am I missing?
Well, I say "you could use first/follow to prune unproductive parses
while still using dynamic failure conditions to get nondeterminism," and
you come back with "Classic LL(1) doesn't have those."  I assume you're
using that as an argument against my statement, but it's a hollow
argument because classic LL(1) is only one way to use first/follow with
recursive descent.
...
...
...
NLL(1) grammars cannot have ambiguity. If the grammar
is ambiguous, first follow computation will fail (first follow
conflicts).
Hum?  The algorithm for computing the first and follow sets of a symbol
don't have an "otherwise the computation fails" branch, and the
definitions of these sets are simple:
first(a) = the set of tokens that can begin strings derived from a.
  follow(N) = the set of tokens that can follow N in grammatical sequences
You can use this information to eliminate hopeless branches of a parse
and avoid wasting effort backtracking without requiring the grammar to
be LL(1) or LL(2) or... even to be unambiguous at all.
Yes. In theory.
So "first follow computation will fail" is false.  BTW, I don't know
what NLL(1) means.  Nondeterministic LL(1)?  I can't even picture that.
...
...
Maybe we're talking about the same thing.
Yes! That was what I was trying to point out from the start.
However, it's not as simple as it seems.
Why not?
...
...
If you're talking about what I'm describing, I don't think there's
anything particularly radical about it, and I'd be very surprised if it
hadn't been done before.
AFAIK, no. Correct me if I'm wrong.
I once sent you a comparative study of some NLP parsing strategies.  One
of those was using first/follow computation and parsing ambiguous
grammars.  I'm pretty sure they were using backtracking or that they
told me that they had thought of adding it.
...
...
...
An alternative scheme is to let the user switch
strategies (possibly using spirit directives).
Another scheme, used by PCCTS, is to make the parser predictive
by default, and allow the user to use syntactic and semantic
predicates with arbitrary lookahead on specific points in
the grammar.
Eewww, I smell a kludge ;-)
Not really. There's a big problem with backtracking in general:
backtracking and semantic actions do not mix well (e.g. false
triggering).
It's not a problem if you build a parse tree and run the analysis on
that (which is sometimes required for synthesized attributes anyway).
...
In a syntactic and semantic predicate, all actions
are intentionally inhibited.
Have you considered the strategy of
http://docs.biostat.wustl.edu/cgi-bin/info2html?(bison.info.gz)GLR%2520Parse...
(http://tinyurl.com/694kq)
??

-- 
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com