
Joel de Guzman wrote:
David Abrahams wrote:
Is first/follow really linked to determinism? It seems to me that you can use first/follow sets to prune unproductive searches in a nondeterministic parse, too (?)
Right. First-follow makes a parser predictive and thus deterministic,
Okay, either I'm "right" and first-follow can be used to prune unproductive parses in a nondeterministic parse *or* using first-follow makes a parser deterministic. It can't be both :)
Using first-follow makes a parser deterministic, it can't be both. An LL(1) parser, for instance, doesn't do backtracking.
Yes, but LL(1) is not synonymous with "recursive descent with first/follow."
It always has one and only one choice when faced with alternatives. It's roughly like the switch (deterministic) versus the cascading ifs (non-deterministic).
However... (see below)
As I understand "nondeterministic" it just means you're doing backtracking when a parse gets into trouble, unlike in an NFA where you're conceptually moving along all viable paths in parallel. I don't see why the use of first/follow should preclude backtracking. You can always define ambiguous grammars with dynamic failure conditions.
typically with one symbol lookahead (LL(1)). A strictly predictive parser is a lot more restrictive in the grammars that one can use, however. There's no way to make it parse adhoc grammars like the C pre-processor or, ahem, c++. What I am inclined to do is to mix determinism with classic spirit's non-determinism. Exactly the recipe that you are alluding to.
I'm not sure if it is or not -- I'm just getting more confuseder. ;^)
Classic LL(1) does not have the "dynamic failure conditions" that you mention.
Yes, I know that. But then, LL(1) is not synonymous with "recursive descent with first/follow."
NLL(1) grammars cannot have ambiguity. If the grammar is ambiguous, first follow computation will fail (first follow conflicts).
Hum? The algorithm for computing the first and follow sets of a symbol don't have an "otherwise the computation fails" branch, and the definitions of these sets are simple: first(a) = the set of tokens that can begin strings derived from a. follow(N) = the set of tokens that can follow N in grammatical sequences You can use this information to eliminate hopeless branches of a parse and avoid wasting effort backtracking without requiring the grammar to be LL(1) or LL(2) or... even to be unambiguous at all.
I'm developing a scheme where you can mix both predictive LL(1) with classic spirit's non-determinism. The scheme involves using a predictive front, backed up by a non-predictive fall-back step when ambiguity arises. It's roughly like combining the switch with cascading-ifs.
Maybe we're talking about the same thing.
It's still too early if such a scheme will hold water. AFAIK, no one has done it yet.
If you're talking about what I'm describing, I don't think there's anything particularly radical about it, and I'd be very surprised if it hadn't been done before.
An alternative scheme is to let the user switch strategies (possibly using spirit directives).
Another scheme, used by PCCTS, is to make the parser predictive by default, and allow the user to use syntactic and semantic predicates with arbitrary lookahead on specific points in the grammar.
Eewww, I smell a kludge ;-) -- Dave Abrahams Boost Consulting http://www.boost-consulting.com