Re: [boost] Clang: Open-source C/C++ front end under development

1 Sep 2007

      on Sat Sep 01 2007, Douglas Gregor <doug.gregor-AT-gmail.com> wrote:
...
On Sat, 2007-09-01 at 10:41 -0400, David Abrahams wrote:
...
on Fri Aug 31 2007, Doug Gregor <dgregor-AT-osl.iu.edu> wrote:
...
- Video: http://llvm.org/devmtg/2007-05/09-Naroff-CFE.mov
The goal of separating semantic analysis from parsing is a noble one,
but it sounds like he may be underestimating the amount of semantic
analysis that's required to parse C++.  Inside templates we have the
benefit of the typename and template keywords to tell us which are the
types, but not inside regular code.  AFAICT that means it has to do
template instantiation just to tell whether foo<bar>::x is a type or
not.  Am I missing something?
No, you're technically correct. Some semantic analysis is certainly
required to parse C++, so you can't completely drop semantic analysis
and still parse.
Isn't "some" a huge understatement?  I mean, c'mon, you need to do
overload resolution!  Just evaluate
boost::detail::is_incrementable<X>::value for some X, for example.
...
However, you can keep the two notions in separate modules,
Sure.
...
and the
parser will certainly need to call into the semantic analysis module
to figure out whether a particular name is a type, a value, a
template, etc... just like a C parser needs to consult a symbol
table to figure out whether a name is a typedef name or something
else.
Yeah, only more so.  At one point he said of the parser, "we don't do
constant folding," but clearly you need to do that to decide whether a
name is a type or not.  

  foo<3*5>::x * y;

It seems to me that for C++ with templates, during parsing you have to
all the semantic analysis that isn't code generation -- and that's a
lot.
...
What this probably means is that the "minimal" semantic analysis for
C++ is a whole lot more heavyweight than the minimal semantic
analysis for C. But you still get some benefit from separating out
the semantics from the parser, because there are many semantic bits
that you *can* ignore if you only want an (unchecked) parse tree.
What, other than code generation?

It has often seemed to me that it might make sense to parse C++
nondeterministically, just to avoid some of these issues.  The number
of real instances of ambiguity is probably pretty small.

Anyway, I should probably take this over to the LLVM list...

-- 
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

The Astoria Seminar ==> http://www.astoriaseminar.com