Re: [boost] Clang: Open-source C/C++ front end under development

2 Sep 2007


      on Sat Sep 01 2007, Chris Lattner <clattner-AT-apple.com> wrote:
...
...
...
for example, and efficient buffer management (at least in our
context) means that the input to the lexer isn't useful as an
iterator interface.
Well, the kind of input sequence is exactly one thing I would
templatize.
To what benefit?
So people don't have to pay the price of copying their sequence into a
null-terminated memory buffer.
...
In practice, clang requires its input to come from a nul terminated
memory buffer (yes, we do correctly handle embedded nul's in the
input buffer as whitespace).  Here are the pros and cons:
Pros: clang is designed for what we perceive to be the common case.   
In particular, mmap'ing in files almost always implicitly null  
terminates the buffer (if a file is not an even multiple of a page  
size, most major OS's null fill to the end of the page) so we get  
this invariant for free in most cases.  Memory buffers and many  
others are also easy to handle in this scheme.
Futher, knowing that we have a sequential memory buffer as an input  
makes various optimizations really trivial: for example our block  
comment skipper is vectorized on hosts that support SSE or Altivec.   
Having the nul terminator at the end of the file means that the lexer  
doesn't have to check for "end of buffer" condition in *many* highly  
performance sensitive lexing loops (e.g. lexing identifiers, which  
cannot have a nul in them).
The ability to provide specialized algorithm implementations that take
advantage of special knowledge of the data structure is a strength of
generic programming.


-- 
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

The Astoria Seminar ==> http://www.astoriaseminar.com

Re: [boost] Clang: Open-source C/C++ front end under development

David Abrahams