Re: [boost] Using lexertl instead of spirit

22 Nov 2006

      Sorry, the first message got send out too early...
...
David Abrahams wrote:
...
...
Yes, and Slex is the other one
Not to mention XPressive?
Xpressive is not really usable as a lexer, and Eric is aware of that. 
I have a Wave lexer implemented with Xpressive here on my hard disk, 
and it functions well, it is only 3 magnitudes slower as for instance 
the re2c based one. The main reasons are:
- no optimization between different regex's used for token 
representation (no internal NFA/DFA generation)
- no way to tell which alternative matched if using regex's containing 
alternatives
The first rules out using separate regex's, one for each token, the 
second one inhibits us from using one giant regex with alternatives...
Both are probably merely natural restrictions stemmed from the fact 
Xpressive is a regex library not a lexer generator.
The same issues would probably occur if we were trying to use 
Boost.Regex for this task.
FYI, I found my old timings of the different lexer types:

Timing results for the different lexer types included with Wave:

               Re2C             Slex              Xlex
============================================================================
===
All C++ tokens, lexer get's intstantiated for every C++ token
----------------------------------------------------------------------------
---
1000 times     1.63[s]          2.08[s]           1047.60[s]
                                                   751.57[s] (hoisted
regex_match struct)
============================================================================
===

Regards Hartmut

Hartmut Kaiser

tags

participants (1)