
Dave Handley wrote:
A colleague and I are working on a fast lexical analyser designed to work alongside Spirit. We currently have some working prototypes, and as such I thought that I would gauge interest in this library. We plan to produce a DFA based lexical analyser that provides output as a set of iterable polymorphic flyweighted tokens. These could then be provided as input to Spirit instead of character iterators (albeit with the addition of a token_p parser in Spirit).
I'm definitely interested to have a look at your library. Besides my general interest in Spirit I'd like to try it out as an alternative lexing component for Wave. I'm pretty sure this should is interesting for you as well, because there are implemented already two different lexers, which gives a good opportunity to compare them in a real environment.
The objectives of our project are as follows:
1) As fast as lex/flex. 2) Simple to use 3) Rules to generate tokens can be provided both statically and dynamically. Static definition would be through an offline pre-process stage much like lex. 4) Easy to interface with Spirit.
As for a dynamic DFA based lexer Wave already uses the Spirit based SLEX, but a static DFA based solution is very interesting to look at. Is the DFA generated at compile time?
Finally, as a taster of the work we are doing, I have performed some rather unscientific and arbitrary performance tests on a 20Mb VRML 1.0 file. The performance of flex, Spirit, and our new library are as follows:
flex: < 1 second new library: about 1 second Spirit: 40-50 seconds.
We definitely should try the new upcoming Spirit-2 code base as well, since it should be a lot faster then the current version. Is it possible to have a look at your test code as well? This way we could try to make a comparision as soon as the Spirit-2 codebase evolves.
These results were produced with no semantic actions to slow down any of the 3 solutions. They are also not conclusive, since we are almost an order of magnitude slower on an Athlon 64 bit machine at present.
Regards Hartmut