Re: [boost] Interest in a fast lexical analyser compatible

28 Dec 2004

      I just wrote a quick and dirty comparison between YARD and Spirit and YARD 
performs roughly 10x faster as a toy C++ tokenizer. I know Joel, I said I 
wouldn't do any comparisons, but I couldn't resist, what with Dave's claim 
to be outperforming Spirit by 50x!

This increased performance of YARD is due to the fact that YARD generates 
the parser at compile-time, rather than at run-time. Clearly I am not using 
an optimized Spirit grammar, I opted instead to implement both grammars in a 
naive and straightforward manner. Here is the full Spirit grammar I used:

      single_comment_p = str_p("//") >> *(~ch_p('\n')) >> ~ch_p('\n');
      full_comment_p = str_p("/*") >> anychar_p - str_p("*/");
      comment_p = single_comment_p | full_comment_p;
      ws = +(space_p | comment_p);
      escape_char_p = ch_p('\\') >> anychar_p;
      string_literal_p = ch_p('"') >> *(escape_char_p | ~ch_p('"')) >> 
ch_p('"');
      char_literal_p = ch_p('\'') >> (escape_char_p | ~ch_p('\'')) >> 
ch_p('\'');
      ident_p = (alpha_p | ch_p('_')) >> +(alnum_p | ch_p('_'));
      number_p = real_p;
      cpp_token = ws | char_literal_p | string_literal_p | number_p | 
ident_p[&inc_counter];
      tokens = *(cpp_token | anychar_p);

I would appreciate any suggestions on how to improve the Spirit grammar. The 
YARD grammar is far more verbose, here is only a small snippet:

  struct MatchBeginFullComment : public
    re_and<
      MatchChar<'/'>,
      MatchChar<'*'>
    >
  { };

  struct MatchEndFullComment : public
    re_and<
      MatchChar<'*'>,
      MatchChar<'/'>
    >
  { };

  struct MatchFullComment : public
    re_and<
      MatchBeginFullComment,
      MatchEndFullComment
    >
  { };

  struct MatchComment : public
    re_or<
      MatchSingleLineComment,
      MatchFullComment
    >
  { };

Anyway you get the picture, YARD is verbose but quite fast. I will be 
including the full source in the next YARD release.

Christopher Diggins
http://sourceforge.net/projects/yard-parser

Re: [boost] Interest in a fast lexical analyser compatible

christopher diggins