[boost] Re: boost::tokenizer

16 Oct 2004

      "Dirk Gregorius" <dirk@dirkgregorius.de> wrote in message
news:004801c4b215$c88ba9b0$0202a8c0@master...
Hi,
...
I like to break a file into tokens for processing. The file contains
comments which are introduced by "//", "#" and ";". Can I setup the
tokenizer directly such that the comments are skipped? If no, what would
you
...
suggest to erase the comments from my string before processing?
Since no one else has suggested these:

IMO, this sounds more like an application for spirit or regex. In spirit you
would do something approximating:

// note this is untested but gives an idea of the
// facilities available.

std::vector<std::string> tokens;

spirit::file_iterator<> first("input.dat");
spirit::file_iterator<> last(first.make_end());

spirit::rule<> rSkip = +space_p
                     | lexeme_d[ comment_p("//")
                               | comment_nest_p("/*","*/")
                               ];

spirit::rule<> rToken = (*anychar_P)[push_back_a(tokens); //

parse_info<> lResults = parse( first, last, *rToken , rSkip );

Certainly I think this is worth a look on your part.

Jeff Flinn