boost::tokenizer

14 Oct 2004

      Hi,

I like to break a file into tokens for processing. The file contains 
comments which are introduced by "//", "#" and ";". Can I setup the 
tokenizer directly such that the comments are skipped? If no, what would you 
suggest to erase the comments from my string before processing?

Here is what I do right now:

// CODE
ifstream is( "file.txt" );

string file, line;
file.reserve( 2 * 1024 * 1024 );
while ( getline( is, line ) )
{
 TrimHead( line );
 if ( line[0] != '/' && line[1] != '/' )
  file.append( line + "\n" );   // Need to append "\n" again to get the 
right tokens - not very nice
}

typedef tokenizer<char_separator<char> > Tokenizer;
char_separator<char> sep(" \t\n");
Tokenizer tokens( file, sep );
// END CODE

Another idea was to the following:

// CODE
ifstream is( "file.txt" );

string line( ( istreambuf_iterator<char>( is ) ), 
istreambuf_iterator<char>() );
EraseComments( line );
// END CODE

Any help is appreciated.

-Dirk

Dirk Gregorius

Pavol Droba

Jeff Flinn

tags

participants (3)