
[Sorry if this has already been reported, fixed, and/or superceded.] Looking at <http://www.boost.org/libs/wave/doc/token_ids.html>, I see various kinds of tokens. There are tokens for the preprocessor that are seen by the lexer and don't make it to the preprocessing iterator level. The other sets of tokens do make it to that level, modulo any transformations. The trigraphs are put in the operator token list. However, trigraphs should not be there. They are processed before anything else, even before the preprocessor tokens. So there should be another level of lexer working here. As is, it doesn't seem that you could use the trigraph for "#", "??=", for preprocessor directives. ??=include <cstdio> // this should work On a related note, I thought maybe Wave should use a generator interface: template < typename Iterator, typename FileID > class phase1 { public: phase1( Iterator b, Iterator e, FileID id ); operator bool() const; // TRUE while not done cpp_p1_char_type operator ()(); }; template < typename Iterator, typename FileID > class phase2 { public: explicit phase2( phase1<Iterator, FileID> const &p ); operator bool() const; // TRUE while not done cpp_p2_line_string_type operator ()(); }; //... You generally can't rewind, of course. The cpp_p1_char_type would contain the expanded character's identity AND some indicator of its location (starting iterator, file ID, and line, row, and un-lined offset numbers). The cpp_p2_line_string_type would carry the locations for each character in its string. Then the tokens of later phases would know the location of their first characters. -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com