I've run the same tests with gcc 3.2.2 on RH9, without any problems, so I'll
post this on the microsoft.public.vc.language newsgroup.
Keith MacDonald
"Keith MacDonald"
Messagetokenizer worked in Unicode for me, so I experimented with your example to try to find out what made the difference. To simplify building in different modes, I changed it to the following:
// ==== BEGIN CODE ==== // Unicode Build: cl /D_UNICODE /EHsc /IF:\Dev\boost_1_31_0 tok.cpp // DBCS Build: cl /EHsc /IF:\Dev\boost_1_31_0 tok.cpp // #include <string> #include <string> #include <iostream> #include
#ifdef _UNICODE typedef std::basic_string
string_t; #define _T(x) L##x #define STDOUT std::wcout #else typedef std::basic_string<char> string_t; #define _T(x) x #define STDOUT std::cout #endif typedef string_t::value_type char_t;
typedef boost::tokenizer < boost::char_separator
, string_t::const_iterator, string_t MyTokenizer;
const boost::char_separator
sep(_T("a")); int main() { #ifdef _BUG MyTokenizer token(string_t(_T("abacadaeafag")), sep); #else string_t s(_T("abacadaeafag")); MyTokenizer token(s, sep); #endif
for (MyTokenizer::const_iterator it = token.begin(); it != token.end(); ++it) STDOUT << *it;
return 0; } // ==== END CODE ====
The following table shows the output when _UNICODE and _BUG are defined:
_UNICODE _BUG Output ----------------------------- undef def " bcdefg" def def "" undef undef "bcdefg" def undef "bcdefg"
It seems that the tokenizer constructor is handling both Unicode and MBCS temporary strings incorrectly, with VC7.1.
Keith MacDonald