[Boost-users] Re: tokenizer and wstring with VC7.1

22 Feb 2004


      I've run the same tests with gcc 3.2.2 on RH9, without any problems, so I'll
post this on the microsoft.public.vc.language newsgroup.

Keith MacDonald

"Keith MacDonald" <boost@mailclan.net> wrote in message
news:c177ag$79p$1@sea.gmane.org...
...
Messagetokenizer worked in Unicode for me, so I experimented with your
example to try to find out what made the difference.  To simplify building
in different modes, I changed it to the following:
// ==== BEGIN CODE ====
// Unicode Build: cl /D_UNICODE /EHsc /IF:\Dev\boost_1_31_0 tok.cpp
// DBCS Build:    cl /EHsc /IF:\Dev\boost_1_31_0 tok.cpp
//
#include <string>
#include <string>
#include <iostream>
#include <boost/tokenizer.hpp>
#ifdef _UNICODE
    typedef std::basic_string<wchar_t>  string_t;
    #define _T(x)                       L##x
    #define STDOUT                      std::wcout
#else
    typedef std::basic_string<char>     string_t;
    #define _T(x)                       x
    #define STDOUT                      std::cout
#endif
typedef string_t::value_type            char_t;
typedef boost::tokenizer <
    boost::char_separator<char_t>,
    string_t::const_iterator,
    string_t
...
MyTokenizer;
const boost::char_separator<char_t> sep(_T("a"));
int main()
{
#ifdef _BUG
    MyTokenizer token(string_t(_T("abacadaeafag")), sep);
#else
    string_t    s(_T("abacadaeafag"));
    MyTokenizer token(s, sep);
#endif
for (MyTokenizer::const_iterator it = token.begin(); it !=
token.end();
++it)
        STDOUT << *it;
return 0;
}
// ==== END CODE ====
The following table shows the output when _UNICODE and _BUG are defined:
_UNICODE    _BUG    Output
-----------------------------
undef       def     " bcdefg"
def         def     ""
undef       undef   "bcdefg"
def         undef   "bcdefg"
It seems that the tokenizer constructor is handling both Unicode and MBCS
temporary strings incorrectly, with VC7.1.
Keith MacDonald