In the regex document it was said that the size of data type of the variable passed to the make_u32regex that determines character encoding (utf8,utf16 or utf32) . I passed wchar_t (which i think size is 4) so that the buffer encoding is considered as utf8 by u32regex_search irrespectively. Actually i am trying to do a utf8 search. Anjaly G S On Fri, 2007-09-28 at 10:53 +0100, John Maddock wrote:
Jens Seidel wrote:
That's the valid byte order mark. See e.g. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058
Right: it's a byte order mark for UTF16LE, but the user is trying to read it as a UTF8 sequence.
If the file is indeed UTF16LE then it's up to the user to read it into a sequence of valid UTF16 code points before passing to Boost.Regex.
HTH, John.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
______________________________________ Scanned and protected by Email scanner