Re: [Boost-users] u32regex_search crashes

1 Oct 2007

      I am sorry the last message had an mistake.I wanted to say that I want
to do a search that would take all the data as though it is  Utf32
rather than utf8 ( as i incorrectly wrote). I don't know whether i am
making myself clear (I am not very good in expressing the opnion).

What i really want to do is a unicode search on the available data.

						Anjaly G S

On Mon, 2007-10-01 at 09:42 +0100, John Maddock wrote:
...
Anjaly wrote:
...
In the regex document it was said that the size of data type of the
variable passed to the make_u32regex  that determines character
encoding (utf8,utf16 or utf32) .
*For construction of the regex object*.
The search algorithms operate independently on any of UTF8/16/32.
...
I passed wchar_t (which i think size
is 4) so that the buffer encoding is considered as utf8  by
u32regex_search irrespectively.  Actually i am trying to do a utf8
search.
Except the data file you sent *was not valid UTF8* !
It looks like it's probably UTF16LE, it's up to you in that case to decode 
the byte order mark and read the text into something that Boost.Regex can 
handle (for example platform-native UTF16).  ICU should have some file IO 
routines for doing that kind of thing: for example for loading a file into a 
UnicodeString type.
HTH, John.
_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users
______________________________________
Scanned and protected by Email scanner