something about UTF8

hi guys, I want to use boost::regex in Windows XP to match Japanese kanji. The encoding of kanji is UTF-8 I want to make sure after I use the funcation: MultibyteToWideChar to change the UTF-8 Kanji string->wstring, I can directly use boost::wregex(from wstring) to match Japanese? Appreciate any help. Worldwind

wind world wrote:
You would need to check the Windows API docs to make sure you're using the API correctly (does it work with UTF-8 as source? No idea on that), but yes, once you have the text encoded as UTF-16 then wregex will behave as you expect. Otherwise you could build regex with ICU support and then match UTF-8 directly: the downside is that you then have a dependency to ICU which is not a small library. HTH, John.
participants (4)
-
John Maddock
-
OvermindDL1
-
Rune Lund Olesen
-
wind world