
18 Dec
2008
18 Dec
'08
9:39 a.m.
wind world wrote:
hi guys, I want to use boost::regex in Windows XP to match Japanese kanji. The encoding of kanji is UTF-8 I want to make sure after I use the funcation: MultibyteToWideChar to change the UTF-8 Kanji string->wstring, I can directly use boost::wregex(from wstring) to match Japanese?
You would need to check the Windows API docs to make sure you're using the API correctly (does it work with UTF-8 as source? No idea on that), but yes, once you have the text encoded as UTF-16 then wregex will behave as you expect. Otherwise you could build regex with ICU support and then match UTF-8 directly: the downside is that you then have a dependency to ICU which is not a small library. HTH, John.