
On Thu, 13 Jan 2011 06:35:53 -0800 (PST) Artyom <artyomtnk@yahoo.com> wrote: [...]
Notes:
1. You can also always assume that strings under windows are UTF-8 and always convert them to wide string before system calls.
This is I think better approach, but it is different from what most of boost does. [...]
An interesting thought... I developed a set of ASCII/UTF-8/16/32 classes for my company not too long ago, and I became fairly familiar with the UTF-8 encoding scheme. There was only one issue that stopped me from assuming that all std::string types as UTF-8-encoded: what if the string *isn't* meant as UTF-8 encoded, and contains characters with the high-bit set? There's nothing technically stopping that from happening, and there's no way to determine with complete certainty whether even a string that seems to be valid UTF-8 was intended that way, or whether the UTF-8-like characters are really meant as their high-ASCII values. Maybe you know something I don't, that would allow me to change it? I hope so, it would simplify some of the code greatly. -- Chad Nelson Oak Circle Software, Inc. * * *