Re: [boost] [gsoc] Request Feedback for Boost.Ustr Unicode String Adapter

15 Aug 2011

      On 13 August 2011 20:10, Daniel James <dnljms@gmail.com> wrote:
...
On 13 August 2011 19:02, Dave Abrahams <dave@boostpro.com> wrote:
...
I think I agree with Artyom here.  *Somebody* has to decide how that
datatype will be interpreted when we receive it.  Unless we refuse
altogether to accept std::string in our interfaces (which sounds like a
bad idea to me), why not make the decision that it's UTF-8?
Because if the native encoding isn't UTF-8 that will give the wrong
result for cases such as:
int main(int argc, char** argv)
{
   // ....
   boost::filesystem::path p(argv[0]);
As a reader of the long discussions of a new string class, it seems to me
the only solution left is to pass the encoding as a separate entity from the
string to those functions that'll need it. Because:

* A new string class only pushes the problem one way further up from the
library level, and imposes unnecessary copying of data on those who don't
need/want it. There's a myriad of string classes already, yet another
adapter/container doesn't make things cleaner.
* Enforcing UTF-8 possible breaks existing applications, which assume the
current behaviour (whatever that is).

With the above options discarded, I see this (in some form):

enum string_encoding
{
  platform_specific,
  utf_8,
};

{
boost::filesystem::path(const char* str, boost::string_encoding e =
boost::platform_specific);
}

If boost were to settle for only two viable encodings, i.e.
platform_specific (or whatever name that matches the current behaviour in
related libraries) and utf_8, it would at least imply that utf_8 is the
preferred viable option for portable code, even if libraries default to
platform_specific for backward compatibility. the utf_8 encoding would take
the route that Artyom advocates, but in a more explicit way.

Well, that's my two euro cents ;)

cheers,

- Christian

Re: [boost] [gsoc] Request Feedback for Boost.Ustr Unicode String Adapter

Christian Holmquist