Re: [boost] [locale] Review results for Boost.Locale library

27 Apr 2011

      ----- Original Message ----
...
From: Jeremy Maitin-Shepard <jeremy@jeremyms.com>
On 04/25/2011 11:56 PM, Artyom wrote:
...
...
From: Jeremy  Maitin-Shepard<jeremy@jeremyms.com>
The most significant complaint seems to be the fact that
the  translation  interface is limited to ASCII (or maybe UTF-8
is  also supported, it isn't  entirely clear).
[snip]
I imagine relative to the work required for the  whole  library,
these changes would be quite trivial, and might  very well
transform the  library from completely unacceptable  to
acceptable for a number of objectors on  the  list,
while having essentially no impact on those that
 are happy to use the  library as  is.
I  can say few words on what can be done and what will never be  done.
I will never support wide, char16_t or char32_t strings as  keys.
It seems that it is mostly possible to get the desired results  using 
only char * strings as keys [snip]
However, I don't see why you are so opposed to providing  additional 
overloads.  With MSVC currently, only wide strings can  represent the 
full range of Unicode.  You could provide the definitions  in an 
alternate static/dynamic library from the char * overloads, so that 
there would not even be any substantial space overhead.
How the catalog works. It searches the key in the hash table,
as the last stage it compares the strings bytewise.

It is fast and efficient.

In order to support both L"", "", u"" and U"" I need to
create a 4 variants of same string to make sure it works
fast (waste of memory) or I need to convert the
string from UTF-16/32 to UTF-8 that is run-time
memory allocation and conversion.

So no, I'm not going to do this, especially that
it is nor portable enough.
...
...
One possibility is to provide per-domain basis a key  in po file
"X-Boost-Locale-Source-Encoding" so user would be able to  specify in
special record (which exists in all message catalogs)  something
like:
"X-Boost-Locale-Source-Encoding: windows-936"
or
        "X-Boost-Locale-Source-Encoding: UTF-8"
Then when  the catalog would be loaded its keys would be converted
to the  X-Boost-Locale-Source-Encoding.
This isn't a property of the message  catalog, but rather a property of 
the program itself, and therefore should  be specified in the program, 
and not in the message catalog, it would  seem.  Something like the 
preprocessor define I mentioned would be a  way to do this.
Two problem with define that I want

 translate("foo") to work automatically and not being a define.

So I either need to provide an encoding in catalog itself
or when I provide domain name (the reason it is done
per domain name as one part of the project may use UTF-8 and
other cp936 and other may use US-ASCII at all)

So I can either specify it when I load a catalog or in
catalog itself.
...
...
wcout<<  translate("「平和」"); // convert in runtime from cp939 to
UTF-16
...
cout<<  translate("「平和」"); //  convert in runtime from cp939 to UTF-8
[snip]
When you say  "convert in runtime", it seems you actually mean the keys 
will be converted  from UTF-8 to cp939 when the messages are loaded, but 
the values will remain  UTF-8.  Untranslated strings would have to be 
converted, I  suppose.
Yes when catalog load the UTF-8 keys will be converted to cp936 for
best performance but in runtime the original untranslated keys
should be converted to target locale.

Artyom