Re: [boost] [General] Always treat std::strings as UTF-8

14 Jan 2011

      On Fri, Jan 14, 2011 at 1:36 PM, Alexander Churanov
<alexanderchuranov@gmail.com> wrote:
...
John,
As I understand the choice is between UTF-8 and UTF-16, since UTF-32
is a waste of memory. Given that, there is never fixed size for a
character or linear times - both UTF-8 and UTF-16 are variable-size
encodings of UTF-32.
Yes, my comment was in response to a comment about UTF-32 as
pertaining to an internal encoding. I'd only use UTF-16 if the APIs I
used required it, and the conversion could be done at the interface
(for example in a fascade). What interests me is if there's a good
reason to use UTF-8 internally and give UTF-32 the same treatment as
UTF-16, or vice versa. I do find the simplicity of a fixed-width
encoding alluring.

By the way, I disagree with Peter's assessment that, "you rarely, if
ever, need to access the Nth character," but I will gladly cede that
this depends on your problem domain.

Re: [boost] [General] Always treat std::strings as UTF-8

John B. Turpish