Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]

21 Jan 2011

      On Fri, Jan 21, 2011 at 10:37 AM, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote:
...
On Thu, 20 Jan 2011 23:26:35 -0800, Patrick Horgan wrote:
...
On 01/20/2011 07:43 AM, Alexander Lamaison wrote:
...
I imagine you wouldn't have UTF-16 and UTF-32 string being passed about as
a matter for course.  For instance, a UTF-16 string should only be used
just before calling a Windows API call.
If this is the case, it makes sense to make the common case (UTF-8 string)
have a nice name like boost::string and the others which are used for
special situations can have something less snappy like boost::u16string and
boost::u32string.
What would you use for a regular string where you just had, essentially
a vector of char, wchar_t, char8_t, char16_t, char32_t, or unsigned
char, but didn't care about encoding?  I want to differentiate between
this case and the case where I know that there's a particular encoding.
A lot of times you just know you got a string from one system call and
you're passing it to another and you don't care about encoding.
[..]
Good point! boost::u8string then?
Why not boost::string (explicitly stating in the docs that it is UTF-8-based) ?
the name u8string suggests to me that it is meant for some special case
of character encoding and the (encoding agnostic/native) std::string
is still the way
to go.

IMO we should send the message that UTF-8 is
"normal"/"(semi-)standard"/"de-facto-standard"
and the other encodings like the native_t (or even ansi_t,
ibm_cp_xyz_t, string16_t,
string32_t, ...) are the special cases and they should be treated as such.

Matus

Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]

Matus Chochlik