Re: [boost] [string] --> [text] ?

28 Jan 2011

      On Fri, Jan 28, 2011 at 11:02 AM, Sebastian Redl
<sebastian.redl@getdesigned.at> wrote:
...
On 28.01.2011 08:56, Dean Michael Berris wrote:
...
On Fri, Jan 28, 2011 at 6:47 AM, Gregory Crosswhite
<gcross@phys.washington.edu>  wrote:
...
Since there has been a lot of talk about what the name of a new immutable
string class should be, may I toss the name "boost::text" into the ring?
Hmm... Unfortunately it denotes the wrong thing for my case.
That's why "text" is the proposed name for the other case. +1 from me.
...
This was the point for my 'view' template idea. That the view would
give some semblance of encoding appropriately.
I really don't like the name "view". It has strong connotations of
non-ownership. It's not meaningful for the actual purpose of a text type:
storing text. A text type should store text, not provide a view on a raw
sequence of bytes. A view<some_encoding> would be something I would look for
if I wanted to get the bytes that make up a text in some_encoding. Not
something I would look for if I wanted to store the text.
Calling a text type "view<utf_8>" feels very much to me like calling int
"view<little_endian_32_bit>".
*Exactly*
...
As I said before, encoding is a property of interfacing with things external
to my code. 3rd party libraries, files, network protocols.
...
...
That is, given a boost::text object "t",
one could convert it into a UTF-8 string by calling "t.utf8_c_str()", a
UTF-16 string by calling "t.utf16_c_str()", and so on, depending on what
the
underlying API is expecting.
And then you run into the problem of having a ton of member functions
that do encapsulate the logic instead of having multiple types to do
the conversion instead. The member functions idea will not scale
appropriately and would be a hell to manage.
True. How about t.c_str<desired_encoding>()? Put the actual logic for the
conversion into the encoding type.
+1 although I would not be against

c_str<encoding_tag>(my_text)

if someone shows that this is better than the member function.

NOTE.1: But I would like to see a special encoding tag for the native
encoding i.e. something like native_char_encoding, native_wchar_encoding
or platform_encoding_tag<char>/platform_encoding_tag<wchar_t>

NOTE.2: UTF-8 is assumed by default.
...
boost::text should store text. The encoding of the underlying bytes in
memory shouldn't matter so much.
Yes, I basically don't care what the internal encoding of the string
is if the interface 'plays' with Unicode/UTF-8.

[snip/]

Matus

Re: [boost] [string] --> [text] ?

Matus Chochlik