
On Fri, Jan 28, 2011 at 11:02 AM, Sebastian Redl <sebastian.redl@getdesigned.at> wrote:
On 28.01.2011 08:56, Dean Michael Berris wrote:
On Fri, Jan 28, 2011 at 6:47 AM, Gregory Crosswhite <gcross@phys.washington.edu> wrote:
Since there has been a lot of talk about what the name of a new immutable string class should be, may I toss the name "boost::text" into the ring?
Hmm... Unfortunately it denotes the wrong thing for my case.
That's why "text" is the proposed name for the other case. +1 from me.
This was the point for my 'view' template idea. That the view would give some semblance of encoding appropriately.
I really don't like the name "view". It has strong connotations of non-ownership. It's not meaningful for the actual purpose of a text type: storing text. A text type should store text, not provide a view on a raw sequence of bytes. A view<some_encoding> would be something I would look for if I wanted to get the bytes that make up a text in some_encoding. Not something I would look for if I wanted to store the text.
Calling a text type "view<utf_8>" feels very much to me like calling int "view<little_endian_32_bit>".
*Exactly*
As I said before, encoding is a property of interfacing with things external to my code. 3rd party libraries, files, network protocols.
That is, given a boost::text object "t", one could convert it into a UTF-8 string by calling "t.utf8_c_str()", a UTF-16 string by calling "t.utf16_c_str()", and so on, depending on what the underlying API is expecting.
And then you run into the problem of having a ton of member functions that do encapsulate the logic instead of having multiple types to do the conversion instead. The member functions idea will not scale appropriately and would be a hell to manage.
True. How about t.c_str<desired_encoding>()? Put the actual logic for the conversion into the encoding type.
+1 although I would not be against c_str<encoding_tag>(my_text) if someone shows that this is better than the member function. NOTE.1: But I would like to see a special encoding tag for the native encoding i.e. something like native_char_encoding, native_wchar_encoding or platform_encoding_tag<char>/platform_encoding_tag<wchar_t> NOTE.2: UTF-8 is assumed by default.
boost::text should store text. The encoding of the underlying bytes in memory shouldn't matter so much.
Yes, I basically don't care what the internal encoding of the string is if the interface 'plays' with Unicode/UTF-8. [snip/] Matus