Re: [boost] [General] Always treat std::strings as UTF-8

18 Jan 2011


      On Tue, 18 Jan 2011 08:48:59 -0500, Dave Abrahams wrote:
...
At Tue, 18 Jan 2011 13:27:29 +0200,
Peter Dimov wrote:
...
Dave Abrahams wrote:
...
I think the reason to use separate types is to provide a type-safety
barrier between your functions that operate on utf-8 and system or
3rd-party interfaces that don't or may not.  In principle, that should
force you to think about encoding and decoding at all the places where
it may be needed, and should allow you to code naturally and with
confidence where everybody is operating in utf8-land.
Yes, in principle. It isn't terribly necessary if everybody is
operating in UTF-8 land though.
But they won't be.  That's not today's reality.
...
It's a bit like defining a separate integer type for nonnegative
ints for type safety reasons - useful in theory, but nobody does it.
I refer you to Boost.Units
...
If you're designing an interface that takes UTF-8 strings,
...as we are...
...
it still may be worth it to have the parameters be of a
utf8-specific type, if you want to force your users to think about
the encoding of the argument each time they call one of your
functions...
Or, you may want to use a UTF-8 specific type to force users of legacy
char* interfaces (and ourselves) to think about decoding each time
they call a legacy char* interfaces.
...
this is a
legitimate design decision. If you're in control of the whole program,
though, it's usually not worth it -  you just keep everything in UTF-8.
By definition, since we're library designers, we don't have said
control.  And people *will* be using whatever Boost does with "legacy"
non-UTF-8 interfaces.
+1 for every point.

Alex

-- 
Easy SFTP for Windows Explorer (http://www.swish-sftp.org)

Re: [boost] [General] Always treat std::strings as UTF-8

Alexander Lamaison