
On Thu, 20 Jan 2011 09:59:51 +0100 Matus Chochlik <chochlik@gmail.com> wrote:
Do you see another way to provide those conversions, and automatic verification of proper UTF coding? (Automatic verification is a very good thing, without it someone won't use it or will forget to, and open up their programs to exploitation.)
Yes, implementing it into std::string in some future standard.
'Fraid that's a little beyond my current level of programming skill. ;-)
Besides the ugly name and that is a new class ? No :)
If you can think of a more-acceptable-but-still-descriptive name for it, I'm all ears. :-)
I have an idea: what about boost::string, which could possibly become the next std::string in the future.
And string16 and string32? We'll have to support UTF-32, as the single-codepoint-per-element type, and UTF-16 (distasteful though it may be) is needed for Windows. Or are you suggesting the utf* types in addition to the boost::string type? If so, I believe the idea has merit.
And the solution is long overdue. And creating utf8_t is just putting the problem away, not solving it really.
I see it as merely easing the transition.
OK, if the long term plan is:
1) design and implement boost::string using UTF-8 doing all the things like code-point iteration, character iteration, convenience stuff like starts-with, ends-with, replace, trim, etc., etc. with as much backward compatibility with std::string as possible without hindering progress
2) try really hard to push it to the standard
then I'm on board with that.
Some of those could be problematic (I've run across references implying that 0x20 isn't the universal word-separation character, so trim would at least need some extra parameters), but for the most part, I'd agree with it. -- Chad Nelson Oak Circle Software, Inc. * * *