
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tuesday 19 February 2008 07:40 am, Phil Endecott wrote:
This code is not yet Boostified (namespaces, directory layout etc.) Most of it compiles but it has hardly been exercised at all. The functionality includes conversion between UTF-8, UCS-2, UCS-4, ASCII and ISO-8859-*.
Things I'd appreciate feedback on: - What should the cs_string look like? Basically everywhere that std::string uses an integer position I have the choice of a character position, a unit position, or an iterator - or not providing that function. - What character sets are people interested in using (a) at the "edges" of their programs, and (b) in the "core"?
I don't have a lot of experience using non-ascii strings in my internal code, aside from occasional forays into utf-8 for special characters, but wouldn't using ucs-4 for the "core" encoding be the sane thing to do? With a ucs-4 encoding, you could use a basic_string<wchar_t> and continue using the familiar api without worrying about the complications and confusion caused by variable length encodings. - -- Frank -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHvHRc5vihyNWuA4URAsY5AKDjvg0giN2IHhdBKzT7+IgNH2h/igCeLeAv axRO7RQJVv1U7OMDLP71bZc= =lImG -----END PGP SIGNATURE-----