
"Erik Wien" <wien@start.no> wrote in message news:cl1tqh$qp$1@sea.gmane.org...
Hi. I am in the process of planning a library for handling unicode strings in C++, and would like to probe the interest in the boost community for something like that. I read through the unicode dicussion that was up back in april, and from what I could gather there was some amount of interest, but no one felt comfortable taking on the task as of yet.
[snip]
I really feel the C++ language needs some form of standardized unicode support, and developing such a library within the boost community would be a very good way to ensure it fits everybody's needs the best possible way.
If you have any, and I do mean ANY, thoughts on this, please do not hesitate to reply to this mail and let me know. I'm looking forward to your responses.
FWIW Here my thoughts.. There is no equivalence between std::string (aka std::string, std::wstring) and a sequence of characters conforming to an encoded sequence (aka encoded-string). However an encoded-string can (potentially) be converted to a string, but not the other way round, because the std::string does not provide adequate information. For an encoding scheme to work the encoding must be provided, and must be run time. The best way to do this for various encodings is to use packets, with headers providing the information regarding the contents, eg type of encoding, number of characters, checksum etc. These packets themselves could be manipulated in std::strings (including sequences of packets), which could then be used to perform operations where the encoding is not important. This should combine the best combination of performance, both in speed and size. regards Andy Little