
18 Jul
2005
18 Jul
'05
11:06 p.m.
Basically we want to map anything that might be contain in a std::string or std::wstring to an XML value string. I little investigation makes me think that the appropriate mechanism is to escape all "non-printable" (uh-oh?) or some subset of "problem characters" using the % escape syntax. It looks to me as non-obvious problem but I have yet to delve into it.
I think if you used UTF-8 as the output character set you avoid all these problems. The conversion from UCS-2, etc. to UTF-8 is fairly straightforward, and should be at least as quick as using %. Most importantly it is more compact (e.g. for Japanese characters 2-3 bytes instead of 6 bytes for %XX%XX). Darren