Re: [boost] GSoC Unicode library: second preview

Of course, the library works with UTF-8 and UTF-32 just as well, it makes no difference to the generic algorithms (which don't exist yet, but expect substring searching and the like), it's up to you to choose what makes the most sense to use for your situation (for example, you may choose to use UTF-8 because you need to interact a lot with programming interfaces expecting that format).
Ok, this is really good.
They should be fairly easy to find. Either you're using the algorithm that does the task correctly, or you're fiddling with the encoding by hand which is likely to be wrong.
They are easy to find in the Unicode aware unit tests but not in real program. I did once a small test, what Unicode aware programs support characters outside of BMP, i.e. I tested a glyph that was encoded as surrogate pair in UTF-16... The results were total disaster: - Windows standard dialogs: displayed character correctly but every operation like deletion related to is as two pairs. For example file name dialog had problems. - Same behavior in notepad or any standard text-area widgets didn't work correctly. - Qt3 hadn't supported surrogate pairs at all (in Qt4 most of it was fixed) displaying two square "glyphs". - Opera Web browser, had similar problems with editing and displaying such characters. So... There is a huge problem with this encoding, because such simple QA test shouldn't give such bad results for such big amount of programs. Also, all programs that used internally utf-8 or utf-32 had passed these tests very well. So I really **do not** suggest recommending this encoding as "best" one for internal use. Artyom
participants (1)
-
Artyom