Re: [boost] [rfc] Unicode GSoC project

What kind of real-world use do people have for random access, anyways? Even UTF-32 isn't random access for the things I can think of that people would care about, what with combining codepoints and ligatures and other such things. I wrote a couple of Unicode text editors that would have been a nightmare if they had not been operating on UTF-32.
As an aside, I'd like to see comparisons between compressed UTF-8 and compressed UTF-16, since neither one is random-access anyways, and it seems to me that caring about size of text before compression is about as important as the performance of a program with the optimizer turned off. Actually in a few cases I have seen it is not the compressed size but the conversion performance [memory/CPU] that hurts. It is much better to get the correct encoding for the correct use case.
Yours, Graham

On Fri, May 15, 2009 at 18:34, Graham <Graham@system-development.co.uk> wrote:
What kind of real-world use do people have for random access, anyways? Even UTF-32 isn't random access for the things I can think of that people would care about, what with combining codepoints and ligatures and other such things.
I wrote a couple of Unicode text editors that would have been a nightmare if they had not been operating on UTF-32.
What sort of thing? I would have thought that the most nightmare-inducing stuff would be replacing an "ffi" ligature with an "ff" ligature if someone hit backspace, figuring out how to edit combining codepoints, and other such stuff that's not much different in the various UTFs.
As an aside, I'd like to see comparisons between compressed UTF-8 and compressed UTF-16, since neither one is random-access anyways, and it seems to me that caring about size of text before compression is about as important as the performance of a program with the optimizer turned off.
Actually in a few cases I have seen it is not the compressed size but the conversion performance [memory/CPU] that hurts. It is much better to get the correct encoding for the correct use case.
Conversion between what?
participants (2)
-
Graham
-
Scott McMurray