
Mathias Gaunard wrote:
On another note, while I do think IF_LIKELY for UTF-16 is a good idea, doesn't that heavily penalize certain scripts, such as asian ones, in the case of UTF-8?
Not really: - In many cases, documents that use a exotic script actually contain large numbers of ASCII characters; consider an HTML page, for example, which will be full of HTML punctuation and tags. (I believe that I became aware of this after reading something written by a Mozilla person who had been investigating Unicode issues.) - The penalty of a wrong branch hint is not "heavy". We probably have lots of places in our code where the compiler heuristic is wrong, but we don't notice until we study it very carefully (as I did with this UTF8 code). This is why processors still need to implement dynamic branch prediction. My normal policy for using compiler branch hints like IF_LIKELY is to compile once with profile-driven optimisation, and then to find the places where it made a significant difference and add branch hints. I then get close to the profile-driven-optimised performance without needing to actually re-do the profiling. Regards, Phil.