
Andrey Semashev wrote:
I'd like to note that Unicode consumes more memory than narrow encodings.
That's quite dependent on the encoding used. The most popular Unicode memory-saving encoding is UTF-8 though, which doubles the size needed for non ASCII characters compared to ISO-8859-* for example. It's not that problematic though. Alternatives which use even less memory exist, but they have other disadvantages.
This may not be desirable in all cases, especially when the application is not intended to support multiple languages in its majority of strings (which, in fact, is a quite common case).
Algorithms to handle text boundaries, tailored grapheme clusters, collations (some of which are context-sensitive) etc. are needed to process correctly any one language. So you need Unicode anyway, and better reuse the Unicode stuff than work on top of a legacy encoding.