
On Jan 19, 2011, at 11:30 AM, Peter Dimov wrote:
Alexander Lamaison wrote:
There is a straightforward way for Microsoft to migrate Windows to this future: If they add UTF-8 support to their narrow character interface (I am avoiding calling it ANSI due to the negative connotations that has) and add narrow character APIs for all wide character APIs that lack a narrow counterpart, then I believe we could treat POSIX and Windows identically from an encoding point of view.
It would break any programs using the narrow API currently that use any 'exotic' codepage (i.e. pretty much anything except 7-bit ascii).
It will only break programs that depend on a specific code page. Programs that use the narrow API but do not require a specific code page (or a single byte code page - the exact opposite of exotic) will work fine - they'll simply see an ANSI code page of 65001. It will still cause a fair amount of breakage, of course, but in principle, the transition path is obvious and straightforward.
What I intended here (but forgot to say explicitly -- sorry) was that Microsoft could allow a process (or thread) to set its local character set to UTF-8. Then all existing code that pays attention to the narrow representation would find that it is UTF-8 and deal correctly with it. Naturally, this migration would take time -- but Microsoft has done that before. They successfully transitioned a large developer base off 16-bit Windows and onto 32-bit Windows (and, incidentally, introduced the wide character API at the same time).