
On 04/27/2011 10:49 AM, Matthew Chambers wrote:
Claims like "non-Western programmers will never use this library" are out of place on this list.
I disagree. It's a relevant data point, even if you don't agree with the reasoning behind it.
I am a total novice at localization but there are a whole lot of "non-Western" programmers who know English/ASCII well enough to use it as keys (I speculate the majority do). Indians in particular have considerable English proficiency and there are quite a lot of them. I chatted with a Chinese grad student here and his impression is that China and South Korea probably follow Japan's pattern of having many programmers without English proficiency, so Ryou does have a point but could have made it in a much nicer way.
I think you're missing his point. Ryou spells it out for you here:
The real obstacle is ASCII. If ASCII is used instead of UTF-8, UTF-16, or UTF-32, we have to use ASCII compatible encoding. In Windows and for Japanese, it's CP932(Microsoft variant of Shift-JIS).
In that case, we can't translate a program simply replacing the text. Because Windows can't tell which encoding it is(it can be anything, we can't detect it heuristically), we have to explicitly specify it.
The pre-Unicode encoding schemes for program text sucked and he's not going back. On 04/27/2011 10:49 AM, Matthew Chambers wrote:
Specifically, as Steve Bush said earlier, the idea of a program localized only for the east Asian markets is plausible. But at the same time I think it's perfectly acceptable for boost.locale to NOT target that particular use case and instead go for (I assume) the much larger use case of programs that are intended for ALL markets.
As long as everyone recognizes at the outset that it's not going to make everyone happy, possibly to the point of it not being used by those with particular requirements (i.e. Asia).
It would be very informative to see statistics about programmers: their primary development language and secondary language proficiencies.
Of course they know how to spell things in English. We have, what, 26 or 52 characters and they learn 2000 or so for professional work. In the context of program text, Japanese identifiers may be easier to spell in 7-bit ASCII than they are in Japanese. But oversimplification seems to carry connotations, too. Asian text appears to very quickly diverge from what we think of as "plain text" into what we would consider "desktop publishing". It's probably not something that can be done halfway, yet Ryou is (understandably) unwilling to go back to pre-Unicode. I see no reason to think that Ryou is not representative of Asian developers. If we want this library to be useful to them, we probably need to make Ryou happy. - Marsh