
On Fri, Oct 28, 2011 at 13:58, Peter Dimov <pdimov@pdimov.com> wrote:
Alf P. Steinbach wrote:
How do I make the following program work with Visual C++ in Windows, using
narrow character string?
<code> #include <stdio.h> #include <fcntl.h> // _O_U8TEXT #include <io.h> // _setmode, _fileno #include <windows.h>
int main() { //SetConsoleOutputCP( 65001 ); //_setmode( _fileno( stdout ), _O_U8TEXT ); printf( "Blåbærsyltetøy! 日本国 кошка!\n" ); } </code>
Output to a console wasn't our topic so far (and is not one of my strong points), but the specific problem with this program is that the embedded literal is not UTF-8, as the warning C4566 tells us, so there is no way for you to get UTF-8 in the output. (You should be able to set VC++'s code page to 65001, but I don't think you can.)
int main() { printf( utf8_encode( L"кошка" ).c_str() ); }
You don't need to configure anything, in fact you cannot do it properly in VS. What you can do is: 1) don't use wide-char literals with non ascii characters 2) use UTF-8 literals for narrow-char. All you need is to save the source as UTF-8 WITHOUT BOM. Works as charm on VS2005 and VS2010. Apparently it's portable. The IDE can detect UTF-8 even without BOM ("☑ Auto-detect UTF-8 encoding without signature").
This is not a practical problem for "proper" applications because Russian text literals should always come from the equivalent of gettext and never be embedded in code.
+1 Personally I'm happy with printf( "Blåbærsyltetøy! 日本国 кошка!\n" ); writing UTF-8. Even if I cannot configure the console, I still can redirect it to a file, and it will correctly save this as UTF-8. Preventing data-loss is more important for me. -- Yakov