
Mathias Gaunard wrote:
On 30/04/2011 18:45, Vladimir Prus wrote:
Mathias Gaunard wrote:
On 26/04/2011 11:17, Sebastian Redl wrote:
GCC has options to control both the source (-finput-charset) and the execution character set (-fexec-charset). They both default to UTF-8. However, MSVC is more complicated. It will try to auto-detect the source character set, but while it can detect UTF-16, it will treat everything else as the system narrow encoding (usually a Windows-xxxx codepage) unless the file starts with a UTF-8-encoded BOM. The worse problem is that, except for a very new, poorly documented, and probably experimental pragma, there is *no way* to change MSVC's execution character set away from the system narrow encoding.
A long time ago, I asked Vladimir Prus to help me add an option to Boost.Build that would allow to automatically prepend the BOM to source files when using MSVC, but unfortunately he was never able to help me do this.
Well, if you have a command that can prepend BOM to a file, you can easily modify 'actions compile-c-c++' in msvc.jam to run that command.
It would be nice if I could only do this when the source files have been tagged as utf-8 or something like that.
Well, it would be trivial to implement syntax like: utf8cpp file : file.cpp ; exe whatever : whatever.cpp file ; If that's what you're asking for. In fact, here's a complete example that should almost work, in any Jamfile: import type ; type.register UTF8CPP : : CPP ; import generators ; generators.register-standard $(__name__).add-bom : CPP : UTF8CPP ; actions add-bom { add-bom $(>) -o $(<) } utf8cpp file : file.cpp ; exe whatever : whatever.cpp file ; I say "almost" because somebody who is actually interested in all this should write the 'add-bom' utility, test everything, and do various other boring things, like sending a patch and/or checking in. I think the ball is now in your court. - Volodya -- Vladimir Prus Mentor Graphics +7 (812) 677-68-40