Re: [boost] [Boost-bugs] [ boost-Bugs-1461533 ] Non-basic-source-character-set characters conflict with MSVC

I have attached a list to this email (There are 354 in the list as of this writing). I couldn't figure out how to attach the file to the bug report after-the-fact. Note that these are the most likely candidate source files to cause an error when building the Boost sources -- some test cases and example sources were omitted. Blessings, Foster On 3/30/06, Marshall Clow <marshall@idio.com> wrote:
-- Foster T. Brereton - Computer Scientist Software Technology Lab, Adobe Systems Incorporated fbrereto@adobe.com -- http://opensource.adobe.com

"Foster Brereton" <fosterb.boost@gmail.com> wrote:
Thanks! I have attached the file to the bug report. [ You may need to be logged into SF to attach a file ] -- -- Marshall Marshall Clow Idio Software <mailto:marshall@idio.com> It is by caffeine alone I set my mind in motion. It is by the beans of Java that thoughts acquire speed, the hands acquire shaking, the shaking becomes a warning. It is by caffeine alone I set my mind in motion.

Foster Brereton wrote:
Many of these are caused by a name in the copyright clause containing non-ASCII characters. Replacing ö and ø with o (even oe) in people's names doesn't seem very polite to me, and such "rechristening" may have legal implications.

I agree that mangling the proper spelling of an individual's name is inappropriate, and that a proper solution to this issue would circumvent that option. Perhaps each Boost library could have a copyright file associated with it (<library>.copyright.utf8, or some other naming convention), and the boillerplate within the sources of that Boost library could reference that copyright file (which in turn would reference the Boost License file at the root of the source tree). The high-ASCII text could then be in that external file, avoiding the compiler, and nobody's name gets inappropriately altered. Thoughts? Alternatives? Blessings, Foster On 3/30/06, Peter Dimov <pdimov@mmltd.net> wrote:
-- Foster T. Brereton - Computer Scientist Software Technology Lab, Adobe Systems Incorporated fbrereto@adobe.com -- http://opensource.adobe.com

On 3/31/06, Foster Brereton <fosterb.boost@gmail.com> wrote:
it gets mangled anyway (what you see is code-page dependent, in my case a greek sum instead of the a: in ja:rvi's name, much worse than seeing "jarvi") Thoughts? Alternatives?
not sure about the correct term, it's called something like "flying accents", so that e.g. an A with an accent is written as A' it's pretty straightforward, supported by many tools (like TeX or vim), and resembles Unicode's way (Unicode U+00C1 written as Unicode U+0041 + U+0301) unfortunately i don't know about any standards (or even official documents) br, andras

Andras Erdei wrote:
What if I ever become a Boost member (seems unlikely right now :-) ), would it be acceptable for me to write my name in Hebrew there? I think not... Should I feel offended because I have to sign my name down this message as "Yuval" rather than the original "יובל" (which most of you can't even read because your OS/mail-reader doesn't support Hebrew)? Again I think not... Either all text, including names, is written in English only, or it can be written in any language, but not in source files, or all source files are converted to Unicode. Going the middle way of allowing English, plus some selected European languagues (because they happen to be somewhat close to English) sounds wrong to me (and can be considered unfair by some, but that's not my point). Thanks, Yuval

"Yuval Ronen" wrote
The characters allowed in source files are actually laid down in the C+ standard AFAIK. That is limited to the characters allowed in the grammar and I'm fairly sure that doesnt include e.g the copyright symbol etc. regards Andy Little

Andy Little wrote:
The standard is actually not very helpful on this score: For the first phase of translation: "Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set"
-- AlisdairM

"AlisdairM" wrote
OK Thats very helpful... For boost source files that makes the answer quite simple, because boost works with multiple vendors so the set of characters allowable in boost source files should consist of ( set of characters allowed by vendor A ) & ( set of characters allowed by vendor B ) & ...(set of characters allowed by vendor X ) | (Set of characters which after mapping become characters allowed outside comments ) Which assuming vendor X is unknown reolves to: (Set of characters which after mapping become characters allowed outside comments) regards Andy Little

Andy Little wrote:
Why "outside comments"? AFAIU, the problem is high-ASCII chracters found in comments, which emits the warning the OP complained about. He actually didn't say that explicitly, but I don't think there are any such chracters in Boost code itself, only in comments. IOW, this discussion is about "inside comments", not outside...

On 4/2/06, Andy Little <andy@servocomm.freeserve.co.uk> wrote:
I am getting the reports third-hand, but it is my understanding that the problems we are having are with certain high-ASCII characters in comments only. I am not sure if the same characters in a non-comment context would generate the same warning. Here's what I have from one of the developers that originally posted the issue: <quote> the warning C4819 (an unsuitable character against current code page) caused the error C2220 when header files (.h and .hpp) which are referred from the compiling source file contain upper ASCII characters. In case of source files which contain upper ASCII characters, C4819 warning occurred, but C2220 error did not occur. I tried to correct the upper ASCII characters in only header files on my local client and the build successfully passed. So this problem seems to be improved if we correct upper ASCII characters in only the header files although it's not the perfect way. Thanks. </quote> Blessings, Foster -- Foster T. Brereton - Computer Scientist Software Technology Lab, Adobe Systems Incorporated fbrereto@adobe.com -- http://opensource.adobe.com

"Peter Dimov" wrote
I only looked at the first offender <boost/archive/detail/auto_link_warchive.hpp> In the copyright is a high ascii © symbol.(I dont know if it'll survive the trip but its a little C in a circle) The same directory contains other headers so i figured... Why arent these showing in the rogues gallery? Lo and behold These files contain (C) which is not high ascii of course. IOW in the first case in the list at least its a trivial fix without (presumably) great legal implications. P.S. I followed this by plugging the offending copyright character into Windows Exploere search facility in boost directory. Funnily enough I came up with a very similar looking rogues gallery to the list there .... ;-) regards Andy Little
participants (7)
-
AlisdairM
-
Andras Erdei
-
Andy Little
-
Foster Brereton
-
Marshall Clow
-
Peter Dimov
-
Yuval Ronen