MSVC doesn't like some high-ASCII characters on non-US systems

All, We are getting warnings when trying to compile Boost source files with MSVC 2003 and 2005 when there are high-ASCII characters in the sources. The reported problem only shows up when the computer's region is set to Japanese, but there could be other affected regions as well. A viable workaround is changing the setting for non-Unicode programs in the Control Panel/Regional and Language Options from Japanese to English (United States), but this is very inconvenient for members of our staff whose native region is Japanese. One (abridged) example of the warning we get from MSVC is: While compiling boost\type_traits\is_base_and_derived.hpp: warning C4819: The file contains a character that cannot be represented in the current code page (932). Save the file in Unicode format to prevent data loss. A caveat to this problem is that the warning shows up when there are high-ASCII characters in comments, so all copyright symbols and high-ASCII characters used in names have the potential to generate these compilation warnings for us. Another caveat to this problem is not all high-ASCII characters produce this problem. My guess is that some of the high-ASCII characters have a valid code page translation (e.g., the copyright symbol), and so the warning is not emitted. An implication of this would be that some characters will work for some OS regions and not for others, resulting in a varied set of errors based on the region one has chosen. The reason this is a problem is because our projects have warnings-as-errors turned on. We are not interested in disabling warnings-as-errors to mitigate the issue because of the far-reaching side effects of such a change. Has anyone else seen this problem? Are there other potential fixes other than removing all high-ASCII characters from the source files? Blessings, Foster -- Foster T. Brereton - Computer Scientist Software Technology Lab, Adobe Systems Incorporated fbrereto@adobe.com -- http://opensource.adobe.com

We are getting warnings when trying to compile Boost source files with MSVC 2003 and 2005 when there are high-ASCII characters in the sources. The reported problem only shows up when the computer's region is set to Japanese, but there could be other affected regions as well.
Boost/1.33.0/boost $ file *.hpp | grep -v ASCII lexical_cast.hpp: ISO-8859 C++ program text multi_index_container_fwd.hpp: ISO-8859 C program text multi_index_container.hpp: ISO-8859 C program text progress.hpp: ISO-8859 C++ program text property_map_iterator.hpp: ISO-8859 C++ program text ref.hpp: ISO-8859 C++ program text token_functions.hpp: ISO-8859 C++ program text tokenizer.hpp: ISO-8859 C++ program text
Has anyone else seen this problem? Are there other potential fixes other than removing all high-ASCII characters from the source files?
Perhaps converting the boost headers into utf-8, rather than a mixture of ASCII and ISO-8859, would be more region-neutural. I don't know if boost has standardised on an encoding, but being Latin-1 centric is probably not necessary or intentional. Nigel

On 3/29/06, Foster Brereton <fosterb.boost@gmail.com> wrote:
The reason this is a problem is because our projects have warnings-as-errors turned on. We are not interested in disabling warnings-as-errors to mitigate the issue because of the far-reaching side effects of such a change.
Has anyone else seen this problem? Are there other potential fixes other than removing all high-ASCII characters from the source files?
Disabling that particular warning?

Olaf van der Spek wrote:
On 3/29/06, Foster Brereton <fosterb.boost@gmail.com> wrote:
The reason this is a problem is because our projects have warnings-as-errors turned on. We are not interested in disabling warnings-as-errors to mitigate the issue because of the far-reaching side effects of such a change.
Has anyone else seen this problem? Are there other potential fixes other than removing all high-ASCII characters from the source files?
Disabling that particular warning?
AFAIK, there is no way in MSVC to disable a specific warning other than using a #pragma, but this is an option only for own code, not third-party libraries which one have no control over (putting this pragma in every file before #including Boost header is cumbersome, IMO). I also think that since Boost supply generic libraries and not specific GUI, there is no reason for Boost code to go beyond plain low ASCII, *including comments*. You know what? Especially for comments. A world-wide project such as Boost must have at least the spoken language (in addition to the programming language) as a common denominator. Isn't this why this mailing list is in English only? Why should comments be allowed in other languages? This low ASCII rule can also be easily checked by an automatic tool. Yuval

Yuval Ronen wrote:
Olaf van der Spek wrote:
On 3/29/06, Foster Brereton <fosterb.boost@gmail.com> wrote:
The reason this is a problem is because our projects have warnings-as-errors turned on. We are not interested in disabling warnings-as-errors to mitigate the issue because of the far-reaching side effects of such a change.
Has anyone else seen this problem? Are there other potential fixes other than removing all high-ASCII characters from the source files?
Disabling that particular warning?
AFAIK, there is no way in MSVC to disable a specific warning other than using a #pragma, but this is an option only for own code, not third-party libraries which one have no control over (putting this pragma in every file before #including Boost header is cumbersome, IMO).
I also think that since Boost supply generic libraries and not specific GUI, there is no reason for Boost code to go beyond plain low ASCII, *including comments*.
In VC7.1 atleast, see /wd compile option, which from the IDE is accessible from project properties|C/C++|Advanced - Disable Specific warnings. Jeff

Jeff Flinn wrote:
In VC7.1 atleast, see /wd compile option, which from the IDE is accessible from project properties|C/C++|Advanced - Disable Specific warnings.
Cool, thanks for the pointer. Funny I never noticed that... Nevertheless, I think that my other argument (about the common denominator) still holds.

I also think that since Boost supply generic libraries and not specific GUI, there is no reason for Boost code to go beyond plain low ASCII
Here is the result of using a Python script to grep for non-ascii lines in the boost headers. (Boost 1.33.0) The issues mostly revolve around copyright symbols and European names. (How inconvenient for developers to have non-ASCII names! :-) Nigel ------------- #!/bin/python import sys for file in sys.argv: f = open(file,"r") for line in f: try: line.decode("ascii") except: print "%s: %s" %(file,line.splitlines()[0]) ------------- ./archive/detail/auto_link_archive.hpp: // © Copyright Robert Ramey 2004 ./archive/detail/auto_link_warchive.hpp: // © Copyright Robert Ramey 2004 ./archive/detail/basic_config.hpp: // © Copyright Robert Ramey 2004 ./archive/detail/decl.hpp: // © Copyright Robert Ramey 2004 ./archive/detail/utf8_codecvt_facet.hpp: // Copyright © 2001 Ronald Garcia, Indiana University (garcia@osl.iu.edu) ./config/abi_prefix.hpp: // © Copyright John Maddock 2003 ./config/abi_suffix.hpp: // © Copyright John Maddock 2003 ./config/compiler/sunpro_cc.hpp: // (Jens Maurer according to Gottfried Ganßauge 04 Mar 2002) ./config/compiler/vacpp.hpp: // (C) Copyright Markus Schöpflin 2002 - 2003. ./detail/allocator_utilities.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./detail/atomic_count_gcc.hpp: // Copyright (c) 2002 Lars Gullik Bjønnes <larsbj@lyx.org> ./detail/utf8_codecvt_facet.hpp: // Copyright © 2001 Ronald Garcia, Indiana University (garcia@osl.iu.edu) ./filesystem/config.hpp: // © Copyright Beman Dawes 2003 ./filesystem/convenience.hpp: // © Copyright Beman Dawes, 2002 ./filesystem/convenience.hpp: // © Copyright Vladimir Prus, 2002 ./filesystem/exception.hpp: // Copyright © 2002 Beman Dawes ./filesystem/exception.hpp: // Copyright © 2001 Dietmar Kühl ./filesystem/operations.hpp: // Copyright © 2002, 2003 Beman Dawes ./filesystem/operations.hpp: // Copyright © 2002 Jan Langer ./filesystem/operations.hpp: // Copyright © 2001 Dietmar Kühl ./filesystem/path.hpp: // © Copyright Beman Dawes 2002-2003 ./format/alt_sstream_impl.hpp: off_type off = off_type(pos); // operation guaranteed by §27.4.3.2 table 88 ./format/alt_sstream_impl.hpp: BOOST_ASSERT(0); // §27.4.3.2 allows undefined-behaviour here ./format/internals.hpp: // set our params to standard's default state. cf § 27.4.4.1 of the C++ norm ./graph/adjacency_list_io.hpp: // Author: François Faure ./graph/property_iter_range.hpp: // (C) Copyright François Faure, iMAGIS-GRAVIR / UJF, 2001. Permission ./graph/property_iter_range.hpp: // 02 May 2001 François Faure ./lambda/algorithm.hpp: // Copyright (C) 2002 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/bind.hpp: // Copyright (C) 1999-2001 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/casts.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/construct.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/control_structures.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/core.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/actions.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/arity_code.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/bind_functions.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/control_constructs_common.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/control_structures_impl.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/function_adaptors.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/is_instance_of.hpp: // Copyright (C) 2001 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/lambda_config.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/lambda_functors.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/lambda_functor_base.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/lambda_fwd.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/lambda_traits.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/member_ptr.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/operators.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/operator_actions.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/operator_lambda_func_base.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/operator_return_type_traits.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/ret.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/return_type_traits.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/detail/select_functions.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/exceptions.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/if.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/lambda.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/loops.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/numeric.hpp: // Copyright (C) 2002 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lambda/switch.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./lexical_cast.hpp: // enhanced with contributions from Terje Slettebø, ./multi_index/composite_key.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/access_specifier.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/archive_constructed.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/auto_space.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/base_type.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/bucket_array.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/converter.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/copy_map.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/def_ctor_tuple_cons.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/duplicates_iterator.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/hash_index_args.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/hash_index_iterator.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/hash_index_iterator_fwd.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/hash_index_node.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/hash_index_proxy.hpp: /* Copyright 2003-2004 Joaquín M López Muñoz. ./multi_index/detail/has_tag.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/header_holder.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/index_base.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/index_iterator.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/index_iterator_fwd.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/index_loader.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/index_matcher.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/index_node_base.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/index_proxy.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/index_saver.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/invariant_assert.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/is_index_list.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/modify_key_adaptor.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/msvc_index_specifier.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/node_type.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/no_duplicate_tags.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/ord_index_args.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/ord_index_node.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/ord_index_ops.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/prevent_eti.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/safe_mode.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/scope_guard.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/seq_index_node.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/seq_index_ops.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/unbounded.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/detail/value_compare.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/hashed_index.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/hashed_index_fwd.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/identity.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/identity_fwd.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/indexed_by.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/key_extractors.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/member.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/mem_fun.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/ordered_index.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/ordered_index_fwd.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/safe_mode_errors.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/sequenced_index.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/sequenced_index_fwd.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index/tag.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index_container.hpp: * Copyright 2003-2005 Joaquín M López Muñoz. ./multi_index_container_fwd.hpp: /* Copyright 2003-2005 Joaquín M López Muñoz. ./numeric/conversion/bounds.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/cast.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/conversion_traits.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/converter.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/converter_policies.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/detail/bounds.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/detail/conversion_traits.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/detail/converter.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/detail/int_float_mixture.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/detail/is_subranged.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/detail/meta.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/detail/sign_mixture.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/detail/udt_builtin_mixture.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/int_float_mixture.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/int_float_mixture_enum.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/is_subranged.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/sign_mixture.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/sign_mixture_enum.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/udt_builtin_mixture.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/conversion/udt_builtin_mixture_enum.hpp: // © Copyright Fernando Luis Cacciola Carballal 2000-2004 ./numeric/interval/arith.hpp: * Copyright 2002-2003 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/arith2.hpp: * Copyright 2002-2003 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/checking.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/compare/explicit.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/compare.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/constants.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/bcc_rounding_control.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/bugs.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/c99sub_rounding_control.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/c99_rounding_control.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/interval_prototype.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/msvc_rounding_control.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/ppc_rounding_control.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/sparc_rounding_control.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/test_input.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/x86gcc_rounding_control.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/detail/x86_rounding_control.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/ext/x86_fast_rounding_control.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/hw_rounding.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/interval.hpp: * Copyright 2002-2003 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/limits.hpp: * Copyright 2002-2003 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/rounded_arith.hpp: * Copyright 2002-2003 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/rounded_transc.hpp: * Copyright 2002-2003 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/rounding.hpp: * Copyright 2002-2003 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/transc.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval/utility.hpp: * Copyright 2002-2003 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./numeric/interval.hpp: * Copyright 2002 Hervé Brönnimann, Guillaume Melquiond, Sylvain Pion ./program_options/detail/utf8_codecvt_facet.hpp: // Copyright © 2001 Ronald Garcia, Indiana University (garcia@osl.iu.edu) ./progress.hpp: m_os << elapsed() << " s\n" // "s" is System International d'Unités std ./property_map_iterator.hpp: // property iterator, generalized from ideas by François Faure ./python/detail/dealloc.hpp: // Copyright Gottfried Ganßauge 2003. ./python/opaque_pointer_converter.hpp: // Copyright Gottfried Ganßauge 2003. ./python/ptr.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./python/return_opaque_pointer.hpp: // Copyright Gottfried Ganßauge 2003. ./ref.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./serialization/config.hpp: // © Copyright Robert Ramey 2004 ./spirit/core/primitives/primitives.hpp: // end_parser class (suggested by Markus Schöpflin) ./spirit/fusion/sequence/at.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/detail/as_tuple_element.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/detail/io.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/detail/manip.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/detail/sequence_equal_to.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/detail/sequence_greater.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/detail/sequence_greater_equal.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/detail/sequence_less.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/detail/sequence_less_equal.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/detail/sequence_not_equal_to.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/equal_to.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/get.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/greater.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/greater_equal.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/io.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/less.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/less_equal.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/make_tuple.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/not_equal_to.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/tie.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/tuple.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./spirit/fusion/sequence/tuple_forward.hpp: Copyright (c) 1999-2003 Jaakko Järvi ./tokenizer.hpp: // © Copyright Jeremy Siek and John R. Bandela 2001. ./token_functions.hpp: // 01 Oct 2004 Joaquín M López Muñoz ./tuple/detail/tuple_basic.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./tuple/detail/tuple_basic_no_partial_spec.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./tuple/tuple.hpp: // Copyright (C) 1999, 2000 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./tuple/tuple_comparison.hpp: // Copyright (C) 2001 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./tuple/tuple_io.hpp: // Copyright (C) 2001 Jaakko Järvi (jaakko.jarvi@cs.utu.fi) ./type_traits/conversion_traits.hpp: // Copyright 1999, 2000 Jaakko Jrvi (jaakko.jarvi@cs.utu.fi) ./type_traits/is_base_and_derived.hpp: Explanation by Terje Slettebø and Rani Sharoni. ./type_traits/is_convertible.hpp: // Copyright 1999, 2000 Jaakko Jrvi (jaakko.jarvi@cs.utu.fi) ./utility/enable_if.hpp: // Copyright 2003 © The Trustees of Indiana University. ./utility/enable_if.hpp: // Authors: Jaakko Järvi (jajarvi at osl.iu.edu) ./wave/util/cpp_include_paths.hpp: // for the case of '#include "file"' directives, they are not searched for ./wave/util/cpp_include_paths.hpp: // '#include <file>' directives. If additional directories are specified ./wave/util/cpp_include_paths.hpp: // for '#include "file"' directives. Therefore, the current directory is ./wave/util/cpp_macromap.hpp: argument->push_back(token_type(T_PLACEMARKER, "§", ./wave/util/cpp_macromap.hpp: argument->push_back(token_type(T_PLACEMARKER, "§", ./wave/util/macro_helpers.hpp: // Each occurrence of white space between the arguments ./wave/util/macro_helpers.hpp: // Each occurrence of white space between the arguments

Nigel Stewart wrote:
I also think that since Boost supply generic libraries and not specific GUI, there is no reason for Boost code to go beyond plain low ASCII
Here is the result of using a Python script to grep for non-ascii lines in the boost headers. (Boost 1.33.0)
The issues mostly revolve around copyright symbols and European names. (How inconvenient for developers to have non-ASCII names! :-)
Here are a few thoughts: * German ë, ö and ü can be replaced by ae, oe and ue respectively - this is standard practice in German. * é could be replaced by e' (e apostrophe) but this is not very satisfactory, and only really works at the end of a word. Such a substitution is permissible for replacing accents at the end of Italian words. * Replace each accented letter with its nearest unaccented equivalent: é -> e, ä -> a, Å -> A, ç -> c, æ -> ae, ñ -> n, ø -> o, ð -> dh, etc. This might not be acceptable to the authors, however, because it could end up changing the meaning (and, very likely, the pronunciation) of their names. However I'm sure each language must have equivalents (such as oe for German ö, etc) when accented characters are not available. The Wikipedia page http://en.wikipedia.org/wiki/Diacritic might be helpful here. Paul

Paul Giaccone ha escrito:
Nigel Stewart wrote:
I also think that since Boost supply generic libraries and not specific GUI, there is no reason for Boost code to go beyond plain low ASCII
Here is the result of using a Python script to grep for non-ascii lines in the boost headers. (Boost 1.33.0)
The issues mostly revolve around copyright symbols and European names. (How inconvenient for developers to have non-ASCII names! :-)
Here are a few thoughts:
* German ë, ö and ü can be replaced by ae, oe and ue respectively - this is standard practice in German. * é could be replaced by e' (e apostrophe) but this is not very satisfactory, and only really works at the end of a word. Such a substitution is permissible for replacing accents at the end of Italian words. * Replace each accented letter with its nearest unaccented equivalent: é -> e, ä -> a, à -> A, ç -> c, æ -> ae, ñ -> n, ø -> o, ð -> dh, etc. This might not be acceptable to the authors, however, because it could end up changing the meaning (and, very likely, the pronunciation) of their names. However I'm sure each language must have equivalents (such as oe for German ö, etc) when accented characters are not available. The Wikipedia page http://en.wikipedia.org/wiki/Diacritic might be helpful here.
A few remarks about Spanish: The most common offending characters in this language are: á é í ó ú ñ and the corresponding uppercase versions, plus ¿ ¡ (there are other, much less frequent non-ASCII characters, like ü) When these are unavailable for whatever reason (one's using a foreign keyboard or typewriter, for instance), the usual substitution rules are: á --> a é --> e í --> i ó --> o ú --> u ¿ ¡ are just omitted Removal of the vowel diacritics result in a change of the stressed syllable, but we Spaniards don't usually frown at that (or even notice): after all, very rarely will this change collide with a different existing word. I've never seen in Spanish the Italian custom of adjoining a ' character as a substitute for the accent. Removal of inverted exclamation and question marks is also widely accepted, and even regularly practised in informal writing. This leaves us with ñ I've seen the following substitutions: ñ --> n ñ --> nn ñ --> gn ñ --> ny the most usual ones being the first and second: I couldn't say which is the winner between these two. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

"Paul Giaccone" <paulg@cinesite.co.uk> wrote in message news:44323C3B.7040503@cinesite.co.uk... FWIW These characters are rendered quite strangely in my (Western European?) version of this message. They may be rendered differently for others so I provided a description ë, A followed by two tiny left braces ö A followed by a reversed capital P with the round bit filled in ü A followed by a 1/4 é A followed by a (C) ä A followed by a round sunny symbol Å A followed by three dots ç A followed by two S's or squiggly snakes æ A followed by an split vertical bar ñ A followed by a +- sign ø , A followed by a squiggle at the foot ð A followed by the symbol for degrees(small circle near the top) I'm guessing they are rendered differently for the authors ? regards Andy Little

Andy Little wrote:
"Paul Giaccone" <paulg@cinesite.co.uk> wrote in message news:44323C3B.7040503@cinesite.co.uk...
FWIW These characters are rendered quite strangely in my (Western European?) version of this message. They may be rendered differently for others so I provided a description
ë, A followed by two tiny left braces
The original message uses UTF-8. Your reply is windows-1252.

"Peter Dimov" wrote
The original message uses UTF-8. Your reply is windows-1252.
Thanks for the technical explanation. My main point, however, is that it is best to avoid any high ascii charcaters where possible because the way they are rendered can be unpredictable and not what the author intended ! (I'm still pushing the point .. Dont use high-ascii characters in source code FWIW) regards Andy Little

If only paper tape had been 9 holes wide :-(( Paul -- Paul A Bristow Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB Phone and SMS text +44 1539 561830, Mobile and SMS text +44 7714 330204 mailto: pbristow@hetp.u-net.com http://www.hetp.u-net.com/index.html http://www.hetp.u-net.com/Paul%20A%20Bristow%20info.html | -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of Andy Little | Sent: 04 April 2006 12:03 | To: boost@lists.boost.org | Subject: Re: [boost] MSVC doesn't like some high-ASCII | charactersonnon-USsystems | | | "Peter Dimov" wrote | | > The original message uses UTF-8. Your reply is windows-1252. | | Thanks for the technical explanation. | | My main point, however, is that it is best to avoid any high | ascii charcaters | where possible because the way they are rendered can be | unpredictable and not | what the author intended ! | | (I'm still pushing the point .. Dont use high-ascii | characters in source code | FWIW)

Andy Little wrote:
My main point, however, is that it is best to avoid any high ascii charcaters where possible because the way they are rendered can be unpredictable and not what the author intended !
(I'm still pushing the point .. Dont use high-ascii characters in source code FWIW)
Couldn't agree with you more (at least until we can be sure that both the standard and available compilers support Unicode, and even then I'd restrict non-English comment text to names only).

Andy Little wrote:
"Paul Giaccone" <paulg@cinesite.co.uk> wrote in message news:44323C3B.7040503@cinesite.co.uk...
FWIW These characters are rendered quite strangely in my (Western European?) version of this message. They may be rendered differently for others so I provided a description
ë, A followed by two tiny left braces ö A followed by a reversed capital P with the round bit filled in ü A followed by a 1/4 é A followed by a (C) ä A followed by a round sunny symbol Å A followed by three dots ç A followed by two S's or squiggly snakes æ A followed by an split vertical bar ñ A followed by a +- sign ø , A followed by a squiggle at the foot ð A followed by the symbol for degrees(small circle near the top)
I'm guessing they are rendered differently for the authors ?
Thanks for that, Andy - I didn't realise that they would get mashed up by the software. They came out fine in the message stored in the "Sent" section of my mail software. For the record, here's the garbled section of my message again, with everything spelled out in full:
* German a-umlaut, o-umlaut and u-umlaut can be replaced by ae, oe and ue respectively - this is standard practice in German. * e-acute could be replaced by e' (e apostrophe) but this is not very satisfactory, and only really works at the end of a word. Such a substitution is permissible for replacing accents at the end of Italian words. * Replace each accented letter with its nearest unaccented equivalent: e-acute -> e, a-umlaut -> a, A-ring -> A, c-cedilla -> c, ae-ligature -> ae, n-tilde -> n, o-slash -> o, edh (an Icelandic letter) -> dh, etc.
Paul

Paul Giaccone wrote:
* German ë, ö and ü can be replaced by ae, oe and ue respectively - this is standard practice in German.
Similarly, the German sharp s (ß, ß) can be replaced by "ss" or the older, disused form "sz". Sebastian Redl

* Paul Giaccone (paulg@cinesite.co.uk) [20060404 11:34]:
* German ë, ö and ü can be replaced by ae, oe and ue respectively - this is standard practice in German.
No, you can't simply replace it, specially in names, as you loose the ability to discern those names that actually do contain ae, oe or ue. In these cases the old TeX way of writing them would be preferable. Philipp

Philipp Thomas wrote:
* Paul Giaccone (paulg@cinesite.co.uk) [20060404 11:34]:
* German ë, ö and ü can be replaced by ae, oe and ue respectively - this is standard practice in German.
No, you can't simply replace it, specially in names, as you loose the ability to discern those names that actually do contain ae, oe or ue. In these cases the old TeX way of writing them would be preferable.
Ah, I was under the impression that they were always interchangeable (and that you could always write "Gerhard Schroeder" for "Gerhard Schröder" [the former Chancellor of Germany]). As you are German, I assume, and my knowledge of German is very limited, I'm sure you know better than I do. However, if this is the only objection to doing this, surely it a small price to pay. It is very unlikely that Boost will have two developers whose names differ in this way, and even if that should happen, there are other ways that they could be distinguished (for example, "Klaus Schroeder who wrote the X library" and "Klaus Schroeder who wrote the Y library", or "Klaus Schroeder (I)" and "Klaus Schroeder (II)" (rather like the system IMDb uses for namesakes). When you say "the old TeX way", do you mean writing \~n for n-tilde, for example? I'm not sure Joaquin would like to see his last name written as Mu\~noz - to me, it looks ugly or like it has been mistyped or mangled. No, to me, this makes the names difficult to read and can't be the appropriate solution. Paul

* Paul Giaccone (paulg@cinesite.co.uk) [20060404 16:48]:
(and that you could always write "Gerhard Schroeder" for "Gerhard Schröder" [the former Chancellor of Germany]).
You can do that. But the other way round is not always correct, i.e. you can't blindly change "Gerhard Schroeder" to "Gerhard Schröder" without knowing which one's the correct form.
When you say "the old TeX way", do you mean writing \~n for n-tilde, for
I agree with you that it'd be awkward to read. It was the first thing that came to my mind when searching for a simple way to express the written form of a letter. BTW, need I say that IMNSHO that MSVC warning is very stupid? Philipp

BTW, need I say that IMNSHO that MSVC warning is very stupid?
I think it raises a broader question about compiler support for text encodings. Ideally boost could standardise on utf-8 and bring C++ into a globalised, multi-lingual world. But, I'm not sure what the C++ standard says about handling of non-ASCII encodings... MSVC seems to be saying that it interprets source code in a locale-dependent manner. Nigel

* Nigel Stewart (ns@fluent.com) [20060404 17:19]:
But, I'm not sure what the C++ standard says about handling of non-ASCII encodings...
The C++ standard says *nothing* about the encoding of comments. On a quick glance, it does handle phases of translation (2.1), encoding of identifiers (2.10) and encoding of character literals (2.13.2). Philipp

Philipp Thomas wrote:
BTW, need I say that IMNSHO that MSVC warning is very stupid? This warning saved hours of debugging for me :)
On Russian keyboards letters 'с' (that's a Russian letter 's' which looks _exactly_ like 'c') and English 'c' are on the same key, so it's quite easy to type the wrong letter. -- With respect, Alex Besogonov (cyberax@elewise.com)

"Paul Giaccone" <paulg@cinesite.co.uk> skrev i meddelandet news:44323C3B.7040503@cinesite.co.uk...
Here are a few thoughts:
* German ë, ö and ü can be replaced by ae, oe and ue respectively - this is standard practice in German. * é could be replaced by e' (e apostrophe) but this is not very satisfactory, and only really works at the end of a word. Such a substitution is permissible for replacing accents at the end of Italian words. * Replace each accented letter with its nearest unaccented equivalent: é -> e, ä -> a, Å -> A, ç -> c, æ -> ae, ñ -> n, ø -> o, ð -> dh, etc. This might not be acceptable to the authors, however, because it could end up changing the meaning (and, very likely, the pronunciation) of their names. However I'm sure each language must have equivalents (such as oe for German ö, etc) when accented characters are not available. The Wikipedia page http://en.wikipedia.org/wiki/Diacritic might be helpful here.
Not only is it inconvenient for the authors, there is the extra dimension of changing names in copyright statements. Does that affect the validity? Bo Persson
participants (14)
-
Alex Besogonov
-
Andy Little
-
Bo Persson
-
Foster Brereton
-
Jeff Flinn
-
Joaquín Mª López Muñoz
-
Nigel Stewart
-
Olaf van der Spek
-
Paul A Bristow
-
Paul Giaccone
-
Peter Dimov
-
Philipp Thomas
-
Sebastian Redl
-
Yuval Ronen