
joaquin@tid.es wrote:
Beman Dawes escribió:
There is a fresh run of the trunk inspect report up at http://mysite.verizon.net/beman/inspect.html
A couple of things are different:
* A "Hall of Shame" has been added to highlight what libraries are the worst offenders. I'm open to suggestions as to at what point we should cut off reporting. Maybe limit it to the worst 10 libraries?
* A check for non-ASCII characters in source files has been added by Marshall Clow. It is picking up non-ASCII characters in people's names in copyright messages; that's why Boost multi-index looks so bad in the report. We need to come up with preferred approach for those with non-ASCII characters in their names.
I think the options are:
1. The inspect tool is modified so as to bypass author names (possibly taken from an author names file). In a sense, this defeats the whole purpose of the non-ASCII check, I guess.
There just isn't a way in standard C++ to deal with non-ASCII characters that will preserve their correct display on all systems, and avoids errors and/or warnings on some Asian language systems.
2. Supress all diacritical marks:
Joaquín M López Muñoz --> Joaquin M Lopez Munoz
I think that's really the only viable choice. Authors are free to use and transformation they desire, as long as it is entirely ASCII.
3. Encode with HTML entities:
Joaquín M López Muñoz --> Joaquín M López Muñoz
That makes the name much less readable except when viewed with a web browser or other HTML aware renderer.
Whatever approach is agreed upon I'll happily apply asap to Boost.MultiIndex.
Unless someone else comes up with an unexpected solution, I think you are going to have to become Joaquín M López Muñoz --> Joaquin M Lopez Munoz:-) --Beman