New subject: UTF-8 BOM and Boost.Build (was: Re: [locale] Review results for Boost.Locale library)

23 Apr 2011

      The formal review of Artyom Beilis' Boost.Locale library, originally
scheduled to run from April 7th through the 16th and extended through
the 22nd, is now finished.

Fifteen people cast votes on it, ten for acceptance and five against.
Though not overwhelming, the two-to-one majority clearly indicates the
consensus of the list. As such:

    The Boost.Locale library IS ACCEPTED into Boost.

The details follow, starting with the voters in favor, in the order
their reviews were received:
* John Bytheway
* Steven Watanabe
* Sebastian Redl
* Fabio Fracassi ("possibly conditional")
* Noah Roberts
* Steve Bush
* Volker Lukas
* Paul A. Bristow
* Matus Chochlik
* Gevorg Voskanyan

Those opposed:
* Ryou Ezoe
* Phil Endecott (but "borderline")
* Edward Diener
* Mathias Gaunard
* Vincente BOTET ("count my vote as 1/2 or 1/4 vote")

There was also an early off-list review from Darren Cook which did not
include a vote, only listing issues.

There was a great deal of discussion around the library, and a number
of issues detailed. The major ones (and some less major ones that were
repeated by several reviewers) are summarized below, along with the
initials of the reviewer(s) who brought them up and Artyom's responses.
Note that although I followed the discussions closely, these are only
the issues brought up in the formal reviews themselves.

* Issue: date_time interface uses enums for periods, which is error
  prone and inconsistent with other date/time libraries. (JB)
* Response: Will be addressed.

* Issue: Reference documentation needs more detail or some things not
  clear; some terms used in the documentation are not defined or are
  defined too briefly; can't find headers that items are defined in
  from the documentation. (JB, SW, FF, PE, ED, MC)
* Response: Will be addressed.

* Issue: There are no prev/next links in the documentation, or tutorial
  can't easily be navigated. (DC, JB, SR, FF, MC)
* Response: Looking into it.

* Issue: Few examples include output. (JB)
* Response: Will be addressed.

* Issue: No examples in Asian languages; the library may have design
  flaws that are not apparent without them. (DC)
* Response: [Artyom has indicated to me (privately) that he's adding
  such examples. -- CN]

* Issue: The translation system requires narrow-character tags, making
  it English-centric. (RE, ED)
* Response: The library implements the most popular and most widely
  used message-catalog format. It is not perfect, but it is the best
  system currently available. There may be a work-around if you really
  must use wide-character languages under Windows.

* Issue: Support for wchar_t/UTF-16 is unclear. (RE, ED)
* Response: Wide characters are fully supported.

* Issue: Doesn't support Win32 API as a backend. (RE)
* Response: Misunderstanding, Win32 API backend is fully supported.

* Issue: boost::locale::format is not compatible with boost::format.
  (FF, NR)
* Response: boost::format is too limited for use in localization, and
  throws on any error, which means that a translator error could crash
  the program.

* Issue: boost::locale::date_time is not compatible with
  boost::date_time. (FF)
* Response: An unavoidable consequence of the differences between them,
  which are necessary due to support for locale independence and
  non-Gregorian calendars.

* Issue: Boundary analysis is only available when using ICU. (FF)
* Response: At present, only the ICU backend supports proper boundary
  analysis.

* Issue: Little documentation on the toolchain needed to extract
  strings and translate them, or the versions required. (FF, NR)
* Response: Will be addressed.

* Issue: Concerns about relying on GPL/LGPL-licensed tools, or their
  availability on all platforms, or recommendations to write a Boost
  version of these tools. (NR)
* Response: Reimplementing these is non-trivial and unnecessary; the
  licensing for these tools does not affect the programs developed
  with them. All are available for all platforms; will add explicit
  instructions for getting the latest versions for Windows.

* Issue: Use of strings instead of symbols for language and encoding
  makes run-time errors that could be compile-time errors. The most
  common ones should be symbols. (PE)
* Response: There are dozens of character encodings, and even more
  locales, and no way to determine which ones are the most common. Not
  all encodings are supported by all backends or OS configurations.
  Names ignore case and non-alphanumeric characters, which should
  minimize errors that could be generated from them. utf_to_utf
  transcoding will be added.

* Issue: Error handling (in conversions) is very basic. (PE)
* Response: An unavoidable limitation of the backends.

* Issue: Code could use more commenting. (PE, VL)
* Response: Noted, will be addressed in the future.

* Issue: Some documentation phrasing is confusing, or could use a
  native English speaker's input. (SR, VL)
* Response: Will be addressed as discovered.

* Issue: There are no lists of valid language, country, encoding, or
  variant strings. (ED)
* Response: Listed in standards ISO-639 and ISO-3166, which are
  referenced in the library's documentation. These standards are
  updated occasionally, and should be referred to directly for the
  latest information.

* Issue: Only works on contiguous, entirely-in-memory strings. (MG)
* Response: All current backends require this, and it satisfies the
  vast majority of use-cases.

* Issue: Boundary analysis goes through the entire string and returns a
  vector of positions. (MG)
* Response: Not perfect, but given the limitations of the existing
  backends, it is reasonable.

* Issue: The library's interface is not generic enough, or independent
  enough of the libraries that it wraps. (VB)
* Response: The interface is similar to that of every other i18n
  library, and should make as few assumptions as possible. It should
  not be changed.

* Issue: The date-time code should be merged into Boost.DateTime. (VB)
* Response: Date-time code is locale-dependent by its nature, and is
  more natural in Boost.Locale. Updating Boost.DateTime to do everything
  that Boost.Locale's library does would require a lot of work, and in
  all the time it has existed, only the Gregorian calendar has been
  implemented. There are Boost libraries that overlap others, so this
  is not a novelty.
-- 
Chad Nelson
Oak Circle Software, Inc.
*
*
*

[locale] Review results for Boost.Locale library

Chad Nelson

Artyom

Jeremy Maitin-Shepard

Artyom

Jeremy Maitin-Shepard

Artyom

Jeremy Maitin-Shepard

Mathias Gaunard

Jeremy Maitin-Shepard

Mathias Gaunard

John Bytheway

Mathias Gaunard

Sebastian Redl

Mathias Gaunard

Artyom

Ryou Ezoe

Mathias Gaunard

Ryou Ezoe

Ryou Ezoe

Artyom

Steve Bush

Artyom

Steve Bush

Vladimir Prus

Matthew Chambers

Marsh Ray

Stewart, Robert

Matthew Chambers

Marsh Ray

Jeremy Maitin-Shepard

Marsh Ray

Jeremy Maitin-Shepard

Matthew Chambers

Matthew Chambers

Mathias Gaunard

Vladimir Prus

Mathias Gaunard

Vladimir Prus

Mathias Gaunard

Vladimir Prus

Mathias Gaunard

Vladimir Prus

Mathias Gaunard

Steven Watanabe

Artyom

Mathias Gaunard

Artyom

Mathias Gaunard

Jeremy Maitin-Shepard

Gottlob Frege

Ryou Ezoe

Gevorg Voskanyan

Ryou Ezoe

Artyom

Ryou Ezoe

Ryo IGARASHI

Artyom

Ryo IGARASHI

Marsh Ray

Mathias Gaunard

Sergey Cheban

Gevorg Voskanyan

Ryo IGARASHI

Artyom

Ryou Ezoe

Artyom

Frank Mori Hess

Ryou Ezoe

Ryou Ezoe

Steve Bush

Mathias Gaunard

Mathias Gaunard

tags

participants (18)