[boost] [locale] Review of Boost.Locale library

17 Apr 2011

      This is my review of the Boost Locale library. It is divided into two 
parts, a review of the documentation and a review of the library itself.
I will refer to Boost Locale as just Locale, with a capital L, for the 
remainder of this review. I will use the term C++ locale to refer to the 
C++ standard library locale implementation.

1. Documentation

The layout of the main page is decent, but I would have expected a 
discussion there, or as a first topic, of what Locale brings that the 
C++ standard locale does not have. I was disappointed not to find such a 
discussion.

The documentation topics are specified as Tutorials. I do not think they 
are Tutorials, which is fine with me since I much prefer topics rather 
than exmaples when trying to understand a library.

a. Introduction to C++ Standard Library localization support

The common critical problems of the C++ locale portion of the standard 
library seems spurious to me. The problems mentioned are really about 
implementations or programmer usage, not the C++ locale library itself. 
The only valid problem mentioned I find there is that the C++ standard 
library did not attempt to standardize any locale names. This makes 
using C++ locales based on locale names non-portable.

Unfortunately the issues there make a very weak argument for the Locale 
library itself.

b. Locale generation

I would have liked it if the doc here specified where one finds valid 
lists of language, country, encoding, and variant which make up a locale 
name. Without this information, the one valid problem mentioned 
regarding C++ locale is also a problem with Locale.

The note about wide strings and 8-bit encoding makes no sense to me at 
all. If I am using a wide string encoding, why would I not be using wide 
string iostreams ?

c. Collation

There is no explanation about what 'collation' is about. This is very 
disappointing, as it makes the rest of the discussion difficult to follow.

The examples were worthless to me since the classes involved have not 
been mentioned or discussed. Also the examples are woefully incomplete 
even in what they represent.

This is one reason why I dislike documentation which attempts to teach 
by example. It always seems to assume that if it throws examples at the 
reader before anything about the classes/templates in the examples have 
been mentioned, that this is somehow an effective way of learning a 
library. Instead it just creates confusion and serves unfortunately as a 
way by which a library implementer does not have to explain how the 
classes in his library actually work or relate to each other.

d. Conversions

"You may notice that there are existing functions to_upper and to_lower 
in the Boost.StringAlgo library. The difference is that these function 
operate over an entire string instead of performing incorrect 
character-by-character conversions."

In he second sentence, "these function" gramatically refers to the 
functions in Boost.StringAlgo, but I doubt that is what is meant.

I do not understand how these conversion functions use a locale. The 
example gives: boost::locale::to_upper(gruben) used in a stream. Is this 
function using the locale imbued in the iostream ?

Again this is what happens when one creates examples without first 
explaining topics in a rational and orderly way.

e. Numbers, Time and Currency formatting and parsing

A bunch of ICU flags are mentioned but with no indication about how 
these are supposed to be used by iostreams. These flags look like they 
are supposed to be used by C-like format printf statements but since 
Locale uses iostreams I can not understand their purpose with Locale.

f. Messages Formatting (Translation)

Gnu gettext should be explained when it interfaces with Locale. Just 
telling someone to learn Gnu gettext is not adequate. Other than that 
the explanation is pretty thorough.

g. Character Set Conversions

An explanation of what character sets are, and what character set 
conversions entail, should be the beginning of this documentation.

h. Localized Text Formatting

"Each format specifier is enclosed within {} brackets.."

These are not "brackets" but "braces". Brackets are '[]'.

i. In general

It is confusing to me how generated locales affect the functionality of 
the different sections presented under 'Using Boost.Locale'. In a number 
of situations I am looking at classes or functions and I have no idea 
how these pick up a locale. I do understand that when used with 
iostreams the locale is determined by the locale imbued in the iostream. 
But outside of iostreams I do not understand from the documentation what 
locale is being used. If it is the C++ global locale, the documentation 
should say so. This entire issue about how locales are actually being 
used in various parts of the library should be explained as part of an 
overall explanation of the library. I find this good overall explanation 
of the library the major flaw in the documentation.

The documentation itself is well-ordered and the explanations generally 
decent. But I find it next to impossible to understand a library when 
the documentation does not take the time to explain concepts/topics and 
instead substititutes exmaples as a means of understaning a library.

2. Library itself

I did not look at the code itself and have no interest in critiquing the 
source. Others do this much better than I ever can.

The library offers a great deal of functionality and a great positive of 
the library is that it works with multiple backends and brings those 
backends into the C++ world.

In general I think that using the global locale is a bad programming 
practice when one specifically intends to work with locales. 
Unfortunately it was hard for me to understand how individual locales 
are used with each of the parts of the library from the documentation. 
But I will assume for the time being, because it seems the only correct 
design, that each part of the library which is documented can work with 
some non-global locale which is created and passed around as necessary.

The only design flaw which I could discover in the library was in 
message translation. The fact that translation always begin from English 
( or perhaps some other narrow character language ) to something else is 
horrendous. I can understand that the Locale implementer wanted to use 
something popular and that already exists, but an idea so tremendously 
flawed in its conception either needs to be changed, if possible, or 
discarded for something better. I do understand that translation is just 
one part of this large library, but I hope that the implementer 
undestands how ridiculous it is to assume that non-English programmers 
are going to be willing to translate from English to language X rather 
than from their own language to language X.

I am assuming that all other parts of the library support both narrow 
character encodings and wide character encodings fully, and that at 
least UTF-16 is always supported using wide characters. It was really 
hard for me to make out from the docs whether or not this was the case.

I believe a great deal of work was put into the library, and that this 
work is invaluable in bringing locale usage into C++ in a better way 
than it is currently supported in C++ locale.

But I would have to vote "No" that the library should be accepted into 
Boost at the current time, with some provisos which would most likely 
gain my own change to a "Yes" vote in the future.

1) The documentation should explain the differences and improvements of 
Locale over C++ locale in a good general way.

2) The documentation should explain the use of locales for each of the 
topics.

3) A number of topics should discuss what they are about in general, and 
more time should be given to discuss how the classes/templates relate to 
each other.

4) Message translation should be reconsidered. I don't mean to say that 
the way it is done is enough to have the library rejected, but I can not 
believe that it is not a flawed system no matter what the popularity of 
Gnu gettext might be.

My major trouble with the library, which has led to my "No" vote, is 
that I can not really understand how to use the library from the 
documentation or what the library really offers over and above C++ 
locales. I realize the library programmer may not himself be a native 
English speaker, but if I can not really understand a library from its 
documentation in a way that makes sense to me I can not vote for its 
inclusion into Boost. I strongly suspect that if I were to understand 
the functionality of the library through a more rigorous explanation of 
its topics, and each topics relationship to a locale class and various 
encodings, I mught well vote for its inclusion into Boost. But for now, 
and in the state which the documentation resides for me, I can not do 
so. So I hope this review will be undersstood at least partially as a 
request to improve the docs as much as anything else.

[boost] [locale] Review of Boost.Locale library

Edward Diener