[locale] Review of the proposed Boost.Locale library

16 Apr 2011

      Hello,

this is my review of the Locale library proposed for inclusion in 
Boost.

I vote to accept the proposed Boost.Locale as a Boost library.

- What is your evaluation of the design?
The library is nicely structured into individual components, as far as 
I can infer from the documentation and having a glance at some of the 
source files.

I like that it is possible to use different backends, utilizing the 
large ICU library when available.

- What is your evaluation of the implementation?
I was having only a superficial look at some of the source files, so I 
can not comment much. What I have seen looks clean to me. While 
reading the source code I think I could benefit from even more code 
comments. There are a lot already, but on the other hand there exists 
for example a source file libs/locale/src/shared/mo_lambda.cpp 
apparently containing a whole state machine or something without any 
word how it is used. I also stumbled over an odd comment in file 
libs/locale/src/std/numeric.cpp in line 163: "// workaround common bug 
- substitute NBSP with ordinary space". As a reader who did not 
already write the code I can not infer much from comments such as this 
one - I mean "What is the common bug then?" If this comment refers to 
the situation that e.g. the locale implementation of GCCs standard 
library implementation generates invalid UTF-8 when using NBSP as 
thousands seperator, then the comment should say that. Well, my point 
about comments is a minor issue of course. I do not suggest that the 
author goes over all the code again just to add or change comments.

I am a bit worried over the size of the libraries binary. On GNU/Linux 
x86_64 the built libboost_locale.so is almost 10MB large and codesize 
according to the "size" utility is almost 1MB. This comes in addition 
to the underlying ICU library. Is there any potential for reducing 
codesize? Due to the same concern about codesize I have also one 
concrete suggestion relating to the comparator template of the 
collation component. I suggest to not make the default comparision 
level a template parameter, instead to store it as a member. This way, 
when instantiating container templates the number of instantiations 
can be kept down. Or am I mistaken in thinking this would lead to more 
compact code? 

- What is your evaluation of the documentation?
It is nicely structured. I could easily find the documentation for any 
specific component I was using and the documentation is also very 
helpful if you have a task at hand and want to find out which component 
to use (and whether such a component is available in the proposed 
Boost.Locale).

There are some issues I have:
Phrasing is sometime odd. For example right in the beginning, under 
the headline "What is Boost.Locale" there is a sentence which reads 
"C++ offers a very good base for localization via the existing C++ 
locale facets std::num_put, std::ctype, std::collate etc. But these 
are very limited and sometimes buggy by design." I can only guess what 
that means. Superficially it seems like a contradiction to me, 
something which is excellent can not be buggy by design, not even 
sometimes. Okay, minor issue again, it just stands out because it is
only the fourth or fifth sentence or so.

There are some typos (I guess) where it is important to be correct:
Under "Messages Formatting (Translation)", subheading "Indirect 
Message Translation" all functions are documented as "The source text 
is not copied.". From the emphasis in the original document I think I 
can conclude that some of these functions copy and some do not.

The reference section of the documentation should document the 
requirements on the CharType which many functions/classes are 
parameterized upon (can it be char/wchar_t only?).

The reference documentation for token_iterator and break_iterator 
should not have individual entries for operator++, operator--, 
operator== and other operators which are always required for 
iterators, unless the documentation adds information. Instead, it 
should just be stated to which iterator concepts these templates 
conform.

Documentation for the dereferening operators (operator*) of these 
templates states that the underlying sequences iterated over can not 
be modified with these operators. As this is the case, could a usage 
hint be incorporated stating that users should always use 
const_iterators as arguments to these templates? If a user sometimes 
employs for example std::string::iterator and sometimes 
std::string::const_iterator this would lead to unnecessary code bloat.
Also, operator-> is missing from these templates.  

- What is your evaluation of the potential usefulness of the library?
It is potentially very useful. The only objection I could conceive of 
is that ICU already provides much of the functionality of
Boost.Locale. This library also duplicates functionality of Matthew 
Wilsons excellent Fastformat library 
(http://fastformat.sourceforge.net/), though the latter does not 
feature localization support as complete as that of the proposed 
Boost.Locale.

- Did you try to use the library?  With what compiler?  Did you have 
any problems?
I used the CMake way to build this library on an Opensuse system on 
x86_64. Compiler is GCC 4.5.1 (or a Svn snapshot roughly
corresponding). The Boost version built against is 1.46.1, as
delivered by Opensuse. I did not encounter any problems building the 
library.

I played around with some of the components of the library and ran the 
supplied test cases. There is one inconsistency I noted, the following 
program generates output "Currency: $100.11" when the environment 
variable LANG is set to "en_US.UTF-8" and variable LC_MONETARY is set 
to "de_DE.UTF-8". Shouldn't the library honor LC_MONETARY? If LANG is 
set to "de_DE.UTF-8" "Currency: 100,11 €" is put out.
Program:
-------------------------------------------------
#include <ios>
#include <iostream>
#include <ostream>
#include <locale>

#include "boost/locale.hpp"

int main() {
  namespace loc = boost::locale;
  using std::cout;

  loc::generator gen;
  cout.imbue(gen(""));

  cout << "Currency: " << loc::as::currency << 100.11 << '\n';
}
-------------------------------------------------

- How much effort did you put into your evaluation? A glance? A quick 
reading? In-depth study?
Reading the tuturial section of the Documentation. Glancing over the 
reference section. I also had a look at some of the source files, but 
only superficially. 

- Are you knowledgeable about the problem domain?
Only marginally.

With best regards,
Volker Lukas

Volker Lukas

Artyom

tags

participants (2)