Re: [boost] [locale] Formal review of Boost.Locale library starts tomorrow

8 Apr 2011

      ...

...
- I  get the impression that Artyom has been using the library
extensively and it  is better for that experience.
It is actually integrated part of CppCMS project.

For example this wiki (written in C++/CppCMS)

  http://art-blog.no-ip.info/wikipp/en/page/main

Is localized using Boost.Locale and it uses collation,
date/time formatting message translation and multiple
locales in same process.

So yes, I definitely use it all the time :-).
...
Only one worry presented itself to me as I  read the docs.  Consider the
following example:
date_time  some_point = period::year * 1995 + period::january +
period::day*1;
It  looks like a period can be implicitly cast to a date_time.  This
seems  dangerous.  I feel the start of 1995 should be obtained more
explicitly  through the calendar.
Actually date_time by default initialized to current time. So if you
set a year 1995 it would be today's date and time back into 1995.

I prefer not to use explicit cast as it makes life much easier
for users and also 

  period::year * 1995 + period::january + period::day*1

Is a periods set has its own type that are set later together.

Also small notices period::january is alias for period::month*0 and
has different type then period::month.
...
The prose of the documentation is very  good, and the tutorials do a good
job of introducing the various  aspects.
I did not look at the reference in detail, but I am more  concerned that
it does not offer the necessary level of detail.   Consider for example
the documentation for  normalize
(<http://cppcms.sourceforge.net/boost_locale/html/group__convert.html#ga11672e3cc3ed7eecf1dd07060265aab2>).
.
  I went to the reference for this because the  tutorial
(<http://cppcms.sourceforge.net/boost_locale/html/conversions.html>)  did
not explain.  I see that it performs Unicode normalization, but  there is
no hint as to what value I may pass as the 'n' parameter.   Where are
these documented?  This is a rhetorical question; I know the  answer is
"at the top of the page", but a link would be useful.
Strange that Doxygen didn't inserted like to norm_type automatically. I'll
add it and try to look over similar places.
...
There  are many terms used in the documentation which may not be familiar
to all  readers.  e.g. what it means to "fold case" and what the
definitions of  the various normalization forms, collation levels, etc.
are.  Some terms  are covered in the glossary, but not all and only
briefly.  There are  some links to Wikipedia articles, but not many.  I
suggest that links to  other resources such as more Wikipedia articles,
Unicode Consortium  documents, etc., especially in the glossary or in a
new "references" appendix  but also wherever else appropriate (perhaps
linking via a list in the  appendix).
Good point, I'll address it.
...
I second the request for previous/next links.  It's hard  to browse
without them.
Ok, I'll see how to do this, unfortunately Doxygen does not know
to do them automatically, but at the top you always have
a link upper and you can go forward.
...
There are many code examples, most of which  involve writing something to
stdout, but few of them include examples of  possible output.  It would
be extremely helpful to add example output  (preferably more than one
example where the output can vary by locale) to  more of the examples.
Ok, I'll try to address this where I can.
...
Assorted minor issues:
In "Locale  Generation":
- You don't mention what character delimits multiple  "variant" options
in a locale identifier.  Is this even possible?   This suggests so, but
later the boost::local::info::variant function implies  that it is not
possible.
Actually variant support vary by backend. Unix supports
predefined set like @latin or @euro and ICU handles them
well. Windows backends does not recognize them and
ICU provides its own syntax. Like

   calendar=hebrew;collation=phonetic

I'll add these examples.
...
In "Conversions"
- You should mention  at least briefly what fold_case and normalize do.
Good point.
...
In "Messages  formatting":
- There are four "The source text is not copied." with  different
emphasis.  I think the second and fourth should read "The  source text is
copied.".  In that same paragraph the formatting is wrong  near "Plural
form translation: Translate a message"; it's missing a newline  and
wrongly indented.
Ok, missed that you are right.
...
- The example of multiple message domains  defines some hypothetical
domains, but doesn't use them in the example; it  would be clearer to
connect the two.
Good point. I'll address it.
...
- In the full list of xgettext  parameters most but not all of the
function names given are links to their  respective docs.  Could they all be?
Actually it is something that Doxygen did for me implicitly, the point
is that u16* and u32* are hidden as they mostly not in use. I'll see if
this is possible to fix this.
...
In "Localized Text  Formatting":
- You do not explain how to include a literal '{' character  in the string.
Good point, forgot this, it is '{{' and '}}'
...
- The syntax grammar seems wrong.  'key' seems to be  being defined to be
just one character, I guess there should be a  '*'?
Yes should be +
...
- "date, time , datetime" has a stray space after "time".
Ok I'll fix this.
...
-  It says "See as::ftime manipulator.", but the "as::ftime" is not  a
link.  Could it be?
I'll add it.
...
In "Working with dates, times, timezones  and calendars.":
- Just how general is the calendar format  here?
What do you mean?

ICU supports several calendars (Hebrew, Islamic, Japanese, Chinese and some 
more)
and their support vary by ICU version.
...
In "Using Localization Backends":
- You discuss on what  platforms the standard library backend is useful.
 Where you say "on Linux  with GCC or Intel compilers", would it be
better to say "when using the GCC  standard library, for example with
GCC, the Intel compiler" since I guess  e.g. clang (if used with the gcc
std lib) would work too?
I can't tell, not tested with clang. 

But I must say that toooo many standard libraries has broken locale
support. For example GCC's libstd++ supports locales only
on Linux based on libc's (now POSIX but in earlier days non-standard
functions like newlocale/duplocale).

So I would not be surprised that clang's version of libstd++ compiled
with only C/POSIX locales in.

For my experience there are very few standard C++ libraries that do this
well. Even stlport's locales quite broken (see faults of SunStudio
with stlport in tested platforms/compilers list)
...
- A comment  in an example has stray Doxygen markup: "select \c std
backend as default",  and again later: "since it is not supported by \c std".
Noted.
...
Misc:
-  There was some earlier discussion of whether Boost.Locale should
provide the  ability to load gettext catalogs from other sources than  the
filesystem.  Was this resolved?  It might warrant a mention in  the
rationale.
Ok this is excellent question.

Few days ago I passed once again on older reviews and found out that
I forgot the request.

It is implemented and documented now in trunk but it
was fully tested only in last nightly build (as part of CppCMS)

  http://art-blog.no-ip.info/files/nightly-build-report.html

So I decided not to take a chance of breaking current release
and not included it in review version.

Basically there is an additional callback of type:

   boost::function<std::vector<char>(std::string const &file_name,std::string 
const &encoding)> 

In messages_info structure. See:

  http://cppcms.sourceforge.net/boost_locale/html/gnu__gettext_8hpp_source.htm...

So a custom file system support can be installed in few
lines of code (example exists in trunk as well)
...
...
Please explicitly state in your review whether the library  should be
accepted.
Yes it should.
I send this review now  since I suspect I will not get around to looking
at the library in more  detail or try it myself.  If I do manage that I
will be sure to report  back further.
For now, thanks very much to Artyom and good luck with the  review.
John  Bytheway
Thank you very much for the review and many important
inputs.

Artyom