
- I get the impression that Artyom has been using the library extensively and it is better for that experience.
It is actually integrated part of CppCMS project. For example this wiki (written in C++/CppCMS) http://art-blog.no-ip.info/wikipp/en/page/main Is localized using Boost.Locale and it uses collation, date/time formatting message translation and multiple locales in same process. So yes, I definitely use it all the time :-).
Only one worry presented itself to me as I read the docs. Consider the following example:
date_time some_point = period::year * 1995 + period::january + period::day*1;
It looks like a period can be implicitly cast to a date_time. This seems dangerous. I feel the start of 1995 should be obtained more explicitly through the calendar.
Actually date_time by default initialized to current time. So if you set a year 1995 it would be today's date and time back into 1995. I prefer not to use explicit cast as it makes life much easier for users and also period::year * 1995 + period::january + period::day*1 Is a periods set has its own type that are set later together. Also small notices period::january is alias for period::month*0 and has different type then period::month.
The prose of the documentation is very good, and the tutorials do a good job of introducing the various aspects.
I did not look at the reference in detail, but I am more concerned that it does not offer the necessary level of detail. Consider for example the documentation for normalize (<http://cppcms.sourceforge.net/boost_locale/html/group__convert.html#ga11672e3cc3ed7eecf1dd07060265aab2>). . I went to the reference for this because the tutorial (<http://cppcms.sourceforge.net/boost_locale/html/conversions.html>) did not explain. I see that it performs Unicode normalization, but there is no hint as to what value I may pass as the 'n' parameter. Where are these documented? This is a rhetorical question; I know the answer is "at the top of the page", but a link would be useful.
Strange that Doxygen didn't inserted like to norm_type automatically. I'll add it and try to look over similar places.
There are many terms used in the documentation which may not be familiar to all readers. e.g. what it means to "fold case" and what the definitions of the various normalization forms, collation levels, etc. are. Some terms are covered in the glossary, but not all and only briefly. There are some links to Wikipedia articles, but not many. I suggest that links to other resources such as more Wikipedia articles, Unicode Consortium documents, etc., especially in the glossary or in a new "references" appendix but also wherever else appropriate (perhaps linking via a list in the appendix).
Good point, I'll address it.
I second the request for previous/next links. It's hard to browse without them.
Ok, I'll see how to do this, unfortunately Doxygen does not know to do them automatically, but at the top you always have a link upper and you can go forward.
There are many code examples, most of which involve writing something to stdout, but few of them include examples of possible output. It would be extremely helpful to add example output (preferably more than one example where the output can vary by locale) to more of the examples.
Ok, I'll try to address this where I can.
Assorted minor issues:
In "Locale Generation":
- You don't mention what character delimits multiple "variant" options in a locale identifier. Is this even possible? This suggests so, but later the boost::local::info::variant function implies that it is not possible.
Actually variant support vary by backend. Unix supports predefined set like @latin or @euro and ICU handles them well. Windows backends does not recognize them and ICU provides its own syntax. Like calendar=hebrew;collation=phonetic I'll add these examples.
In "Conversions"
- You should mention at least briefly what fold_case and normalize do.
Good point.
In "Messages formatting":
- There are four "The source text is not copied." with different emphasis. I think the second and fourth should read "The source text is copied.". In that same paragraph the formatting is wrong near "Plural form translation: Translate a message"; it's missing a newline and wrongly indented.
Ok, missed that you are right.
- The example of multiple message domains defines some hypothetical domains, but doesn't use them in the example; it would be clearer to connect the two.
Good point. I'll address it.
- In the full list of xgettext parameters most but not all of the function names given are links to their respective docs. Could they all be?
Actually it is something that Doxygen did for me implicitly, the point is that u16* and u32* are hidden as they mostly not in use. I'll see if this is possible to fix this.
In "Localized Text Formatting":
- You do not explain how to include a literal '{' character in the string.
Good point, forgot this, it is '{{' and '}}'
- The syntax grammar seems wrong. 'key' seems to be being defined to be just one character, I guess there should be a '*'?
Yes should be +
- "date, time , datetime" has a stray space after "time".
Ok I'll fix this.
- It says "See as::ftime manipulator.", but the "as::ftime" is not a link. Could it be?
I'll add it.
In "Working with dates, times, timezones and calendars.":
- Just how general is the calendar format here?
What do you mean? ICU supports several calendars (Hebrew, Islamic, Japanese, Chinese and some more) and their support vary by ICU version.
In "Using Localization Backends":
- You discuss on what platforms the standard library backend is useful. Where you say "on Linux with GCC or Intel compilers", would it be better to say "when using the GCC standard library, for example with GCC, the Intel compiler" since I guess e.g. clang (if used with the gcc std lib) would work too?
I can't tell, not tested with clang. But I must say that toooo many standard libraries has broken locale support. For example GCC's libstd++ supports locales only on Linux based on libc's (now POSIX but in earlier days non-standard functions like newlocale/duplocale). So I would not be surprised that clang's version of libstd++ compiled with only C/POSIX locales in. For my experience there are very few standard C++ libraries that do this well. Even stlport's locales quite broken (see faults of SunStudio with stlport in tested platforms/compilers list)
- A comment in an example has stray Doxygen markup: "select \c std backend as default", and again later: "since it is not supported by \c std".
Noted.
Misc:
- There was some earlier discussion of whether Boost.Locale should provide the ability to load gettext catalogs from other sources than the filesystem. Was this resolved? It might warrant a mention in the rationale.
Ok this is excellent question. Few days ago I passed once again on older reviews and found out that I forgot the request. It is implemented and documented now in trunk but it was fully tested only in last nightly build (as part of CppCMS) http://art-blog.no-ip.info/files/nightly-build-report.html So I decided not to take a chance of breaking current release and not included it in review version. Basically there is an additional callback of type: boost::function<std::vector<char>(std::string const &file_name,std::string const &encoding)> In messages_info structure. See: http://cppcms.sourceforge.net/boost_locale/html/gnu__gettext_8hpp_source.htm... So a custom file system support can be installed in few lines of code (example exists in trunk as well)
Please explicitly state in your review whether the library should be accepted.
Yes it should.
I send this review now since I suspect I will not get around to looking at the library in more detail or try it myself. If I do manage that I will be sure to report back further.
For now, thanks very much to Artyom and good luck with the review.
John Bytheway
Thank you very much for the review and many important inputs. Artyom