Re: [boost] [Boost.Locale] Upcoming Boost.Locale version

I like to think in terms of [serialized] objects and not strings. Besides having to internationalize images sometimes I need to do the same to arrays of strings or any type of object. In the use case of array of strings, I may in English have a selection or radii ["25 miles", "50 miles", "75 miles" "100 miles", "150 miles", "200 miles"]. However in a different language I may have ["50km", "100km", "150km", "250km"]. Because the unit is different I may more or fewer items in the array. The items in the array may not even be strings but a unit object that has both
runtime scaler and runtime unit.
There is a general approach to solve such issues, even if this is not supported by the library. You use following: double factor = atof(dgettext("Distance conversion factor for km","1.0").c_str()); int local_distance = static_cast<int>(metric_distance * factor); std::cout <<format(translate("{1,num} kms","{1,num} kms",local_distance)) % local_distance That's it. You just define some translation string and use it as localization reasource. And tanslator translates this factor to ~1.6 and kms to miles according to locale.
So using ICU how would I store and retrieve an array of objects using boost serialization into one field.
For serialization you should always use locale insensitive data. For example metric system. As for time we alawys use POSIX time wich always relates to UTC.
Does ICU support any kind of contenttype, metadata or filename extension that
could identify such items to tools? Will your Locale library have any add on or enhancement to simplify
this use-case;
perhaps translate<object type> versus translate?
This is done differently. Lets define following: class distance { int meters; }; std::ostream &operator<<(std::ostream &out,distance const &d) { // get stream specific locale std::locale l=out.getloc(); // get stream specific factor double factor = atof(dgettext("Distance conversion factor for km","1.0",l).c_str()); int local_distance = static_cast<int>(metric_distance * factor); // format the distance is stream specific units. out <<format(translate("{1,num} kms","{1,num} kms",local_distance)) % local_distance } Note: locaization is different for serialization and it is wrong way to look on them this way. Serialization is provided for computer-to-computer interface while localization is provided for computer-to-human interface and it is much more compilcated. For example, for serialization of time point it is enought to write it as number representing POSIX time to stream. time_t time_point=time(0); std::cout << time_point; // would be something like 32452345 For human you would write time_t time_point=time(0); std::cout << as::date_time << time_point; // may be something like 03/02/2010 12:30:02 Artyom

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Artyom Sent: Tuesday, August 24, 2010 2:23 PM To: boost@lists.boost.org Subject: Re: [boost] [Boost.Locale] Upcoming Boost.Locale version <snip> For human you would write time_t time_point=time(0); std::cout << as::date_time << time_point; // may be something like 03/02/2010 12:30:02 So is that 3 Feb 2010 or 2 Mar 2010 ;-) Or are the reader and the locale assumed the same? (When "Assumption is the mother of all foul ups?") Paul Paul A. Bristow, Prizet Farmhouse, Kendal LA8 8AB UK +44 1539 561830 07714330204 pbristow@hetp.u-net.com

<snip>
For human you would write
time_t time_point=time(0); std::cout << as::date_time << time_point;
// may be something like 03/02/2010 12:30:02
So is that 3 Feb 2010 or 2 Mar 2010 ;-)
Or are the reader and the locale assumed the same?
(When "Assumption is the mother of all foul ups?")
Paul
Exactly, that is why looking at localization as serialization is incorrect. So for internal representation you should always pic locale independent representation. For example for number C or binary, for dates ISO format like 2010-02-03 12:30:02 GMT. Artyom

I like to think in terms of [serialized] objects and not strings. Besides having to internationalize images sometimes I need to do the same to arrays of strings or any type of object. In the use case of array of strings, I may in English have a selection or radii ["25 miles", "50 miles", "75 miles" "100 miles", "150 miles", "200 miles"]. However in a different language I may have ["50km", "100km", "150km", "250km"]. Because the unit is different I may more or fewer items in the array. The items in the array may not even be strings but a unit object that has both
runtime scaler and runtime unit.
There is a general approach to solve such issues, even if this is not supported by the library.
You use following:
double factor = atof(dgettext("Distance conversion factor for km","1.0").c_str()); int local_distance = static_cast<int>(metric_distance * factor); std::cout<<format(translate("{1,num} kms","{1,num} kms",local_distance)) % local_distance
That's it.
You just define some translation string and use it as localization reasource. And tanslator translates this factor to ~1.6 and kms to miles according to locale.
The conversion of ["25 miles", "50 miles", "75 miles" "100 miles", "150 miles", "200 miles"] is ["40.2336 km", "80.4672 km", "120.70080000000001 km" "160.9344 km ", "241.40160000000003 km", "321.8688 km"]. Conversion is not an option because an end user would not want to see fractional more less oddball numbers such as 241 or 322. Because of this an end user would be presented with a different array of strings; one that makes more sense in that locale such as ["50km", "100km", "150km", "250km"]. There are a different number of items in the array because of the different length scale factor it wouldn't make any sense to preserve the same number items from the end user's perspective and business requirements can further drive this. As such I need to I18N the array of strings as a whole not each string individually. ICU supports simple integer array but not simple string array. So what support does your library have for the ICU complex types. Remember ICU arrays and tables can contain recursively other ICU arrays and tables. I think both your library and ICU are wonderful but would like some mappings, convenience functions, perhaps in another library that provide the following. simple type integer array => vector<int> or any collection of int complex type array of primitive => vector<primitive or string or any> or any collection of primitives, strings, any(s) complex type table of primitive => map<primitive or string or any> or any associative collection of primitives, strings, any(s) multiple boost ICU serialization archives for more complex types capable of generating reading using the ICU api and other output and input archive for creating the .txt document passed to genrb.exe With such there would be no need for the users of the library to have to keep writing std::ostream &operator<<(std::ostream &out, ... const &d) over and over again for each type especially if the predefined type already have serializers. Originally I had thought of storing a object as a string in one field but I didn't know or understand what ICU complex types are capable of. Consider the following exerpt from the ICU documentation http://userguide.icu-project.org/locale/resources gave an example of a menu translated/serialized as such menu { id { "mainmenu" } items { { id { "file" } name { "&File" } items { { id { "open" } name { "&Open" } } { id { "save" } name { "&Save" } } { id { "exit" } name { "&Exit" } } } } { id { "edit" } name { "&Edit" } items { { id { "copy" } name { "&Copy" } } { id { "cut" } name { "&Cut" } } { id { "paste" } name { "&Paste" } } } } ... } } Now imagine a menu object that was serializable/ translatable and could be retrieved with a single line menu rb.translate<menu>("menu");. In this example a menu in a different locale could have not only different text, but different keyboard handlers and even different number of [sub] menu items. I don't know if this functionality belongs in your library, built on top of your library, or as a better C++ ICU wrapper that what is currently provided by ICU. It is still a need though. Please send me your thoughts or if I need to clarify on anything.

Conversion is not an option because an end user would not want to see fractional more less oddball numbers such as 241 or 322. Because of this an end user would be presented with a different array of strings; one that makes more sense in that locale such as ["50km", "100km", "150km", "250km"].
Agree, them just use following: static char const *distances[] = { NOOP_("50km"), NOOP_("100km"), NOOP_("150km"), NOOP_("250km") } cout << tralsate(distances[i]) << endl; And tranlsator would know how to set these distances in its own way into something like "30 miles", "60 miles" etc.
So what support does your library have for the ICU complex types.
Just to make it clear. - I don't use ICU for resources translation - I do not even expose ICU's API to user. See: http://cppcms.sourceforge.net/boost_locale/html/tutorial.html#a5bfc9e07964f8... And http://cppcms.sourceforge.net/boost_locale/html/tutorial.html#19ca14e7ea6328... So you don't use ICU resources. You rather use GNU Gettext like catalogs, that have much better tools and much more popular and much easier to understand by average translator and developer.
complex type table of primitive => map<primitive or string or any> or any associative collection of primitives, strings, any(s) multiple boost ICU serialization archives for more complex types capable of generating reading using the ICU api and other output and input archive for creating the .txt document passed to genrb.exe
No, there are plently of ways to store very complex data in gettext catalogs. It is a metter taste how to access and parse them. Sometimes, simpler solutions are just better.
http://userguide.icu-project.org/locale/resources gave an example of a menu translated/serialized as such
menu { id { "mainmenu" } items { { id { "file" } name { "&File" } items { { id { "open" } name { "&Open" } } { id { "save" } name { "&Save" } } { id { "exit" } name { "&Exit" } } } }
{ id { "edit" } name { "&Edit" } items { { id { "copy" } name { "&Copy" } } { id { "cut" } name { "&Cut" } } { id { "paste" } name { "&Paste" } } } }
... } }
Example for same structure in gettext (context+key): "Main menu", "&File" "Main menu->File", "&Open" "Main menu->File", "&Save" "Main menu->File", "&Exit" "Main menu", "&Edit" "Main menu->Edit", "&Copy" "Main menu->Edit", "&Cut" "Main menu->Edit", "&Paste" Please note, you also get working English translation for free. Atyom
participants (3)
-
Artyom
-
Jarrad Waterloo
-
Paul A. Bristow