New subject: [locale] Support of non US-ASCII character set for messages keys

27 Apr 2011

      Hello,

After reviewing all the discussion I've decided
to do following changes in the interface to 
provide better support for non-US-ASCII keys.

The actual thing that convinced me is a requirement
to be able to include chars like © into the text...

Currently there are following classes:

   template<typename CharType>
   class message_format :  public std::locale::facet {
   public:
     ...
     typedef CharType char_type;
     virtual char_type const *get(int domain_id,char const *context,char const 
*id) const = 0;
     ...
   };

   class message {
   public:
     ...
     explicit message(char const *id);
     ...
     // convert message to localized message
     template<typename CharType>
     std::basic_string<CharType> str(std::locale const &locale) const;

   };

   ...
   inline message translate(char const *id);
   inline std::string gettext(char const *id,std::locale const 
&loc=std::locale());
   inline std::wstring wgettext(char const *id,std::locale const 
&loc=std::locale());
   ...

Basically message is created using narrow id only and can be converted
to multiple output formats narrow, wide and so on.

   std::cout << translate("Hello") << std::endl
   std::wcout << translate("Hello") << std::endl;

And you could call:

   message msg = translate("Hello");
   std::string hello = msg.str<char>();
   std::wstring whello = msg.str<wchar_t>();

Work together.

I'll change it in following way:

   template<typename CharType>
   class message_format :  public std::locale::facet {
   public:
     ...
     typedef CharType char_type;
     virtual char_type const *get(int domain_id,char_type const 
*context,char_type const *id) const = 0;
     ...
   };

   template<typename CharType>
   class basic_message {
   public:
     typedef CharType char_type;
     typedef std::basic_string<char_type> string_type;
     ...
     explicit message(char_type const *id);
     ...
     // convert message to localized message
     string_type str(std::locale const &locale) const;

   };
   typedef basic_message<char> message;
   typedef basic_message<wchar_t> wmessage;
   typedef basic_message<char16_t> u16message;
   typedef basic_message<char32_t> u32message;

   ...
   inline message translate(char const *id);
   inline wmessage translate(wchar_t const *id);
   inline std::string gettext(char const *id,std::locale const 
&loc=std::locale());
   inline std::wstring wgettext(wchar_t const *id,std::locale const 
&loc=std::locale());
   ...

Now you would have to:

   std::cout << translate("Hello") << std::endl
   std::wcout << translate(L"Hello") << std::endl;

And you should call:

   message msg = translate("Hello");
   wmessage wmsg = translate(L"Hello");
   std::string hello = msg.str();
   std::wstring whello = msg.str();

Additionally you would be able to specify the encoding
of the source strings when adding domain.

  boost::locale::generator gen;
  gen.add_messages_domain("myprogram","windows-936");

While the default would always be UTF-8.

So if you write in the program:

  std::cout << translate("平和") << std::cout

Under GCC using UTF-8 sources you have anythig to do.

If you are using MSVC then you'll have to provide
a charset name as shown above or use u8"平和"

Of course this would break the API for users who
currently use Boost.Locale (and I know at least several
project who will suffer).

But this would probably bring it so some logical
point and prevent rising these questions.

If course you should remember that untranslated
non-US-ASCII strings would be converted in the 
run-time to current locale's encoding.

Regards,

  Artyom Beilis

P.S.: Of course the documentation will still discourage 
      programmers from using non-US-ASCII keys as they
      may not be displayed properly in local character
      sets and may confuse users.

[locale] Support of non US-ASCII character set for messages keys

Artyom

Ryou Ezoe

Ryou Ezoe

Artyom

tags

participants (2)