
Lars listed some reasons I was thinking about when asking for more flexibility around the dictionary loading features, so I will try to make myself more clear and specific. After some thought I'll try to explain what I know I would need from a localisation library, from my tiny experience with some at-job-home-made tools (in video games we don't use a "standard" localization tool as most available tools - as your already mentioned - assume we will use a classic filesystem, and other unflexible problems not compatible with some game engine architecture - even on desktop games - so this kind of tool is often house-made) : A. Dictionary source B. Dictionary loading control C. Dictionary format Those customization points shouldn't be directly related (orthogonal?). A. Dictionary source : Currently, if I my understanding is correct, the boost::locale library will always assume that dictionary files are on the (standard?) filesystem. As Lars and others already said, there are some cases (embedded and games) where this is simply not the case, the game structure requiring getting data from somewhere else, baked resource packs or network or RAM or somewhere else. I've seen some domain-specific libraries that are used in video-game industry and other industries to provide a simple way to fix this without managing all cases : 1. They provide a way to load domain-specific data (in our case, dictionaries) from any source by allowing the user to provide a custom "data stream" class that the library will use to pull/read the data. For example OGRE (graphic rendering engine) allow loading textures and models from anywhere by providing such a mechanism. Some people use it to feed the engine with graphic textures and meshes from the network, having a central server providing the resources to the clients (for some simulation applications if I remember well). 2. They provide "helper" functions that assume that the data are on the file system, simplifying the use of dictionary files when no custom data-source are required. That make those libraries data-source-independent. In fact I just remembered that all the libraries I used so far (for games and not-games) provide such a way to plug any source of data to the library. That also let the user easy ways to change the data source later if needed. B. Dictionary loading control This is about "when is a dictionary loaded in memory and usable without having to process something first?". If my understanding is correct, boost::locale will automatically load the dictionary when needed? I guess it will load the dictionary when the corresponding language/domain will be invoked? Anyway, some ways to manually load and unload dictionaries (or dictionaries related to a locale?) would help controlling the application performance/flow. For example most games first load all "whole app life" resources on startup, then will load "world-chunk-specific" resources each time it need it and will unload those resources at some point without exiting the whole app, just to free the memory for another world chunk. After having load a world chunk, there wouldn't be any allocation/deallocation because it would easily slow down the frame rate and make it fluctuate in an unpleasant way. In my own case I also have to manage user-made-modules that have some localization informations that would be loaded when the module is used but not if it isn't. The module structure of my application and memory limitations makes impossible to load all modules at startup, that would be too much and I don't even know how much modules will be available some time after the release. Manual control over when to load/unload what is required for my current "big" game. So some manual control on this side would be of great help. Maybe some kind of strategy could be provided by the user? C. Dictionary format You already pointed the way to provide a custom format for dictionaries, so this is good from my point of view. A lot of companies uses simple excell/csv files to manipulate localization texts, making simple to provide texts to translate to localization companies. I would only criticize the "domain" thing but that's a gettext philisophy thing (read farther). So those points of customization would be necessary for almost all my own projects (games or not). I understand that I might not be in the general case - not sure about this. However I think all programming libraries should be data-source agnostic at least. That said, I have to say that I often searched for good alternatives to gettext as it always seemed to me unadequate for my use (at work AND at home) for other reasons than the previous ones : - the ids have to be strings (or am I wrong?) - having the user to provide custom id would help to manage tools/performance on his side - it assume that the string id will have some context informations allowing to know the right localization needed. It looks like a hack to me because I think each unique text should have a unique id. That way you can have the same english words with different ids, allowing to have different words in another language. The domain string seem to be some hack to fix this case. I would prefer some way to get a unique id from each text, provided by the user. As boost::locale follow the gettext philosophy I don't see how it would be possible to change this without changing the backend. I'm not sure I'm clear about all of this so ask me if you don't understand something. (sorry, my english isn't perfect) The current boost::locale is already a great work that I'd like to use as soon as I'm in a case suited for it's use. So I forgot to say : good work :) About this
Actually one of the most widely deployed localization library - gettext has much harder and stricter restrictions, and yet Boost.Locale implements all gettext gives and much more.
I hadn't seen too much complains about possibility to load dictionaries from gettext developers (unlike other issues)
Most libraries are not held to the same standard that Boost libraries are.
There are (good and sometime bad) reasons why almost all game-industry developers don't use (all) boost and gettext. However I'm making a game on desktop that heavily uses boost and so far it was a really good idea - the alternative was POCO. I'd like to help fighting against the often-wrong belief that stl and boost are bad for games (and i'm not the only one it seem : http://gamedev.stackexchange.com/questions/268/stl-for-games-yea-or-nay - see the answer with the most points, not the checked one) I planned to write my specific solution for my game's localisation, having a somewhat complex user-provided-module-based structure, but if boost::locale provide a solution for the points I've listed, then I can plug it in my game without a problem and that will simplify a lot of things (assuming performance is correct for my need). For the moment I'll keep following how boost::locale goes until I reach the point where I need to make a final decision. Joël Lamotte On Sat, Sep 11, 2010 at 21:22, Mathias Gaunard <mathias.gaunard@ens-lyon.org
wrote:
On 11/09/2010 19:30, Artyom wrote:
Yes, are you using the latest version 2.x from sourceforge site or you had taken the "/trunk"? Because latest boost.locale sits in its own branch - rework.
I didn't use your library.
What exactly did you seen? In what case? Do you save into file or into
std::wcout?
do_in gets called in the file to memory case. I'm talking of a codecvt facet that converts UTF-8 in files to UTF-16 in memory.
The behaviour I've observed is the following: the implementation of fstream in MSVC9 seems to call 'in' char per char, calling again and appending one character when partial is returned.
Then, in case of 'ok', it just reads the first wchar_t written on the output, and ignores the second that would be written in the case of surrogates.
But then, looking at your library, you seem to do some weird (and dangerous!) reinterpret casting, which suggests you're not making the fstream interface directly with a std::codecvt<wchar_t, char, std::mbstate_t> facet. How did you make that work?
Can you bring me the sample code that shows the issue?
Attached is a testcase that demonstrates the bug in MSVC9. It prints "65 65 65 65 65" instead of "65 66 65 66 65 66 65 66 65 66".
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost