
On 19/04/2011 20:30, Artyom wrote:
As you know, the API is a subject to change in Boost, and I would say that it is changed toooo freely and toooo often.
Right, that's because the API is what makes Boost libraries good. If it's not good enough, it needs to change. And to avoid APIs from changing too much in released libraries, the best solution is to make sure the library has the best API *when it is accepted into Boost*. And I think there is still some room for improvement with the Boost.Locale API.
I can accept that some operations may be better to work on arbitrary streams but most of them just don't need it.
For example collation... When exactly do you need to compare arbitrary text streams?
Because the data does not exist in memory, may be computed on the fly, or whatever really. A possible application is simply to chain operations in a pipeline, i.e. without having to apply one operation completely, then the other on that result, etc (and do the intermediate temporary buffer allocations).
So this is a quick glance to the future that currently only on planning stage and is not a part of the review.
I thought to provide stream API for charset conversion but put it on hold as it is not really a central part, especially when codecvt it there.
I believe it *is* the very central part of any text processing system.
Such things should appear in code review.
I'm really welcome to learn how to do things better. The question is whether such changes are critical or not.
All those comments at the end of my original message only affect the internals, so I don't care so much. It's mostly a style thing.
Take a deeper look to the section.
It is different from backend selection.
If I want to add a backend, I only want to add a new repository with the implementation for that backend. I do not want to have to hack all shared files by adding some additional ifdefs.
I'm also a bit concerned about the generalized usage of virtual functions anywhere; it seems unnecessary in quite a few places, please prefer using templates when dynamic binding is not required.
More specific points?
Couldn't boost::locale::util::base_converter be a concept rather than an abstract base class?
Because there is no need to duplicate a complex code via template metaprogamming if a simple function call can be made.
This sentence doesn't make any sense to me. Template meta-programming is not a mean to duplicate code. Nor is normal template usage, which is what I suggested instead of virtual functions, template meta-programming.
Don't overrate template metaprogramming and don't underestimate virtual functions.
You could remove the reference to the base class by a template parameter and get rid of the virtualness of the member functions. The virtualness is already in codecvt, why would you want to put it twice in a row?
A lot of new and vectors too, I'd prefer if ideally the library never allocated anything.
I'm sorry but it is just something that can't and would never happen.
This request has no reasonable base especially for such complex topic.
Usage of templates instead of inclusion polymorphism would allow to avoid newing the object and using a smart pointer, for example.
I'm not sure what exact location bothers use but anywhere (unless I miss something) there are minimal data copying, and I relate heavily on RVO.
I didn't say copying, I said allocation and usage of new. grep -r -F "new " * should give you the exact locations.
If you see some not-required copying tell me.
Plus the instances of allocation in the boundary stuff (when you build an index vector and when you copy into a new basic_string) appears to be unnecessary.
More specific location? I don't remember such thing, I just need better pointers to answer.
I've been very precise. You unnecessarily allocate a new string and copy the contents in the operator* of token_iterator. I've also given you a trivial fix for that particular problem, though it doesn't solve the underlying problem of generating (and allocating) the index vector first. I don't see how I can be more precise than this.
Passing an arbitrary range to the functions is convenient, forcing the user to copy it to a vector and pass the begin and end pointers before it can call the functions isn't.
Where exactly, because I need better context as above.
The interface of your functions is something like std::string foo(char* begin, char* end), Let's say I've got a filter_range<std::vector<char> > as input. To use your function, I need to do std::vector<char> v; std::copy(boost::begin(input), boost::end(input), std::back_inserter(v)); std::string s = foo(v.data(), v.data() + v.size()); A more convenient interface would directly allow std::string s = foo(input);