
Hi, On Tue, Apr 06, 2004 at 06:29:54PM +0400, Vladimir Prus wrote: [snip]
This argument is quite questionable. IMHO either you stick with narrow, or wide characters in whoule application. Otherwise you are forced to make conversions on the border lines. I don't realy see a point in the mixed type approach.
Ok, let me rephrase. You're writing boost::http_proxy library and want to make it customizable via program_options. So you need to provide function 'get_options_descriptions'. What will the function return? If there's only one options_descriptions class, there's no question. If there are two versions, then which one do you return? No matter what you decide, the main application might need to do conversions just because it either needs unicode or does not need it.
Well, the http library have two options. Either it can be char_type independent or it would simply accept only char* variants. Given the case of http library, later will be probably the case because it is quite domains specific library. I see that we are generaly arguing, whether program_options library domain is generic enough to support natively char and wchar_t (and be templated) or if it is enough to provide an interface via conversions and support only one encoding internaly. I'm in favor of the first approach. The library works with various sources of informations and its purpose is to restructure the information from these sources into something more usable. I would assume for such a utility, that information passed on input has the same encoding and format as the information on the output. From the nature of the library it seems, that it might be possible to avoid unnecessary conversion into some intermediate encoding. Another association might be a container. The library is a kind container. It parses the input and provides a conainer-like interface for the information stored there. I find it natural, that the container uses the same encoding for its internals as it provides in the external interface.
And why an existing operator>> which works for istream only should be fixed to support wistream, if some other option need unicode support?
I don't understand this point. [snip]
I generally tend to ignore speed issues, since with linear time algorithsm and contemporary processors it's not likely to be important. OTOH, code size *is* important. I've just compiled one of the library example, with static linking and full optimization. It takes 152K.
Probably, it's partly gcc fault, or maybe it can be reduced but now it's so. Empty program takes several K. Now, if I tell anyone "here's a good library for parsing command line but it will add 152K to the application size", the someone will tell "thanks, I'll parse command line by hand".
However, is the library is shared and is available on every Linux installation, then the code size is not issue.
I don't think, that overhead of 152kb is somehow too big. We living in the world of GBs, few kBs does not realy change much. If an application would use some STL stuff, it won't very small anyway. (probably not the best example, but I have compiled following program with gcc3.3.1 in cygwin with options -03, and stripped of debug info afterwards #include <iostream> using namespace std; int main() { cout << "a test" << endl; return 0; } resulting binary have 200Kb) I would strongly prefer simplier usage of the library to an overhead of 152kBs.
If my application is unicode, and all input I have is unicode, it is realy annoying to convert everything to and fro when interfacing to library like program_options.
You don't have to convert anything. Parsers will accept wstring and for values where you need unicode you'll use wstring as well.
[snip]
Some of the conversions are unavoidable. E.g. if you have unicode-enabled library, you'd still need to accept ascii input (because you can't expect that all input sources are unicode -- main in Linux is never unicode).
If you want to support legacy operator>> you'd need conversion to ascii.
I'm not a linux expert. I'm mainly working on windows. If I decide to use unicode, I have whole api in the unicode without any need for conversions. Actualy in the project I'm working on now, I encountered a need for conversion only once. I'm using date_time library and there was no support for the wide strings at the time. Fortuntely it is fixed now :) Regards, Pavol