
On Mon, Jan 18, 2010 at 1:30 AM, Domagoj Saric <dsaritz@gmail.com> wrote:
'A while back', in the "error_code debate" I used a lexical_cast<> example for demonstrating certain concerns/aspects but the whole post was so long (as usual :) that it was probably read by less than 0.1% of people :) Anyways I'm extracting and reposting this bit now as I think it warrants attention.
The first problem is the ('standard') "std::streams vs efficency" issue: - on msvc 9.0 sp1 the single line boost::lexical_cast<int>( "321" ); caused 14 calls to new and 26 calls to EnterCriticalSection() (not to mention vtable initializations, virtual function calls, usage of locales...) the first time it is called, and 3 calls to new and 15 calls to EnterCriticalSection() on subsequent calls...It also caused the binary (which does not otherwise use streams) to increase by 50 kB! ...which is IMNHO abhorrent... (with my usual 'put things in perspective' examples http://www.theprodukkt.com/kkrieger http://www.youtube.com/watch?v=3vzcMdkvPPg :)
As a start maybe this problem could be sufficiently "lessened" by providing lexical_cast specializations/overloads that use C library functions (strtol, itoa and the likes) as they suffer less from bloat/performance issues than std::streams. Ideally, IMHO, lexical_cast<> should provide the power/configurability of boost::numeric_cast (so that one can, for example, say I do not need/want to use locales/assume ASCII strings and things like that).
The second problem is the use of exceptions (or better of only using exceptions): if, for example, one uses lexical_cast<> deep in the bowels of some parser it is probably "natural" to use a bad_cast exception to report an error to the outside world but if one uses lexical_cast<> to convert a string entered into a widget by a user into an int it would mostly be simpler to have a simple error code if the user entered an invalid string and issue a warning and retry "at the face of the place" (without the need for try-catch blocks)... In other words maybe a dual/hybrid approach of using both exceptions and error codes (through different overloads would be justified).
Boost.Lexical_cast is *slow* for sure. I have been making my own overloads for my own projects that have it use Boost.Spirit2.1 behind it instead of the stream, and with Boost.Spirit2 in the trunk (2.2, 2.3?) to be released in Boost 1.43 most likely, it has an ability to build parsers based on return types, that would allow us to have a rather very generic lexical_cast that would be a *GREAT* great deal faster. Even for you example of boost::lexical_cast<int>( "321" ), spirit2 could still parse that faster then the native fast C function atoi. Although you can still do that pretty easily with Spirit2 in the trunk like this (not sure if this code has the proper identifier spelling, but close enough, it works like this): int result; std::string input = "321"; boost::spirit::qi::gen_parse(input.begin(), input.end(), result); assert(result==321); And yes, as stated, that will execute faster then the native c functions atoi/strtol/etc... Spirit includes a benchmark that you can run yourself as proof. Plus with spirit, you can customize your grammar inline to, like: tuple<int,double> result; std::string input = "[ 42, 3.14 ]"; boost::spirit::qi::phrase_parse(input.begin(), input.end(), '['>>int_>>','>>double_>>']', result, blank); assert(result==make_tuple(42,3.14)); Or even other more complicated things like: tuple<int,std::vector<double> > result; std::string input = "( 42, [3.14,1.2, 3.4] )"; boost::spirit::qi::phrase_parse(input.begin(), input.end(), '('>>int_>>','>>'['>>double_%','>>']'>>')', result, blank); // assert(result==make_tuple(42,std::vector<double>(3.14,1.2,3.4))); // line of pseudo-code Or the above using the generator parser: tuple<int,std::vector<double> > result; std::string input = "42,3.14,1.2,3.4"; boost::spirit::qi::gen_phrase_parse(input.begin(), input.end(), result, lit(',')); // assert(result==make_tuple(42,std::vector<double>(3.14,1.2,3.4))); // line of pseudo-code So if you want speed, use something else, like Boost.Spirit, Boost.Lexical_cast is made for simplicity, not speed, although much of it could certainly be sped up if Boost.Spirit become part of its back-end, and with some template magic it can even fall back to stringstream if spirit does not know how to parse it directly (thus you would need to supply a grammar).