Re: [boost] [lexical_cast] Fast conversion from boost::iterator_range<std::string::iterator>?

On Oct 13, 2011 Michel Morin wrote:
lexical_cast made optimizations for some types (such as `std::string`) to achieve better performance. But, from my experience, it seems that lexical_cast does not make any optimization for `boost::iterator_range<std::string::iterator>`.
Is there any plan to add optimization for such iterator ranges? Currently, - `lexical_cast<int>(std::string(iter_rng.begin(), iter_rng.end()))` is faster than `lexical_cast<int>(iter_rng)`. - Assuming it is safe to use `std::atoi`, `std::atoi(&iter_rng.front())` is faster than both of the above two methods. (`iter_rng` is an instance of `boost::iterator_range<std::string::iterator>`.)
Regards, Michel
This post was on the boost users list but never got a reply. I figure it might be a better fit on the developer list. I too would like to see optimized conversions work on string ranges; what's the point in doing a fast conversion if you have to do a string copy first? :) -Matt

On Wed, Nov 23, 2011 at 5:55 PM, Matthew Chambers <matt.chambers42@gmail.com> wrote:
On Oct 13, 2011 Michel Morin wrote:
lexical_cast made optimizations for some types (such as `std::string`) to achieve better performance. But, from my experience, it seems that lexical_cast does not make any optimization for `boost::iterator_range<std::string::iterator>`.
Is there any plan to add optimization for such iterator ranges? Currently, - `lexical_cast<int>(std::string(iter_rng.begin(), iter_rng.end()))` is faster than `lexical_cast<int>(iter_rng)`.
Why's that?
- Assuming it is safe to use `std::atoi`, `std::atoi(&iter_rng.front())` is faster than both of the above two methods.
Is front() guaranteed to return a 0-terminated string? -- Olaf

On 11/23/2011 11:55 AM, Olaf van der Spek wrote:
On Wed, Nov 23, 2011 at 5:55 PM, Matthew Chambers wrote:
On Oct 13, 2011 Michel Morin wrote:
lexical_cast made optimizations for some types (such as `std::string`) to achieve better performance. But, from my experience, it seems that lexical_cast does not make any optimization for `boost::iterator_range<std::string::iterator>`.
Is there any plan to add optimization for such iterator ranges? Currently, - `lexical_cast<int>(std::string(iter_rng.begin(), iter_rng.end()))` is faster than `lexical_cast<int>(iter_rng)`.
Why's that?
Why is it faster? Presumably because lexical_cast has optimized specializations for std::string but not for iterator_range<std::string::iterator>. So creating a new string from the range invokes the optimized specialization and ends up being faster.
- Assuming it is safe to use `std::atoi`, `std::atoi(&iter_rng.front())` is faster than both of the above two methods.
Is front() guaranteed to return a 0-terminated string?
Certainly not. I consider that caveat to be part of the "assuming it is safe" precondition. He's just saying that there is potential to be faster. Although he didn't compare lexical_cast<int> on a string without copying the string, which should be comparable to atoi. I put together a test (http://codepad.org/lKxMXOwP): 1000000 iterations of "123": atoi: 0.0695092 seconds strtol: 0.0652839 seconds Spirit: 0.546789 seconds lexical_cast(string): 0.615594 seconds lexical_cast(iterator_range): 4.25534 seconds lexical_cast(iterator_range->string): 1.07405 seconds 1000000 iterations of "123567890": atoi: 0.0856611 seconds strtol: 0.0829571 seconds Spirit: 0.905017 seconds lexical_cast(string): 0.762137 seconds lexical_cast(iterator_range): 7.46185 seconds lexical_cast(iterator_range->string): 1.23397 seconds 1000000 iterations of "1.23456": atof: 0.520452 seconds strtod: 0.519113 seconds Spirit: 1.02579 seconds lexical_cast(string): 2.93515 seconds lexical_cast(iterator_range): 6.62648 seconds lexical_cast(iterator_range->string): 3.73831 seconds 1000000 iterations of "1.23456789e42": atof: 0.711819 seconds strtod: 0.719144 seconds Spirit: 1.48992 seconds lexical_cast(string): 3.31997 seconds lexical_cast(iterator_range): 9.87004 seconds lexical_cast(iterator_range->string): 4.20932 seconds -Matt

On 11/23/2011 4:13 PM, Matthew Chambers wrote:
I put together a test (http://codepad.org/lKxMXOwP):
1000000 iterations of "123": atoi: 0.0695092 seconds strtol: 0.0652839 seconds Spirit: 0.546789 seconds lexical_cast(string): 0.615594 seconds lexical_cast(iterator_range): 4.25534 seconds lexical_cast(iterator_range->string): 1.07405 seconds
1000000 iterations of "123567890": atoi: 0.0856611 seconds strtol: 0.0829571 seconds Spirit: 0.905017 seconds lexical_cast(string): 0.762137 seconds lexical_cast(iterator_range): 7.46185 seconds lexical_cast(iterator_range->string): 1.23397 seconds
1000000 iterations of "1.23456": atof: 0.520452 seconds strtod: 0.519113 seconds Spirit: 1.02579 seconds lexical_cast(string): 2.93515 seconds lexical_cast(iterator_range): 6.62648 seconds lexical_cast(iterator_range->string): 3.73831 seconds
1000000 iterations of "1.23456789e42": atof: 0.711819 seconds strtod: 0.719144 seconds Spirit: 1.48992 seconds lexical_cast(string): 3.31997 seconds lexical_cast(iterator_range): 9.87004 seconds lexical_cast(iterator_range->string): 4.20932 seconds
I forgot to say the test environment: boost 1.47 MSVC 9 SP1 Core 2 Q9400 -Matt

On 11/23/2011 04:15 PM, Matthew Chambers wrote:
On 11/23/2011 4:13 PM, Matthew Chambers wrote:
I put together a test (http://codepad.org/lKxMXOwP):
1000000 iterations of "123": atoi: 0.0695092 seconds strtol: 0.0652839 seconds Spirit: 0.546789 seconds lexical_cast(string): 0.615594 seconds lexical_cast(iterator_range): 4.25534 seconds lexical_cast(iterator_range->string): 1.07405 seconds
1000000 iterations of "123567890": atoi: 0.0856611 seconds strtol: 0.0829571 seconds Spirit: 0.905017 seconds lexical_cast(string): 0.762137 seconds lexical_cast(iterator_range): 7.46185 seconds lexical_cast(iterator_range->string): 1.23397 seconds
1000000 iterations of "1.23456": atof: 0.520452 seconds strtod: 0.519113 seconds Spirit: 1.02579 seconds lexical_cast(string): 2.93515 seconds lexical_cast(iterator_range): 6.62648 seconds lexical_cast(iterator_range->string): 3.73831 seconds
1000000 iterations of "1.23456789e42": atof: 0.711819 seconds strtod: 0.719144 seconds Spirit: 1.48992 seconds lexical_cast(string): 3.31997 seconds lexical_cast(iterator_range): 9.87004 seconds lexical_cast(iterator_range->string): 4.20932 seconds
I forgot to say the test environment: boost 1.47 MSVC 9 SP1 Core 2 Q9400
-Matt
The spirit code isn't really optimal (see the slightly improved version attached). Also, did you turn on optimization?

On 11/23/2011 04:15 PM, Matthew Chambers wrote:
On 11/23/2011 4:13 PM, Matthew Chambers wrote:
I put together a test (http://codepad.org/lKxMXOwP):
1000000 iterations of "123": atoi: 0.0695092 seconds strtol: 0.0652839 seconds Spirit: 0.546789 seconds lexical_cast(string): 0.615594 seconds lexical_cast(iterator_range): 4.25534 seconds lexical_cast(iterator_range->string): 1.07405 seconds
1000000 iterations of "123567890": atoi: 0.0856611 seconds strtol: 0.0829571 seconds Spirit: 0.905017 seconds lexical_cast(string): 0.762137 seconds lexical_cast(iterator_range): 7.46185 seconds lexical_cast(iterator_range->string): 1.23397 seconds
1000000 iterations of "1.23456": atof: 0.520452 seconds strtod: 0.519113 seconds Spirit: 1.02579 seconds lexical_cast(string): 2.93515 seconds lexical_cast(iterator_range): 6.62648 seconds lexical_cast(iterator_range->string): 3.73831 seconds
1000000 iterations of "1.23456789e42": atof: 0.711819 seconds strtod: 0.719144 seconds Spirit: 1.48992 seconds lexical_cast(string): 3.31997 seconds lexical_cast(iterator_range): 9.87004 seconds lexical_cast(iterator_range->string): 4.20932 seconds
I forgot to say the test environment: boost 1.47 MSVC 9 SP1 Core 2 Q9400
-Matt
The spirit code isn't really optimal (see the slightly improved version attached). Also, did you turn on optimization?
Apparently not (MSVC2010, 64bit): 1000000 iterations of "123": atoi: 0.0366391 seconds strtol: 0.0327922 seconds Spirit: 0.0117357 seconds lexical_cast(string): 0.0702884 seconds lexical_cast(iterator_range): 0.742229 seconds lexical_cast(iterator_range->string): 0.0797006 seconds 1000000 iterations of "123567890": atoi: 0.0489029 seconds strtol: 0.0473159 seconds Spirit: 0.0209374 seconds lexical_cast(string): 0.0842183 seconds lexical_cast(iterator_range): 1.02365 seconds lexical_cast(iterator_range->string): 0.0912515 seconds 1000000 iterations of "1.23456": atof: 0.295518 seconds strtod: 0.279579 seconds Spirit: 0.0308074 seconds lexical_cast(string): 1.2818 seconds lexical_cast(iterator_range): 2.15233 seconds lexical_cast(iterator_range->string): 1.35335 seconds 1000000 iterations of "1.23456789e42": atof: 0.422748 seconds strtod: 0.407293 seconds Spirit: 0.0479103 seconds lexical_cast(string): 1.47888 seconds lexical_cast(iterator_range): 2.61988 seconds lexical_cast(iterator_range->string): 1.52793 seconds That's more like it :-P Regards Hartmut --------------- http://boost-spirit.com http://stellar.cct.lsu.edu

On 11/23/2011 4:35 PM, Hartmut Kaiser wrote:
On 11/23/2011 04:15 PM, Matthew Chambers wrote: The spirit code isn't really optimal (see the slightly improved version attached).
OK. That would be nice in the docs...took me 15 minutes to figure out what I did have! :(
Also, did you turn on optimization?
I DID have optimization on but had some other junk on: debug symbols, /CLR (this was huge!), and the default _SECURE_SCL=1. After changing these, it got better but still not quite as good as MSVC 10. _SECURE_SCL was a major difference between MSVC9 and 10 (where it defaults to off).
Apparently not (MSVC2010, 64bit):
1000000 iterations of "123": atoi: 0.0366391 seconds strtol: 0.0327922 seconds Spirit: 0.0117357 seconds lexical_cast(string): 0.0702884 seconds lexical_cast(iterator_range): 0.742229 seconds lexical_cast(iterator_range->string): 0.0797006 seconds
1000000 iterations of "123567890": atoi: 0.0489029 seconds strtol: 0.0473159 seconds Spirit: 0.0209374 seconds lexical_cast(string): 0.0842183 seconds lexical_cast(iterator_range): 1.02365 seconds lexical_cast(iterator_range->string): 0.0912515 seconds
1000000 iterations of "1.23456": atof: 0.295518 seconds strtod: 0.279579 seconds Spirit: 0.0308074 seconds lexical_cast(string): 1.2818 seconds lexical_cast(iterator_range): 2.15233 seconds lexical_cast(iterator_range->string): 1.35335 seconds
1000000 iterations of "1.23456789e42": atof: 0.422748 seconds strtod: 0.407293 seconds Spirit: 0.0479103 seconds lexical_cast(string): 1.47888 seconds lexical_cast(iterator_range): 2.61988 seconds lexical_cast(iterator_range->string): 1.52793 seconds
That's more like it :-P
I added error checking to strto[dl] (what's the point of using it otherwise?). My new figures (32-bit): 1000000 iterations of "123": atoi: 0.0434303 seconds strtol: 0.0385471 seconds Spirit: 0.27993 seconds lexical_cast(string): 0.337947 seconds lexical_cast(iterator_range): 3.13472 seconds lexical_cast(iterator_range->string): 0.441562 seconds 1000000 iterations of "123567890": atoi: 0.0635996 seconds strtol: 0.0627828 seconds Spirit: 0.550669 seconds lexical_cast(string): 0.440172 seconds lexical_cast(iterator_range): 4.85826 seconds lexical_cast(iterator_range->string): 0.516084 seconds 1000000 iterations of "1.23456": atof: 0.495972 seconds strtod: 0.507613 seconds Spirit: 0.685259 seconds lexical_cast(string): 2.74444 seconds lexical_cast(iterator_range): 4.62169 seconds lexical_cast(iterator_range->string): 2.8685 seconds 1000000 iterations of "1.23456789e42": atof: 0.700639 seconds strtod: 0.707499 seconds Spirit: 0.967519 seconds lexical_cast(string): 3.16936 seconds lexical_cast(iterator_range): 6.19912 seconds lexical_cast(iterator_range->string): 3.29152 seconds Arash Partow pointed me at his thorough article: http://www.codeproject.com/KB/recipes/Tokenizer.aspx Based on Boost 1.48 (not sure how many performance changes went into lexical_cast between 1.47 and 1.48). He's using Profile Guided Optimization though; it seems to make a big difference which has not been my experience in bigger projects. -Matt

On 11/23/2011 4:35 PM, Hartmut Kaiser wrote:
On 11/23/2011 04:15 PM, Matthew Chambers wrote: The spirit code isn't really optimal (see the slightly improved version attached).
OK. That would be nice in the docs...took me 15 minutes to figure out what I did have! :(
Also, did you turn on optimization?
I DID have optimization on but had some other junk on: debug symbols, /CLR (this was huge!), and the default _SECURE_SCL=1. After changing these, it got better but still not quite as good as MSVC 10. _SECURE_SCL was a major difference between MSVC9 and 10 (where it defaults to off).
Apparently not (MSVC2010, 64bit):
1000000 iterations of "123": atoi: 0.0366391 seconds strtol: 0.0327922 seconds Spirit: 0.0117357 seconds lexical_cast(string): 0.0702884 seconds lexical_cast(iterator_range): 0.742229 seconds lexical_cast(iterator_range->string): 0.0797006 seconds
1000000 iterations of "123567890": atoi: 0.0489029 seconds strtol: 0.0473159 seconds Spirit: 0.0209374 seconds lexical_cast(string): 0.0842183 seconds lexical_cast(iterator_range): 1.02365 seconds lexical_cast(iterator_range->string): 0.0912515 seconds
1000000 iterations of "1.23456": atof: 0.295518 seconds strtod: 0.279579 seconds Spirit: 0.0308074 seconds lexical_cast(string): 1.2818 seconds lexical_cast(iterator_range): 2.15233 seconds lexical_cast(iterator_range->string): 1.35335 seconds
1000000 iterations of "1.23456789e42": atof: 0.422748 seconds strtod: 0.407293 seconds Spirit: 0.0479103 seconds lexical_cast(string): 1.47888 seconds lexical_cast(iterator_range): 2.61988 seconds lexical_cast(iterator_range->string): 1.52793 seconds
That's more like it :-P
I added error checking to strto[dl] (what's the point of using it otherwise?).
My new figures (32-bit):
1000000 iterations of "123": atoi: 0.0434303 seconds strtol: 0.0385471 seconds Spirit: 0.27993 seconds lexical_cast(string): 0.337947 seconds lexical_cast(iterator_range): 3.13472 seconds lexical_cast(iterator_range->string): 0.441562 seconds
1000000 iterations of "123567890": atoi: 0.0635996 seconds strtol: 0.0627828 seconds Spirit: 0.550669 seconds lexical_cast(string): 0.440172 seconds lexical_cast(iterator_range): 4.85826 seconds lexical_cast(iterator_range->string): 0.516084 seconds
1000000 iterations of "1.23456": atof: 0.495972 seconds strtod: 0.507613 seconds Spirit: 0.685259 seconds lexical_cast(string): 2.74444 seconds lexical_cast(iterator_range): 4.62169 seconds lexical_cast(iterator_range->string): 2.8685 seconds
1000000 iterations of "1.23456789e42": atof: 0.700639 seconds strtod: 0.707499 seconds Spirit: 0.967519 seconds lexical_cast(string): 3.16936 seconds lexical_cast(iterator_range): 6.19912 seconds lexical_cast(iterator_range->string): 3.29152 seconds
Arash Partow pointed me at his thorough article: http://www.codeproject.com/KB/recipes/Tokenizer.aspx
Based on Boost 1.48 (not sure how many performance changes went into lexical_cast between 1.47 and 1.48). He's using Profile Guided Optimization though; it seems to make a big difference which has not been my experience in bigger projects.
Something is definitely fishy with your Spirit numbers. Regards Hartmut --------------- http://boost-spirit.com http://stellar.cct.lsu.edu

On 11/24/2011 7:30 AM, Hartmut Kaiser wrote:
Something is definitely fishy with your Spirit numbers.
Yes, there's something wrong with the numbers alright. We have extensive int and real number tests in Spirit. I did a run and here's what I got: 32bit: atoi_test: 2.7353607652 [s] {checksum: 3b6f0e0} strtol_test: 2.5913341972 [s] {checksum: 3b6f0e0} spirit_int_test: 1.2209585612 [s] {checksum: 3b6f0e0} atof_test: 2.7674816420 [s] {checksum: 84a4f7d} strtod_test: 3.0302511938 [s] {checksum: 84a4f7d} spirit_double_test: 0.8215847269 [s] {checksum: 84a4f7d} /////////////////////////////////////////////////////////////////////////// 64bit: atoi_test: 2.5954563140 [s] {checksum: ea987f7a} strtol_test: 2.4382622754 [s] {checksum: ea987f7a} spirit_int_test: 1.0699516219 [s] {checksum: ea987f7a} atof_test: 2.7788751173 [s] {checksum: 84a4f7d} strtod_test: 3.0187541826 [s] {checksum: 84a4f7d} spirit_double_test: 0.8012355484 [s] {checksum: 84a4f7d} Check it out. It's in: boost/libs/spirit/optimization/ Regards, -- Joel de Guzman http://www.boostpro.com http://boost-spirit.com

On 11/23/2011 7:32 PM, Joel de Guzman wrote:
On 11/24/2011 7:30 AM, Hartmut Kaiser wrote:
Something is definitely fishy with your Spirit numbers.
Yes, there's something wrong with the numbers alright. We have extensive int and real number tests in Spirit. I did a run and here's what I got:
Check it out. It's in:
boost/libs/spirit/optimization/
It doesn't include lexical_cast or iterator_range which is the topic here. My Spirit discrepancy is interesting (and I'd like to resolve it) but not on topic. MSVC 9 SP1 ---------- Compile options (minus some warning flags): /Ot /GL /D "WIN32" /D "NDEBUG" /D "_UNICODE" /D "UNICODE" /FD /EHsc /MD /Fo"Release\\" /Fd"Release\vc90.pdb" /c /TP Link options (minus some manifest stuff): /OPT:REF /OPT:ICF /LTCG /DYNAMICBASE /NXCOMPAT /MACHINE:X86 ---------- 1000000 iterations of "123": atoi: 0.0391888 seconds strtol: 0.038651 seconds Spirit: 0.273361 seconds lexical_cast(string): 0.248789 seconds lexical_cast(iterator_range): 2.24887 seconds lexical_cast(iterator_range->string): 0.340937 seconds 1000000 iterations of "123567890": atoi: 0.0631489 seconds strtol: 0.0616906 seconds Spirit: 0.52429 seconds lexical_cast(string): 0.311549 seconds lexical_cast(iterator_range): 3.60336 seconds lexical_cast(iterator_range->string): 0.411227 seconds 1000000 iterations of "1.23456": atof: 0.489091 seconds strtod: 0.50077 seconds Spirit: 0.578644 seconds lexical_cast(string): 2.88651 seconds lexical_cast(iterator_range): 5.84474 seconds lexical_cast(iterator_range->string): 3.00146 seconds 1000000 iterations of "1.23456789e42": atof: 0.689908 seconds strtod: 0.701619 seconds Spirit: 0.871071 seconds lexical_cast(string): 3.27655 seconds lexical_cast(iterator_range): 7.47095 seconds lexical_cast(iterator_range->string): 3.41462 seconds MSVC 10 ------- Compile options (minus some warning flags): /Zi /O2 /Ot /Oy- /GL /D "WIN32" /D "NDEBUG" /D "_UNICODE" /D "UNICODE" /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope /Fp"Release\test.pch" /Fa"Release\" /Fo"Release\" /Fd"Release\vc100.pdb" /Gd /analyze- Link options (minus some manifest stuff): /PDB:"D:\test\test\Release\test.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /PGD:"D:\test\test\Release\test.pgd" /LTCG /TLBID:1 /DYNAMICBASE /NXCOMPAT /MACHINE:X86 ------- 1000000 iterations of "123": atoi: 0.0428806 seconds strtol: 0.0381644 seconds Spirit: 0.0155017 seconds lexical_cast(string): 0.176925 seconds lexical_cast(iterator_range): 1.75922 seconds lexical_cast(iterator_range->string): 0.193789 seconds 1000000 iterations of "123567890": atoi: 0.0587083 seconds strtol: 0.0606732 seconds Spirit: 0.0342255 seconds lexical_cast(string): 0.201997 seconds lexical_cast(iterator_range): 2.49349 seconds lexical_cast(iterator_range->string): 0.210316 seconds 1000000 iterations of "1.23456": atof: 0.450875 seconds strtod: 0.448144 seconds Spirit: 0.0557749 seconds lexical_cast(string): 2.53753 seconds lexical_cast(iterator_range): 4.62941 seconds lexical_cast(iterator_range->string): 2.55487 seconds 1000000 iterations of "1.23456789e42": atof: 0.656821 seconds strtod: 0.657229 seconds Spirit: 0.0721354 seconds lexical_cast(string): 2.87113 seconds lexical_cast(iterator_range): 5.72209 seconds lexical_cast(iterator_range->string): 2.91933 seconds The Spirit issue seems to be between MSVC 9 and 10. Still off topic, but is anybody able to get better Spirit results with MSVC 9? Finally, I got lexical_cast down to its best performance with BOOST_LEXICAL_CAST_ASSUME_C_LOCALE (again using MSVC 10). There is some kind of optimization going on with the iterator_range->string branch, so let's ignore that. I added a sscanf line for better comparison with http://www.boost.org/doc/libs/1_48_0/doc/html/boost_lexical_cast/performance... : 1000000 iterations of "123": atoi: 0.0395041 seconds strtol: 0.037376 seconds sscanf: 0.141029 seconds lexical_cast(string): 0.0169919 seconds lexical_cast(iterator_range): 1.54364 seconds 1000000 iterations of "123567890": atoi: 0.0575238 seconds strtol: 0.0577509 seconds sscanf: 0.194656 seconds lexical_cast(string): 0.0428486 seconds lexical_cast(iterator_range): 2.23957 seconds 1000000 iterations of "1.23456": atof: 0.437103 seconds strtod: 0.45665 seconds sscanf: 0.590672 seconds Spirit: 0.0519114 seconds lexical_cast(iterator_range): 4.63105 seconds 1000000 iterations of "1.23456789e42": atof: 0.63978 seconds strtod: 0.673893 seconds sscanf: 0.888611 seconds lexical_cast(string): 2.88724 seconds lexical_cast(iterator_range): 5.65133 seconds Still we see that lexical_cast is quite slow for string->float (at least on MSVC 10) which doesn't match the GCC results on the performance.html page and it's abysmal for converting iterator_ranges. In case someone else wants to try, the latest code I'm using to test: http://codepad.org/P6Os5MKf I'm hoping Antony Polukhin will chime in and say how hard it would be to add iterator_range specializations and redirect the existing string->T specializations to forward to the iterator_range ones. I don't understand the TMP in the new lexical_cast. Also, [Antony] please add something like: "Tests were compiled with BOOST_LEXICAL_CAST_ASSUME_C_LOCALE defined for optimum performance." to the test description on the performance.html page? Thanks, -Matt

Hi, On 23 November 2011 17:55, Matthew Chambers <matt.chambers42@gmail.com> wrote:
On Oct 13, 2011 Michel Morin wrote:
lexical_cast made optimizations for some types (such as `std::string`) to achieve better performance. But, from my experience, it seems that lexical_cast does not make any optimization for `boost::iterator_range<std::string::iterator>`.
Is there any plan to add optimization for such iterator ranges? Currently, - `lexical_cast<int>(std::string(iter_rng.begin(), iter_rng.end()))` is faster than `lexical_cast<int>(iter_rng)`. - Assuming it is safe to use `std::atoi`, `std::atoi(&iter_rng.front())` is faster than both of the above two methods. (`iter_rng` is an instance of `boost::iterator_range<std::string::iterator>`.)
Regards, Michel
This post was on the boost users list but never got a reply. I figure it might be a better fit on the developer list. I too would like to see optimized conversions work on string ranges; what's the point in doing a fast conversion if you have to do a string copy first? :)
-Matt
You might be interested in boost.coerce, a type-to-string and string-to-type conversion library. The code can be found at http://svn.boost.org/svn/boost/sandbox/coerce/ and an initial version of the documentation at http://vexocide.org/coerce/. It has the following customization points: http://vexocide.org/coerce/coerce/traits/string/is_string.html and http://vexocide.org/coerce/coerce/traits/string/string_traits.html, which would allow you to teach it about boost::iterator_range<std::string::iterator>. If I find time I'll write a small example. Kind regards, Jeroen

On 11/23/2011 4:49 PM, Jeroen Habraken wrote:
You might be interested in boost.coerce, a type-to-string and string-to-type conversion library. The code can be found at http://svn.boost.org/svn/boost/sandbox/coerce/ and an initial version of the documentation at http://vexocide.org/coerce/.
It has the following customization points: http://vexocide.org/coerce/coerce/traits/string/is_string.html and http://vexocide.org/coerce/coerce/traits/string/string_traits.html, which would allow you to teach it about boost::iterator_range<std::string::iterator>.
If I find time I'll write a small example.
Unless I'm missing something, using Spirit in a header-only conversion library is pretty much a non-starter due to the compile time -explosion- hit. If you moved the inclusion of Spirit into a CPP I'd definitely consider it. Thanks, -Matt

Thanks Matt for reviving the thread! I just created a trac ticket to draw the author's attention. https://svn.boost.org/trac/boost/ticket/6430 Regards, Michel

Sorry for not writing for a long time, totally forgot about this thread. 2011/11/23 Matthew Chambers <matt.chambers42@gmail.com>:
I too would like to see optimized conversions work on string ranges; what's the point in doing a fast conversion if you have to do a string copy first? :)
There are a lot of libraries, that can have tuned conversions for lexical_cast. The bad thing, is that lexical_cast can be customized ONLY via overloading operator>>(stream&) and operator<<(stream&), and that is not very fast. That is the design. As simple, as possible. Multiple customization points will make the design obfuscated. It would be also a bad idea, to include a lot of different library headers to lexical_cast.hpp (will increase compilation times, add unnecessary dependencies...) 2011/11/24 Matthew Chambers <matt.chambers42@gmail.com>:
On 11/23/2011 11:55 AM, Olaf van der Spek wrote:
On Wed, Nov 23, 2011 at 5:55 PM, Matthew Chambers wrote:
On Oct 13, 2011 Michel Morin wrote:
- Assuming it is safe to use `std::atoi`, `std::atoi(&iter_rng.front())` is faster than both of the above two methods.
Is front() guaranteed to return a 0-terminated string?
Certainly not. I consider that caveat to be part of the "assuming it is safe" precondition.
If you assume, that it is safe to use &iter_rng.front() (extremely unsafe!), you can add overload for lexical_cast to the iterators_range_io.hpp. Something like this: template <class OutT> OutT lexical_cast(const iterator_range<char>& iter_rng) { return lexical_cast<OutT>(&iter_rng.front()); } Then it will benefit from all the char* optimizations. 2011/11/28 Matthew Chambers <matt.chambers42@gmail.com>:
Finally, I got lexical_cast down to its best performance with BOOST_LEXICAL_CAST_ASSUME_C_LOCALE (again using MSVC 10). There is some kind of optimization going on with the iterator_range->string branch, so let's ignore that. I added a sscanf line for better comparison with http://www.boost.org/doc/libs/1_48_0/doc/html/boost_lexical_cast/performance...
Thanks for that info. It looks like std::locale class has a totally different implementation under VC. I`ll take care of that and tune lexical_cast implementation for VC.
Still we see that lexical_cast is quite slow for string->float (at least on MSVC 10) which doesn't match the GCC results on the performance.html page and it's abysmal for converting iterator_ranges.
Not string->float. You are using string->double conversion. Under VC it is not tuned, because a tuned version is not exactly precise. And trading accuracy for speed is a questionable solution. By the way, last time I was looking through the Spirit implementation, it was fast, but not perfectly accurate (incorrect values in 18-20th sign after dot)
I'm hoping Antony Polukhin will chime in and say how hard it would be to add iterator_range specializations and redirect the existing string->T specializations to forward to the iterator_range ones. I don't understand the TMP in the new lexical_cast. Also, [Antony] please add something like:
Optimizations for iterator ranges will not be added in nearest releases.
"Tests were compiled with BOOST_LEXICAL_CAST_ASSUME_C_LOCALE defined for optimum performance." to the test description on the performance.html page?
They were compiled without BOOST_LEXICAL_CAST_ASSUME_C_LOCALE. Differences between GCCs and VCs implementations of std::locale are huge. I`ll optimize lexical_cast for VC soon. Best regards, Antony Polukhin

Hi Antony, Thanks for your response. Antony Polukhin wrote:
There are a lot of libraries, that can have tuned conversions for lexical_cast. The bad thing, is that lexical_cast can be customized ONLY via overloading operator>>(stream&) and operator<<(stream&), and that is not very fast. That is the design. As simple, as possible. Multiple customization points will make the design obfuscated.
It would be also a bad idea, to include a lot of different library headers to lexical_cast.hpp (will increase compilation times, add unnecessary dependencies...)
Makes sense. But, IMHO, it's worth optimizing codes for string ranges, since string ranges are often used in string algorithms.
If you assume, that it is safe to use &iter_rng.front() (extremely unsafe!),
IIUC, the main concern to implement optimized code is that how to obtain begin and end pointer to CharT from `iter_rng` safely. With C++11-conforming standard library, it is safe to implement bool operator<<(::boost::iterator_range<std::string::iterator> const& str) { if (!str.empty()) { start = const_cast<CharT*>(&str.front()); finish = start + str.size(); } else { start = 0; finish = 0; } return true; } because `&*(s.begin() + n) == &*s.begin() + n` (`s` is an object of std::basic_string<…>) is guaranteed. But, in C++03, the above code is not safe; the standard does not guarantee that std::basic_string<…> is stored contiguously (though many standard library implementations store it contiguously). Regards, Michel

On Thu, Jan 26, 2012 at 6:14 PM, Michel Morin <mimomorin@gmail.com> wrote:
IIUC, the main concern to implement optimized code is that how to obtain begin and end pointer to CharT from `iter_rng` safely. With C++11-conforming standard library, it is safe to implement
bool operator<<(::boost::iterator_range<std::string::iterator> const& str) { if (!str.empty()) { start = const_cast<CharT*>(&str.front()); finish = start + str.size(); } else { start = 0; finish = 0; } return true; }
because `&*(s.begin() + n) == &*s.begin() + n` (`s` is an object of std::basic_string<…>) is guaranteed.
But, in C++03, the above code is not safe; the standard does not guarantee that std::basic_string<…> is stored contiguously (though many standard library implementations store it contiguously).
What about using c_str() instead of begin()? It does provide that guarantee in C++03. -- Olaf

Olaf van der Spek wrote:
On Thu, Jan 26, 2012 at 6:14 PM, Michel Morin <mimomorin@gmail.com> wrote:
IIUC, the main concern to implement optimized code is that how to obtain begin and end pointer to CharT from `iter_rng` safely. With C++11-conforming standard library, it is safe to implement
bool operator<<(::boost::iterator_range<std::string::iterator> const& str) { if (!str.empty()) { start = const_cast<CharT*>(&str.front()); finish = start + str.size(); } else { start = 0; finish = 0; } return true; }
because `&*(s.begin() + n) == &*s.begin() + n` (`s` is an object of std::basic_string<…>) is guaranteed.
But, in C++03, the above code is not safe; the standard does not guarantee that std::basic_string<…> is stored contiguously (though many standard library implementations store it contiguously).
What about using c_str() instead of begin()? It does provide that guarantee in C++03.
We cannot use c_str() because we only have iterators. P.S. Antony, I created a trac ticket but could not change the owner to you. Could you take a look at it? https://svn.boost.org/trac/boost/ticket/6453 Regards, Michel
participants (8)
-
Antony Polukhin
-
Hartmut Kaiser
-
Jeroen Habraken
-
Joel de Guzman
-
Matthew Chambers
-
Michel Morin
-
Olaf van der Spek
-
Thomas Heller