
On Sat, Jul 18, 2009 at 2:13 AM, Eric Niebler<eric@boostpro.com> wrote:
Michael Caisse wrote:
OvermindDL1 wrote:
Parsing: 42.5
<snip>
spirit-grammar(threadsafe/reusable): 3.1393
Thank you for pulling this together. Would you mind sharing your test suite?
Er, I meant to attach it, it is attached now. :) It requires Boost trunk, and the timer file hpp I include is part of the Boost.Spirit2.1 examples/test/somewhere_in_there area, but I included it with my cpp file too so you do not need to hunt for it. The defines at the top control what parts to compile or not, 0 to disable compiling for that part, 1 to enable it. My build is built with Visual Studio 8 (2005) with SP1. Compiler options are basically defaults, except getting rid of the secure crt crap that Microsoft screwed up (enabling that crap slows down Spirit parsers on my system, a *lot*). The exe I built is in the 7zip file attached. As stated, I have heard that Visual Studio handles template stuff like Spirit better then GCC, so I am very curious how GCC's timings on this file would be. There are still more changes to make that I intend to make, but I really want the original code in a way that I can use it. To be honest, I had to change the core::to_number lines (commented out) to boost::lexical_cast (right below the commented version), so the xpressive version could be slightly faster if I actually had the implementation of core::to_number available, and core::to_number was well made. The xpressive code also throws a nice 100 line long warning in my build log, all just about a conversion warning from double to int_64, no clue how to fix that, I do not know xpressive, so I would gladly like it if someone could get rid of that nasty warning in my nice clean buildlog. In my compiler, my Spirit2.1 grammar builds perfectly clean, I would like it if xpressive was the same way. I honestly do not know *why* the Spirit version is so much faster then the xpressive version, the spirit-quick version (the non-threadsafe) I whipped up in about 2 minutes. The threadsafe version took about 5 minutes, the grammar/threadsafe/reusable version took about 10 minutes, and I know a lot more work was put into the xpressive version, especially with the auto macros added and all such as well. I would love it if someone could find out way. If someone else with MSVC, and someone with GCC and perhaps other things could build it and display the results that it prints out too, I would be much appreciative. I do have a linux computer here, but, to be honest, no clue what to pass to gcc to build something, the command line switches I pass to MSVC's version is rather monstrous, so trying to convert that to GCC's seems nightmarish from my point of view. On Sat, Jul 18, 2009 at 2:13 AM, Eric Niebler<eric@boostpro.com> wrote:
Yes, please. I know Spirit2 is great tech, but I have to wonder how it's over 10X faster than the hand-coded parser. And I have not tested the hand-coded parser as I cannot get it to compile. If you can get me a code-complete standalone version of it, I would be very happy. :)
Either way, Windows users, could you please run the attached exe (that is in the 7zip file) and paste the results it tells you in an email to this thread, along with your windows version and basic hardware? Before I attach this, I am going to run the release exe through a profiler right quick. With 1000000 iterations (one million so the xpressive version does not take so long), with just the xpressive version enabled, the top 10 slowest functions: CS:EIP Symbol + Offset 64-bit CPU clocks IPC DC miss rate DTLB L1M L2M rate Misalign rate Mispredict rate 0x421860 strcmp 2248 1.98 0 0 0 0 0x42bc84 __strgtold12_l 1196 1.1 0 0 0.02 0.01 0x4068a0 std::operator<<<std::char_traits<char> > 744 1.06 0 0 0 0.02 0x41d864 TrailUpVec 686 0.03 0.11 0 0 0 0x40e0e0 std::num_get<char,std::istreambuf_iterator<char,std::char_traits<char>
::_Getffld
571 0.94 0 0 0 0.01 0x42d344 __mtold12 447 2.2 0 0 0 0 0x4170a0 std::basic_istream<char,std::char_traits<char> >::operator>> 406 0.38 0 0 0.05 0.08 0x414150 boost::xpressive::detail::posix_charset_matcher<boost::xpressive::cpp_regex_traits<char>
::match<std::_String_const_iterator<char,std::char_traits<char>,std::allocator<char> ,boost::xpressive::detail::static_xpression<boost::xpressive::detail::true_matcher, 358 1.36 0 0 0 0 0x419231 std::_Lockit::~_Lockit
334 0.26 0 0 0 0 0x42b200 _ld12tod 333 1.05 0 0 0.01 0.01 10 functions, 700 instructions, Total: 48191 samples, 50.01% of samples in the module, 31.99% of total session samples So it looks like strcmp i massively hobbling it, taking almost twice the time of the next highest user. Now for 1000000 (one million) of just the spirit quick version (all calls, surprisingly few): CS:EIP Symbol + Offset 64-bit CPU clocks IPC DC miss rate DTLB L1M L2M rate Misalign rate Mispredict rate 0x4188c9 _pow_pentium4 358 1.04 0 0 0 0 0x404d70 ??$phrase_parse@PBDU?$expr@Ubitwise_or@tag@proto@boost@@U?$list2@ABU?$expr@Ushift_right@tag@proto@boost@@U?$list2@ABU?$expr@Ushift_right@tag@proto@boost@@U?$list2@ABU?$expr@Usubscript@tag@proto@boost@@U?$list2@ABU?$terminal073d7121f2c9203b84cbac5f1ea1214c 116 1.71 0 0 0 0 0x405080 boost::spirit::qi::detail::real_impl<double,boost::spirit::qi::real_policies<double>
::parse<char const *,double>
76 1.21 0 0 0 0 0x405f90 boost::spirit::qi::detail::extract_int<__int64,10,1,-1,boost::spirit::qi::detail::positive_accumulator<10>,0>::parse_main<char const *,__int64> 68 2.35 0 0 0 0 0x405550 boost::spirit::qi::detail::extract_int<double,10,1,-1,boost::spirit::qi::detail::positive_accumulator<10>,0>::parse_main<char const *,double> 66 1.82 0 0 0 0 0x4053e0 boost::spirit::qi::detail::`anonymous namespace'::scale_number<double> 63 1.14 0 0 0 0 0x404300 parse_price_spirit_quick<char const *> 62 1.31 0 0 0 0.03 0x4054e0 boost::spirit::qi::detail::fail_function<char const *,boost::fusion::unused_type const ,boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::blank,boost::spirit::char_encoding::ascii>
::operator()<boost::spirit::qi::action<boost: 59 1.78 0 0 0 0 0x404f30 boost::spirit::qi::skip_over<char const *,boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::blank,boost::spirit::char_encoding::ascii>
58 1.59 0 0 0 0 0x417b90 floor
48 0.67 0 0 0 0 0x417b16 _ftol2 46 2.37 0 0 0 0 0x4018f0 dotNumber 42 0.86 0 0 0 0 0x404fa0 boost::spirit::qi::action<boost::spirit::qi::real_parser_impl<double,boost::spirit::qi::real_policies<double>
,void (__cdecl*)(double)>::parse<char const *,boost::fusion::unused_type const ,boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::s 41 1.12 0 0 0 0 0x405660 boost::spirit::qi::detail::extract_int<double,10,1,-1,boost::spirit::qi::detail::positive_accumulator<10>,1>::parse_main<char const *,double>
31 1.29 0 0 0 0 0x417890 _CIpow 31 1.68 0 0 0 0 0x405af0 boost::spirit::qi::int_parser_impl<__int64,10,1,-1>::parse<char const *,boost::fusion::unused_type const ,boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::blank,boost::spirit::char_encoding::ascii>
,__int64> 29 0.48 0 0 0 0 0x405010 boost::spirit::qi::action<boost::spirit::qi::real_parser_impl<double,boost::spirit::qi::real_policies<double> ,void (__cdecl*)(double)>::parse<char const *,boost::fusion::unused_type const ,boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::s 27 1.04 0 0 0 0 0x4174c0 _allmul
27 1 0 0 0 0 0x405b60 boost::spirit::qi::not_predicate<boost::spirit::qi::literal_char<boost::spirit::char_encoding::standard,1,0>
::parse<char const *,boost::fusion::unused_type const ,boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::blank,boost::s 25 1 0 0 0 0 0x404ec0 bo$phrase_parse@PBDU?$expr@Ubitwise_or@tag@proto@boost@@U?$list2@ABU?$expr@Ushift_right@tag@proto@boost@@U?$list2@ABU?$expr@Ushift_right@tag@proto@boost@@U?$list2@ABU?$expr@Usubscript@tag@proto@boost@@U?$list2@ABU?$terminal073d7121f2c9203b84cbac5f1ea1214c 23 0.17 0 0 0 0.12 0x417bd0 _floor_pentium4
17 0.24 0 0 0 0 0x4188b0 _CIpow_pentium4 14 0 0 0 0 0 0x401970 main 9 0.11 0 0 0 0.3 0x404f10 boost::spirit::qi::skip_over<char const *,boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::blank,boost::spirit::char_encoding::ascii>
4 0 0 0 0 0 0x40cc02 _flsbuf
1 0 0 0 0 0 0x40e8b0 __SEH_prolog4 0 0 0 0 0 0 26 functions, 447 instructions, Total: 6513 samples, 100.00% of samples in the module, 69.20% of total session samples Now for the same, but with the spirit grammar version, since it is so much slower then the quick for some reason (all calls again, not that many): CS:EIP Symbol + Offset 64-bit CPU clocks IPC DC miss rate DTLB L1M L2M rate Misalign rate Mispredict rate 0x419909 _pow_pentium4 365 0.97 0 0 0 0 0x4056a0 boost::function4<bool,char const * &,char const * const &,boost::spirit::context<boost::fusion::cons<__int64 &,boost::fusion::nil>,boost::fusion::vector0<void> > &,boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::blank,boost::sp 129 1.19 0 0 0 0.02 0x405780 boost::detail::function::function_obj_invoker4<boost::spirit::qi::detail::parser_binder<boost::spirit::qi::alternative<boost::fusion::cons<boost::spirit::qi::reference<boost::spirit::qi::rule<char const *,__int64 __cdecl(void),boost::proto::exprns_::expr<boos 99 1.12 0 0 0 0.03 0x406f50 boost::spirit::qi::detail::extract_int<__int64,10,1,-1,boost::spirit::qi::detail::positive_accumulator<10>,0>::parse_main<char const *,__int64> 81 1.28 0 0 0 0 0x406100 boost::spirit::qi::detail::real_impl<double,boost::spirit::qi::real_policies<double>
::parse<char const *,double>
77 1.38 0 0 0 0 0x406bc0 boost::spirit::qi::rule<char const *,__int64 __cdecl(void),boost::proto::exprns_::expr<boost::proto::tag::terminal,boost::proto::argsns_::term<boost::spirit::tag::char_code<boost::spirit::tag::blank,boost::spirit::char_encoding::ascii>
,0>,boost::fusion::unu 77 0.87 0 0 0 0.04 0x406c30 boost::spirit::qi::action<boost::spirit::qi::int_parser_impl<__int64,10,1,-1>,boost::phoenix::actor<boost::phoenix::composite<boost::phoenix::assign_eval,boost::fusion::vector<boost::spirit::attribute<0>,boost::phoenix::composite<boost::phoenix::multiplies_ev 74 1.61 0 0 0 0 0x406620 boost::spirit::qi::detail::extract_int<double,10,1,-1,boost::spirit::qi::detail::positive_accumulator<10>,0>::parse_main<char const *,double>
64 1.22 0 0 0 0 0x4050b0 boost::spirit::qi::phrase_parse<char const *,price_grammar<char const *>,boost::proto::exprns_::expr<boost::proto::tag::terminal,boost::proto::argsns_::term<boost::spirit::tag::char_code<boost::spirit::tag::blank,boost::spirit::char_encoding::ascii>
,0>,__in 56 0.29 0 0 0 0.11 0x406460 boost::spirit::qi::detail::`anonymous namespace'::scale_number<double>
53 1.79 0 0 0 0 0x405810 boost::detail::function::function_obj_invoker4<boost::spirit::qi::detail::parser_binder<boost::spirit::qi::alternative<boost::fusion::cons<boost::spirit::qi::reference<boost::spirit::qi::rule<char const *,__int64 __cdecl(void),boost::proto::exprns_::expr<boos 52 1.98 0 0 0 0.02 0x418b56 _ftol2 50 1.68 0 0 0 0 0x401940 main 45 0.67 0 0 0 0.04 0x405fe0 boost::spirit::traits::action_dispatch<boost::spirit::qi::real_parser_impl<double,boost::spirit::qi::real_policies<double>
::operator()<dot_number_to_long_long_function,double,boost::spirit::context<boost::fusion::cons<__int64 &,boost::fusion::nil>,boost:: 43 1.19 0 0 0 0 0x405f70 boost::spirit::qi::action<boost::spirit::qi::real_parser_impl<double,boost::spirit::qi::real_policies<double> ,dot_number_to_long_long_function>::parse<char const *,boost::spirit::context<boost::fusion::cons<__int64 &,boost::fusion::nil>,boost::fusion::vecto 41 0.83 0 0 0 0 0x405930 boost::detail::function::function_obj_invoker4<boost::spirit::qi::detail::parser_binder<boost::spirit::qi::action<boost::spirit::qi::real_parser_impl<double,boost::spirit::qi::real_policies<double> ,dot_number_to_long_long_function>,boost::mpl::bool_<0> >,bo 36 2 0 0 0 0 0x418bd0 floor
34 1.12 0 0 0 0 0x405e60 boost::spirit::qi::action<boost::spirit::qi::real_parser_impl<double,boost::spirit::qi::real_policies<double>
,dot_number_to_long_long_function>::parse<char const *,boost::spirit::context<boost::fusion::cons<__int64 &,boost::fusion::nil>,boost::fusion::vecto 33 0.15 0 0 0 0.28 0x4182a0 _allmul
33 3.42 0 0 0 0 0x4188d0 _CIpow 27 0.52 0 0 0 0 0x406730 boost::spirit::qi::detail::extract_int<double,10,1,-1,boost::spirit::qi::detail::positive_accumulator<10>,1>::parse_main<char const *,double> 26 2.62 0 0 0 0 0x406560 boost::spirit::qi::int_parser_impl<__int64,10,1,-1>::parse<char const *,boost::spirit::context<boost::fusion::cons<__int64 &,boost::fusion::nil>,boost::fusion::vector0<void>
,boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::bla 19 0.16 0 0 0 0 0x418c10 _floor_pentium4
16 0 0 0 0 0 0x406ca0 boost::spirit::qi::not_predicate<boost::spirit::qi::literal_char<boost::spirit::char_encoding::standard,1,0>
::parse<char const *,boost::spirit::context<boost::fusion::cons<__int64 &,boost::fusion::nil>,boost::fusion::vector0<void> ,boost::spirit::qi::char_ 11 0.36 0 0 0 0 0x4198f0 _CIpow_pentium4
11 0 0 0 0 0 0x40b090 _flush 1 0 0 0 0 0 26 functions, 451 instructions, Total: 7342 samples, 100.00% of samples in the module, 71.73% of total session samples