
on Tue Jun 02 2009, Eric Niebler <eric-AT-boostpro.com> wrote:
Joel de Guzman wrote:
Sebastian Redl wrote:
Eric Niebler wrote:
I confess that I'm not actually benchmarking compile speed; rather, I'm benchmarking the number of template instantiations as reported by Steven's template profiler. I'm profiling TMP-heavy code like some of Proto's and xpressive's tests and cherry-picking the worst offenders. The Fusion vector_n_chooser patch knocked off 100's of template instantiations, for instance.
That's not necessarily a good benchmark, especially if you replace it by preprocessor metaprogramming which leads to more non-template code. GCC is extremely slow at instantiating templates, but this is not necessarily true for other compilers - I believe, for example, that Clang will be faster at instantiating templates than parsing raw code. (No benchmarks - but I know the code.)
Cool! I wonder how that's possible. I have it from Walter Bright (Zortech, Symantec, Digital Mars) that instantiating a template is inherently expensive, and certain features of the C++ language (ADL, partial specialization, etc.) force that to be the case. If Clang has found a way to solve these problems, that's good news indeed.
It may be "inherently expensive" by some measure, but most compilers were implemented by people for whom template instantiation speed was way down the list of priorities, and most got their template implementations before "interesting TMP" was even available for them to test against. In some cases they do *really* dumb things.
I read form the Wikipedia entry that Clang's C++ support is 2-3 years from being usable, though.
I wouldn't bet against Doug Gregor when he's firing on all cylinders :-)
Agreed 100%
OK. When compiling Fusion's vector_make.cpp test ...
Before ...
$ time g++ -I ../../../.. -c vector_make.cpp
real 0m1.670s user 0m1.216s sys 0m0.325s
After ...
$ time g++ -I ../../../.. -c vector_make.cpp
real 0m1.208s user 0m0.684s sys 0m0.309s
From the user time, my recent changes make this test compile twice as fast for gcc-3.4 (cygwin). For MSVC, the wins are less dramatic.
Your point is taken, though ... instantiation count is merely a rule of thumb and the real measure is clock time. It is, in my experience and with compilers actually in use today, a very good rule of thumb, though.
Well, it's great to get the instantiation count down, but consider that what you're replacing it with may not be any faster :-) If you *are* getting a win from PP metaprogramming, there's a good chance that you could improve the speed a lot more, e.g. by using the "z" parameter as described in http://www.boostpro.com/tmpbook/preprocessor.html#horizontal-repetition -- Dave Abrahams BoostPro Computing http://www.boostpro.com