Re: [boost] [fusion] improving compile times

3 Jun 2009


      Doug Gregor wrote:
...
On Tue, Jun 2, 2009 at 10:09 AM, Eric Niebler <eric@boostpro.com> wrote:
...
Cool! I wonder how that's possible. I have it from Walter Bright (Zortech,
Symantec, Digital Mars) that instantiating a template is inherently
expensive, and certain features of the C++ language (ADL, partial
specialization, etc.) force that to be the case.
Template instantiation is expensive, but you're probably seeing some
O(n^2) or worse effects because of a poor choice in data structures.
Indeed, I recall Walter saying it's an N^2 problem. I've convinced him 
to write a blog entry about why template instantiation in C++ is 
inherently slow, so soon we'll know why he thinks so. Maybe your work in 
Clang can prove him wrong.
...
In GCC, for example, a surprising amount of time is wasted determining
whether the class template specialization X<T1, T2, ..., TN> refers to
an already-known template instantiation (or specialization), because
GCC stores all of the template instantiations in a linked list. Thus,
you pay  each time you name X<T1, T2, ..., TN>, even if you don't
instantiate it.
That sucks.
...
That's why we see quadratic (or worse) behavior for
template metaprograms with GCC. I suspect that other compilers have
the same problem.
...
If Clang has found a way to
solve these problems, that's good news indeed.
We're working on it. I did some simple benchmarking with the
ultra-boring Fibonacci template metaprogram last Friday, just to see
how Clang is doing, and posted the results here:
http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20090529/a392b024/att...
Lookin' good!
...
The cost of template *instantiation* for the Fibonacci example is
quite small for both compilers, since we're just talking about
creating a class with its special member functions and a single static
data member.
Would it go faster if the compiler didn't have to create the special 
member functions? Could we use the declared-but-not-defined trick to 
suppress their generation and speed up template instantiations for 
metafunctions?
...
However, GCC is exhibiting quadratic behavior because
every time we name Fibonacci<I> for some value I, it's doing a linear
search to see if there's already a specialization for that value of I.
Clang is scaling much better here because our search for an
already-named specialization is constant time in the average case.
I can't promise that the improvements we see in Fibonacci will extend
to real template metaprograms, because I haven't tried it. Nor can I:
Clang lacks both member templates and class template partial
specialization [*], which means that we can't compile a serious
template metaprogram with Clang. Obviously, template metaprogramming
is important to me, personally, so we'll do our best to scale this
well for real template metaprograms.
Where can I read more and follow the team's progress?

-- 
Eric Niebler
BoostPro Computing
http://www.boostpro.com