[Niall Douglas]
Do also bear in mind that MSVC is not an AST based compiler, so all your benchmarks will look totally different on MSVC anyway. What may have O(N^N) scaling on an AST compiler might well be O(N) on MSVC - in fact, that is exactly why MSVC isn't AST based, as internal Microsoft code is so complex it is uncompilable by an AST based compiler.
1. That's not why. The reason is that manipulating tokens seemed like a good idea 20-30 years ago, when this was sufficient to deal with C and C++ (before templates, variadic templates, decltype, constexpr, expression SFINAE, etc.) when computers were far smaller and slower. 2. MS has a whole bunch of code, but only a tiny fraction approaches the complexity of Boost (in terms of "doing tricky things with the language"; stuff that's just loops and pointers is not complicated by this metric, no matter how twisty). Our STL implementation is usually the high water mark for complexity, and while we do tricky stuff, I know that Boost is trickier. 3. I am not really qualified to talk about compiler tech, but I cannot possibly understand how keeping an AST around would change the *asymptotic* complexity of translation (it introduces constant factors which are an issue in practice). The AST has a strict superset of the information available by looking at a few tokens at a time. By definition, any speedups with a token-based approach that can't be obtained through an AST would be due to nonconformance. STL