On Fri, Jun 29, 2007 at 02:05:59AM -0500, Michael Marcin wrote:
There is a senior engineer that I work with who believes templates are slow and prefers to write C or assembly routines. The templates are slow
What does he base his belief on? And did he provide *any* proof for his reasoning? (Well, if he's in a higher position than you, he might not be required to do so. People listen to him because he's in higher position, not because he has good arguments. Been there, experienced that.) Templates are *purely* compile-time mechanism and thus amenable to optimization in compile-time. You might try explaining that templates are just an "advanced preprocessor". [Or he might just be "rationalizing" his unwillingness to learn something new. If this is the case, then any argument with him is probably lost apriori. Your best bet would be to show people willing to listen that he's wrong. But even *if* you show that templates are not less efficient than C+macros, you will have to show what *advantage* they have over C. So you need to make your argument based on two things: 1) no efficiency loss, 2) advantages over C. And you need to show this argument to people who are 1) willing to listen, and 2) can override the senior programmer. Those people will probably be interested in the time that others in the team will need to learn templates, etc.]
Write some interesting code and generate the assembly for it. Analyze this assembly manually and save it off in source control. When the test suite is run compile that code down to assembly again and have the test suite do a simple byte comparison of the two files.
I don't understand this part. What do you want to compare? Macro vs. template version? This will certainly *not* yield identical object file (because it contains symbol names, etc. along with generated code).
Write the templated and C versions of the algorithms. Run the test suite to generate the assembly for each version. Write a parser to and a heuristic to analyze the generated code of each. Grade and compare each
Unfortunately, it's almost meaningless to analyze the run-time performance of a program (beyond algorithmic complexity) without the actual input. "Register usage" is a vague term, and the number of function calls does not have to play a role (infrequent code paths, large functions, cache effects, etc).
Does anyone have any input/ideas/suggestions?
How about traditional profiling? Write a test-suite that feeds the same input to C and C++ versions and compares their run-time? Compile once with optimizations, other time with profiling and compare run-times and hot-spots shown by the profiler.