
Boris Gubenko wrote:
Steven Watanabe wrote:
Ok. It would be great if compilers supported this directly.
Just fyi: the cxx compiler has verbose template instantiation mode. On Tru64, for example:
Steven wrote: I only see function template instantiations, I'm more interested in class template instantiations because that's where the metaprogramming is done. Am I missing something?
I'd also like to have all template instantiations, not just those that are triggered from inside other template instantiations. (Although this doesn't make a huge difference)
This is the icc documentation for the -prof-gen flag. Apparently it instruments the code for every basic block to enable profile guided optimization later on. This should include template instantiates and basic blocks from inlined functions. ----------------------------------------- prof-gen, Qprof-gen Instruments a program for profiling. IDE Equivalent Windows: General > PGO Phase Architectures IA-32 architecture, Intel(r) 64 architecture, IA-64 architecture Syntax Linux and Mac OS X: -prof-gen -prof-genx Windows: /Qprof-gen /Qprof-genx Arguments None Default OFF Programs are not instrumented for profiling. Description This option instruments a program for profiling to get the execution count of each basic block. It also creates a new static profile information file (.spi). If -prof-genx or /Qprof-genx is specified, extra information (source position) is gathered for code-coverage tools. If you do not use a code-coverage tool, this option may slow parallel compile times. If you are doing a parallel make, this option will not affect it. These options are used in phase 1 of the Profile Guided Optimizer (PGO) to instruct the compiler to produce instrumented code in your object files in preparation for instrumented execution. ------------------------------------------ Later on you would use -prof-gen-sampling to, among other things, create a map from object code to line number in the source code. This map should be a superset of data you are looking for, which is instantiation count for templates. For templates that don't end up with any object code (meta-functions) I think you would find no instantiations in the map, whereas you might still find that the compiler evaluated the meta-function many times with your warning based approach. I guess it depends on what you are looking for. ------------------------------------------------------ prof-gen-sampling, Qprof-gen-sampling Prepares application executables for hardware profiling (sampling) and causes the compiler to generate source code mapping information. IDE Equivalent None Architectures IA-32 architecture Syntax Linux and Mac OS X: -prof-gen-sampling Windows: /Qprof-gen-sampling Arguments None Default OFF Application executables are not prepared for hardware profiling and the compiler does not generate source code mapping information. Description This option prepares application executables for hardware profiling (sampling) and causes the compiler to generate source code mapping information. The application executables are prepared for hardware profiling by using the profrun utility followed by a recompilation with option -prof-use (Linux and Mac OS X) or /Qprof-use (Windows). This causes the compiler to look for and use the hardware profiling information written by profrun (by default, into a file called pgopti.hpi). This option also causes the compiler to generate the information necessary to map hardware profile sample data to specific source code lines, so it can be used for optimization in a later compilation. The compiler generates both a line number and a column number table in the debug symbol table. This process can be used, for example, to collect cache miss information for use by option ssp on a later compilation. Alternate Options None See Also prof-use, Qprof-use compiler options ssp, Qssp compiler options ---------------------------------------------- My own interest is that I would like to do performance profiling and tuning of template instantiated code with VTune. It looks like these compiler options in icc are more well suited to that than the static information you are looking for. Is there a way to write a meta-function that implements a counter? The idea is that each time a template is instantiated by the compiler a meta-function counter (inserted by a script similar to your warning) would be evaluated. Then you could collect the counts computed at compile time and print them to file at runtime along with the assocated type name information from the counter's template parameter. template <typename T> struct meta_counter {...}; template < something> struct my_template { //inserted by script meta_counter<typeof_this_template>::increment_somehow; ... }; I'm not sure how retrieving the count would work. Perhaps you can create a global const with set to the value and just use nm to retrive the value. I have no idea how to fully implement this right now, but perhaps Steven can run with the idea. Thanks, Luke