[fusion] [mpl] associative containers, compile time performance

(Boost 1.42, MSVC++ 9.0 SP1) Recently, I was finally forced to do some serious refactoring because the 'unbareable' turned to 'not working at all' (read: the compiler was hitting the 32 bit limit with 'out of heap space' errors), I got from hundreds of thousands of template instantiations down to 'just' thousands in a particular TU [meassured with Steven Watanabe's profiler...Steven you do miracles ;-D ...I also managed to get the Templight profiler to work with MSVC++ 9.0 but it just crashed with a project like this one...], and a peak cl.exe allocation of about 0.5 GB with a clean rebuild taking under 2 minutes which is acceptable... In the process I found out that: - the Fusion associative containers are always slower than random access containers (even if you use them only for/with by-key access). For example, I have a Fusion vector of different 'parameter' classes (it is therefore a 'unique random access set') for which a I wish to genereate a container of corresponding widget classes. Considering I later need to access the widget container using a parameter type as a key ('give me the widget for this parameter') it seems logical to use a map ( of pair<Parameter, Widget>s) but as it turns out accessing this map is several times! slower (compile-time wise) than using a plain vector for the widgets and then doing a linear search on the original parameter container to find the index of the parameter and then accessing the widget vector using that index... - the MPL associative containers are faster than random access ones only if you use them solely for by-key access, if any sequential or indexed access (using begin+advance) is also used (even if 'minor' to the amount of by-key access) this immediately kills the compiler. The size of the container might have also played a part in the distinction...in the end what worked best for me was to use mpl::maps for smaller collections (less than 10 types) that are only accessed by key, and mpl::vectors for large collections (50 types) that need to be accessed by index and by key... If these things are somehow 'ancient knowledge' maybe it would be worthwhile to document them so...In fact the Fusion documentation for associative sequences actually states "An Associative Sequence allows efficient retrieval of elements based on keys..." while in my case this turned out quite the opposite of truth...maybe this was referening to runtime-efficiency but then again this should be more clearly stated in the documentation (especially considering that, given the compile time nature of Fusion, it is reasonable to apriori expect that there is no runtime/generated code difference between different Fusion containers with most half decent compilers)... Looking at the implementations of Fusion associative containers it can be seen that, in fact, they do use vectors for implementation and do linear searches... One more thing that might help with Fusion associative container related compile-time defficiencies is to add the non-variadic/fixed sized versions for sets and maps (e.g. I frequently have this information and could specify the size exactly instead of being forced to use the variadic versions)...Just specifying FUSION_MAX_SET_SIZE=10 instead of 20 provided a measurable compile-time improvement... Additionaly, fusion::joint_view lacks the vital at<>() implementation(s) even though it states it is a model of a forward sequence so one is forced to use begin+advance... As for MPL, it still lacks the implementation of at<Seq, Key, Default> (that would be of significant help for my case) even though it is documented for 'who knows how long'... Even more usefull would be if MPL associative containers would not only provide the order<> 'getter' but also allow access using the order value (with same efficiency as key-based access of course)...this is usefull when you need to 'export' your 'typelist' outside of your library/'DLL' and allow outside code to specify the type that it wishes to create using an ID (like a factory that takes an ID as a parameter)...without the ability to use the 'order' as the ID you are forced to use the index...which then forces you to use vectors (because indicies simply 'kill' the compiler when used with a set) even though you also need key-based access at other places... ps. What does BOOST_MPL_LIMIT_UNROLLING actually do/configure? :) -- "What Huxley teaches is that in the age of advanced technology, spiritual devastation is more likely to come from an enemy with a smiling face than from one whose countenance exudes suspicion and hate." Neil Postman
participants (1)
-
Domagoj Saric