large variant performance compared (50 elements)

In a project we are using variants holding shared_ptr's. These variants are typically 20 items large but are growing as the project progresses. Now it seems that each time we add elements, we fall into issue's regarding excessive compilation time, huge pdb file or out of heap space errors on the compiler or linker. I took some time to compare various approaches using variants, here are the results: stdafx.h #include <boost/mpl/vector.hpp> #include <boost/mpl/vector/vector100.hpp> #include <boost/variant/variant.hpp> #include <boost/variant/apply_visitor.hpp> #include <boost/variant/static_visitor.hpp> test.h class CTest { public: void foo(const Cs_t& rt); }; type.h class C1 { public: int i; }; class C2 { public: int i; }; class C3 { public: int i; }; ... class C50 { public: int i; }; //v1 typedef boost::variant< boost::shared_ptr<C1>, boost::shared_ptr<C2>, <...> boost::shared_ptr<C49>, boost::shared_ptr<C50>
v_t;
//v2 typedef boost::mpl::vector50< boost::shared_ptr<C1>, boost::shared_ptr<C2>, <...> boost::shared_ptr<C49>, boost::shared_ptr<C50>
Mplv_t; typedef boost::make_variant_over<Mplv_t>::type v_t;
//v3 typedef boost::variant< C1, C2, <...> C49, C50
v_t;
test.cpp namespace { class CVisit : public boost::static_visitor<> { public: template<typename T> void operator()(const T& rt) const { std::cout << typeid(T).name() << std::endl; } }; } void CTest::foo(const v_t& v) { //A: (invoke visitor directly, no new variant) boost::apply_visitor(CVisit(), v); //B: Copy variant using copy constructor v_t v2(v); <...> //C: Copy variant using assignment operator v_t v2; v2 = v; <...> } main.cpp int main(int argc, char* argv[]) { CTest t; v_t v = boost::shared_ptr<C10>(new C10); t.foo(v); return 0; } Setup: Boost 1.38 Msvc2005, Release target Optimize MaxSpeed (/O2) Debug info enabled (/Zi) Build time = rebuild solution; cl + link Variant type: v1 Foo logic:A Obj size: 2.4MB Exe size: 348KB Pdb size: 9.0MB Build time: 0:49 Peak commit: 500MB Variant type: v2 Foo logic:A Obj size: 1.4MB Exe size: 348KB Pdb size: 11.7MB Build time: 1:02 Peak commit: 600MB Variant type: v3 Foo logic:A Obj size: 1.9MB Exe size: 344KB Pdb size: 3.9MB Build time: 0:25 Peak commit: 300MB Variant type: v1 Foo logic:B (copy variant using copy constructor) Obj size: 3.1MB Exe size: 352KB Pdb size: 9.0MB Build time: 0:50 Peak commit: 500MB Variant type: v2 Foo logic:B Obj size: 2.2MB Exe size: 352KB Pdb size: 11.7MB Build time: 1:03 Peak commit: 600MB Variant type: v3 Foo logic:B Obj size: 2.1MB Exe size: 344KB Pdb size: 3.9MB Build time: 0:26 Peak commit: 300MB Variant type: v1 Foo logic:C (copy variant using assignment operator) Obj size: 71.3MB Exe size: 356KB Pdb size: 10.7MB Build time: 5:18 Peak commit: 1.2GB Variant type: v2 Foo logic:C Obj size: 80.6MB Exe size: 356KB Pdb size: 13.6MB Build time: 8:46 Peak commit: 1.4GB Variant type: v3 Foo logic:C Obj size: 2.4MB Exe size: 344KB Pdb size: 3.9MB Build time: 0:26 Peak commit: 300MB Is there any way to explain these (huge) differences and what is preffered? - Why does the use of assignment operator have such a hugh impact? - Why the difference between using shared_ptr or not? - When to use the numbered variant or boost::variant<...>? Help is greatly appreciated! Paul

AMDG On 1/7/2011 2:45 PM, Paul wrote:
Thank you for doing this.
I wouldn't have expected so much of a difference, but the dispatching of the assignment operator is relatively complex compared to the copy constructor and apply_visitor, to handle exception safety. You might check what happens if you add boost::blank to the variant.
- Why the difference between using shared_ptr or not?
It's probably because C1, C2, ... are all POD. What happens if you add a destructor?
- When to use the numbered variant or boost::variant<...>?
It shouldn't make a significant difference. In Christ, Steven Watanabe

AMDG On 1/7/2011 2:45 PM, Paul wrote:
Looks like there's quadratic behavior. Total instantiations: 45697 Location count cum. ----------------------------------------------------------------------------------------------------------------------- ..\trunk\boost/variant/variant.hpp(666) 9604 9604 ..\trunk\boost/variant/variant.hpp(627) 9604 19208 ..\trunk\boost/variant/detail/visitation_impl.hpp(163) 4949 24157 ..\trunk\boost/variant/detail/visitation_impl.hpp(140) 4949 29106 In Christ, Steven Watanabe

Hi Steven, I'm really impressed with the amount of detail you have been able to produce in your analysis in such short time. How did (can) you count the number of instantiations? This type of analysis is really helpful in our compile-time reduction/analysis. I will try several suggestions; including suggested fix this evening and respond on the thread with the results. Paul

AMDG On 1/8/2011 3:26 AM, Paul wrote:
The tool I wrote is available through Subversion http://svn.boost.org/svn/boost/sandbox/tools/profile_templates In Christ, Steven Watanabe

"Steven Watanabe" <watanabesj@gmail.com> wrote in message news:4D289EAB.6000403@providere-consulting.com...
The tool I wrote is available through Subversion http://svn.boost.org/svn/boost/sandbox/tools/profile_templates
thanks for this!

"Steven Watanabe" <watanabesj@gmail.com> wrote in message news:4D289EAB.6000403@providere-consulting.com...
The tool I wrote is available through Subversion http://svn.boost.org/svn/boost/sandbox/tools/profile_templates
basic question: How do I compile this? Here are my attempts: I compiled boost before: ======= C:\Users\Peter\Sources\boost_1_45_0>.\bjam.exe msvc architecture=x86 address-model=64 install --prefix=C:\users\peter ======= Using bjam.exe: ================ C:\Users\Peter\Sources\profile_templates>..\boost_1_45_0\bjam.exe ...found 1 target... ================ Using cl.exe on the commandline: ================ C:\Users\Peter\Sources\profile_templates\src>cl postprocess.cpp /Zi /EHa Microsoft (R) C/C++ Optimizing Compiler Version 16.00.30319.01 for x64 Copyright (C) Microsoft Corporation. All rights reserved. postprocess.cpp Microsoft (R) Incremental Linker Version 10.00.30319.01 Copyright (C) Microsoft Corporation. All rights reserved. /out:postprocess.exe /debug postprocess.obj LINK : fatal error LNK1104: cannot open file 'libboost_regex-vc100-mt-s-1_45.lib ' C:\Users\Peter\Sources\profile_templates\src>cl postprocess.cpp /Zi /EHa C:\Users\Peter\Sources\profile_templates\src>cl filter.cpp /Ox /EHa Microsoft (R) C/C++ Optimizing Compiler Version 16.00.30319.01 for x64 Copyright (C) Microsoft Corporation. All rights reserved. filter.cpp Microsoft (R) Incremental Linker Version 10.00.30319.01 Copyright (C) Microsoft Corporation. All rights reserved. /out:filter.exe filter.obj C:\Users\Peter\Sources\profile_templates\src>cl postprocess.cpp /Ox /EHa Microsoft (R) C/C++ Optimizing Compiler Version 16.00.30319.01 for x64 Copyright (C) Microsoft Corporation. All rights reserved. postprocess.cpp Microsoft (R) Incremental Linker Version 10.00.30319.01 Copyright (C) Microsoft Corporation. All rights reserved. /out:postprocess.exe postprocess.obj LINK : fatal error LNK1104: cannot open file 'libboost_regex-vc100-mt-s-1_45.lib' C:\Users\Peter\Sources\profile_templates\src>set include INCLUDE=C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE;C:\Progra m Files (x86)\Microsoft Visual Studio 10.0\VC\ATLMFC\INCLUDE;C:\Program Files (x 86)\Microsoft SDKs\Windows\v7.0A\include;c:\Users\Peter\include\boost-1_45 C:\Users\Peter\Sources\profile_templates\src>dir \Users\Peter\lib\libboost_regex*.lib /s /b C:\Users\Peter\lib\libboost_regex-vc100-mt-1_45.lib C:\Users\Peter\lib\libboost_regex-vc100-mt-gd-1_45.lib C:\Users\Peter\lib\libboost_regex-vc100-mt-gd.lib C:\Users\Peter\lib\libboost_regex-vc100-mt.lib C:\Users\Peter\Sources\profile_templates\src>

"Steven Watanabe" <watanabesj@gmail.com> wrote in message news:4D2A6C71.5070600@providere-consulting.com...
First I figured I had to install boost book. This was a nightmare! I gave up on this. So I just started reading the .qbk with a texteditor. But it still does not work. C:\Users\Peter\sources7\diode>set BOOST_BUILD_PATH BOOST_BUILD_PATH=c:\Users\Peter\Sources\boost_1_45_0 C:\Users\Peter\sources7\diode>type *.jam jamroot.jam import template-profile ; template-profile diode : main.cpp ; C:\Users\Peter\sources7\diode>type ..\..\user-config.jam using msvc : 10.0 : "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\BIN\amd64\cl.exe" : address-model=64 ; modules.load template-profile : c:/users/peter/Sources/profile_templates/template-profile.jam ; C:\Users\Peter\sources7\diode>bjam address-model=64 error: Unable to find file or target named error: '/boost//regex' error: referred from project at error: '/c:/Users/Peter/Sources/profile_templates/src' C:\Users\Peter\sources7\diode>bjam -v Boost.Jam Version 3.1.19. OS=NT. Copyright 1993-2002 Christopher Seiwald and Perforce Software, Inc. Copyright 2001 David Turner. Copyright 2001-2004 David Abrahams. Copyright 2002-2008 Rene Rivera. Copyright 2003-2008 Vladimir Prus. C:\Users\Peter\sources7\diode>'

"Steven Watanabe" <watanabesj@gmail.com> wrote in message news:4D3CD78C.3060500@providere-consulting.com...
thanks! Ich verneige mich vor meinem MPL Lehrer! http://translation.babylon.com/german/to-english/

Steven, I got it to work! Had to make some change to the jamfiles however to get it going on my workarea (non sandbox). How should i interpret the callgraphs? See sample below. Is the part below Parents the call(stack)? Need some help in doing analysis using this nice tool. Paul Call Graph I:\fcm3prod\BuildSet\repository\Boost\boost\boost/mpl/aux_/preprocessed/plain/fold_impl.hpp(131) (48) Parents: I:\fcm3prod\BuildSet\repository\Boost\boost\boost/mpl/fold.hpp(32) (48) I:\fcm3prod\BuildSet\repository\Boost\boost\boost/mpl/transform.hpp(138) (24) I:\fcm3prod\BuildSet\repository\Boost\boost\boost/variant/variant.hpp(114) (24) I:\fcm3prod\BuildSet\repository\Boost\boost\boost/variant/variant.hpp(216) (24) I:\fcm3prod\BuildSet\repository\Boost\boost\boost/variant/variant.hpp(908) (48) b.cpp(25) (48) Children: I:\fcm3prod\BuildSet\repository\Boost\boost\boost/detail/reference_content.hpp(115) (336/50) I:\fcm3prod\BuildSet\repository\Boost\boost\boost/detail/reference_content.hpp(85) (336/50) I:\fcm3prod\BuildSet\repository\Boost\boost\boost/mpl

Op 11-1-2011 21:48, Paul schreef:
<...>
More specifically what does (336/50) mean here?
Question: What should be an acceptable number of cum-instantiations for a cpp file? Ofcourse i could break it down into several cpp's (however needing explicit instantiations since inclusion-model cannot be used in such case). I'm having doubts whether its wise to do so!? Paul

AMDG On 1/11/2011 1:27 PM, Paul wrote:
I can try to improve this. I know that it needs a Jamroot and there needs to be a use-project boost : path/to/boost ; somewhere. Is there anything else?
I don't think it really means anything. I've had a lot of trouble with the call graph, and I've never managed to get the numbers to come out to something meaningful.
I don't advise using this metric to determine what's acceptable. If it compiles in a reasonable amount of time using a reasonable amount of resources, then it's acceptable--reasonable being defined by what your willing to pay. The number of template instantiations doesn't translate directly into compiler resource usage. The tool is more designed to help you figure out what to look at, given that you already want to improve compilation time.
In Christ, Steven Watanabe

The Jamfile.v2 in profile_template/src: # # Copyright (c) 2008 # Steven Watanabe # # Distributed under the Boost Software License, Version 1.0. (See # accompanying file LICENSE_1_0.txt or copy at # http://www.boost.org/LICENSE_1_0.txt) import modules ; local boost = [ modules.peek : BOOST ] ; project src : : requirements <include>path-to-boost <library-path>path-to-boost <link>static ; exe postprocess : postprocess.cpp $(boost)/libs/regex/build//boost_regex : <variant>release ; exe filter : filter.cpp $(boost)/libs/regex/build//boost_regex : <variant>release ; The Jamfile of the cpp being profiled (much is workarea/project specific): # Jamfile.v2 # # Copyright (c) 2008 # Steven Watanabe # # Distributed under the Boost Software License, Version 1.0. (See # accompanying file LICENSE_1_0.txt or copy at # http://www.boost.org/LICENSE_1_0.txt) import modules ; modules.load template-profile : ../../../../../../profile_templates/template-profile.jam ; template-profile archive : Archive.cpp : <include>../../../../include <include>added-some-more-project-specific-include-paths <define>BOOST_ALL_DYN_LINK=1 ;

I've done some testing based on the suggestions: Adding a destructor to the 'dummy' types doesn't seem to make a difference. Adding boost::blank as the first type however makes a hugh difference: typedef boost::mpl::vector50< boost::blank, boost::shared_ptr<C1>, boost::shared_ptr<C2>, boost::shared_ptr<C3>, ... Variant type: v2 Foo logic:C Original Obj size: 80.6MB Exe size: 356KB Pdb size: 13.6MB Build time: 8:46 Peak commit: 1.4GB Variant type: v2 Foo logic:C Add boost::blank as first type Obj size: 2.6MB Exe size: 348KB Pdb size: 11.6MB Build time: 1:03 Peak commit: 600MB Drawback of this approach seems to be that each visitor needs to have an operator()() receiving a boost::blank&; not a big problem anyway. I failed to apply the patch-file on the 1.38 revision; either there is a problem with the patch.exe that i downloaded (for win32) or the patch-file doesn't match this revision. When you can send me a modified variant.hpp based on the file attached, then i will be happy to run some tests. I was wondering whether the found behavior is a bug or not; after all it seems to be functionally correct but it just doesn't scale very well. When this problem can be fixed in the variant.hpp then i guess that's preferred but what are the consequences at functional level and/or runtime performance? Paul

AMDG On 1/8/2011 11:56 AM, Paul wrote:
Now that I've looked through the code, it's the copy constructor and default constructor that matter.
I was able to apply the patch. I've attached it. (zipped since it's rather large.)
The functionality should be unaffected. The runtime cost is one call through a function pointer. In Christ, Steven Watanabe

Steven, Thanks for the modified header. I've tested and it seems to work, see results below. Would this be a change that is candidate for a future boost release? And are you able to do just that? Variant type: v2 Foo logic:C Obj size: 80.6MB Exe size: 356KB Pdb size: 13.6MB Build time: 8:46 Peak commit: 1.4GB Variant type: v2 Foo logic:C Add boost::blank as first type Obj size: 2.6MB Exe size: 348KB Pdb size: 11.6MB Build time: 1:03 Peak commit: 600MB Variant type: v2 Foo logic:C Test with patch <==== Obj size: 4.6MB Exe size: 352KB Pdb size: 11.8MB Build time: 1:10 Peak commit: 600MB

AMDG On 1/8/2011 1:19 PM, Paul wrote:
Thanks for the modified header.
I've tested and it seems to work, see results below.
Cool.
Would this be a change that is candidate for a future boost release? And are you able to do just that?
I've committed it to the trunk. https://svn.boost.org/trac/boost/changeset/67798 In Christ, Steven Watanabe

AMDG On 1/9/2011 5:37 AM, Mathias Gaunard wrote:
This is in the code path that's executed when 1) The source type doesn't have a nothrow copy constructor, 2) The source type doesn't have a nothrow move constructor, and 3) None of the variant types has a nothrow default constructor. The cost is already 2 switch statements (more if there are a large number of types), 1 new/delete pair, 1 call to the copy constructor of the type that the variant already holds, 2 calls to its destructor, and 1 call to the copy constructor of the new type. a) If the small extra overhead actually matters to you, you should probably be avoiding this branch to begin with. b) There's no guarantee that it actually makes the code slower, since it significantly reduces the code size. (The O(n^2) is function templates that are doing real work.) In Christ, Steven Watanabe

I guess as was to fast there in being optimistic ;) It seems there is another performance issue when i extend the number of types further to 75 (on the project we expect to grow to +/- 60 types). typedef boost::mpl::vector75< boost::shared_ptr<C1>, boost::shared_ptr<C2>, boost::shared_ptr<C3>, ... My machine simply runs out of memory (commit 3.0GB) while the compiler keeps consuming 100% for minutes... From what i can see now is that no combination (adding boost::blank or not, patching variant.hpp or not) seems to succeed to compile and link <1min and <1GB or ram. A test using boost::blank + the patched variant just finished on 4:00min. Steven, can you maybe invest with your tool where the load/instantiations are coming from? PS: I have downloaded your tool with svn, i intend to test/use it for myself as well. ;) Paul

AMDG On 1/8/2011 2:48 PM, Paul wrote:
Does the cost increase continuously, with the number of elements?
Steven, can you maybe invest with your tool where the load/instantiations are coming from?
Maybe in a few days. boost::mpl::vector75 doesn't exist without some extra work.
PS: I have downloaded your tool with svn, i intend to test/use it for myself as well. ;)
In Christ, Steven Watanabe

On 01/08/11 18:32, Steven Watanabe wrote: [snip]
Why not just use variadic mpl: http://svn.boost.org/svn/boost/sandbox/variadic_templates/ which has no limit on mpl::vector size? It would simply involve adding another -I to the compile flags. [snip]

On 01/08/11 20:51, Larry Evans wrote:
Or even try the attachment to see how one_of_maybe alternative to boost::variant: http://svn.boost.org/svn/boost/sandbox/variadic_templates/ boost/composite_storage/pack/container_one_of_maybe.hpp compares. -regards, Larry

Op 9-1-2011 1:32, Steven Watanabe schreef:
Does the cost increase continuously, with the number of elements?
Some more testresults, i've measured compile + link times for 10,20,30,40,... number of elements. typedef boost::mpl::vector75< boost::shared_ptr<C1>, boost::shared_ptr<C2>, boost::shared_ptr<C3>, ... No boost::blank added to the sequence, variant.hpp not patched (so original version). Also i removed the copy/assignment so its only construction of the variant, and 1 static-visit. size: 10 = 0:09 size: 20 = 0:12 size: 30 = 0:19 size: 40 = 0:34 size: 50 = 1:02 size: 60 = 1:51 size: 70 = 3:09 size: 80 = 5:07 cl.exe working set = 1.7GB size: 90 = 9:58 compiler out of heap space (commit 3.2GB)
Maybe in a few days. boost::mpl::vector75 doesn't exist without some extra work.
Sizes beyond 50 (if i remember correct) are not supported by default with the boost distribution, but there is a script somewhere to generate the headers (such as boost/mpl/vector/vector100.hpp, 150, 200, ...). It looks to me that the variant may not be that feasible for >50 types. Ofcourse it may be larger or smaller depending on the hardware and your patience ;) and ofcourse also to what extend the variant is used. We use for instance a simple struct to wrap the variant (fwd decl the struct) so the load is not exposed that much. Paul

On 01/09/11 12:55, Paul wrote:
Paul, Can you post your test driver and whatever make-like file (Maybe, a .jam or CMake or plain Makefile) used to produce and report these results. I'd like to try it on the variadic templates one_of_maybe template mentioned in my other posts to this thread. BTW, there is a non-variadic version of one_of_maybe sitting in the boost vault: http://www.boostpro.com/vault/index.php?action=downloadfile&filename=composite_tagged_seq.zip&directory=Data%20Structures&PHPSESSID=13e034cbcc2d4c0a4c6e0b779540ef00 In case you want to try that. However, I have not tested that in a while and it requires explicit specification, via an enumerator , of what you're "injecting" into the variant. I.e. T_i t_i; v.inject<E_i>(t_i; where v is the variant with component types, T_1, T_2,..., T_n, and E_i is the corresponding enumerator. For example: enum E_I { e_1 , e_2 ... , e_n }; Or, you could just use mpl::integral_c<unsigned,0> as first template arg to the one_of_maybe template and then all the E_i's would be unsigned values. TIA. Larry

Larry, All i have is a vs2005 solution. I stripped the overhead so you should be able to use it; however you might need to add some search paths to boost/lib, etc... Paul

At Fri, 07 Jan 2011 23:45:04 +0100, Paul wrote:
If it's really just shared_ptr's in the variant, I would seriously consider replacing it with shared_ptr<void> (with a type tag if necessary). Just a thought. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

At Sat, 08 Jan 2011 02:17:36 -0500, Dave Abrahams wrote:
Seriously, could be a big simplification and speedup vs. using using variant. not-going-to-mention-it-again-ly y'rs, Dave -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave, Another alternative would be to pass shared/base-ptr's and use double-dispatching to provide functions on concrete types. This however bothers the model-objects with visitation code; which is fine for 1 or 2 things but very disturbing on >10 functions. With the variant you can move the visitation logic completely to the client/caller code, Hopefully its not to vague...? Your option requires dynacasting and it doesn't allow the goodies such as enable-if on subsets of the variant types; which is al optimized at compiletime. I however agree that it's much simpler and in-fact we might had to explore this option more back then... ;) Thanks for thinking along. Paul Op 8-1-2011 21:53, Dave Abrahams schreef:

At Sat, 08 Jan 2011 22:29:33 +0100, Paul wrote:
Yes. That's essentially the same problem as extracting an object from a variant. However, yes, if you want typesafe visitation you'd need to build that on top, which essentially replicates logic in visitor. In that case I would consider the shared_ptr<void> approach as a possible future optimization but not a game changer. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

At Sat, 8 Jan 2011 23:34:47 +0200, Igor R wrote:
No, tags could be an enum or a type_info*, for example -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On 08/01/2011 23:34, Dave Abrahams wrote:
An enum (or an integer) is a much better idea than a type_info*, since it can be used in a switch statement, and can be mapped naturally to a type in a list of types as an index. Writing a variant replacement is actually quite easy, and doing so would greatly reduce your compile times. Variant is old, full of quirks, and doesn't scale well. Why it even requires its MPL input sequence to be Front Extensible (which it doesn't even state in its documentation) is beyond me. This is a very annoying limitation that makes it impractical to use with a large amount of types, since compatibility with joint_view would be very nice in that situation. The best option is probably to declare your variant-like type using a PP list of types. This way you can directly generate the matching switch statement with minimal overhead. If you need to take a MPL list as input, the best advice is probably to use Steven Watanabe's switch library, as it can generate a switch statement that converts runtime integers into compile-time ones.

Op 9-1-2011 17:29, Steven Watanabe schreef:
Fully agree on that Steven. Would help though to also mention its limitations + rationale in the documentation. It is mentioned to set /Zm on out of memory errors but the root-cause why you got the out-of-memory problem in the first place is not addressed. Paul

On 1/10/2011 7:35 AM, Mathias Gaunard wrote:
Agreed. Perhaps it's time for V2 that does not necessarily have to be fully backward compatible. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

AMDG On 1/9/2011 7:57 PM, Joel de Guzman wrote:
Why exactly would we need to break backwards compatibility? Eliminating the Front Extensible requirement shouldn't break anything. I don't know of anything in the interface of variant that would seriously interfere with a better implementation. I know that assignment is a mess, but there's a good reason it works the way it does. In Christ, Steven Watanabe

On 1/10/2011 12:22 PM, Steven Watanabe wrote:
Indeed! :-) Also, the Front Extensible quirk is easy to fix. I have code for it. In fact, I mentioned that Eric Friedman about it at BoostCon. Compile time? I think that too can be improved without having to rewrite again from scratch. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

On 1/10/2011 12:29 AM, Steven Watanabe wrote:
Agreed 100%. Someone should do a full upgrade for Boost.variant. We use it a lot and its quirks are somewhat annoying (e.g. I had to write my own as_variant from Fusion sequences because of the Front Extensible requirement). A more efficient upgrade would be very welcome! Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Op 9-1-2011 15:03, Mathias Gaunard schreef:
We use following logic to work with subsets: //Join the two mpl sequences typedef boost::mpl::joint_view<Mplv1_t, Mplv2_t>::type Mplv_t; typedef boost::make_variant_over< boost::mpl::copy< //make_variant_over requires an 'extensible sequence' while joint_view creates a forward sequence Mplv_t, boost::mpl::back_inserter<boost::mpl::vector0<> > >::type
::type v_t;
This solution probably takes quite some compile-time as well but since the sequences only hold 5 or 6 types its sufficient. It does allow to use enable_if(contains<mplv1_t, T>) in the static-visitor functor which gives some really nice code; meaning efficient and easy to understand. Paul

It looks like we are heading for an alternative approach (again). Based on the boost::variant i'm writing my own variant that is optimized to only hold shared_ptr's. First test seem promising i can finally create variants with 200 types in 2 sec compilation :) However binary invokation is even here quickly a problem, its generating 'paths' quadratically, 50 bounded types produce 2500 paths.... Maybe anyone can provide some feedback on following implementation? Thanks, Paul template<typename Typelist> class CLoPtrVariant { public: CLoPtrVariant() : m_uiSelect(0) { // precondition assertions BOOST_STATIC_ASSERT((boost::mpl::is_sequence<Typelist>::value)); } template<typename Type> CLoPtrVariant(const boost::shared_ptr<Type>& rspValue) { assign(rspValue); } CLoPtrVariant(const CLoPtrVariant& rOperand) { m_spSelect = rOperand.m_spSelect; m_uiSelect = rOperand.m_uiSelect; } template<typename Typelist2> CLoPtrVariant(const CLoPtrVariant<Typelist2>& rOperand) { CConvertVariant<Typelist> convert(*this); apply_visitor(convert, rOperand); } template<typename Type> CLoPtrVariant& operator=(const boost::shared_ptr<Type>& rspValue) { assign(rspValue); return *this; } template<typename Type> CLoPtrVariant& operator=(boost::shared_ptr<Type>& rspValue) { assign(rspValue); return *this; } bool operator<(const CLoPtrVariant& rRhs) const { if(m_uiSelect != rRhs.m_uiSelect) { return (m_uiSelect < rRhs.m_uiSelect); } else { return (m_spSelect < rRhs.m_spSelect); } } bool operator==(const CLoPtrVariant& rRhs) const { return (m_uiSelect == rRhs.m_uiSelect && m_spSelect == rRhs.m_spSelect); } bool operator!=(const CLoPtrVariant& rRhs) const { return !(*this == rRhs); } template<typename VisitorType> typename VisitorType::result_type apply_visitor(VisitorType& rVisitor) const { switch (m_uiSelect) { #ifdef BOOST_PP_LOCAL_ITERATE #define BOOST_PP_LOCAL_MACRO(n) \ case n: \ return apply_visitor_impl<VisitorType, boost::mpl::int_<n>
size_t GetWhich() const { return m_uiSelect; } void SetWhich(size_t which) { LO_CHECK(!m_spSelect); m_uiSelect = which; } private: template<typename Type> void assign(const boost::shared_ptr<Type>& rspValue) { typedef boost::mpl::find<Typelist, boost::shared_ptr<Type>
template<typename VisitorType, typename IndexType> inline typename VisitorType::result_type apply_visitor_impl(VisitorType& rVisitor, boost::mpl::true_ /*is_unrolled_t*/) const { typedef boost::mpl::at<Typelist, IndexType>::type type_t; return rVisitor(boost::static_pointer_cast<type_t::value_type>(m_spSelect)); } template<typename VisitorType, typename IndexType> inline typename VisitorType::result_type apply_visitor_impl(VisitorType& rVisitor, boost::mpl::false_ /*is_unrolled_t*/) const { //Should never be here at runtime; only required to block code generation that deref's the sequence out of bounds BOOST_ASSERT(false); return VisitorType::result_type(); } template<typename VisitorType, typename IndexType> inline typename VisitorType::result_type apply_visitor_impl(VisitorType& rVisitor) const { typedef typename boost::mpl::less<IndexType, boost::mpl::size<Typelist>::type>::type is_unrolled_t; return apply_visitor_impl<VisitorType, IndexType>(rVisitor, is_unrolled_t()); } private: boost::shared_ptr<void> m_spSelect; size_t m_uiSelect; }; //Helper function-template to construct the variant type template<typename Typelist> struct make_variant_over { public: typedef CLoPtrVariant<Typelist> type; }; //Unary visitation template<typename Visitor, typename Visitable> inline typename Visitor::result_type apply_visitor(const Visitor& visitor, Visitable& visitable) { return visitable.apply_visitor(visitor); } template<typename Visitor, typename Visitable> inline typename Visitor::result_type apply_visitor(Visitor& visitor, Visitable& visitable) { return visitable.apply_visitor(visitor); } //Binary visitation template <typename Visitor, typename Visitable2> class CBinaryUnwrap1 { public: typedef typename Visitor::result_type result_type; private: Visitor& visitor_; Visitable2& visitable2_; public: CBinaryUnwrap1(Visitor& visitor, Visitable2& visitable2) : visitor_(visitor) , visitable2_(visitable2) { } public: template<typename Value1> result_type operator()(Value1& value1) { CBinaryUnwrap2<Visitor, Value1> unwrapper(visitor_, value1); return apply_visitor(unwrapper, visitable2_); } }; template<typename Visitor, typename Value1> class CBinaryUnwrap2 { public: typedef typename Visitor::result_type result_type; private: Visitor& visitor_; Value1& value1_; public: CBinaryUnwrap2(Visitor& visitor, Value1& value1) : visitor_(visitor) , value1_(value1) { } public: template <typename Value2> result_type operator()(Value2& value2) { return visitor_(value1_, value2); } }; template<typename Visitor, typename Visitable1, typename Visitable2> inline typename Visitor::result_type apply_visitor(const Visitor& visitor, Visitable1& visitable1, Visitable2& visitable2) { CBinaryUnwrap1<const Visitor, Visitable2> unwrapper(visitor, visitable2); return apply_visitor(unwrapper, visitable1); } template<typename Visitor, typename Visitable1, typename Visitable2> inline typename Visitor::result_type apply_visitor(Visitor& visitor, Visitable1& visitable1, Visitable2& visitable2) { CBinaryUnwrap1<Visitor, Visitable2> unwrapper(visitor, visitable2); return apply_visitor(unwrapper, visitable1); } //Base class for visitor classes template<typename R = void> class static_visitor { public: typedef R result_type; protected: // for use as base class only static_visitor() { } ~static_visitor() { } }; template<typename Base, typename Derived> struct is_base_of_smartptr : boost::is_base_of<typename Base::value_type, typename Derived::value_type> { }; //Convert variant types template<typename Typelist> class CConvertVariant : public static_visitor<> { public: CConvertVariant(CLoPtrVariant<Typelist>& rVariant) : m_rVariant(rVariant) { } template<typename Pos, typename Type> void assign_variant(boost::shared_ptr<Type>& rValue, boost::mpl::false_) //convertible { typedef boost::mpl::deref<Pos>::type type_t; m_rVariant = boost::static_pointer_cast<type_t::value_type>(rValue); } template<typename Pos, typename Type> void assign_variant(boost::shared_ptr<Type>& rValue, boost::mpl::true_) //not convertible { BOOST_STATIC_ASSERT((boost::mpl::false_)); //Compiler error here indicates that (one of) the variants bounded types is not convertible to the target variant type m_rVariant = rValue; } template<typename T> void operator()(boost::shared_ptr<T>& rValue) //T is not const to match the variant bounded types { typedef boost::mpl::find_if<Typelist, is_base_of_smartptr<boost::mpl::_1, boost::shared_ptr<T> > >::type pos_t; typedef boost::mpl::end<Typelist>::type end_t; typedef boost::is_same<pos_t, end_t>::type not_convertible; assign_variant<pos_t>(rValue, not_convertible()); } private: CLoPtrVariant<Typelist>& m_rVariant; }; //Delayed visitation (std algorithm support) template<typename VisitorType> class CVisitDelayed { public: typedef typename VisitorType::result_type result_type; private: VisitorType& visitor_; public: explicit CVisitDelayed(VisitorType& visitor) : visitor_(visitor) { } public: //Unary template<typename Visitable> result_type operator()(Visitable& visitable) { return apply_visitor(visitor_, visitable); } //Binary template<typename Visitable1, typename Visitable2> result_type operator()(Visitable1& visitable1, Visitable2& visitable2) { return apply_visitor(visitor_, visitable1, visitable2); } }; template<typename VisitorType> inline CVisitDelayed<VisitorType> apply_visitor(VisitorType& visitor) { return CVisitDelayed<VisitorType>(visitor); }

On 01/15/11 08:54, Paul wrote:
[snip] I think maybe that should be expected. After all, the number of binary signatures, where the first is from set, S1, and the second is from set, S2, and the sizes of S1 and S2 anr N1 and N2, then there has to be N1*N2 different signatures; hence, if N1=50 and N2=50, then there's 2500 different signatures. IOW, the binary visitor would have to have member functions something like: template < typename... S1 , typename... S2
struct bin_viz { void operator()(S1_0& s1, S2_0& s2); void operator()(S1_0& s1, S2_1& s2); void operator()(S1_0& s1, S2_2& s2); ... void operator()(S1_0& s1, S2_n& s2); void operator()(S1_1& s1, S2_0& s2); void operator()(S1_1& s1, S2_1& s2); void operator()(S1_1& s1, S2_2& s2); ... void operator()(S1_1& s1, S2_n& s2); void operator()(S1_2& s1, S2_0& s2); void operator()(S1_2& s1, S2_1& s2); void operator()(S1_2& s1, S2_2& s2); ... void operator()(S1_2& s1, S2_n& s2); . . . void operator()(S1_m& s1, S2_0& s2); void operator()(S1_m& s1, S2_1& s2); void operator()(S1_m& s1, S2_2& s2); ... void operator()(S1_m& s1, S2_n& s2); }; where S1 is a typelist with members S1_0, S1_1, ..., S1_m S2 is a typelist with members S2_0, S2_1, ..., S2_m IOW, there would be m*n member functions; hence, the quadratic compile times. However, I'm just guessing now, I've not actually measured it or tried to actually show this is the reason, but it would be the first place I'd look. HTH. -regards, Larry };

On 01/15/11 17:43, Larry Evans wrote:
Actually, I now remember encountering this problem and reporting it here: http://article.gmane.org/gmane.comp.parsers.spirit.general/20163 BTW, on a related note (although not much use to you since you don't have a variadic template compiler available), there's a comparison of the variant container and one_of_maybe container in boost vault: http://www.boostpro.com/vault/index.php?action=downloadfile&filename=variant_test.zip&directory=Data%20Structures& It has a nice graph of compile times vs number components. It shows one_of_maybe is faster and grows more slowly as the number of components increases. HTH. -Larry

On 01/15/11 08:54, Paul wrote:
[snip] Could you post a test driver showing the problem. I'd like to see how: http://svn.boost.org/svn/boost/sandbox/variadic_templates/boost/composite_st... performs on the same or similar problem. TIA. Larry

la> [snip]
Could you post a test driver showing the problem. I'd like to see how:
In rough (but working) code: class CVisitBinary : public static_visitor<bool> { public: template<typename T, typename U> bool operator()(const T&, const U&) const { return false; } template<typename T> bool operator()(const T&, const T&) const { return true; } }; template<typename Typelist, typename ElementType1, typename ElementType2> void TestVariantBinary() { typedef Layout::make_variant_over<Typelist>::type variant_t; variant_t v1_1(boost::shared_ptr<ElementType1>(new ElementType1)); variant_t v1_2(boost::shared_ptr<ElementType1>(new ElementType1)); variant_t v2_1(boost::shared_ptr<ElementType2>(new ElementType2)); BOOST_CHECK(apply_visitor(CVisitBinary(), v1_1, v1_2)); //Same type BOOST_CHECK(apply_visitor(CVisitBinary(), v1_1, v1_1)); //Same type, same instance BOOST_CHECK(!apply_visitor(CVisitBinary(), v1_1, v2_1)); //Different types } //Generate 200 types (class C1, C2, ... C200) #define BOOST_PP_LOCAL_MACRO(n) \ class BOOST_PP_CAT(I, n) \ { \ public: \ }; \ \ class BOOST_PP_CAT(C, n) \ : public BOOST_PP_CAT(I, n) \ { \ public: \ int i; \ }; #define BOOST_PP_LOCAL_LIMITS (0, 200) #include BOOST_PP_LOCAL_ITERATE() //Generate mpl sequences #define SHARED_PTR_C(z, n, data) boost::shared_ptr<BOOST_PP_CAT(C, n)> #define SHARED_PTR_I(z, n, data) boost::shared_ptr<BOOST_PP_CAT(I, n)> typedef boost::mpl::vector10<BOOST_PP_ENUM(10, SHARED_PTR_C, 0)> MplVectorC10_t; void test() { TestVariantBinary<MplVectorC10_t, C3, C4>(); //2 phase visitation giving 100 paths }

On 01/17/11 14:18, Paul wrote:
However, it appears to need, at least, the #includes: #include <boost/shared_ptr.hpp> #include <boost/mpl/vector.hpp> #include <boost/preprocessor/cat.hpp> #include <boost/preprocessor/iteration/local.hpp> #include <boost/test/minimal.hpp> Anything else? previous post: http://article.gmane.org/gmane.comp.lib.boost.user/65156 ? [snip]
Is Layout a new namespace? I don't see it in this post or the previous one? [snip] Thanks for any help, Paul. -regards, Larry

On 01/17/11 15:24, Larry Evans wrote:
1) in assign_variant(,true), I had to disable the BOOST_STATIC_ASSERT because if was always causing a compiler error. 2) I had to forward declare: CLoPtrVariant(const CLoPtrVariant<Typelist2>& rOperand) because the body used CConvertVariant which hadn't been declared yet. However, I'm still getting a compile error with gcc4.5.1. A partial list of the compilation output is also attached. Any ideas what's going wrong? -Larry

On 01/17/11 18:35, Larry Evans wrote: [snip]
[snip] Apparently gcc4.5.1 is more stringent about converting value types to reference types. Changing: template < typename VisitorType , typename IndexType > inline typename VisitorType::result_type apply_visitor_impl ( VisitorType& rVisitor , boost::mpl::true_ /*is_unrolled_t*/ ) const { typedef typename boost::mpl::at<Typelist, IndexType>::type type_t; return rVisitor ( boost::static_pointer_cast<typename type_t::value_type> ( m_spSelect ) ); } to: template < typename VisitorType , typename IndexType > inline typename VisitorType::result_type apply_visitor_impl ( VisitorType& rVisitor , boost::mpl::true_ /*is_unrolled_t*/ ) const { typedef typename boost::mpl::at<Typelist, IndexType>::type::value_type value_t; boost::shared_ptr<value_t> spc(boost::static_pointer_cast<value_t>(m_spSelect)); return rVisitor(spc); } at around line 163 of CLoPtrVariant.hpp solved problem.

This might also indicate your are trying to assign a type to the variant that is not one of the bounded types. See your callstack.
Odd, since its a template/argument-naming dependant function and the poi is way down it should give you no problems!? Anyway, its not used in your test anyway, so you may also strip it.

On 01/27/11 13:26, Paul wrote:
After changing the code to: BOOST_MPL_ASSERT((boost::mpl::false_)); //Compiler error here indicates that (one of) the variants bounded types is no to allow the compilation to proceed a little further, the g++ compiler just prints: CLoPtrVariant.hpp: In member function 'void Layout::CConvertVariant<Typelist>::assign_variant(boost::shared_ptr<Type>&, mpl_::true_)': CLoPtrVariant.hpp:315:7: error: no matching function for call to 'assertion_failed(mpl_::failed************ mpl_::bool_<false>::************&)' So, there's no callstack available. BTW, if the BOOST_STATIC_ASSERT (or in the above modification, the BOOST_MPL_ASSERT) always fails, as it always must since it is passed the compile time constant, boost::mpl::false_, then the following assignment statement: m_rVariant = rValue; can never execute; so, why is it there? OTOH, why bother with assign_variant at all since the assert can take place in the calling routine.
Does poi mean "point of instantiation"? If so, then I don't see how that's relevant. AFAICT, the problem is that there's no declaration of CConvertVariant before it's used and the g++ compiler, understandably, issues the diagnostic. Now if CConvertVariant were declared after the CTOR, but *within* the CLoPtrVariant class, then I think it would work; however, since CConvertVariant is declared outside of CLoPtrCVariant, there's a problem. Maybe visual c++ allows this freedom; however, I don't think that is portable.

On 01/17/11 14:18, Paul wrote:
With the program in 1st attachment, when compiled with gcc4.5.1 with the -ftime-report option, I get the timing results shown in 2nd attachment. Plotting the results using: http://soft.proindependent.com/qtiplot.html showed quadratic looking curves with boost::variant the clear winner. Also with qtiplot, the curves were fitted (pretty closely from looking at the fitted quadratics). The log output from the fitting is shown in 3ird attachment which shows the polynomial coefficients. So, based on these fitting results, you might find the boost::variant visitors faster than the CLoPtrVariant visitors. I've no idea why a specialized visitor, like CLoPtrVariant, would be slower than the more general one, like boost:;variant. Indeed, the most general visitor, OneOfMaybe, is the slowest, so there's no obvious general rule :( HTH. -Larry

Larry, Sorry about not having complete compiling testcode; i don't have a gcc environment available so it's difficult for me to get thing right. In the meanwhile i had to make some fixes as well, maybe you could try with attached code? Yuo will probably need to make the same changes/fixes as before, but it does have an important improvement to reduce number of instantiations. It's working on win platform vs2005; solution compilation reduced drastically with about 50%, also total heap usage dropped about 1.5GB. In your testapp, please also try with higher number of types, like to 100 or even 200, just to see where the limits actually are. Best regards, Paul

On 01/24/11 14:30, Paul wrote:
I'm pretty sure if you just started your improvements with my patches to your original code instead of starting your improvements starting from your original code, it would compile with g++. Please try that and see if it doesn't compile with vs2005. -regards, Larry

On 01/24/11 18:17, Larry Evans wrote:
OTOH, if you'd like to plot the results yourself, you might find the attached .py file useful. Of course you'll need python installed as well as the python libraries described here: http://docs.scipy.org/doc/ http://matplotlib.sourceforge.net/contents.html The attached .py takes 1 argument, a filename. The contents of such a file name is: line1: a plot title line2: a list of space delimited column names. lines3-...: a table of x values vs y values. x values are in column1, y values are in the remaining columns. the result plot contains n curves where n is the number of columns -1. An example input file is also attached. HTH. -Larry

Hopefully attached header will pass your test? Since i heave no gcc or python environment currently available it will be more difficult for me to test this. I'm curious whether you find the same improvement between '''our''' variant compared to the boost::variant as we have seen on our project. Paul

On 01/27/11 13:19, Paul wrote:
Paul, it still does not compile with gcc. My code attached to earlier post forward declared: template<typename Typelist2> CLoPtrVariant(const CLoPtrVariant<Typelist2>& rOperand) ; this was necessary because the body used: CConvertVariant<Typelist> convert(*this); which had not been declare yet. Only after CConvertVariant was declared could g++ compile the body of: template<typename Typelist2> CLoPtrVariant(const CLoPtrVariant<Typelist2>& rOperand) Please, start from the code I provided earler which *does* separate the declaration of: template<typename Typelist2> CLoPtrVariant(const CLoPtrVariant<Typelist2>& rOperand) from it's defintiion, which occurs *after* the declaraion of: template<typename Typelist> CConvertVariant The code you recently posted does *not* sepearte the declaration from the definition of: template<typename Typelist2> CLoPtrVariant(const CLoPtrVariant<Typelist2>& rOperand) the same applies to another class, I think that class has wrapper in it's name. Also, use typename within templates typedef, as shown in my earlier post. Then I may be able to eompile the code with g++ *and* the code will be more portable. Since g++ is freely available, I'm wondering why you just can't install it on your machine and compilee your code with g++ to make sure your code is portable to g++. Is there come company which you work for that has some policy which forbids that? To summarize, why haven't you started your modification from the code I provided earlier which did sepearte the declaration from the definition of: template<typename Typelist2> CLoPtrVariant(const CLoPtrVariant<Typelist2>& rOperand) and which did provide the needed typename prefixes to the several typdefs where that was required by g++ (and I assume) by the c++ standard). -regards Larry

On 01/27/11 19:32, Larry Evans wrote:
variants_compare.zip in: http://www.boostpro.com/vault/index.php?&direction=0&order=&directory=Data%20Structures The zip also has a compare of just simple assignment. In all cases, CloPtrVariant is the clear winner. Unfortunately, OneOfMaybe suffers pretty badly in the visitation compare :( -Larry
participants (9)
-
Dave Abrahams
-
Igor R
-
Jeff Flinn
-
Joel de Guzman
-
Larry Evans
-
Mathias Gaunard
-
Paul
-
Peter Foelsche
-
Steven Watanabe