
Thanks for your feedback.
I think the generated code gets somewhat simplified once issue (1) is addressed.
It would help, but I think won't get rid of all the branches. Your refactoring might help more.
I think it would be a mistake to just blindly copy the value of b when b.m_initialized is false, if for no other reason than doing so will lead to endless user complaints about compiler and valgrind warnings. Also, invoking undefined behavior can result in the compiler doing very nasty and unexpected things, even in the absence of runtime issues from reading an "uninitialized" location. Consider the possibility that the compiler can prove that the optional being copied from is uninitialized, and so can conclude that the read of its value is undefined behavior. Probably the *best* one can hope for in such a situation is a compiler warning, and many far worse results are possible.
Consider the completely legal code below: struct cheap_optional_int{ cheap_optional_int() : m_initialized() {} // don't init m_data bool m_initialized; int m_data; }; void assign_boost_cheap_optional_int(cheap_optional_int& a,cheap_optional_int& b){ a=b; // default impl } The compiler generates nothing but 32-bit moves from the source to the destination. This is completely fine for valgrind. It only complains if a branch based is taken based on uninitialized data. 00000000 <assign_boost_cheap_optional_int(cheap_optional_int&, cheap_optional_int&)>: 0: 53 push %ebx 1: 8b 44 24 0c mov 0xc(%esp),%eax 5: 8b 58 04 mov 0x4(%eax),%ebx 8: 8b 08 mov (%eax),%ecx a: 8b 44 24 08 mov 0x8(%esp),%eax e: 89 08 mov %ecx,(%eax) 10: 89 58 04 mov %ebx,0x4(%eax) 13: 5b pop %ebx 14: c3 ret Sorry the assembler is so poorly formatted after it's mailed. The cool thing is cheap_optional_int has_trivial_destructor and has_trivial_copy because we haven't overridden the defaults. Unfotunately overriding the default ctor/dtor always breaks these, even if the code could be optimized out. It may not even be possible for a compiler to solve. Chris _____________________________________________ From: Hite, Christopher Sent: Wednesday, January 25, 2012 6:29 PM To: 'boost@lists.boost.org' Subject: [optional] generates unnessesary code for trivial types When decompiling my code I noticed a bunch of unnessesary code caused by boost::optional. 1) deconstruction typedef boost::optional<int> optional_int; void deconstruct_boost_optional(optional_int& o){ o.~optional_int(); } One would expect this to do nothing. Instead gcc 4.6.0 with O3 generates: if(m_initialized){ // do nothing m_initialized = false; } 00000000 <deconstruct_boost_optional(boost::optional<int>&)>: 0: 8b 44 24 04 mov 0x4(%esp),%eax 4: 80 38 00 cmpb $0x0,(%eax) 7: 74 03 je c <deconstruct_boost_optional(boost::optional<int>&)+0xc> 9: c6 00 00 movb $0x0,(%eax) c: f3 c3 repz ret This one could be easily fixed by removing the bit that sets m_initialized to false, since we're deconstructing anyway. 2) assignment also generates these problems: void assign_boost_optional(optional_int& o){ o=13; } Here there's a semantic issue: we have to decide to use the copy constructor or operator=. This is also wasteful for POD types or any type which has_trivial_copy<>. 3) Even more expensive is if we want to copy an optional<int> void assign_boost_optional(optional_int& a,optional_int& b){ a=b; } 00000000 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)>: 0: 8b 44 24 04 mov 0x4(%esp),%eax 4: 8b 54 24 08 mov 0x8(%esp),%edx 8: 80 38 00 cmpb $0x0,(%eax) b: 74 0b je 18 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x18> d: 80 3a 00 cmpb $0x0,(%edx) 10: 75 16 jne 28 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x28> 12: c6 00 00 movb $0x0,(%eax) 15: c3 ret 16: 66 90 xchg %ax,%ax 18: 80 3a 00 cmpb $0x0,(%edx) 1b: 74 09 je 26 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x26> 1d: 8b 52 04 mov 0x4(%edx),%edx 20: c6 00 01 movb $0x1,(%eax) 23: 89 50 04 mov %edx,0x4(%eax) 26: f3 c3 repz ret 28: 8b 52 04 mov 0x4(%edx),%edx 2b: 89 50 04 mov %edx,0x4(%eax) 2e: c3 ret Three possible branches! Theoretically single 64 bit copy do the job. I'm tempted to say: it would be best if for any T has_trivial_copy< optional<T> > iff has_trivial_copy<T>. It might make a sense to make an exception for huge T, where the copying an unused T is more expensive than the branching. 4) has_trivial_destructor<T> should impl has_trivial_destructor< optional<T> > , but this is hard to implement without specialization of optional. Checking has_trivial_destructor might take care of the complexity of optional<T&> since has_trivial_destructor< T& >. I'd be willing to fix #1. The other issues need some discussion. Chris