
One easy issue I can see (in code right in front of me) is:
const NT& result=a*d-b*c;
Using const references instead of plain variables is something that quite a number of people do, and which (I think) would fail if operator- returned a reference.
Killer argument :-(
Note that it also means that the unary operator+ (which I hope noone ever uses) is "unsafe" in a number of libraries.
Nod :-( So, I've updated to return by value, discovered that my move constructors weren't actually moving (more care needed in calling base class constructors to prevent them from copying!), and generally added more move support throughout. Results are now: Horner polynomial case: Expression templates: 0 temporaries. Move semantics: 0 temporaries! This isn't quite as good as it looks - in the test one temporary is created in the move case, but it gets wiped out by the move constructor. So were we assigning to an already constructed variable these two cases would differ by one temporary. Oh, and I'm using allocations as a proxy measurement for temporaries, but in reality we are creating temporaries in the move case (lot's of them), they just don't need to allocate. Bessel function test: Time for mpf_float_50 = 4.96608 seconds Total allocations for mpf_float_50 = 2592537 Time for mpf_float_50 (no expression templates = 5.2808 seconds Total allocations for mpf_float_50 (no expression templates = 4174720 And again with BOOST_NO_RVALUE_REFERENCES defined: Time for mpf_float_50 = 5.20759 seconds Total allocations for mpf_float_50 = 2594441 Time for mpf_float_50 (no expression templates = 6.10358 seconds Total allocations for mpf_float_50 (no expression templates = 6498034 So basically, the rvalue ref no-ET code does as well as the no-rvalue expression template code, except of course turning on rvalue ref support helps the ET code as well, so ET's still win overall - in other words as you might expect, best performance comes from having both. Hope that makes sense, John.