
Hi Thorsten,
The point is: making the function 'inline', either with explicit keyword or by placing it in the class body *increases* the chances that it will be inlined. And since inlining here can cause code bloat, it's better not to increase those chances.
I agree I should investigate how my lib perform in this regard. I do think that we wan't all the inlining that we can get. For example,
vector<int> v; v += 1,2;
should preferably be expanded to
vector<int> v; v.push_back( 1 ); v.push_back( 2 );
I don't see any benefit of another layer of function-calls (except larger code size:-)).
Right, in this case inline expansion is ok. Though even if there's another layer of function call the only drawback would be a single instantination of that other function, plus extra time for argument passing.
I've just sketched an example which can be found at
http://zigzag.cs.msu.su:7813/inline
There are two files -- one with in-class definition and one with out-of-class definition. Both are compiled with -O3 but the function is inlined only in the first example and the number of instructions needed to each call grows from 4 to 13. In a real example the difference might be smaller, but it also might be larger :-(
with vc7.1 and como4.3 the results are:
-rwxrwxrwx 1 nesotto None 135168 Apr 9 00:12 cl_inline.exe -rwxrwxrwx 1 nesotto None 135168 Apr 9 00:08 cl_inline2.exe -rwxrwxrwx 1 nesotto None 950272 Apr 9 00:07 como_inline.exe -rwxrwxrwx 1 nesotto None 950272 Apr 9 00:08 como_inline2.exe
950K for a tiny program? Oh, anyway, I think this can have two explanations 1. Those compilers have different opinions about inlining (which does not disprove the statement that 'inline' increases chances for inlining). 2. The binary size is rounded to some 'page' boundary
It was some time ago, so situation might have improved in gcc or in libstdc++, but generally, unnecessary inlining will still increase the code size.
perhaps; except when it decreases code size. :-) Arguable it would be good to see more what other compilers do; and I would like to see your code optimized for size (-O3 is for speed, right).
It would be also good to have tests with 1 call, 10 calls, 20 calls and so on, so that we can draw nice plot, instead of computing the number of assembly instructions ;-) Ok, I've wrote a script to measure that, please see the results in http://zigzag.cs.msu.su:7813/inline/data.pdf and data for optimization for space (-Os) in in http://zigzag.cs.msu.su:7813/inline/data_space_optimization.cvs (and is not much different from -03) - Volodya