
- Inline functions is best way to improve performance. I've found this to be true in my own work. So have lots of other people. Prove us wrong.
Let me have a go at describing some bloat. It's not in my interest to prove anything. I'll just tell you cost we see in real-life applications.
Thanks for providing some real numbers - much better than endless speculation! If there are any big offenders that are Boost libraries then lets hear about it: I'm sure we can take it ;-)
On Linux, GCC 4.5.x, x86_64, we have executables which load 598 DSO images mapped in 611 memory regions, corresponding to:
- 263'431'108 100.0% bytes mapped from shared libraries - 261'638'634 99.3% bytes total allocated to sections - 1'792'474 0.7% bytes padding not in any section (= rounding to page)
The break down by sections that are actually loaded into memory:
- 114'086'965 43.3% code (.text)
Ouch, that's one big application.
- 65'205'187 24.8% dynamic symbols and related tables - 26'825'486 10.2% unwind tables - 24'356'624 9.2% plt + relocations + related tables - 19'000'712 7.2% global data - 11'325'928 4.3% global common data (.bss) - 749'576 0.3% various shared library headers - 82'138 0.0% global constructors and destructors - 5'890 0.0% glibc memory management voodoo - 128 0.0% thread-specific data
That's ~55% "real stuff", ~25% of symbol tables, ~10% unwind tables, ~10% relocations and PLT. The application virtual memory size is about a gigabyte, so this is a major fraction of the overall footprint.
There are 544'533 symbols which represent 142'548'100 bytes. Of this there are 272'190 weak symbols, or 43'565'063 bytes.
A significant fraction of those weak symbols represent template duplication across libraries, but that's not the only form of bloat we see. There are 2'599 symbols with at least 10 duplicates, total 5'832'419 bytes, and 118 vtables with at least 10 duplicates (about 300k).
Are you able to use separate file template instantiation to reduce duplication?
So over half of the symbols and about a third of the size are ill-advicedly generated inline functions, virtual function tables (19'043 vtables = 2'802'928 bytes) and type info objects and names (45'961 typeinfo objs + names = 3'142'851 bytes). This goes with accompanying symbol tables, PLTs, unwind tables, and so on.
A significant fraction of the 60+ MB symbol tables is obviously for long mangled names.
That is one big problem with templates that I'll freely admit to. Compiler vendors are certainly aware of this, and some of them have reduced mangled name size over the years, but I guess it'll always be an issue :-( Regards, John.