[Proto] Inlining failure on gcc 4.x on 64bits linux

I'm reporting this performances problem we encountered today. We have a large expression operating on table of 51200000 elements; In 32bits mode, our proto code for NT² works perfectly and output a gprofile like: http://codepad.org/Uyyvkq4X rutniem performance sis somethign like 1.5s for the whoel computation. In 64bits on linus using gcc 4.4 or 4.3, we got a massive 8.6s of runtime (!!) and the profile is: http://codepad.org/6VgRsbyR As it seems the call to proto::default_ is not inlined (hence the 5.12M calls) and take 80% of the runtime. I don't thing anything is 32/64bits specific in proto so is it a bug in gcc ? I don't have code handy here but will do tomorrow.

On 7/19/2010 2:30 PM, joel falcou wrote:
I'm reporting this performances problem we encountered today. We have a large expression operating on table of 51200000 elements;
In 32bits mode, our proto code for NT² works perfectly and output a gprofile like: http://codepad.org/Uyyvkq4X
rutniem performance sis somethign like 1.5s for the whoel computation.
In 64bits on linus using gcc 4.4 or 4.3, we got a massive 8.6s of runtime (!!) and the profile is:
As it seems the call to proto::default_ is not inlined (hence the 5.12M calls) and take 80% of the runtime.
I don't thing anything is 32/64bits specific in proto so is it a bug in gcc ?
There is no platform-specific code in proto. I'd report it as a gcc bug. -- Eric Niebler BoostPro Computing http://www.boostpro.com
participants (3)
-
Eric Niebler
-
joel falcou
-
Joel Falcou